Introduction
The SOAP Web Server takes care of making the actual analysis, while this web site only helps in presenting the results. Under the covers, the web site makes calls to the web server. What this means is that jobs launched via the web site are directly over the web server, and can have their results queries interchangeably via the web site or the web server. Jobs launched via the web server can also be inspected via de web site, which provides the opportunity to schedule a batch of jobs during the night, and analyze them the next day using the web site. The web server is implemented using SOAP, and the WSDL file can be found here SentWS.wsdl
The web server works asynchronously, it provides a set of methods that initiate analysis job, and methods that can be used to query the progress of these jobs and to gather the results upon ending. Several jobs can be issued in parallel, and an overload of jobs will be dealt with with the use of a queue. Depending on the characteristics of the job, it can last from a couple of minutes to half an hour (for fine grained and custom analysis). Each job is identified with a unique job identifier, that is created when the job is started, and that is used to further query that job’s status and results. Additionally there are a few methods that query general characteristics of the server, such as the organisms datasets supported. Following is a full description of the methods.
To create a client driver in Ruby you can use the following code:
require 'soap/wsdlDriver';
wsdl_url = "http://sent.dacya.ucm.es/wsdl/SentWS.wsdl";
driver = SOAP::WSDLDriverFactory.new(wsdl_url).create_rpc_driver;
An example script can be found at sent.rb. En example execution is:
ruby sent.rb --org human --genes human.txt --factors 3
Where human.txt could be the file human.txt
| Analysis | |||
|---|---|---|---|
| Method | Arguments | Returns | Description |
| analyze |
|
Job id |
Performs an standard analysis. |
| fine_grained |
|
Job id | As before, but the word importance weights are computed in-situ. This analysis takes more time and renders more detailed results. |
| custom |
|
Job id |
The |
| Job Query | |||
| Method | Arguments | Returns | Description |
| status |
|
Status string |
Returns a string with the status of the job. The possible values
are in order: |
| messages |
|
Array of message strings | These are messages generated by the analysis. They are usually verbose descriptions of the status states they go through, but in the case of error and aborted, they can hold information about the nature of the error. |
| abort |
|
Nothing | Aborts the execution of an analysis job. |
| done, error, aborted |
|
Boolean: true or false | Checks if the job is in that particular state. Done actually means been in the state of @done@, @error@, or @aborted@ rather than just in the state @done@. |
| info |
|
YAML structure with information about the job |
This method returns a hash with information like the genes used in
the analysis, and the how the translated to the native identifier
format, the number of factors used, whether there is a job
computing the literature index or if the job is |
| stems |
|
Content of the stem file | The stem file is a tab separated file that lists each of the stems and a coma separated list with the words that have been found produce that stem. |
| associations |
|
Content of the associations file | The association file lists all the associations between genes and PubMed ids. Also tab separated as before. The associations may be spread across several lines. |
| search_literature |
|
Contents of a tab separated list. One PubMed id per line followed by the score for the query | Makes a query in the literature index of the job with the provided links and returns an list with value pairs containing the PubMed id and the score. The literature index must have been computed already with the literature method described bellow. |
| results |
|
Array of result ids |
Returns an array of result identifiers that can then be used to retrieve the content of
the actual result files. The ids correspond the to following files:
|
| result |
|
String with the content of the file |
The content of the file that the id represents, in |
| Job Extension | |||
| Method | Arguments | Returns | Description |
| recluster |
|
Job id | Takes the job denoted by the analysis job id and redoes the final part of the analysis making a different number of clusters. By default the analysis takes the 10 executions of the factorization and makes as many clusters as factors, with roughly 10 factors each. The analysis is assigned a new job id, but the original analysis job status will reflect the state of the analysis as well. The results of the new re-clustering will be be accessible from the original job id. This step is done in a considerably shorter time compared to redoing the complete analysis. |
| refactor |
|
Job id |
As before, but redoes the analysis from the factorization step. Not only from the
clustering step. This method is specially helpful to save time with |
| build_index |
|
Job id |
Builds a literature index with the jobs associated literature that
can be used in the |
| reset |
|
Nothing |
If an extension job, like |
| clear_index |
|
Nothing |
Resets all the literature index information of the job, including erasing the index. This is
helpful if the |
| Other | |||
| Method | Arguments | Returns | Description |
| datasets | None | Array of identifiers strings for datasets |
Returns the identifiers for the datasets supported by the server.
These are the ones that must be specified in the |
| description |
|
YAML hash with description information for the dataset | This information includes the organism, number of genes and articles considered, and the supported format of the ids. |