The first step involves a coarse overlapping clustering of the abstracts. References are classified into eight classes depending on their subject. Classes correspond to MeSH main categories, such as ‘Anatomy’, ‘Organisms’, ‘Chemical and Drugs’, ‘Biological Sciences’, etc. (see
http://www.nlm.nih.gov/mesh/meshhome.html). You can impose an initial filtering to restrict the search to categories of interest and it is also possible to filter the search results by publication date (see page 2 of the tutorial).
The next web page displays keywords in the selected abstracts. The method for computing keywords and relations between them can be found in literature (
5). The list of extracted keywords provides a summary of the subjects within the query results and these are listed in order of relevance (more important concepts are listed first). Considering the above example of heparin and Alzheimer, XplorMed gives expected terms—‘protein’, ‘heparin’, ‘alzheimer’ and ‘disease’—in addition to others that may be new to you, for example, ‘tau’ and ‘app’.
At this stage, you can choose whether to go directly to the next step or to start a deeper analysis of the displayed subjects. The latter involves a context analysis of the subjects represented by the keywords and it is outlined briefly below (see Context Analysis of the Subjects). Alternatively, if you choose to go further, several groups or chains of closely related keywords are then presented to you.
You can modify the number of chains and their length by means of two parameters: alpha and score (see page 3 of tutorial for details). Each chain is preceded by a number that indicates how many abstracts contain both words. By selecting one or more of these chains, you perform a sub-query of the original set. For example, suppose you are interested in protein domains that could bind heparin. Accordingly, you would inspect the pair {protein, domain}, which appears in 13 references. You can select an alternative or additional word chain if you do not find what you wanted among the proposals of the system.
The next web page provides an ordered list of abstracts; those likely to be most interesting according to your selection are highlighted on top (in our example, the papers dealing with the heparin binding domain). If you checked in the previous page the boxes for cross-linking to the corresponding databases, several hyperlinked symbols will label some abstracts (see Cross Linking to Molecular Biology Databases).
The filtered subset of papers can now be used as a new XplorMed starting point at the computation-of-keywords step (see above). Alternatively, you can expand this subset with new papers among their MEDLINE neighbors (see Expanding the Query through Related Bibliography). New keywords focusing more closely on your subject of interest will appear at this stage. The procedure can be performed repetitively and the recovery of the set of abstracts is possible at any stage.