NLP Web Services Guide

NLP Programmatic Interface:

NCIBI has provided web service access to the NLP database that includes segmented sentences and named entity tags from documents in PubMed and PubMed Central (Open Access). Programmatic access may be useful for integrating NLP data into an analytic workflow or data pipeline.

Base URL:

http://nlp.ncibi.org/fetch?

URL Parameters:

pmid
PMID is the PubMed ID for a document.

pmcid
PMCID is the PubMed Central ID for the document.

tagger
Tagger is the named entity tagger. Currently supports “NameTagger” tags.

parser
Parser is the grammatical parser. Currently supports the “Stanford” parser.

type
Type specifies the output generated by the tagger or parser. For the "nametagger" tagger, the currently supported type is "gene." For the "stanford" parser, the currently supported type is "phrase."

id
Id is the canonical identifier for the named entity tag. “NameTagger” identifiers for the “Gene” type are Entrez Gene IDs.

limit
Optional. Limit sets the maximum number of returned results; it defaults to 1000 if not set.

tool
Optional. Tool is a string with no internal spaces that identifies the resource using the service. Inclusion of this parameter allows us to track usage of the service.

email
Optional. Email associates an email address with the request. Inclusion of this parameter allows us to contact users if there are problems or if the software interface changes.

metadata
Optional. Prints out author, journal, title, and date of publication when set to "all".

The following are valid parameter combinations:

pmid, pmcid, pmid & tagger & type, tagger & type & id, pmid & parser & type

Examples:

http://nlp.ncibi.org/fetch?pmid=17523140
http://nlp.ncibi.org/fetch?pmcid=2672633
http://nlp.ncibi.org/fetch?pmid=17523140&tagger=nametagger&type=gene
http://nlp.ncibi.org/fetch?pmid=17523140&tagger=nametagger&type=gene&metadata=all
http://nlp.ncibi.org/fetch?tagger=nametagger&type=gene&id=11137
http://nlp.ncibi.org/fetch?tagger=nametagger&type=gene&id=11137&limit=10
http://nlp.ncibi.org/fetch?pmid=17523140&type=phrase&parser=Stanford

Output:

The XML returned is divided into two sections. The “Request” section echos the input parameters and the values of unset parameters. The “Response” section holds the data from the query.

Sample Code For Retrieving and Parsing NCIBI NLP Data

The following four sample java files show how to interact with the NCIBI web service by querying for PMIDs, sentences, and named entity tags within documents. Samples 1-3 query the NCIBI-WS. Sample 4 shows a combined query to the NCIBI-WS and eUtils. Sample 5 shows a query combining eutils and three of NCIBI's web services: NLP, Gene2MeSH, and MiMI.

Sample 1 - Java, Perl
Sample 2 - Java, Perl
Sample 3 - Java, Perl
Sample 4 - Java
Sample 5 - Perl