PMCC PMCC PMCC

Downloading Using the OAI Service

Overview

The PubMed Central Canada OAI service, (CAPMC-OAI) provides access to metadata of all items in the PubMed Central Canada (CAPMC) archive, as well as to the full text of a large subset of these items.

CAPMC-OAI is an implementation of the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH), a standard for retrieving metadata from digital document repositories. The Open Archives Initiative (OAI) is an attempt to build a "low-barrier interoperability framework" for archives containing digital content and allows anyone to harvest metadata. Visit the Open Archives Initiative site for more information about the protocol and other activities of the OAI group.

PMC-OAI supports OAI-PMH version 2.0. It does not support earlier versions of the protocol.

Copyright

Most of the items in this archive are copyright protected, with copyright held by the author(s) or the depositing journal. In general, the OAI service cannot be used to retrieve the full text of articles in PMC. The only exceptions to this policy are for articles that are in the public domain and those that are made available under an Open Access provision (as defined in the Open Access List). See the PMC copyright notice for more information.

Using the CAPMC-OAI Service

The base URL for the service is http://pubmedcentralcanada.ca/oai.cgi?.

CAPMC OAI verbs: http://pubmedcentralcanada.ca/oai.cgi?Verb=Identify

CAPMC metadata formats: http://pubmedcentralcanada.ca/oai.cgi?Verb=ListmetadataFormats

CAPMC sets: http://pubmedcentralcanada.ca/oai.cgi?Verb=ListSets

High-Volume Retrievals

If you are using a script that makes more than 100 requests of any kind, please run it outside of the PMC system's peak hours. Do not make more than one request every 3 seconds, even at off-peak times. Peak hours are Monday to Friday, 5:00 AM to 9:00 PM, U.S. Eastern time.

Access to Full Text

Some CAPMC journals allow harvesting of the full text of all items, others allow it for only some items, and many do not allow it at all. See the PMC Open Access List for specifics.

Use the metadataPrefix =pmc to get the full text.
e.g. http://pubmedcentralcanada.ca/oai.cgi?verb=ListRecords&from=2007-10-01&metadataPrefix=pmc

In addition, the parameter set=pmc-open identifies the complete collection of items in PMC for which the full text may be harvested.
e.g. http://pubmedcentralcanada.ca/oai.cgi?verb=ListRecords&metadataPrefix=pmc&set=pmc-open

Supported Data Formats

Records may be retrieved from the CAPMC archive in one of the following formats:

  • The U.S. National Library of Medicine (NLM) Journal Archiving and Interchange XML format, for metadata or full-text article records. A DTD, W3C Schema, and complete documentation for this format are available from http://dtd.nlm.nih.gov.
    • Use metadataPrefix=pmc_fm for metadata, and
    • metadataPrefix=pmc to get the full text of an item.
  • The Dublin Core format, for metadata only (metadataPrefix=oai_dc). Information about Dublin Core is available at http://dublincore.org
Automatic Segmentation of Large Result Sets

If a ListIdentifiers request results in more than 1,000 hits, PMC-OAI will return the first 1,000 with a that can be used to get the remaining items.

If a ListRecords request results in more than 25 hits for PMC full text, 50 hits for PMC metadata or 250 hits for Dublin Core format metadata, PMC-OAI will return the first 25, 50 or 250 records, respectively, with a .

Restrictions on Systematic Downloading of Articles

Crawlers and other automated processes may NOT be used to systematically retrieve batches of articles from the PMC Canada web site. Bulk downloading of articles from the main PMC Canada web site, in any way, is prohibited because of copyright restrictions.