Open access (OA) to the scientific literature means the removal of barriers (including price barriers) from accessing scholarly work. There are two parallel “roads” towards OA: OA journals and self-archiving [
]. OA journals make published articles immediately freely available on their Web site, a model mostly funded by charges paid by the author (usually through a research grant). The alternative for a researcher is “self-archiving” (i.e., to publish in a traditional journal, where only subscribers have immediate access, but to make the article available on their personal and/or institutional Web sites [including so-called repositories or archives]), which is a practice allowed by many scholarly journals.
OA raises practical and policy questions for scholars, publishers, funders, and policymakers alike, including what the return on investment is when paying an article processing fee to publish in an OA journal, or whether investments into institutional repositories should be made and whether self-archiving should be made mandatory, as contemplated by some funders [
Among the arguments of OA proponents (and an expectation of scientists who publish OA articles) is that “open” work is more quickly recognized, as measured by citations. Critics of OA dispute this fact and argue that there is “no evidence that this will happen.” [
] Representatives of traditional publishers argue that the “established system of scientific/technical/medical publishing provides excellent levels of open access to scientists and the public alike,” implying that scientists have access to the literature anyway and that there would be little advantage to publish OA. [
In fact, the evidence on the “OA advantage” is controversial. Previous research has based claims of an OA citation advantage mainly on studies looking at the impact of self-archived articles or articles that are found online (“openly accessible,” which some have argued to be different from open access in the narrower sense [
]). Most studies show an association between being online and being cited more often [
], although another study in the field of pediatrics seemed to suggest the opposite [
All these previous studies are cross-sectional and are subject to numerous limitations.
The first problem is self-selection. As most of these previous studies broadly define OA as “being found freely available online,” [
] alternative explanations for citation differences include that important (high-citation) articles are more likely to be posted online by authors or users as a
of the articles' importance; for example, because they are used for journal clubs [
] or coursework, or because authors post them on their homepages because they get so many requests from peers (Wren found that online accessible papers are clearly biased towards publications with “higher popular demand” [
]). In other words, one could argue that the articles are online
they are highly cited, rather than being highly cited because they are online. A mere association in a cross-sectional study tells us nothing about the direction of the relationship. Kurtz even argues that “the claims that the citation rate ratio of papers openly available on the internet versus those not available is caused by the increased readership of the open articles…(“OA advantage”) are somewhat overstated.” [
] Similarly, while the usual line of argument is that self-archiving leads to higher citations [
], alternative explanations include that top authors are more likely to be at top institutions that may be more likely to have an institutional repository, which smaller institutions do not have, or that authors selectively self-archive their best work as a “trophy.” [
] A recent analysis of articles published in four mathematics journals indicates that articles deposited in the arXiv (
) received more citations than nondeposited articles, but the authors do not attribute OA as the cause of more citations, but self-selection (quality differential) [
Secondly, especially in fields like physics, where pre- and post-publication on
is quasi-standard, a relationship between self-archiving and higher citation may be due to other factors, such as earlier dissemination of results through preprints [
], a quality improvement through discussion of preprints [
], or an “outsider” position of authors who do not self-archive.
Thirdly, previous studies reported crude, unadjusted rate ratios, where differences in author and article characteristics between OA and non-OA publications were not taken into account and corrected for. One could argue that the observed citation advantages of self-archived papers are a result of confounders; for example, publications with more authors are more likely to be self-archived (as it takes only one author to self-archive) and are also (independently from any OA effect) cited more often (e.g., through increased self-citations or because they might be of higher quality).
Limited or no evidence is available on the citation impact of articles originally published as OA that are not confounded by the various biases and additional advantages of self-archiving or “being online” that contribute to the previously observed OA effects. A “journal-level” analysis of journal impact factors concluded that OA journals are more often in the lower half of their subject category, although within the collection of OA titles, these journals ranked higher by immediacy index than by impact factor [
]. However, comparing the impact of OA journals against non-OA journals ignores differences in the journals' novelty, editorial policies, quality of peer review, and acceptance policies, which are strong confounders that are difficult to adjust for.
To answer the question of whether OA publications lead to a citation advantage I chose an article-level approach, comparing the bibliometric impact of a cohort of articles from the same journal
(Proceedings of the National Academy of Sciences [PNAS]) that offers both an OA and a non-OA publishing option, adjusted for different article and author characteristics.