The present overview of highly cited journals highlights three main features of the current status of data availability practices in this high impact scientific literature. First, there are heterogeneous instructions to investigators publishing in high impact journals, with some journals requiring public data availability as a condition for publication, others encouraging data sharing but having no binding instructions, and a few journals having no specific instructions at all. Second, nearly a third of the examined sample of 500 papers were not subject to any data availability policies, either because they were published in journals without such policies or with specific policies that do not cover the primary data upon which the research was based. Third, even when research is published in journals with specific instructions regarding data availability, more than half of publications did not adhere to the data availability instructions in their respective journals.
Our findings present a snapshot of data availability practices in recent literature. While the papers we reviewed were from 2009, it is unlikely that the situation has changed much over the past year, and we therefore believe that the present findings represent reasonably well the current state of the literature. Moreover, since the papers we reviewed were likely submitted 6-12 months prior to our recording of journal policies, some journals may have adopted data sharing policies in the interim, hence inflating our estimate of lack of adherence to data sharing policies. However, it is doubtful that policies related to data sharing have changed substantially over such a short period of time. We also focused our analysis on high impact journals, since the research that they publish has a pivotal role in the evolution of scientific investigation and it is essential that this pivotal research is reproducible. It is not likely that data availability practices are more common and more efficient in other journals with lower impact factor - the opposite seems more plausible, if anything. Therefore the present findings may well overestimate the prevalence of effective data sharing among investigators publishing across all peer-reviewed journals. In fact, some types of biomedical studies, in particular traditional epidemiological/observational investigations, may be underrepresented in our sample as compared with molecular and other clinical research. Some of these types of underrepresented studies have no established history of public data repositories and thus primary data availability may be a more critical deficiency in these fields. It is also worth noting that the association between higher impact factor and conditioning publication upon provision of materials/protocols may be confounded by type of journal, as experimental/basic science journals that typically have such conditions tend to have higher impact factors.
While this analysis highlights an important element of data sharing, that of public availability of primary data, there are other elements not evaluated here but still important to make the data sharing culture functional and efficient. For example, a statement of willingness to share raw data by the primary investigators does not always translate into true availability of data when requested by independent scientists 
. Empirical studies suggest that data withholding is not uncommon in the scientific community and may be influenced by industry relationships, perceptions of proprietary information and scientific priority, lack of resources, and personal investigator training and stances towards data sharing
. Moreover, while all data web links of full primary datasets were verified as functioning in our analysis, this may reflect the temporal proximity of our analysis to the publication date of the articles, and some of these links may become unavailable a few years later 
Legislation to make results of clinical trials publicly available within one year of study completion may promote the culture of transparency in clinical trials research, but at present such legislation does not mandate making raw data from clinical trials publicly available 
. Indeed, widespread availability of clinical trial data may be hampered by financial incentives of journals to publish industry-sponsored trials, many of which may be bound by confidentiality agreements 
. Data sharing may be enhanced when granting agencies require investigators to share data but regulatory barriers remain 
Finally, for data that was made available by investigators, we did not attempt to replicate their findings. Even when data are publicly available, published results are often not reproducible by independent investigators due to incomplete annotation or specification of data processing and analyses 
This empiric evaluation highlights opportunities for improvement. Journals should adopt more routinely policies for data sharing, expanding the types of data that are subject to public sharing policies with the ultimate target of covering all types of data. Moreover, it is essential to develop mechanisms for journals to ensure that existing data availability policies are consistently followed by researchers and published research findings are easily reproducible.