Database Search for Published Studies
An information specialist searched the Medline and EMBASE databases for all potentially relevant English language scientific papers published between 1 January 1999 and 31 March 2005, reporting original research on live rats, mice and non-human primates (referred to hereafter as ‘primates’) carried out in publicly funded research establishments in the UK and the USA. (See supplementary online information for search terms).
Databases were searched using the following search terms:
1. exp MICE 14. Hominidae
2. usa.in. 15. 12 or 13
3. 1 and 2 16. 15 not 14
4. exp great britain17. Pan troglodytes
5. england.in.18. 16 or 17
6. uk.in.19. exp CEBIDAE
7. 4 or 5 or 620. exp MACACA
8. 1 and 721. exp Papio
9. exp RATS22. 18 or 19 or 20 or 21
10. 9 and 223. 22 and 7
11. 9 and 724. 22 and 2
12. PRIMATES25. 3 or 8 or 10 or 11 or 23 or 24
An upper limit on the number of papers that would be included in the survey was set at 300 – made up of approximately 50 papers for each of three species and two countries. This limit was based on pragmatic considerations that included the time taken to assess and extract information from each publication. The sample size for surveys such as this is not normally based on formal statistical considerations, as there are no primary hypotheses being tested. There was therefore no need to formally power this study.
Selecting Published Studies
A sample of the most recently indexed abstracts was selected from the total number of potentially relevant publications identified in the database search. We chose the most recently indexed papers from all the papers identified in the search as an unbiased way of selecting the publications. When a journal is added to a database and becomes indexed, all previous issues are also indexed, enabling us to have a spread of publication years in the sample. The abstracts were appraised and publications were selected or rejected based on the exclusion criteria listed below (see ). The full texts of the remaining publications were obtained. Each potentially relevant full text was numbered within its country-species stratum and the exact reference of each paper recorded. Three digit random numbers were generated using MINITAB, and the six lists were re-ordered using the random numbers. This stratified randomisation procedure was carried out to minimise bias, to ensure the total sample was representative of the six subgroups (i.e. three species and two countries), and to allow analysis of each subgroup in addition to the overall sample. The first fifty papers for each species and country were considered from each of the six randomised lists. If a paper was not eligible, the next paper on the randomised list was taken. A second reviewer independently assessed the full texts of all the selected papers and finalised the list of included studies. Some further studies were excluded in this step.
All relevant English language studies published between January 1999 and March 2005, reporting original scientific research carried out in UK or USA publicly funded research establishments and whose source(s) of funding were UK- or USA- publicly funded bodies such as universities, charities or other non-industry funding, such as the NIH, USPHS, MRC, BBSRC, etc., were included. Studies that had any commercial/industry funding were included only if the majority of the other funding sources were UK or USA public funding sources and the work was carried out in a UK or USA publicly funded research establishment. Studies that had any non-UK or non-USA public funding were included only if the majority of the other funding sources were from UK or USA public funding sources and the work was carried out in a UK or USA publicly funded research establishment. Studies whose funding source was not stated were included only if the research was carried out at a UK or USA publicly funded institution. Note was made that the funding source information was not reported.
We chose to limit our investigation to publicly funded research in the USA and UK because the funding for this study came from both US and UK publicly funded bodies, the two countries are highly influential in setting the scientific agenda, and because there should theoretically be no constraints on reporting publicly funded research for reasons of confidentiality or commercial sensitivity.
The survey was restricted to original scientific research studies using mice, rats, and primates. The experiments had to use live animals (including terminal anaesthesia) and state that they had used UK Animals [Scientific Procedures] Act 1986 (ASPA) licensed interventions, or equivalent USA institutional guidelines for animal care and use. Rodents are the most widely used animals and primates are the most high profile and ‘ethically sensitive’ group (for convenience primates are designated a species here). Other species or groups such as fish, birds, rabbits and guinea-pigs are either used in small numbers or in more specialised areas of research. The sample sizes for these species would have been too small to draw any strong inferences about the reporting standards in these research areas. In addition, every such study that was included would reduce the statistical power of the study for drawing inferences about reporting and experimental design standards studies involving more widely used species.
Publications were excluded if industry/commercial funding was the sole source of funding, or if the research was solely funded by an organisation not based in the USA or UK. In vitro studies, studies using tissue from animals killed before use, or that did not involve experimental procedures/testing, technical or methodological papers not involving actual experiments using animals, review articles, genetics papers reporting linkages of genes, studies with no abstract, and brief communications with no methods, were also excluded. No more than two papers were included from any single laboratory to ensure that the survey results were not unduly influenced by the bad – or good – practice of one particularly productive laboratory.
Unit of Analysis
The unit of analysis was the ‘main experiment’ reported in the paper. Many papers report the results of more than one experiment; accordingly, the number of experiments per paper was noted. For those studies that reported more than one experiment, the experiment that used the most animals was considered the ‘main experiment’. Details and results from the main experiment were used to complete the data collection sheets. Although the specific details described in this report relate to a single experiment assessed in each publication, the whole paper was searched for information relevant to that experiment, and to the way the experimental work was conducted and analysed in general.
The Survey Process
The survey was carried out in two steps identified as phases 1 and 2.
Phase 1: quality of reporting
In phase 1, the full texts of the 271 included studies were divided equally between two assessors who were experienced statisticians (one from the UK and one from the USA). Assessor 1 analysed the even numbered papers, assessor 2 analysed the odd numbered papers extracting the relevant information to complete the Quality of Reporting checklist (see Supporting Information S1
). Any supplementary online data associated with any of the included publications was accessed and analysed.
Phase 2: quality of experimental design and statistical analysis
In phase 2, a random sub-sample of 48 papers chosen from the 271 papers evaluated in phase 1, stratified by animal and by country (i.e. 8 papers×3 species×2 countries), was assessed. This number was selected as an appropriately sized sub-sample of the papers assessed in phase 1 based, as was the case for phase 1, on the time necessary to complete the very detailed reports. The statistical methods and analysis of the papers were assessed to determine whether the experimental design and the statistical analysis were appropriate. This involved the expert judgement of two statisticians, both of whom assessed all 48 papers using the Quality of Experimental Design and Analysis checklist (see Supporting Information S1
). The main experiment was the same as that analysed in phase 1. Errors of omission were noted.
Any disagreements or differences in interpretation of the checklists were resolved by consultation and discussion with a third assessor and, where necessary, the relevant studies were re-analysed. To allow for possible discrepancies between the two assessments, in phase 2 the mean of the results from the two statisticians are reported in all data summary tables. Overall agreement between the assessors was assessed once during each phase of the survey – In phase 1 both assessors applied the relevant checklist to the same sub-set of 30 (of 271) papers and their analyses compared, and in phase 2, all 48 papers were used to assess agreement (see ).