You are viewing the site in preview mode

Skip to main content

Tolerating bad health research (part 2): still as many bad trials, but more good ones too

A Correction to this article was published on 29 April 2025

This article has been updated

Abstract

Background

We previously published a study examining the risk of bias of a random selection of Cochrane systematic reviews. The purpose of our current study is to reassess the risk of bias of a cohort of Cochrane reviewed trials to see if our reassessment differs from the original Cochrane assessment and to determine whether the funder, having methodological support, or involving a statistician affected the risk of bias.

Methods

We extracted data from 140 of 159 included trials from three countries, the UK, Canada, and Ireland, in our original cohort. The 19 remaining trials were excluded for a variety of reasons. We recorded the number of participants in the trial, the funder, if a statistician was involved in the trial, if there was any methodological support from a trials unit or clinical research facility, the sponsor, and whether the sponsor was involved in the design or conduct of the trial. The risk of bias of the 140 trials was re-assessed using the same tool as that used by the Cochrane authors.

Results

Our judgement of overall high risk of bias was broadly consistent with the original Cochrane authors. The proportion of high risk of bias trials remained more or less where it was at 55%, but the proportion of low risk of bias trials increased from 9 to 16%. The proportion of unclear risk of bias trials changed accordingly. Compared to the original assessments, we judged more studies to be low risk of bias across all domains. The greatest variation was in the two blinding categories (participants and personnel; outcome assessor) and ‘other bias’.

Conclusions

More than half of trials in our UK, Canada, and Ireland cohort were at high risk of bias highlighting significant challenges in ensuring the integrity and reliability of research findings. Addressing bias in clinical trials is essential to uphold the credibility of scientific research and to ensure that healthcare interventions are based on sound evidence, ultimately improving patient outcomes.

Peer Review reports

Background

We previously published a study examining the risk of bias of a random selection of Cochrane systematic reviews published between May 2020 and April 2021 [1]. Cochrane defines bias as a systematic error or deviation from the truth [2]. Bias can occur at any phase of a trial, including study design or data collection, as well as in the process of data analysis and publication [3]. Biases can lead to under- or over-estimation of the true intervention effect and can vary in magnitude: some are small (and trivial compared with the observed effect) and some are substantial (so that an apparent finding may be due entirely to bias) [2]. Bias in reporting can lead to misrepresentation of an intervention’s efficacy, which not only alters the public’s perception of the intervention but also the collective scientific understanding of its benefits and harms.

The link between bias and misrepresenting effect size has evidence to support it, although the certainty of that evidence varies for different types of bias. A recent systematic survey of meta-epidemiological studies to investigate the influence of various risk of bias domains on effect estimates found that inadequate random sequence generation and allocation concealment lead to an overestimate of treatment effects [4]. The certainty for this was judged to be moderate. The same authors also judged as moderate certainty evidence that a lack of blinding of patients leads to an overestimate of patient reported treatment effects but were more uncertain regarding other outcomes [4]. There was high certainty that unblinded outcome assessors overestimate subjective outcomes, but the authors were again uncertain about the effect on other outcomes. While uncertainty remains regarding the effect of other biases, e.g. the unblinding of personnel, data collectors, and data analysts, the authors highlight the importance of looking at the impact on outcomes when analysing the risk that blinding (or lack of it) has on a study [4].

Minimising bias should therefore be the goal of all trialists but is a goal often missed. Our earlier study [5] identified 96 reviews, co-authored by 546 reviewers from 49 Cochrane Review Groups and included 1659 trials done in 84 countries. Of these trials, 1640 had a published risk of bias assessment score, determined by the Cochrane review authors and 1013 (62%) were judged as high risk of bias and 494 (30%) uncertain risk of bias. Only 113 (8%) were rated as being at low risk of bias. Trials are hard to do, and as Hamilton and colleagues point out [6] in a response to our original article [5], researchers need to weigh up many factors when making design decisions. Some bias may remain, and this may be the best that can be done for a given trial.

We have presented and discussed our work many times since publishing it in 2022 [5], and colleagues and fellow methodologists have asked, amongst other things, whether the original Cochrane risk of bias assessments we used can be relied upon or whether there are some types of bias (blinding in particular) driving the overall risk of bias assessment. Perhaps Cochrane reviewers have been over-zealous? These are good questions and, moreover, ones to which we were unable to provide compelling answers. Neither were we able to respond to Hamilton and colleague’s paper [6] without re-visiting the risk of bias assessments. The audiences for our talks also asked to what extent the funder of a trial influences the risk of bias or the extent to which having a methodologist or statistician on the trial team was linked to having a low risk of bias trial. The latter question about methodologists and statisticians is particularly relevant given that in our earlier work we recommended that trials should not be funded or given ethical approval unless a trial team had methodological and statistical expertise [5].

The purpose of the current study is to answer the questions posed by our colleagues for at least a subset of the trials included in our original cohort.

Methods

Sample

In our original study [1], we randomly selected up to two systematic reviews published between May 2020 and April 2021 from each of the 53 clinical Cochrane Review Groups. To be included, a review had to consider intervention effects rather than being a qualitative review or a review of reviews. We extracted data for 1659 randomised trials spread across 96 reviews from 49 of the 53 clinical Cochrane Review Groups. The remaining four Review Groups published no eligible reviews in our time period. For our current study, we selected all the UK, Irish, and Canadian trials from those included in our original study. A trial is considered to be UK, Irish, or Canadian if the trial is conducted in that country. In principle, we could have chosen trials from any country for this study, but we chose these three countries because the authors are resident in the UK and Ireland, and we included the Canadian trials to provide an international perspective.

Data extraction

The risk of bias assessments were done by the original review authors in our earlier study [5]. We extracted the following additional data from each trial publication: the number of participants in the trial, the funder, if a statistician was involved in the trial, if there was any methodological support from a trials unit or clinical research facility mentioned, the sponsor, and whether the sponsor was involved in the design or conduct of the trial. To determine whether a statistician was involved in the trial, we first looked to see if a statistician or statistics department was mentioned in the trial publication. We then checked the authors’ details section to see if any author was described as a statistician. If the article reported that a particular named author conducted the data analysis, we performed a Google search to establish if they were a statistician. We did not attempt to contact any study authors. All of the information extracted was recorded in MS Excel.

Risk of bias re-assessment

There were three iterations of our risk of bias reassessment. All reassessments were performed blind, i.e. the risk of bias reassessment was conducted independently without reference to the original risk of bias assessment. AD initially assessed the risk of bias of three trials. Independently of each other, FS and ST conducted a reassessment of the same three trials. This identified one discrepancy in the ‘other bias’ domain. AD conducted a reassessment of one further trial followed by an online meeting over Zoom, between AD, FS and ST, which provided an opportunity to address emerging issues. It became apparent that without a trial protocol to refer to, assessing the selective reporting bias domain, particularly for older trials, was challenging. Therefore, we devised a set of guidelines to provide more consistency for our risk of bias reassessment (Table 1). Following this, AD reassessed another three trials, and FS independently reassessed the same three trials. There was full agreement. AD conducted risk of bias assessments for all remaining trials (n = 133). GSH conducted a verification of all of AD’s reassessments and recorded discrepancies, along with her reasoning in cases of disagreement. In total, there were 43 discrepancies, 40 of which arose from a single error in reassessment concerning trials with no information on blinding (Guideline 2 in Table 1). For validation, FS conducted a risk of bias reassessment on a random sample of 10% of the trials (n = 14) and ST did the same for another 10% random sample. Of the trials randomly allocated to FS and ST, one was common to both, so a total of 27 trials were reassessed. These 27 trials were discussed on a Zoom meeting with AD to ensure consensus amongst the team. The solution implemented for any discrepancy was also then applied to the remaining 106 trials.

Table 1 Author agreed criteria for the re-assessment of risk of bias

Finally, an overall risk of bias assessment was conducted following Cochrane risk of bias guidelines [2]. When this was complete, the original risk of bias assessments done by the Cochrane review authors were added to the spreadsheet to facilitate our analysis.

Results

Our inclusion criteria gave us 113 UK trials, six Irish trials, and 40 Canadian trials. We felt that 159 trials (10% of the full original sample) was a reasonable sample to meet our objectives but was not so large that we would be unable to do the work within the time and resources we had available. Of the 159 trials, we were unable to obtain the publications for three UK trials, which reduced our sample to 156. One hundred and fifty trials had been assessed using the original Cochrane risk of bias tool, but nine trials were assessed using the new Cochrane ‘RoB2’ tool [7]. We excluded these nine trials from our reassessment for purely pragmatic reasons: the need for additional training for some members of our team to do an RoB2 assessment. Seven trials included in the reviews were not randomised and were excluded. Our final sample was therefore 140 trials.

Table 2 provides background information for the 140 trials included in our study. Most were UK trials (70%), and two thirds ran between 2001 and 2020. There was a wide range of Cochrane review groups covered by our sample, with the largest proportion of trials in the Common Mental Disorder Group (19.3%). Most trials had fewer than 200 participants.

Table 2 Tolerating bad health research (part 2): still as many bad trials, but more good ones too

Table 3 compares the original risk of bias judgements by the Cochrane review authors to our new risk of bias judgements—i.e. low, unclear, and high risk of bias—for each of the Cochrane risk of bias domains. Compared to the original assessments, we judged more studies to be low risk of bias across all domains, which is also reflected in the overall risk of bias assessment. Our judgement of overall high risk of bias was broadly consistent with the original Cochrane authors, but we considered almost double the number of studies (22 compared to 12) to be overall low risk of bias compared to the original Cochrane authors. The situation is not improving and in fact the trend in the overall risk of bias over time is for an increase in the proportion of high risk of bias trials (1990–1999 43%; 2000–2009 59%; 2010–2021 70%).

Table 3 Original Cochrane authors’ risk of bias assessments and our new assessments

The number and proportion of differences between our assessment and the original assessment for each risk of bias domain is shown in Table 4. The greatest differences are in the two blinding and the ‘Other bias’ domains.

Table 4 The number of differences between our judgement and the original judgement

Of the 49 changes we made from the original overall risk of bias domain scores, 28 of them reduced the risk of bias compared to the original Cochrane authors while 21 increased the risk of bias (Table 5).

Table 5 The changes in the overall bias from the original risk of bias judgement to our assessment

Obtaining information on the trial sponsor was difficult with 89% of trials not reporting it. Funding was reported for 64% of trials. Of those, 56% were academic/hospital funded, 35% were charity/foundation funded, and 9% were commercial. Fourteen percent (n = 20) had methodological support from a clinical trial unit/clinical research facility, and 32% (n = 45) had statistical support. Small numbers mean the influence of this support on the risk of bias (Table 6) is inconclusive.

Table 6 Other potential influences on trial bias

Discussion

Our original article concluded that most trials were at high risk of bias and were, to use our parlance, bad [5]. Some insightful questions from colleagues led us to reexamine a subset of 140 trials from our article and to do the risk of bias assessments ourselves. Having done that, our new conclusion is that most trials are bad, but there are more good trials than we originally thought. To put this into numbers, the proportion of high risk of bias trials (bad) remained more or less where it was at 55%, but the proportion of low risk of bias trials (good) increased from 9 to 16%. The proportion of unclear risk of bias trials was squeezed accordingly. The situation is not improving over time as described in our results and also reflects what we found in our original work [1].

Hamilton and colleagues argued convincingly that bad research is not all bad [6] and our positions are not so far apart. The perfect trial lives its entire life on paper and does not survive the transition from design to reality. Much like the Tooth Fairy, methodologically perfect trials are often spoken of but rarely seen. The point we made in our original article, and continue to make in this one, is that many trials are bad but do not need to be. With more methodological and statistical support at the design stage, many problems could be fixed fairly easily [8]. We acknowledge the compromises that are needed to do a trial and that trial teams must often choose from options that are less than ideal. But even with these caveats, we feel uneasy when fewer than one in five trials is judged to be low risk of bias. That seems like a lot of compromise to us.

There are other messages too. Some of our colleagues suspected that systematic reviewers struggled with how to deal with a lack of blinding when using risk of bias tools and our results suggest that they are correct. Issues around blinding are a key driver of high risk of bias ratings, and in the current study, we changed almost half of all blinding assessments. We looked at blinding together with the potential impact on the outcome and outcome measurement, not just whether there was blinding or not, which seems sensible and is anyway what Cochrane guidance recommends. Not all reviewers take this approach, and this led to a lot of our changes on the blinding domain. That said, it is worth noting that while our reassessments increased the number of low risk of bias judgements for blinding, we also increased the number of high risk of bias judgements. As with overall risk of bias, it was the number of unclear risk of bias judgements that decreased.

Our ‘rules’ for how to score some domains (Table 1) led to a consistent approach but also some substantial differences between our assessments and the originals for the ‘Selective reporting’ and ‘Other’ bias domains. Overall, we increased the proportion of trials judged as low risk of bias on these domains. We are comfortable with the rules we used but the number of changes we made does underline the subjective nature of risk of bias assessment. It is a process that needs time and discussion, and training.

Despite the subjective nature of risk of bias assessment, and the substantial differences between our assessments and the originals for some domains, we and the original Cochrane authors remained in agreement about the proportion of trials that had serious risk of bias problems. In other words, this exercise has not overturned our original result regarding the proportion of trials we call bad.

Things are less rosy for us when it comes to the characteristics shown in Table 6. We recommended that trials should be neither funded or approved unless there was statistical and methodological expertise on the team, both of which Hamilton and colleagues disagreed with on grounds of the potential for unintended consequences [6]. These authors then raise some sensible questions about how such recommendations would be implemented. Disappointingly for us, the current study was unable to give any sort of meaningful signal that trials with statistical and methodological support were more likely to lead to trials with lower overall risk of bias. Reporting of this information was poor, but still. Based on our cohort of 140 trials, it is not possible to say much at all about the impact of statistical and methodological support. We are also unable to say anything meaningful regarding whether funder or sponsor involvement affects the overall risk of bias of a trial, again chiefly because of incomplete reporting.

Strengths and limitations

We have reassessed risk of bias as a team, and had a substantial amount of discussion regarding it, and this is a strength. We have also directly addressed some questions that our colleagues raised. They suspected that the blinding domain may be a particular concern: we found that they were correct. We changed the blinding assessment just under half the time although mostly by shifting assessments out of unclear risk of bias and into either high or low risk. This mattered more for outcome assessors than for participants and personnel.

There are limitations too. We have reassessed around 10% of the original cohort and from just three of the 84 countries included in the original. Perhaps trials done elsewhere may produce different results. That said, we would expect the same issues around scoring of blinding and think that it is likely that for the remaining 90% of trials we would keep the number of trials judged as high risk of bias more or less the same, increase the number of low risk of bias trials, and reduce the number of unclear risk of bias trials accordingly. But that is a guess.

We did not obtain protocols and we did not try to contact authors, which are clear limitations. The latter limitation may be especially important for risk of bias domains highlighted by Wang et al. [4] as clearly influencing effect estimates (e.g. random sequence generation and allocation, and blinding). In some ways, this does support one of our earlier recommendations [1] that trial teams use a risk of bias tool at design. Trial teams are the only people who have every scrap of relevant information at their fingertips. Tackling potential problems at design would, we still think, increase the number of low risk of bias trials and reduce the number judged as uncertain by reviewers working with imperfect knowledge, sometimes years later.

Implications for practice

Our implications for practice remain the same, and though they were never hierarchical, we have reordered them taking on board Hamilton et al.’s comments on bureaucracy and the risk of potentially causing more paperwork for trial teams.

  1. 1.

    Train and support more methodologists and statisticians. If we have the trained personnel to participate in trials, we are on the road to success

  2. 2.

    Put more money into applied methodology research and supporting infrastructure. The trials industry is growing every year. We need the infrastructure to support them

  3. 3.

    Do not fund a trial unless the trial team contains methodological and statistical expertise. We have not conclusively shown that this benefits the risk of bias assessment, but we do believe that having trained professionals with experience in design and analysis of trials will do more good than harm. This is likely to prevent trial results not being fit for purpose and prevent research waste

  4. 4.

    Do not give ethical approval for a trial unless the trial team contains methodological and statistical expertise. We understand Hamilton et al.’s concerns, and we do not suggest a formal assessment of qualifications takes place or that someone is named as a methodologist simply to tick a box. Each ethics committee has an ethics form. It would be possible to have a section there where the chief investigator details the methodological and statistical training and experience within the team and their role in the trial design in a few sentences. What is written in that section would only be a discussion point if the ethics committee was concerned about potential design problems

  5. 5.

    Use a risk of bias at trial design stage. We stand by this suggestion. Trial teams need to consider the risk of bias at the design stage, when there is an opportunity to reduce bias before it becomes entrenched

Conclusion

Many trials are high risk of bias (bad), and they do not need to be. Despite the subjectivity of risk of bias assessments, both our team and the original Cochrane authors largely agreed on the prevalence of serious bias in trials. However, our reassessment revealed a higher proportion of good trials than previously thought. Moving forward, continued dialogue, training, and methodological rigour are essential for improving the quality and reliability of clinical trials. Nevertheless, it remains concerning that fewer than one in five trials are deemed to have low risk of bias, highlighting the need for improvement.

Data availability

The datasets generated and/or analysed during the current study are available from the corresponding author on reasonable request.

Change history

References

  1. Pirosca S, et al. Tolerating bad health research: the continuing scandal. Trials. 2022;23(1):458.

    Article  PubMed  PubMed Central  Google Scholar 

  2. Higgins, J.P. and S. Green, Cochrane handbook for systematic reviews of interventions. Cochrane Collaboration and John Wiley & Sons Ltd. 2008.

  3. Jüni P, Altman DG, Egger M. Assessing the quality of controlled clinical trials. BMJ. 2001;323(7303):42–6.

    Article  PubMed  PubMed Central  Google Scholar 

  4. Wang Y, et al. Compelling evidence from meta-epidemiological studies demonstrates overestimation of effects in randomized trials that fail to optimize randomization and blind patients and outcome assessors. J Clin Epidemiol. 2024;165: 111211.

    Article  PubMed  Google Scholar 

  5. Pirosca S, et al. Tolerating bad health research: the continuing scandal. Trials. 2022;23(1):1–8.

    Article  Google Scholar 

  6. Hamilton F, Arnold D, Lilford R. Bad research is not all bad. Trials. 2023;24(1):680.

    Article  PubMed  PubMed Central  Google Scholar 

  7. Sterne JA, et al, RoB 2: a revised tool for assessing risk of bias in randomised trials. BMJ. 2019;366.

  8. Yordanov Y, et al. Avoidable waste of research related to inadequate methods in clinical trials. BMJ. 2015;350.

Download references

Acknowledgements

Thank you to all those that commented on our original paper that led to this piece of work.

Funding

AD was funded by the HRB Trials Methodology Research Network (HRB TMRN 2021–1).

Author information

Authors and Affiliations

Authors

Contributions

ST conceived the project and contributed to the design, analysis, and write-up. FS funded the project and contributed to the design, analysis, and write up. AD led on all stages of the project under supervision of FS and ST. GSH contributed to the data extraction and risk of bias assessments and reviewed the final draft.

Corresponding author

Correspondence to Frances Shiely.

Ethics declarations

Ethics approval and consent to participate

This was an observational retrospective study of trial publications in the public domain so ethical approval was not required.

Consent for publication

All authors agree to the publication.

Competing interests

ST is an Editor-in-Chief of Trials. FS is an associate editor of Trials.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

“The original online version of this article was revised: Following the publication of the original article, we were notified that the third author’s last name was incorrectly tagged as Hayes instead of Shiely Hayes.”

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Daly, A., Treweek, S., Shiely Hayes, G. et al. Tolerating bad health research (part 2): still as many bad trials, but more good ones too. Trials 26, 110 (2025). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s13063-025-08747-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s13063-025-08747-4

Keywords