ADVERTISEMENT

The Case For Accurate Reporting Of “Nonsignificant” Results

Empirical research based on experiments and data analysis requires an objective measure of the pre-experimental difference between treatment groups. The common way to measure such a difference is to use P-values. They are the outcome of statistical tests based on the data, for which level of statistical significance of P = 0.05 has become a recognized and accepted measure. When testing for statistical significance and obtaining values higher than 0.05, the difference or relationship is deemed weak and, by extension, uninformative and uninteresting. P values falling below this boundary suggest a strong, important, or “statistically significant” difference.

Statistically significant results, then, attract considerable interest in the research community in contrast to other well-designed and performed studies that ended with their main relationship as statistically non-significant. This black-and-white perspective not only stems from a misinterpretation of P-values but, more importantly, stimulates some malpractice (Amrhein et al. 2019).

ADVERTISEMENT

The over-representation of significant P-values in the scientific literature has been widely documented in several fields. One of the reasons behind this bias has been attributed to selective reporting where significant results are more likely to be submitted for publication but also published by editorial boards due to the false perception that significant results are more interesting and of higher scientific value than non-significant results. Because of this perception, some researchers are inclined, consciously or not, to manipulate data and analyses to obtain statistical-significant results (P< 0.05). This phenomenon is known as P-hacking (Head et al. 2015).

In our current study, we focused on an alternative scenario, where researchers favor non-significant outcome of a statistical test. We define reverse P-hacking as the manipulation of data and analyses to obtain a statistically non-significant result (i.e. P > 0.05). We thought this could occur in experiments when researchers randomly assign individuals to a control or treatment group where they don’t want the groups to differ. This random assignment is often used to account for a confounding variable that, despite not being the focus of the study (mostly parameters like body size or age), may still affect the results.

Even under such a random setup, statistically significant results are expected to occur by chance alone in 5% of studies (i.e. commonly accepted threshold P-value of 0.05). Failing to acknowledge the effect of a confounding variable could have far-reaching consequences. Imagine releasing a new medical treatment after a clinical trial showed no significant adverse effects on patients, only afterward realizing that the placebo group was significantly older than the treated group. The trial failed to acknowledge the confounding variable of age, which might explain the absence of a significant difference in side effects between groups (e.g. the placebo group might have been more likely to have health complications due to aging that made the side effects from the younger treated group seem non-significant).

We screened a representative number of research articles published over 30 years within the discipline of behavioral ecology for these types of tests. We found that only 3 of 250 papers (here, 5% would be 12 papers) had reported a significant treatment-control difference for confounding variables. We conclude that the lower-than-expected number of significant P-values in the literature reporting effects of associated with confounding variables could be caused by reverse P-hacking and/or selective reporting. Selective reporting could stem, for example, from editorial boards decisions to reject a paper based on an experimental flaw (i.e. cannot disentangle the effect of the variable of interest with the confounding variable).

ADVERTISEMENT

Despite not being able to isolate reverse P-hacking as the cause of too few significant P-values, our empirical study provides a proof of concept, and we hope that future studies will replicate it in their own discipline. Much of the literature on publication bias is by statisticians discussing “in principle” methods to detect and correct for publication bias, or policy statements. These types of papers vastly outnumber studies that collect data. One of our main points was to show yet another way that the use of P-values (and a dichotomy between significance and non-significance) can lead to poor scientific practices that create a discrepancy between data collection/analysis and what eventually appears in the literature.

For some, it might come as a surprise that randomization alone is not enough for dealing with confounding variables. Those 5% of “unlucky” studies that got significant by chance alone might suffer from their undermined scientific conclusions or be improved by a justification of some unexpected results. If one wants to minimize the likelihood of obtaining a significant difference in confounding variable between treatment groups, we recommend the use of balanced designs over randomization. This would entail, for example, pre-sorting individuals based on a confounding variable (e.g. groups of similar-size animals) and then randomly selecting individuals from each category consecutively to form treatment groups. In addition, through the constant improvement of statistical methods, researchers have the option of controlling for confounding variables by including them in their statistical models, rather than attempting to control for them experimentally.

Whatever the methods used to form treatment groups, researchers should always test for differences in confounding variables and report them accurately. Editorial board members and reviewers must flag the absence of such tests and thus encourage better science practices in the community.

These findings are described in the article entitled Evidence that nonsignificant results are sometimes preferred: Reverse P-hacking or selective reporting?, recently published in the journal PLOS Biology.

Citation

ADVERTISEMENT
  • Amrhein V, Greenland S, McShane B (2019) Scientists rise up against statistical significance. Nature (567): 305-307. doi: 10.1038/d41586-019-00857-9
  • Head ML, Holman L, Lanfear R, Kahn AT, Jennions MD (2015) The Extent and Consequences of P-Hacking in Science. PLoS Biol 13(3): e1002106. doi:10.1371/journal.pbio.1002106

Comments

READ THIS NEXT

What Is Cached Data?

In our technology inundated world, we are exposed to a lot of tech-jargon that many people do not necessarily understand. One […]

Brush-Like Polysaccharides With Motif-Specific Interactions

The development of new materials using natural polymers provides an opportunity to meet industry demand and at the same time […]

Mycobacterium Avium Subspecies Paratuberculosis As A Soil Bacteria And ALS Clusters In Outdoor Sports Players

Trevor Nace very kindly reached out to me to write about my recently-published hypothesis proposing that some cases of amyotrophic […]

Two Cases Of Unusual Migraine Auras

Neurological symptoms during sex need urgent tests to look for a serious cause such as a bleed on the brain.  […]

Species Delimitation In North American Coachwhips And Whipsnakes

Published by Kyle O’Connell National Museum of Natural History and the University of Texas at Arlington These findings are described […]

A Flexible Photo-Thermoelectric Nanogenerator Based On MoS2/PU Photothermal Layer For Infrared Light Harvesting

Thermoelectric emerged as the most promising technology to collect thermal energy, which is greatly abundant and universally existing but easily […]

New Toulouse-Led Scientific Study Reveals Drosophila melanogaster Can Transmit Sexual Preferences Culturally Over The Long Term

A Toulouse-led interdisciplinary research consortium shows that the fruit fly has all the cognitive capacities to culturally transmit their sexual […]

Science Trends is a popular source of science news and education around the world. We cover everything from solar power cell technology to climate change to cancer research. We help hundreds of thousands of people every month learn about the world we live in and the latest scientific breakthroughs. Want to know more?