New biomedical techniques, like next-generation genome sequencing, are creating vast amounts of data and transforming the scientific landscape. They’re leading to unimaginable breakthroughs – but leaving researchers racing to keep up. In this feature for Mosaic, Tom Chivers meets the biologists – junior and senior – who are learning to work with algorithms.
If you want to dive deeper into this topic, here’s some further reading. We’ve broken things down into key subtopics, but otherwise these links aren’t listed in any particular order – so feel free to dip in and out.
Big data in research – an overview
Emily Dreyfuss’s article ‘Want to make it as a biologist? Better learn to code’ reflects on the limitations facing expert researchers who were never taught programming skills.
This 2014 Forbes article anticipates how data might transform science: “It is as if we are back in the days of Leeuwenhoek, staring into a microscope for the first time and just beginning to understand the possibilities.
Wet and dry labs
This 2017 study compares students’ perceptions of wet and dry labs. The results, after comparing two experiments, found that students considered the wet-lab work to be more like “real science”, although they found “more scientific tasks” to be involved in the database work.
David Heckerman’s 2012 speech ‘Biology: from wet to dry’ details how biology has progressed from using one-off experiments that generated small amounts of data to having so much data readily available that a hypothesis can be tested without any collaboration with a wet lab whatsoever.
In this 2015 interview, Sarah Teichmann, who works at the Wellcome Sanger Institute, explains how she combines computational methods with lab experimentation.
In Tom Chivers’s article for Mosaic, Professor Gil McVean observes that research labs 15 years ago were typically 90 per cent wet but are today 90 per cent computing. This has resulted in excess empty research space. For instance, the Greater Baton Rouge Business Report recently announced that 16,000 square feet of underutilised wet-lab space at Louisiana State University would be sold and converted into office space.
Early criticisms and challenges
The 2013 article ‘Why big data is bad for science’ claims that big data samples take longer to analyse, with an increased risk of spurious correlations or flukes.
This paper from 2008 explores the difficulties of daily cooperation between wet and dry researchers, in what is dubbed “the moist zone”.
Timo Hannay, former managing director of Digital Science, claimed in 2014 that the world of research was not doing enough to embrace the power of information. Read the details in his article ‘Science’s big data problem’.
More from Mosaic
Tom Chivers’s article on big data in science concludes our season on genomics, which has coincided with the 25th anniversary of the opening of the Wellcome Sanger Institute. Visit our genomics page for more information about the Sanger Institute and the impact it has had on science and healthcare.
Here are two other highlights from the series, both reporting on Sanger Institute projects that take a big data approach to science.
In ‘Searching for a diagnosis: how scientists are untangling the mystery of developmental disorders’, Linda Geddes explores how, 17 years since the first draft of the human genome, our genes are giving up their secrets and bringing hope to parents around the world.
And in ‘The DNA detectives hunting the causes of cancer’, Kat Arney investigates why cancer rates vary wildly across the world. To solve this mystery, scientists are tracking down causes of cancer by the fingerprints they leave in the genome.