It was 30 years ago that a group of ambitious scientists first began the greatest project in the history of genetic research. Beginning on 1 October 1990, they set out to sequence and map all of the genes – together known as the genome – of Homo sapiens. The first draft of the human genome was published in 2001 and the Human Genome Project (HGP) was officially completed in April 2003. The consequences were momentous. The project provided a base map for all future genetic research and drastically cut the cost of future genome mapping. Today more than a million human genomes have been sequenced. A wide array of research has emerged, providing the opportunity to delve into human genetic variation, discover drugs that target specific proteins in the body and allowing people to send off a drop of saliva and receive a breakdown of their heritage.
In a recent big-data analysis, encapsulating more than 700,000 scientific studies, researchers from the Center for Complex Network Research (CCNR), directed by Professor Albert-László Barabási set out to track the trends in genetic research that followed the HGP. Perhaps unsurprisingly they identified an explosion in genetic research following publication, although the specifics of some of this research were more surprising.
Genes can either be coding genes (which means that they lead to the formation of proteins which have a direct eff ect in the body) or non-coding. Once considered ‘junk DNA’ there has been a boom in research regarding the non-coding portion of the genome since the HGP. Far from being junk, there is now a greater understanding that these genes work together to bring the genome to life. ‘With the big data analysis that we did, we actually noticed that the increase really happens after the Human Genome Project. The explosion in attention to the non-coding genes is very important,’ says Deisy Morselli Gysi, a postdoctoral research associate at CCNR.
When it comes to the more famous coding genes, the analysis revealed what most researchers already know, that huge attention has been paid to a small proportion of ‘superstar genes’, leaving others completely unstudied. There are some valid reasons for this. The gene called TP53 is the subject of hundreds of publications a year because it is crucial to cell growth and death, and leads to cancer when inactivated or altered. But it does mean that potentially significant avenues of research have yet to even begin, something that was flagged as a problem on the tenth anniversary of the draft genome’s publication. The authors point to a ‘rich-gets-richer dynamic’ that drives this trend and raise a critical question for the future: ‘A challenge now for biology is to disentangle the motivations for what gets studied next. Are researchers putting money, time and effort into what is most important or urgent, or into more of the same because that will reliably win grants and plaudits?’
In some ways this is exciting as it means there are genetic diseases suffered today that in the future could be treated. Of the roughly 20,000 proteins revealed by the HGP as potential drug targets, the authors show that only about 10 per cent – 2,149 – have so far been targeted by approved drugs.
‘I think that one of the fun things about this analysis is the questions it begins to open up,’ says Alexander Gates, an associate research scientist at CCNR. ‘We can layer on and actually measure the extent that money and research agendas drove some of these trends. We can ask, is it possible to actually quantify which came first, the money or the knowledge?’