The Human Genome Project revealed that ~1-2% of our genome makes functional proteins while the role of the remaining 98-99% remains enigmatic. Researchers have tried to uncover the mysteries surrounding the same and this article throws light on our understanding of its role and implications for human health and diseases.
From the time the Human Genome Project (HGP) was completed in April 20031, it was thought that by knowing the entire sequence of human genome which consists of 3 billion base pairs or ‘pair of letters’, genome will be an open book using which researchers would be able to pin point exactly how a complex organism as a human being works which will eventually lead to finding our predispositions to various kinds of diseases, enhance our understanding of why disease occurs and finding cure for them as well. However, the situation became very perplexed when the scientists were only able to decipher only a part of it (only ~1-2%) which makes functional proteins that decide our phenotypic existence. The role of 1-2% of the DNA to make functional proteins follows the central dogma of molecular biology which states that DNA is first copied to make RNA, especially mRNA by a process called transcription followed by production of protein by mRNA by translation. In the language of the molecular biologist, this 1-2% of the human genome codes for functional proteins. The remaining 98-99% is referred to as ‘junk DNA’ or ‘dark matter’ which does not produce any of the functional proteins mentioned above and is carried as a ‘baggage’ every time a human being is born. In order to understand the role of the remaining 98-99% of the genome, ENCODE ( ENCyclopedia Of DNA Elements) project2 was launched in September 2003 by the National Human Genome Research Institute (NHGRI).
The ENCODE project findings have revealed that majority of the dark matter’’ comprises of noncoding DNA sequences that function as essential regulatory elements by turning genes on and off in different type of cells and at different points in time. The spatial and temporal actions of these regulatory sequences is still not completely clear, as some of these (regulatory elements) are located very far away from the gene they act upon while in other cases they may be close together.
The composition of some of the regions of human genome was known even before the launch of the Human Genome Project in that ~8% of the human genome is derived from viral genomes embedded in our DNA as human endogenous retroviruses (HERVs)3. These HERVs have been implicated in providing innate immunity to humans by acting as regulatory elements for genes that control immune function. The functional significance of the this 8% was corroborated by the findings of the ENCODE project which suggested that majority of the ‘dark matter functions as regulatory elements.
In addition to the ENCODE project findings, a vast amount of research data is available from the past two decades suggesting a plausible regulatory and developmental role for the ‘dark matter’. Using Genome-wide association studies (GWAS), it has been identified that majority of the noncoding regions of DNA are associated with common diseases and traits4 and variations in these regions function to regulate the onset and severity of large number of complex diseases such as cancers, heart disease, brain disorders, obesity, among many others5,6. The GWAS studies have also revealed that majority of these non-coding DNA sequences in the genome get transcribed (converted to RNA from DNA but not translated) into non-coding RNAs and perturbation of their regulation lead to differential disease causing effects7. This suggests the ability of non-coding RNAs to play a regulatory role in the development of the disease8.
Further, some of the dark matter remains as non-coding DNA and functions in a regulatory manner as enhancers. As the word suggests, these enhancers function by enhancing (increasing) the expression of certain proteins in the cell. This has been shown in a recent study where the enhancer effects of a non-coding region of DNA make patients susceptible to complex autoimmune and allergic diseases such as inflammatory bowel disease9,10, thereby leading to the identification of a new potential therapeutic target for the treatment of inflammatory diseases. The enhancers in the ‘dark matter’ has also been implicated in brain development where the studies on mice have shown that the deletion of these regions lead to abnormalities in brain development11,12. These studies might help us to better understand the complex neurological diseases such as Alzheimer’s and Parkinson’s. ‘Dark matter’ has also been shown to play a role in the development of blood cancers13 such as chronic myelocytic leukemia (CML) and chronic lymphocytic leukemia (CLL).
Thus, ‘dark matter’ represents an important part of the human genome than previously realised and has directly influences human health by playing a regulatory role in the development and onset of human diseases as described above.
Does that mean that the entire ‘dark matter’ is either transcribed into non-coding RNAs or play an enhancer role as non-coding DNA by acting as regulatory elements associated with predisposition, onset and variations in the various diseases inflicting humans? The studies performed till now show a strong preponderance for the same and more research in the coming years will help us exactly delineate the function of the entire ‘dark matter’, that will lead to identification of novel targets in the hope of finding cure to the debilitating diseases that inflict the human race.
1. “Human Genome Project Completion: Frequently Asked Questions”. National Human Genome Research Institute (NHGRI). Available online at https://www.genome.gov/human-genome-project/Completion-FAQ Accessed on 17 May2020.
2. Smith D., 2017. The mysterious 98%: Scientists look to shine light on the ‘dark genome’. Available online at https://phys.org/news/2017-02-mysterious-scientists-dark-genome.html Accessed on 17 May 2020.
3. Soni R., 2020. Humans and Viruses: A Brief History of Their Complex Relationship And Implications for COVID-19. Scientific European Posted 08 May 2020. Available online at https://www.scientificeuropean.co.uk/humans-and-viruses-a-brief-history-of-their-complex-relationship-and-implications-for-COVID-19 Accessed on 18 May 2020.
4. Maurano MT, Humbert R, Rynes E, et al. Systematic localization of common disease-associated variation in regulatory DNA. Science. 2012 Sep 7;337(6099):1190-5. DOI: https://doi.org/10.1126/science.1222794
5. A Catalog of Published Genome-Wide Association Studies. http://www.genome.gov/gwastudies.
6. Hindorff LA, Sethupathy P, et al 2009. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc Natl Acad Sci U S A. 2009, 106: 9362-9367. DOI: https://doi.org/10.1073/pnas.0903103106
7. St. Laurent G, Vyatkin Y, and Kapranov P. Dark matter RNA illuminates the puzzle of genome-wide association studies. BMC Med 12, 97 (2014). DOI: https://doi.org/10.1186/1741-7015-12-97
9. The Babraham Institute 2020. Uncovering how ‘dark matter’ regions of the genome affect inflammatory diseases. Posted 13 May, 2020. Available online at https://www.babraham.ac.uk/news/2020/05/uncovering-how-dark-matter-regions-genome-affect-inflammatory-diseases Accessed on 14 May 2020.
10. Nasrallah, R., Imianowski, C.J., Bossini-Castillo, L. et al. 2020. A distal enhancer at risk locus 11q13.5 promotes suppression of colitis by Treg cells. Nature (2020). DOI: https://doi.org/10.1038/s41586-020-2296-7
11. Dickel, D. E. et al. 2018. Ultra conserved enhancers are required for normal development. Cell 172, Issue 3, P491-499.E15, January 25, 2018. DOI: https://doi.org/10.1016/j.cell.2017.12.017
12. ‘Dark matter’ DNA influences brain development DOI: https://doi.org/10.1038/d41586-018-00920-x
13. Dark-matter matters: Discriminating subtle blood cancers using the darkest DNA DOI: https://doi.org/10.1371/journal.pcbi.1007332