Monday 4 January 2016

Being Human - The Human Accelerated Regions of our genome (HARs).


In the middle of the last decade advances in DNA technology allowed scientists to sequence the genome of the Chimpanzee (1). The door was thus open to compare the human genome and that of our nearest relative. The aim was to identify what changes have led to humanity's unique abilities. Of particular interest were our cognitive abilities, speech and language and upright mode of locomotion.
A major step towards this goal was achieved when Pollard publish a pair of papers in 2006 (2 and 3).

In the first paper Pollard (2) and her co-authors said the following:

 
“Recent sequencing and assembly of the genome of the common chimp (Pan troglodytes) offers an unprecedented opportunity to understand primate evolution and to identify those changes in the ancestral hominoid genome which gave rise to the modern human species [1]. Primate genome comparisons are expected to shed light on questions as diverse as the origins of speech [2,3] and the progression of HIV infection to AIDS [4]. Whereas the aim of comparative studies of human and rodent genomes [5,6] is typically to identify genomic elements that are evolutionarily conserved (and therefore presumably functionally important given the ~150 million years of evolution separating the species), we look to the chimpanzee genome to better understand what is uniquely human about our genome. One goal is to find DNA elements that show evidence of rapid evolution in the human lineage, where “accelerated” or “rapid” refers to a general increase in the rate of nucleotide substitution. Pollard et al. [7] used comparative genomics to identify 49 such human accelerated regions (HARs) that are evolving very slowly in vertebrates but have changed significantly in the human lineage. The most accelerated of these, HAR1, was found to be a novel RNA gene expressed during neocortical development [7]. In this paper, we investigate the properties of a larger set of 202 carefully screened HARs in order to unravel the evolutionary forces at work behind the fastest evolving regions of the human genome.
 
To address questions of human-specific molecular evolution it is not sufficient to simply identify all nucleotide differences between the human and chimpanzee genomes. Despite being a small fraction of the human genome, the number of human bases that differ from the corresponding chimp base is still large (nearly 29 million bases), and it is likely that most of these differences do not have a functional consequence. Furthermore, many authors, starting with the seminal work of King and Wilson [8], have suggested that the majority of the changes that distinguish humans from other hominoids will be found in the 98.5% of the genome that is non-coding DNA, which is a vast territory to search. To identify changes that may be functional, we focus on the set of regions of the human genome of at least 100 base pairs (bp) that appear to have been under strong negative selection up to the common ancestor of human and chimp (as evidenced by high sequence identity between chimp and rodents), but exhibit a cluster of changes in human compared to chimp. Our expectation is that the selective constraint on the most extremely accelerated regions of the human genome may have switched from negative to positive (and possibly back to negative) some time in the last 5−6 million years.”
Note To allow the reader to fully appreciate Pollard et. al’s argument I have included the ‘nested’ references from the above under a sub-heading “Pollard 2 references” in my references section at the end of this post, my own reference appear in round (n) brackets.
 
Put simply what Pollard et. al. were saying was:
·         In placental mammals certain genomic areas have been conserved over a vast stretch of time - 150 million years.
·         The most likely reason for long period of conservation is that they are “functionally important”.
Yong (9) neatly describes functionally important DNA thus: “For years, we’ve known that only 1.5 percent of the genome actually contains instructions for making proteins, the molecular workhorses of our cells. But ENCODE has shown that the rest of the genome – the non-coding majority – is still rife with “functional elements”. That is, it’s doing something.
It contains docking sites where proteins can stick and switch genes on or off. Or it is read and ‘transcribed’ into molecules of RNA. Or it controls whether nearby genes are transcribed (promoters; more than 70,000 of these). Or it influences the activity of other genes, sometimes across great distances (enhancers; more than 400,000 of these). Or it affects how DNA is folded and packaged. Something.”
·         The authors compared the conserved areas of the Chimp and human DNA to look for areas of rapid evolution these they named “human accelerated regions” (HARs).
·         They found 202 such HAR regions of DNA.
·         To identify functional changes in the human genome as compared to that of the Chimp the authors focus on these HARs.
 
The paper’s results show the following
 
·         The normalized human substitution rate exceeds the rate in the chimp-rodent phylogeny in all of the HARs.
·         The divergence between the human and chimpanzee genomes is higher in the top 49 HARs
·         Directly comparing substitution rates per site in the human and chimp branches (over the same period of evolutionary time), the human rate is an average of seven times higher than the chimp rate in HAR1–HAR5.
·         The HAR elements themselves are significantly more diverged from chimpanzee than surrounding sequences
·         The index of dispersion (i.e., the ratio of the variance in the number of substitutions on a lineage to the mean number).. in HAR1–HAR5 is much larger than the expected value of 1.. and therefore these data are compatible with strong selection on the human lineage.
·         All of the observed human-specific changes in HAR1–HAR5 occurred after human diverged from chimp.
·         These findings are in agreement with the hypothesis, first proposed by King and Wilson in 1975, that the majority of chimp-human phenotypic differences can be explained by differential control of transcriptional networks [8] which may be expected to occur primarily in the non-coding DNA and in particular in the HAR regions identified (own italics).
In her second paper of 2006 Pollard et. al. looked more closely at the top ranked region of significant evolutionary acceleration. They reported that the most dramatic of these ‘human accelerated regions’, HAR1 "is part of a novel RNA gene (HAR1F) that is expressed specifically in Cajal–Retzius neurons in the developing human neocortex from 7 to 19 gestational weeks, a crucial period for cortical neuron specification and migration. HAR1F is co-expressed with reelin, a product of Cajal–Retzius neurons that is of fundamental importance in specifying the six-layer structure of the human cortex".
In other words the change in the HAR1 region is more than like responsible for humanity's differences with respect to higher functions such as sensory perception, generation of motor commands, spatial reasoning, conscious thought, and language.
The impact of these two papers was immense. The identification of the HAR regions opened the door for researchers to investigate the differences between our nearest hominid relative the Chimpanzee and answer the question what TRULY makes us human.
This is all well and good, but HOW did Pollard et. al. accomplish this? Her popular science piece of 2012 for Scientific American (4) explains the process of hunting for the differences between the Chimp and Human genomes:
“To facilitate the hunt, I wrote a computer program that would scan the human genome for the pieces of DNA that have changed the most since humans and chimps split from a common ancestor. Because most random genetic mutations neither benefit nor harm an organism, they accumulate at a steady rate that reflects the amount of time that has passed since two living species had a common forebear (this rate of change is often spoken of as the “ticking of the molecular clock”). Acceleration in that rate of change in some part of the genome, in contrast, is a hallmark of positive selection, in which mutations that help an organism survive and reproduce are more likely to be passed on to future generations. In other words, those parts of the code that have undergone the most modification since the chimp-human split are the sequences that most likely shaped humankind.
In November 2004, after months of debugging and optimizing my program to run on a massive computer cluster at the University of California, Santa Cruz, I finally ended up with a file that contained a ranked list of these rapidly evolving sequences.”
 
Pollard further explains what she did next:
 
“We spent the next year finding out all we could about the evolutionary history of HAR1 by comparing this region of the genome in various species, including 12 more vertebrates that were sequenced during that time. It turns out that until humans came along, HAR1 evolved extremely slowly. In chickens and chimps—whose lineages diverged some 300 million years ago—only two of the 118 bases differ, compared with 18 differences between humans and chimps, whose lineages diverged far more recently. The fact that HAR1 was essentially frozen in time through hundreds of millions of years indicates that it does something very important; that it then underwent abrupt revision in humans suggests that this function was significantly modified in our lineage.”
 
The result was the two papers outline above. A nice illustration accompanies her earlier 2009 piece (5) also in scientific American.
 
Photo credit: Pollard (5)
 
Since then a huge amount of research has gone into looking at the HARs. In 2012 Pollard, herself  (4) summarised these:
 
“HAR1 resides in two overlapping genes. The shared HAR1 sequence gives rise to an entirely new type of RNA structure, adding to the six known classes of RNA genes. These six major groups encompass more than 1,000 different families of RNA genes, each one distinguished by the structure and function of the encoded RNA in the cell... HAR1 is also the first documented example of an RNA-encoding sequence that appears to have undergone positive selection..”
 
“So, too, is the FOXP2 gene, which contains another of the fast-changing sequences I identified and is known to be involved in speech. ..FOXP2 extracted from a Neandertal fossil and found that these extinct humans had the modern human version of the gene, perhaps permitting them to enunciate as we do.”
 
“.. human brain volume has more than tripled since the chimp-human ancestor—a growth spurt that genetics researchers have only begun to unravel.
One of the best-studied examples of a gene linked to brain size in humans and other animals is ASPM. Genetic studies of people with a condition known as microcephaly, in which the brain is reduced by up to 70 percent, uncovered the role of ASPM and another gene—CDK5RAP2—in controlling brain size. More recently, researchers at the University of Chicago, the University of Michigan and the University of Cambridge have shown that ASPM experienced several bursts of change over the course of primate evolution, a pattern indicative of positive selection. At least one of these bursts occurred in the human lineage since it diverged from that of chimps and thus was potentially instrumental in the evolution of our large brains
.. Amazingly, more than half of the genes located near HARs are involved in brain development and function..”
 
A little more detail on HAR1 activity from Carta Anthropology (6):
 
“Human Accelerated Regions 1 (HAR1) is part of the cis-antisense RNA gene pair HAR1F and HAR1R, which are expressed in neurons during human embryonic cortical development and adult brain. HAR1 is conserved in amniotes as far back as frogs, but 18 base pair substitutions have occurred specifically in the human lineage leading to a secondary structure change in HAR1F that is unique to humans. HAR1F co-expresses with reelin, a protein important to the proper layering of the human cortex, suggesting an important role for HAR1 in human brain development. In addition, HAR1 expression is repressed by REST, and it has been hypothesized that changes in HAR1 expression may contribute to Huntington’s disease phenotypes.”

Photo credit: Pollard (5)

Back to Pollard’s 2012 article. Having pointed out some of the positive effects of the Human Accelerated Regions of our genome, Pollard notes that there are some negative consequences associated with HARs:
 
PtERV1 is a relic retro-virus that plagued ancient chimps, gorillas and humans living in Africa about four million years ago. Its effects can be found on the genes we have inherited from our ancestors.

Researchers reconstructed the original PtERV1 sequence and re-created this ancient retrovirus. They then performed experiments to see how well the human and great ape versions of the TRIM5α gene could restrict the activity of the resurrected PtERV1 virus. Their results indicate that most likely a single change in human TRIM5α enabled our ancestors to fight PtERV1 infection more effectively than our primate cousins could, however these same shifts make it much harder for us to fight HIV. This finding is helping researchers to understand why HIV infection leads to AIDS in humans but less frequently does so in nonhuman primates.
 
Photo credit: Pollard (5)

In 2014 Pollard also co-authored a review article (7) with Hubisz on the work carried out on HARs
 
“Transgenic [gene regulatory] enhancer assays also enable the activity of a human ncHAR sequence to be compared to its ortholog from chimpanzee or other mammals. Of 26 ncHAR enhancers that have been tested using both human and non-human primate sequences, seven drive human-specific expression patterns in mouse embryos at day 11.5. The tissues with differential expression are limb (HAR2, 2xHAR114), eye (HAR25), forebrain (2xHAR142, 2xHAR238), and the midbrain–hindbrain boundary (2xHAR164, 2xHAR170). The functional implications of these expression differences remain to be discovered, but it is tempting to speculate that changes in the development of these tissues could influence human anatomy and traits such as fine motor skills, spoken language, and cognition.”
 
In other words non-coding (nc)HARs are implicated in limb, eye, fore, mid and hindbrain development and thus may affect the development of fine motor skills, language and the higher reasoning skills seen in humanity.

The 2012 article in Scientific American had some people bamboozled though. One online comment (8) in particular, made me smile:


“If the 118 base pair sequence that makes up HAR1 have been so highly conserved over 300 million years with only 2 base pair substitutions since chickens and chimps diverged, what type of natural selection process could account for 18 base pair changes in the span of 6 millions years since we split with the chimps. And why haven't we seen examples of 4, 6, 8 or more base pair variations in any other species? Is it possible that the only viable genetic variation for the HAR1 sequence would be the ancestral and the human versions, with nothing in between? If so, what are the odds that random mutation could be responsible for 18 base pair changes all occurring at the same time in such a highly conserved piece of DNA code? I think these questions should be answered by the author!”

Although at the time, a little logic was needed the commenter could have answered his own question with a little thought.. During the estimated six million (too low a number in my opinion) years since our split from our last common ancestor with Chimpanzees, humanity has been through/on a HUGE genetic odyssey. What species led to and/or contributed genes to, humanity among Ardipithecus, Australopithecus, Homo habilis, Homo erectus, Homo ergastor, Homo heidelburgensis and lastly Homo neaderthalensis is still an open question. However since the interbreeding between Neanderthals and humans has been discovered through the sequencing of Neanderthal genomes, much work has been carried out to understand in what species HARs first began to appear. In their review article (7) Hubisz and Pollard give their take on when HARs first appear on the human tree:
 
“Genomes from archaic hominins and diverse modern humans provide information about when along the human lineage HAR mutations arose. We analyzed ncHARs for mutations shared with a Neanderthal [11] and a Denisovan [12] using other primates (100-way alignments;http://genome.ucsc.edu) to polarize differences. We estimate that 7.1% of human–chimp differences in ncHARs occurred after divergence from archaic hominins and 2.7% are shared. The post-archaic fraction is similar to that observed in targeted sequencing of HARs captured from an Iberian Neanderthal fossil [31]. Compared to chimp–human differences in flanking regions and phastCons elements, those in ncHARs are significantly more likely to be pre-archaic (90% show derived allele only in Neanderthal and Denisovan; both P < 0.01). Thus, the archaic hominins provide some evidence for a depletion of accelerated evolution in the past 1 million years of human evolution compared to earlier in our lineage.”

(note: nested references in this passage are given below under ‘Hubisz and Pollard references’ below)

Basically my reading of the above is that 92.9% (100-7.1) of HARs occurred BEFORE the split between the ancestors of modern humans and archaic species. In other words MOST of our evolution had occurred way BEFORE we, modern humans, emerged in Africa ca. 200000 years ago.. Whoa! Now there’s something to ponder!
 
So to answer the commenter query: “Of course the number of base pair changes didn’t happen all at once, they happened during our genetic journey.. but mainly early on”.

 
Readers of my blog may wonder where I am going with all this in the light of the other types of post on this blog.. Well all I can say for now is read the statement of intent at the top of the blog and give it a guess. Best answer wins a copy of ‘The Last Giant of Beringia’..
 
References 
1. The Chimpanzee Sequencing and Analysis Consortium.
Chimpanzee Sequencing and Analysis Consortium (2005) Initial sequence of the chimpanzee genome and comparison with the human genome. Nature 437: 69–87.
Full article available at
http://www.nature.com/nature/journal/v437/n7055/full/nature04072.html
 
2. Pollard K.S., Salama S.R., King B., Kern A.D., Dreszer T., Katzman S., Siepel A., Pedersen J.S., Bejerano R., Baertsch R., et al.
Forces shaping the fastest evolving regions in the human genome. PLoS Genet. 2006;2:1599–1611.
Found at http://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.0020168#pgen-0020168-b007
 
3. Pollard K.S., Salama S.R., Lambert N., Lambot M.A., Coppens S., Pedersen J.S., Katzman S., King B., Onodera C., Siepel A., et al. An RNA gene expressed during cortical development evolved rapidly in humans. Nature. 2006a;443:167–172
Abstract available at http://www.ncbi.nlm.nih.gov/pubmed/16915236
 
4. Pollard K.S., 2012. Secrets of Our Success. What makes us different? Scientific American Volume 22, Issue 1s
 
5. Pollard K.S. What makes us Human? Sci Am. 2009 May; 300(5):44-9.
 
6. Carta Anthropology.  Retrieved from:
http://carta.anthropogeny.org/moca/topics/human-accelerated-region-1-har1
 
7. Melissa J Hubisz and Katherine S Pollard. Exploring the genesis and functions of Human Accelerated Regions sheds light on their role in human evolution. Current Opinion in Genetics & Development 2014, 29:15–21
Download at http://www.sciencedirect.com/science/article/pii/S0959437X14000781
 
8. Comments on “What makes us Human”. Retrieved from:
http://www.scientificamerican.com/article/what-makes-us-human/
 
9. ENCODE: the rough guide to the human genome By Ed Yong 9/5/2012
Found at http://blogs.discovermagazine.com/notrocketscience/2008/06/14/rna-gene-separates-human-brains-from-chimpanzees/

 
Pollard 2 References
1. Chimpanzee Sequencing and Analysis Consortium (2005) Initial sequence
of the chimpanzee genome and comparison with the human genome.
Nature 437: 69–87.

2. Enard W, Przeworski M, Fisher S, Lai C, Wiebe V, et al. (2002) Molecular
evolution of FOXP2, a gene involved in speech and language. Nature 418:
869–872.

3. Holden C (2004) The origin of speech. Science 303: 1316–1319.

4. Varki A (2000) A chimpanzee genome project is a biomedical imperative.
Genome Res 10: 1065–1070.

5. Waterston R, Lindblad-Toh K, Birney E, Rogers J, Abril JF, et al. (2002)
Initial sequencing and comparative analysis of the mouse genome. Nature
420: 520–562.

6. Rat Genome Sequencing Project (2004) Genome sequence of the brown
Norway rat yields insights into mammalian evolution. Nature 428: 493-521.

7. Pollard KS, Salama SR, Lambert N, Coppens S, Pedersen JS, et al. (2006) An
RNA gene expressed during cortical development evolved rapidly in humans. Nature.
E-pub ahead of print 16 August 2006.

8. King MC, Wilson AC (1975) Evolution at two levels in humans and
chimpanzees. Science 188: 107–116.

Hubisz and Pollard references11. K. Prufer, F. Racimo, N. Patterson, F. Jay, S. Sankararaman, S. Sawyer, A. Heinze, G. Renaud, P.H. Sudmant, C. de Filippo, et al. The complete genome sequence of a Neanderthal from the Altai Mountains. Nature, 505 (2014), pp. 43–49
Pdf download available at http://dash.harvard.edu/handle/1/12717373?frbrVersion=10
 
12. M. Meyer, M. Kircher, M.T. Gansauge, H. Li, F. Racimo, S. Mallick, J.G. Schraiber, F. Jay, K. Prufer, C. de Filippo, et al. A high-coverage genome sequence from an archaic Denisovan individual. Science, 338 (2012), pp. 222–226

31. H.A. Burbano, R.E. Green, T. Maricic, C. Lalueza-Fox, M. de la Rasilla, A. Rosas, J. Kelso, K.S. Pollard, M. Lachmann, S. Paabo
Analysis of human accelerated DNA regions using archaic hominin genomes
PLoS ONE, 7 (2012), p. e32877
Full Text via CrossRef
 

No comments:

Post a Comment