The Tree of Genes

I think that I shall never see/a tree as complex as a tree (with apologies to Joyce Kilmer)

By this time in your life (however old you are), you have heard the letters DNA and RNA countless times… or at least once each. You have been told about genomes, genes, and chromosomes, although you may have what is which and vice versa somewhat muddled. You may have a sense that someone somewhere has mapped out the complete human genome and that it is a large bowl of alphabet soup… with a very short alphabet of four letters—A, C, T, and G. You may have even seen references to other genomes from other creatures going through the process of “sequencing” while this sequencing business may remain a mystery.

This is the thing… all of this genetic data has an implication for how biologists—and lay people—have thought about the tree of life since we started grouping similar-looking creatures together into domain, kingdom, phylum, class, order, family, genus, and species. This “tree” metaphor started taking root (yuk-yuk) in the middle ages and became more complicated as more creatures were discovered (this being the usual term that we use when we find creatures that knew they were there all along without being “discovered”). The way we all look is how our genetic makeup is expressed. The genetic makeup has a much more complicated message about how we are all (by all I mean all living entities on this planet) related.

haeckel_arbol_bn
Attribution

First, all living entities contain some type of polymeric nucleotide, whether it is single-stranded DNA or RNA or double-stranded DNA or RNA. But let’s back up. What is life? The generally accepted biological definition is that life must exhibit seven characteristics:  organisms maintain homeostasis, are composed of cells, undergo metabolism, can grow, adapt to their environment, respond to stimuli, and reproduce. This definition covers most of what we perceive as living with the exception of viruses (there is more on those critters at What is life?). A lot of that definition is dependent on polymeric nucleotides in one way or another. There are other definitions of life but this is sufficient for our purposes.

The number of base pairs determined in the human genome is about 3,079,843,747 (3.1 billion). A base pair is either adenine paired through hydrogen bonding to thymine OR guanine paired through hydrogen bonding to cytosine (fun fact: guanine got its name from the biological source from which it was isolated – seabird “guano,” more commonly referred to as poop):

dna_chemical_structure-svg
Attribution

In RNA, thymine is replaced by uracil, another nucleoside (nucleosides do not have the phosphate “backbone” that links the ribose or deoxyribose sugars together to form RNA or DNA chains, respectively).

Within this 3.1 billion base pair code of life, there are currently estimated to be about 21,000 genes, each of which is a message for creating a protein of one type or another, some of which are structural proteins (i.e. make up parts of our bodies) and some of which have a wide variety of functions (e.g. digesting carbohydrates, initiating certain activities in a cell, etc.). The 3.1 billion base pairs are distributed across the twenty-three chromosomes that are packages of DNA with between 50 and 2,000 genes in each chromosome (fun fact: the chromosome with only 50 genes is the Y chromosome that must be present for a male to be “expressed;” the genes in the Y chromosome contain about 59,000,000 base pairs). These twenty-three are paired with identical chromosomes so that most cells contain forty-six chromosomes, useful during cell division, reproduction, and all that crucial life business. The following illustration may help with the geography of DNA, genes, and chromosomes and how they are related:

chromatin_structures
The major structures in DNA compaction: DNA, the nucleosome, the 10-nanometer “beads-on-a-string” fiber, the 30-nanometer fiber, and the metaphase chromosome.
(Attribution)

An odd factor is that there are long segments within these genes that do not seem to have any function, do not seem to be translated onto RNA and coded into any kind of protein or function determined to date. They are like sets of blank pages interspersed in groups within a novel, which is entirely readable without the blank pages. Some geneticists hypothesize that these unused segments are bits of genetics that may have been used by some previous version of ourselves or may be conserved from other creatures in our long evolutionary history but have gone silent for so long that they are like scaffolds left standing outside (or inside) a completely finished building; they aren’t needed but there they are!

An odd factor is that there are long segments within these genes that do not seem to have any function, do not seem to be translated onto RNA and coded into any kind of protein or function determined to date. They are like sets of blank pages interspersed in groups within a novel, which is entirely readable without the blank pages. Some geneticists hypothesize that these unused segments are bits of genetics that may have been used by some previous version of ourselves or may be conserved from other creatures in our long evolutionary history but have gone silent for so long that they are like scaffolds left standing outside (or inside) a completely finished building; they aren’t needed but there they are!

So here’s another interesting fact. Although we think quite highly of ourselves, we are by no means the creature on this planet with the greatest number of chromosomes. With 46 chromosomes per cell, we sit snugly between the rabbit (44) and the chimpanzee, gorilla, and hare (48), or if you prefer, between the Syrian hamster and cultivated tobacco. The kingfisher has 132 bite-sized nuggets of genetic material, while the adder’s tongue fern contains about 1,200 chromosomes.

ophioglossum_closeup
Ophioglossum vulgatum contains roughly 1,200 chromosomes compared to our measly 46

Does this mean we’re less or more than any other life form? No. The genes on these varying number of chromosomes occur in different zones of their distinct strands of DNA and are translated in different ways and while there are expressed similarities between humans and rabbits (e.g. we both have ears that stick out from our heads, erm, we’re both mammals who carry our young internally) and between humans and gorillas (e.g. 96% of base pair sequences are identical; if the gene sequences are compared, the comparison moves up to 98% of the base pair sequences are identical), the genes code for different traits in the final results (i.e. there are similarities but we’re obviously not “the same”).

The implication of genetics is that the “tree of life” model isn’t as much about external or anatomical similarities and differences but is about how genes on chromosomes are expressed. This has resulted in various reconsiderations of how the “tree” concept can be replaced by more genetically-based models. The bottom line is how are sequences of base pairs (DNA) interpreted by RNA and various resulting proteins to produce the life forms all around us? As usual, the models are more complicated.

The following is a “tree of life” based on genomic sequences. How do you read this (aside from the fact that it is impossibly rich in detail and the print is illegible)? At the center of the diagram are a couple of lines radiating out from a short line. These two lines branch out and become two more and those branch into two and this goes on until you end up with various genus and species names arrayed around the edge of the circle. The genii and species that came before the displayed names are not listed; if this diagram is complicated, imagine what a mess it would be if all life forms were listed!

tree_of_life_svg-svg
For a higher resolution version of this file, please go here.

This does not resemble the classical Haeckel tree of life anymore… except, perhaps, if one imagined looking down on a tree from directly above and saw all the branches radiating out from the trunk.

There’s a fascinating website called Open Tree of Life taking shape as more and more genomic data rolls in from laboratories around the world. This model really gets into the details of how these various branches articulate outwards and become the huge variety of life, some of which we have seen but most of which we will never see as we are typically only vaguely familiar with the life forms that make up our immediate environment. To interact with Open Tree of Life (OTL), click on any round node in the model and a new page will open with all the critters mapped to date that fall under that category. To date, the OTL project only has about 50,000 creatures in this database; there are estimated to be 2.6 million separate life forms known, so there is quite a bit of work to do to complete the project. Nonetheless, this is a far more accurate and dynamic view of life on earth than the featured image ever could have hoped to illustrate.

A really great outcome of this project is something called OneZoom. This website starts with a clickable version of the image below (instructions: go to the website and click on any of the bubbles):

top-level-onezoom-map
Snipped from OneZoom

Each click will take you down that particular rabbit hole of life forms. I suggest clicking on the ladybug: (1) you are likely to have some familiarity with these happy-looking little insects (unless you’ve had a swarm of them in your area) and (2) you are probably going to be familiar with other the creatures you encounter down that particular node of life.

One other site you may like to visit and browse (it will take you a while – set aside a few years… or just nibble!) is the Encyclopedia of Life site. The site was the result of a talk by Edward O. Wilson, one of the great biologists (and certainly the greatest myrmecologist) of our current era (and perhaps of all time). He’s no spring chicken, so be patient as he tells you about his love of all of these critters that should be our friends and fellow travelers.

To close out this episode of “It’s More Complicated Than You May Have Not Been Considering in the First Place,” here is a Smithsonian Institution site for a standing exhibit on genomes:

https://unlockinglifescode.org

And here is a website that delves into genetics in more detail:

https://ghr.nlm.nih.gov/primer

And just because genetics should never be considered without attempting to understand the complex nuttiness of life within any individual cell, feast your eyes and ears on this video (prepare to be astonished at all the work you’re doing while you’re attempting to relax):

One final sentence: If you’re young and scientifically inclined, consider learning as much as you can about genetics and its affiliated disciplines (e.g. sequencing technologies); this will be a growth area for decades to come.

Featured image

(As I am not a biologist or a geneticist, I hope I haven’t made egregious, unforgivable errors that might offend The Biology Yak et al.)

Author: makingsenseofcomplications

I have an academic background in literature and, separately, science. My career has been in industry in positions of increasing responsibility assisting in the drug development process - one of the most amazing intellectual pursuits of the human mind, among many other amazing intellectual pursuits. I am interested in films, philosophy, history, art, music, science (obviously), literature (also obviously), some video gaming, human behavior, and many other topics. I wish there was more time in every day because we have a world that is full of amazing phenomena that are considered too superficially by too many. Although my first and last names are fictional, I think I believe in all of the stuff you read here, although I retain the right in perpetuity of changing my thoughts about anything written herein.

12 thoughts on “The Tree of Genes”

  1. Very interesting blog, thank you for sharing. Complicated as the new Tree of Life is, it puts animals right next to fungi, which is just great. What I don’t get is, where are all the plants?

    Liked by 1 person

    1. There are several “tree” diagrams in the piece. The most useful for answering your question is the diagram from OneZoom. If you click on the ladybug, it will take you to another set of branches. As animals and plants are both eukaryotes, that is where the plants live. The large disc is just a tree of various bacteria, huge though it is. And thank you!

      Liked by 1 person

    1. I would have to know where you are to answer the question 😊 I ask myself the same every time I find myself at the grocery trying to decide which of countless brands to buy.
      More seriously, we are terrible at appreciating the amount of time that has passed to “get here.” So, lots of time and a tremendous number of incremental changes in our four-letter alphabet over that time is an overly facile way of responding to a very complicated question. As a scientist, I, dear sir, do not know for a certainty but we keep on finding clues in history’s broad and deep landscape and that data is reinforced by what has been determined through genomics.

      Liked by 1 person

    1. Yes, and then there’s RNAi and all that it implies. The danger is choosing one of these topics is that there is so much there, so one has to chart a narrow path through. I was really just trying to talk about modifications to the phylogenetic tree model but so much has to be referenced just to get there!

      Liked by 1 person

  2. As always, an enlightening article m.
    There are a few points that I would like to say in this matter:
    1. The definition of life could embrace a few more life forms if it is cell/cells.
    2. I didn’t know about the origin of the word guanine. Ty.
    3. Introns as per my knowledge do have functions. They are more like policemen- regulatory fxn, transposon,variable splicing and yet unknown fxns.
    4. The sites you provided references to. Great idea👍

    Liked by 2 people

      1. Yes, there must be, we did evolve, we must have left behind a few things we don’t now need.
        My mistake I thought the empty areas you were referring to were introns (areas in DNA that don’t code for proteins)

        Liked by 1 person

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s