I plan to write quite a few posts that will require a knowledge of genetics. I had started writing a new post on a genetic disease I have exposure to, called hemochromotosis, and to do this I started writing about DNA, and genetics in general. What I quickly realized is that my readers would have to be well versed in genetics to follow all the details, and that's a lot to ask. So I've decided to undertake the task of introducing DNA to you myself. The study of DNA and genetics is so fascinating that's it's worthy of a detailed introduction. It's likely there will be plenty of modification to this post in the future to fill in any interesting and useful oversights, but for the moment, this is a solid grounding that will allow you to follow many genetics related news stories, and hopefully, my future posts. I hope you enjoy learning about it as much as I did.
DNA
Is an acronym for Deoxyribonucleic Acid, although that does not help describe it if you do not know what nucleic acids are, so lets discuss them. Nucleic acids are composed of compounds called nuclueosides. Nucleosides are simply small compounds of which there are essentially only four types, called Adenine, Guanine, Cytocine, and Thymine, pictured below.
These are abbreviated to A,T, G and C for convenience. There are actually additional parts to these molecules, not shown in the picture above, and these are also important. They form what is called the backbone of DNA. You can think of it as akin to a scaffold, to which each of the A,T,G and C compounds are anchored. Like a scaffold, this backbone provides a degree of structural stability, which is important in biological molecules.
Is an acronym for Deoxyribonucleic Acid, although that does not help describe it if you do not know what nucleic acids are, so lets discuss them. Nucleic acids are composed of compounds called nuclueosides. Nucleosides are simply small compounds of which there are essentially only four types, called Adenine, Guanine, Cytocine, and Thymine, pictured below.
These are abbreviated to A,T, G and C for convenience. There are actually additional parts to these molecules, not shown in the picture above, and these are also important. They form what is called the backbone of DNA. You can think of it as akin to a scaffold, to which each of the A,T,G and C compounds are anchored. Like a scaffold, this backbone provides a degree of structural stability, which is important in biological molecules.
THE STRUCTURE OF DNA
In the same way that letters from a particular language come together to form legible words A,T,G and C combine together to form a "legible" linear sequence, or strand, of DNA. A strand could therefore be represented by a long sequence of letters as follows "ATGCTGACCGGTAATGCCGTGCA". Indeed this is how molecular biologists depict DNA sequences, and remarkably there is a lot of information embedded in this deceptively simple code. In fact, all of the information required to build a human being comes from a sequence just 3.2 billion letters long.
Interestingly, DNA does not exist as a single strand, instead, two strands combine together, aligning opposite each other. The alignment is not random, but follows a simple set of rules based on the chemistry of the A, T, G, and C compounds. Molecules of A must be paired with molecules of T, and molecules of G must be paired with molecules of C. So, for the sequence above the double stranded version would actually look like,
"ATGCTGACCGGTAATGCCGTGCA" "TACGACTGGCCATTACGGCACGT"
Once combined these two strands twist around each other to form a 3D structure called a double helix. In the picture (left) you can see how one strand wraps around the other. This whole structure is what is meant when we refer to a molecule of DNA. There is actually a great video on all of this here, I recommend it. A final point, we inherit one strand of DNA from our mother, and one from our farther. Therefore, the combined strands are a 50:50 mix of our parents.
WHAT DOES DNA DO?
In terms of what DNA does in the body, the best way to describe is as an enormous instruction manual. This manual codes for the building on an entire cell, but since each cell is different, liver cells are distinct from, bone cells, and blood cells etc, different pages of the instruction manual are used to create them. This feat of cellular engineering is performed via the production of small molecules called proteins, more on these later, but for the moment, think of them as small autonomous building machines. They're very cool. DNA determines your sex, your eye colour, the number of fingers and toes you have, your ability to metabolise foods, and partially your intelligence. This is an exciting realisation, because what it says is if you can change your DNA, you can change your physiology.
The regions of a DNA molecule that result in the production of a protein are called coding regions, or genes. They are essentially just small sections of a one or other of the strands of DNA which have the right sequence of A,G,T, and C to produce a specific protein molecule. We as humans have approximately 20,000 genes, resulting in 30,000 unique proteins (one gene can result in the production of more than one type of protein).
Because cells are very small, and there's not much space in there, DNA is packaged into more compact structures called chromosomes. DNA wraps itself around a set of small spherical shaped proteins called histones, similar to winding in the string of a YoYo. The picture below depicts the whole process very well. The end result of this compacting process is a new structure called a chromosome.
Because cells are very small, and there's not much space in there, DNA is packaged into more compact structures called chromosomes. DNA wraps itself around a set of small spherical shaped proteins called histones, similar to winding in the string of a YoYo. The picture below depicts the whole process very well. The end result of this compacting process is a new structure called a chromosome.
Annunziato, A. (2008) DNA packaging: Nucleosomes and chromatin. Nature Education 1(1):26 |
CHROMOSOMES
Different organisms have different numbers of chromosomes, but there is no relation between the number of chromosomes and the complexity of an organism. For example, a humble hedgehog has 88 pairs of chromosomes, a butterfly has over 200 pairs, whereas we have 46 pairs, 23 from our mother and farther each.
Chromosomes are important since they are how we exchange genetic material. A single chromosome will contain many genes, for example, chromosome 6 alone contains nearly 2,000 genes, some of which are known be responsible for diseases such as hemochromotosis, diabetes and epilepsy. If something goes wrong with the dishing out of chromosomes during the formation of an embryo then the results can be enormous. Patau Syndrome, Edwards Syndrome, and Kleinfelter Syndrome are all examples of this.
HEREDITARY GENETICS
Hereditary genetics refers to how genes are passed down from parents to offspring. Biologists will often refer to this as Mendelian genetics, after an Augustine friar called Gregor Mendel who studied pea plants in the 1850's. What Mendel discovered was that if you breed small pea plants with other small pea plants, the offspring will be small. Similarly, if you breed tall pea plants with other tall pea plants, the offspring will be tall. But what happens if you breed small pea plants with tall pea plants? Well, the answer all depends of how dominant the gene for "tallness" actually is.
DOMINANT AND RECESSIVE GENES
The physical traits of the pea plants will actually depend on the type of gene that is being studied. You have probably heard the terms dominant and recessive before. A dominant trait is one that requires just one copy of the gene for the trait to occur in the offspring. A recessive trait requires both copies of the gene for the the trait to occur in the offspring.
As an example, lets assume there is a gene called "T" that determines how tall a pea plants can be. All tall pea plants therefore must have the "T" gene. We know small pea plants don't have this gene, so these are denoted as "t". Think of it as a defunct gene for tallness. Since we have genetic material from the mother and the father there are actually two genes we need to think about. With two genes "T" and "t" these can combine in one of only four ways, "TT", "Tt", "tT", and "tt" and with the exception of "Tt" and "tT" they will not result in the same outcomes.
HETEROZYGOUS AND HOMOZYGOUS
These terms arise frequently in genetics, but they are very simple. Heterozygous simply means you have two non-identical copies of the same gene, so Tt, or tT. Conversely, homozygous mean you have two identical copies of the same gene, so TT, or tt. Some diseases only arise if you are homozygous, i.e, you require two copies of the disease gene. You could also be what's called a "carrier" where you have one normal copy of the gene, and one disease carrying copy of a gene. You may not have the symptoms of the disease, but you could pass it down to your offspring. Genetics is sneaky like that!
AUTOSOMAL AND X-LINKED
It is common to hear the terms autosomal or sex-linked when reading about genetic diseases. These terms refer to the types of chromosome the gene responsible for the disease is located on. As mentioned, we have 23 pairs of chromosomes, the first 22 are simply labelled 1 through 22, but the last pair are X and Y chromosomes. If you have two X chromosomes you are female, if you have a Y chromosome, XY, you are male. The X and Y chromosomes are called sex chromosomes, while chromosomes 1 through 22 are called autosomal chromosomes. So, for example, cystic fibrosis is an autosomal recessive genetic disorder, this means it does not involve the X or Y chromosome. It's actually down to a small change in one gene on chromosome 7. There are some diseases which occur only on the X and Y chromosomes which are called sex linked diseases, or X-linked. Colour blindness is an example of this. And because men only have one copy of the X chromosome they tend to suffer more, since a woman with two X chromosomes has a better chance of having one normal copy of the gene.
THE PREDICTIVE POWER OF GENETICS
Because of the simple rules governing the behaviour of some genes you can actually calculate the probability of a given trait being passed from parents to their offspring. So in the example above, if tallness is governed by a dominant gene then "TT", "tT" "Tt" will all be tall offspring, while "tt" will be short. The calculation can be represented using what are referred to as punnett squares, shown below.
The genes provided by the mother are shown in red, the genes from the father, in blue. The results of the combinations are shown in orange and/or white. In the first example (a) both parents are heterozygous for being tall. The result of their mating is that 3 of the 4 possible gene combinations result in tall offspring (shown in orange). In (b) one parent, the mother, is heterozygous for being tall, while the father homozygous for being small, the results of their mating will be that only 2 gene combinations can result in a tall child. Finally, in (c) both parents are homozygous for being small, and they can only produce small children. Therefore, if the parents in (a) have a child, each child has a 75% chance of being tall. If they have two children, the odds of both of them being tall are (0.75 x 0.75), or just over 50%. A similar calculation can be performed in example (b).
FINAL REMARKS
One final thing. The term genome is becoming more common in the media. A genome is simply the set of all the genes for a given organism. I have a genome which is unique from yours. It is made of ~20,000 genes, wrapped around histone proteins and tightly packaged into chromosomes, of which we have 46. The entire set of 20,000 genes codes for ~30,000 proteins, which together build our cells, move oxygen around our body, metabolise our food and allow us to see, and move. Each gene is merely a segment of double stranded DNA, made up of small compounds abbreviated to A, T, G, and C. So, really, it's almost as simple, as A, B,C...
One final thing. The term genome is becoming more common in the media. A genome is simply the set of all the genes for a given organism. I have a genome which is unique from yours. It is made of ~20,000 genes, wrapped around histone proteins and tightly packaged into chromosomes, of which we have 46. The entire set of 20,000 genes codes for ~30,000 proteins, which together build our cells, move oxygen around our body, metabolise our food and allow us to see, and move. Each gene is merely a segment of double stranded DNA, made up of small compounds abbreviated to A, T, G, and C. So, really, it's almost as simple, as A, B,C...
- Some interesting information on the human genome project
- A scientific paper on DNA packaging: Nucleosomes and Chromatin
- A very cool map of your chromosomes: Chomosome Map