The Human Genome Project
Scientific Prison. Sydney Brenner remarked that the task of mapping human DNA could be like a vast prison for scientists. He joked that offenders could be allocated a "stretch" of DNA to mapits length depending on the severity of the crime. Minor offenses such as borrowing a colleague's bottles of enzyme without asking would merit a kilobase of DNA, while scientific fraud would attract a punishment of a whole chromosome to analyze. (p 62 in S. Aldridge 1996. The Thread of Life. Cambridge Univ. Press. ). |
The Human Genome Project is best understood as the 20th century's version of the discovery and consolidation of the periodic table (Lander 1996). In the period from 1869 to 1889 chemists realized that it was possible to systematically enumerate all atoms and to arrange them in an array that captured their similarities and differences. The building blocks of chemistry were rendered finite, and the predictability of matter gave rise to the chemical industry and the theory of quantum mechanics.
The Human Genome Project aims to produce biology's periodic tablenot 100 elements, but 100,000 genes; not a rectangle reflecting electron valences, but a tree structure depicting ancestral and functional affinities among the human genes. The biological periodic table will make it possible to define unique "signature" for each building block. Just as chemists can recognize atoms by mass and charge alone, biologists will be able to build detectors that allow each gene to be recognized from 20 well-chosen nucleotides or each protein from a distinctive fragment. As Lander (1996) states, "We live in a time of breathtaking transitions in the biological sciences. Molecular genetics has spawned a new revolution every decade and has now brought us to the brink of a global vista on life."
Fundamental Principles of Genetics and Inheritance
Identification and Evaluation of Genes |
UUU | Phenylalanine | |
UUA | Leucine | |
UUG | Leucine | |
AUU | Isoleucine | |
AUG | START | |
UAA | STOP | |
AAG | Lysine | |
GAU | Aspartic acid | |
GAG | Glutamic acid | |
UGA | STOP |
Note that UUA and UUG are redundant codesthey both encode for the amino acid leucine.
Genetic abnormalities are either monogenic or polygenic, the former is single gene diseases (e.g.Tay-Sachs, sickle-cell anemia) and the later is the term used to describe diseases caused by multiple genes (e.g. most cancers). About 3,000 genetic disorders have been identified (Dawson 1996). Genetic tests for diseases such as cystic fibrosis, breast cancer, colon cancer, and sickle cell anemia are being developed or are already in use.
There has been a rush of biogenetic firms for marketing genetic tests for ever-increasing number of gene-related disorders. In the future a diagnostician will turn to say page 250 of Chapter 10 and, comparing it to your genomic readout, see how a misspelling or typographic error in your DNA might give your family HCM (hypertrophic cardiomyopathy--thickening of the heart muscle). Your DNA is divided into introns which are noncoding stretches of filler and exons which are the protein coding regions. Within the exons are control elements which are special DNA sequences which modulate the duration, the amplitude at the area (tissue site)--like the liver, brain or heart--of protein expression. Also within the exons are the codons--sequential triplets of DNA bases that specify which of the body's 20 amino acids will be added next to the long chain of amino acids that make up different proteins. Patients who have misspellings at codon 403 get one wrong amino acid in the chain of thousands. That is what a mutation is. Each genome is filled with misalignments. Some never manifest themselves, some cause allergies. But a misspelling at 403 can kill you, misshaping the head of the myosin protein in such a way as to impair the interaction of the main heart proteins--myosin and actin--causing the heart second by second of every day over the course of a life, to improperly contract and thicken. (from C. Siebert, New York Times Sunday Magazine, September 17, 1995).
MOLECULAR BIOLOGY
Molecular biology has emerged during the last decade as one of the most profound developments in the biological and biomedical sciences. Study of life at the molecular level is creating a basic epistemological shift in biological research from an approach that is hypothesis-driven to one that is discovery-driven. Broad acceptance of this new strategy is having a major impact on how scientific research is funded and conducted at national and international levels. The fundamental approach of molecular biology is the study of life through genomics that is leading to the creation of a universal periodic table of life that will reflect common genetic properties and patterns of ancestral and functional affinities among the genes of both plants and animals, thereby unlocking the record of 3.5 billion years of evolutionary innovation. Comparison of related organisms will reveal regulatory regions and key architectural features of proteins that can be used as Rosetta stones for translating and understanding informational pathways and for deciphering biological complexity. Ultimately the field is leading to the creation of global tools of genomics that will revolutionize the medical, health, environmental and agricultural sciences.
The transition of molecular biology from a research "tool" to a transformational concept began in the mid-1980s with the creation of the Human Genome Project (HGP). Although the original objective of the HGP was to create a sequence map for humans, it quickly became clear that the same approach could benefit from knowledge of the genomic sequences of model organisms such as bacteria, yeast, nematodes, Drosophila and mice. The field of infomatics began to emerge as the demands for storing, processing and analyzing the enormous amounts of sequence data rapidly increased.
Genomics research is reframing biology as an informational science concerned with how to decipher and manipulate information classified into one-of-three types: (1)one-dimensional digital code of genes and chromosomes, the new science of which is referred to as genomicsthe study of many genes; (2)three dimensional protein (so-called folding problem) which catalyze life as well as give it shape and form, the new science of which is called proteomicsthe study of many proteins; and (3)four-dimensional complex systems and networks such as the brain which involves emergent properties of memory, consciousness and the ability to learn. This new area is referred to as systems biology and is concerned with identifying elements, determining function, correlating expression, and defining subsystems. Deciphering information classified according to this scheme provides a framework for intervention through one-of-three types of genomic controls: (1)transcriptional control between the genomic sequence and mRNA; (2)translational control between mRNA and the protein product; and (3)post-translational control between the protein product and the functional protein product. The importance of this framework is that it underscores the concept of information flow from genome through system as well as provides epistemological continuity for designing experiments and analyzing data from a wide range of model species. Indeed, the concept of comparative genomics is based on the unity of life; that the informational pathways of different organisms have shared processes, genes, regulatory regions, and even chromosome functions. For example, Stanford biologists discovered that a set of highly conserved proteins is encoded by a minority of genes in two organisms with their genomes completely sequencedthe nematode and yeast. These genes carry out the core biological processes shared by these two eukaryotes including intermediary metabolism, DNA and RNA metabolism, protein folding and degradation. It is likely that these same core processes are conserved in mammals including humans.
IMPORTANCE OF MOLECULAR BIOLOGY AND GENOMICS
Both the short and long term importance of sequence-based biological research is profound. First, the agriculture of the future will be based on precisely honed genetic fitness of the domestic ungulates such as sheep, goats, cattle, pigs, of poultry including chickens and turkeys, and the main grasses (wheat, rice, maize), crucifers and legumes. Knowledge gained in a few major crops, ungulates or poultry species can be pooled and applied across the board. This is because the order of genes in most of the related species is conserved. Thus genetic engineering in one species for resistance to disease, to parasite and/or insect attack, for rapid growth, or for quantity and quality of product can be used in the other related ones. Advances in one species will have a multiplier effect.
Second, the conventional quantitative genetic or artificial selection approach to plant and animal breeding will be replaced by the use of genetic engineering and cloning. This will change both the quantitative and qualitative traits of domestic plants and animals. Quantitative traits can be perpetuated through cloningdairy cows with the highest milk production, beef cattle with the greatest rates of gain, and sheep with the most prolific growth of woolcreating herds of hyper-producers. But no species of plant and animal possesses genes to produce all proteins or amino acids. Thus qualitatively different traits can be genetically engineered such as blue roses, rice varieties with maize-type proteins, and cows that produce goat's milk. These types of plants and animals could never be created through selection since their genomes do not have genes to produce these proteins or traits.
Third, new biomedical and agricultural technologies are emerging, many of which did not even exist in concept several decades ago. These include the use of transgenic domestic animals such as cows, sheep, and goats as bioreactors (pharming)living factories that produce therapeutic human proteins using a promotor which directs the expression of these protein to the mammary glands for milk-derived proteins (e.g. insulin, hepatitis B vaccine) or to their blood (e.g. haemoglobin). Another emerging area is xenotransplantationthe harvesting of organs (kidneys; hearts) from transgenic animals such as pigs for transplantation into humans. Molecular approaches are being used in an attempt to discover a protein which will "cloak" the foreign organ in the human body to prevent rejection.
Fourth, comparative biology will help to identify enhancements for humans as well as for domestic plants and animals that lie within the realm of possibility through their existence in other living creatures. The enhancements for humans might include hearing (bats), olfaction (canines), vision (falcons), and oxygen efficiency (crocodiles) and for domestic animals could include the incorporation of genes from selected wild species for rapid growth, high speed, milk and eggs with high protein content, large or small size, docile or aggressive behavior, intelligence or search specificity. Once genomic sequences are known and gene functions specified, genetic engineering for a host of enhancements may become routine.
Fifth, future ethical, legal and social efforts will require acute scientific vision to anticipate the problems and propose safeguards. For example, individuals will be faced with the choice of whether to obtain global views of their own genomes and the need to interpret the information. Genomics research has implications for the genetics of intelligence, propensities for substance abuse and addiction, for homosexuality, for risk taking and for impulsive and violent behaviors. Issues surrounding the privacy of genetic information, genetic counseling such as for preimplantation diagnosis of embryos, for therapeutic abortion, and for germline engineering will pose important ethical and legal problems for social scientists and family planning specialists. Economists will be increasingly concerned with the implications from natural resource economics, to marketing and development and international trade, textile scientists with designer fabrics, and environmental scientists with toxin-eating bacteria. In short, all aspects of academia will be impacted by this new biology.
Future Goals of Genomics (after Lander 1996)
The current goals of the Human Genome Project include: (1)developing a physical map of the chromosomes; (2)sequencing; and (3)determining gene function. The future goals include the following:
Issues arising
"A genetic revolution has been occurring in biology for several decades, and it is rapidly affecting the population at large. The development of the Human Genome project is fueling the revolution, accelerating the discovery of new genetic connections to old human problems. First, the targets will be the genetics of disease, then the genetics of deviant behavior, and then, as many feel is likely, the genetics of human enhancement." (from Conrad, P. 1996. Growing concerns, Science 274, 1147). |
LITERATURE
Anderson, W. F. 1998. Human gene therapy. Nature, 392 (supp):25-30.
Dawson, K. 1996. Genetics: a scientific sketch. Pp5-12 in: Birth to Death. D. C. Thomasma and T. Kushner (Eds.). Cambridge University Press, Cambridge.
Schwartz, R. 1996. Genetic knowledge: some legal and ethical questions. Pp21-34 in: Birth to Death. D. C. Thomasma and T. Kushner (Eds.). Cambridge University Press, Cambridge.
Lander, E. S. 1996. The new genomics: global views of biology. Science 274:536-539.
Schuler, G. D. et al. 1996. A gene map of the human genome. Science 274:540-546.
Table 1. Number of megabases (mb; mega=million) in each of the 22 (plus X and Y) human chromosomes (Chr) and example diseases or traits associated with genes on each. (from Schuler, G. D. et al. 1996). For example, chromosome #1 has approximately 236 million based (i.e. Adenine, Cytosine, Guanine or Thymine)
Chr | mb | Example |
1 | 236 | Gaucher disease; Alzheimer's |
2 | 255 | some colon cancers; Waardenburg syndrom (deafness, etc. |
3 | 214 | one type of lung cancer |
4 | 203 | Huntington's disease; Ellis-van Creveld syndrome (6-fingered dwarfism) |
5 | 194 | diastrophic dysplasia; plant homolog of human steriod |
6 | 183 | juvenile onset diabetes |
7 | 171 | association with cystic fibrosis |
8 | 155 | Werner's syndrome (premature aging); Burkitt lymphoma |
9 | 145 | melanoma associated with mutation; tuberous sclerosis |
10 | 144 | multiple endocrine neoplasia; gyrate atrophy of eye retina |
11 | 144 | multi-disease system w/predisposition to cancers; cardiac arrythmia |
12 | 143 | Zellweger syndrom; susceptibility to phenylketonuria |
13 | 114 | Wilson's disease (basal ganglia of brain); breast cancer |
14 | 109 | Alzheimer's disease associations |
15 | 110 | Marfan syndrome |
16 | 131 | adult polycystic kidney disease |
17 | 92 | many mutations associated with early-onset breast and ovarian cancer |
18 | 85 | loss of DPC4 gene causes pancreatic cancer to grow rapidly |
19 | 67 | myotonic dystrophy; coronary artery disease |
20 | 72 | severe immunodeficiency caused by missing enzyme, adenosine deaminase |
21 | 60 | Lou Gehrig's disease |
22 | 56 | Neurofibromatosis; DiGeorge syndrome |
X | 164 | mental retardation; Duchenne muscular dystrophy |
Y | 59 | testis-determining factor |