Bioinformatics - Primer on Molecular Genetics(3).pdf

(676 KB) Pobierz
Primer on Molecular Genetics
1
DOE Human Genome Program
Primer on Molecular Genetics
Date Published: June 1992
U.S. Department of Energy
Office of Energy Research
Office of Health and Environmental Research
Washington, DC 20585
The "Primer on Molecular Genetics" is taken from the June 1992 DOE
Human
Genome 1991-92 Program Report.
The primer is intended to be an introduction to
basic principles of molecular genetics pertaining to the genome project.
Human Genome Management Information System
Oak Ridge National Laboratory
1060 Commerce Park
Oak Ridge, TN 37830
Voice: 865/576-6669
Fax: 865/574-9888
E-mail: bkq@ornl.gov
2
Contents
Primer on
Molecular
Genetics
Revised and expanded
by Denise Casey
(HGMIS) from the
primer contributed by
Charles Cantor and
Sylvia Spengler
(Lawrence Berkeley
Laboratory) and
published in the
Human Genome 1989–
90 Program Report.
Introduction
............................................................................................................. 5
DNA............................................................................................................................... 6
Genes ............................................................................................................................ 7
Chromosomes ............................................................................................................... 8
Mapping and Sequencing the Human Genome
...................................... 10
Mapping Strategies .....................................................................................................
Genetic Linkage Maps ............................................................................................
Physical Maps .........................................................................................................
Low-Resolution Physical Mapping......................................................................
Chromosomal map .........................................................................................
cDNA map ......................................................................................................
High-Resolution Physical Mapping .....................................................................
Macrorestriction maps: Top-down mapping ...................................................
Contig maps: Bottom-up mapping ..................................................................
Sequencing Technologies ...........................................................................................
Current Sequencing Technologies .........................................................................
Sequencing Technologies Under Development .....................................................
Partial Sequencing to Facilitate Mapping, Gene Identification ...............................
11
11
13
14
14
14
14
16
17
18
23
24
24
End Games: Completing Maps and Sequences; Finding Specific Genes .................. 25
Model Organism Research
.............................................................................. 27
Informatics: Data Collection and Interpretation
..................................... 27
Collecting and Storing Data ........................................................................................ 27
Interpreting Data ......................................................................................................... 28
Mapping Databases .................................................................................................... 29
Sequence Databases .................................................................................................. 29
Nucleic Acids (DNA and RNA) ................................................................................ 29
Proteins .................................................................................................................. 30
Impact of the Human Genome Project
....................................................... 30
Glossary...
.............................................................................................................. 32
3
Introduction
T
he complete set of instructions for making an organism is called its genome. It
contains the master blueprint for all cellular structures and activities for the lifetime of
the cell or organism. Found in every nucleus of a person’s many trillions of cells, the
human genome consists of tightly coiled threads of deoxyribonucleic acid (DNA) and
associated protein molecules, organized into structures called chromosomes (Fig. 1).
Fig. 1. The Human Genome at Four Levels of Detail.
Apart from reproductive cells (gametes) and
mature red blood cells, every cell in the human body contains 23 pairs of chromosomes, each a
packet of compressed and entwined DNA (1, 2). Each strand of DNA consists of repeating
nucleotide units composed of a phosphate group, a sugar (deoxyribose), and a base (guanine,
cytosine, thymine, or adenine) (3). Ordinarily, DNA takes the form of a highly regular double-
stranded helix, the strands of which are linked by hydrogen bonds between guanine and cytosine
and between thymine and adenine. Each such linkage is a base pair (bp); some 3 billion bp
constitute the human genome. The specificity of these base-pair linkages underlies the mechanism
of DNA replication illustrated here. Each strand of the double helix serves as a template for the
synthesis of a new strand; the nucleotide sequence (i.e., linear order of bases) of each strand is
strictly determined. Each new double helix is a twin, an exact replica, of its parent. (Figure and
caption text provided by the LBL Human Genome Center.)
5
Primer on
Molecular
Genetics
If unwound and tied together, the strands of DNA would stretch more than 5 feet but
would be only 50 trillionths of an inch wide. For each organism, the components of these
slender threads encode all the information necessary for building and maintaining life,
from simple bacteria to remarkably complex human beings. Understanding how DNA
performs this function requires some knowledge of its structure and organization.
DNA
In humans, as in other higher organisms, a DNA molecule consists of two strands that
wrap around each other to resemble a twisted ladder whose sides, made of sugar and
phosphate molecules, are connected by “rungs” of nitrogen-containing chemicals called
bases. Each strand is a linear arrangement of repeating similar units called nucleotides,
which are each composed of one sugar, one phosphate, and a nitrogenous base (Fig.
2). Four different bases are present in DNA—adenine (A), thymine (T), cytosine (C), and
guanine (G). The particular order of the bases arranged along the sugar-phosphate
backbone is called the DNA sequence; the sequence specifies the exact genetic instruc-
tions required to create a particular organism with its own unique traits.
The two DNA strands are held together
by weak bonds between the bases on
each strand, forming base pairs (bp).
Genome size is usually stated as the total
number of base pairs; the human genome
contains roughly 3 billion bp (Fig. 3).
Each time a cell divides into two daughter
cells, its full genome is duplicated; for
humans and other complex organisms,
this duplication occurs in the nucleus.
During cell division the DNA molecule
unwinds and the weak bonds between
the base pairs break, allowing the strands
to separate. Each strand directs the
synthesis of a complementary new
strand, with free nucleotides matching up
with their complementary bases on each
of the separated strands. Strict base-
pairing rules are adhered to—adenine will
pair only with thymine (an A-T pair) and
cytosine with guanine (a C-G pair). Each
daughter cell receives one old and one
new DNA strand (Figs. 1 and 4). The
cell’s adherence to these base-pairing
rules ensures that the new strand is an
exact copy of the old one. This minimizes
the incidence of errors (mutations) that
may greatly affect the resulting organism
or its offspring.
Fig. 2. DNA Structure.
The four nitrogenous
bases of DNA are
arranged along the sugar-
phosphate backbone in a
particular order (the DNA
sequence), encoding all
genetic instructions for an
organism. Adenine (A)
pairs with thymine (T),
while cytosine (C) pairs
with guanine (G). The two
DNA strands are held
together by weak bonds
between the bases.
A gene is a segment of
a DNA molecule (rang-
ing from fewer than
1 thousand bases to
several million), located
in a particular position on
a specific chromosome,
whose base sequence
contains the information
necessary for protein
synthesis.
Phosphate Molecule
Deoxyribose
Sugar Molecule
Nitrogenous
Bases
A
C
T
G
G
T
C
A
Weak Bonds
Between
Bases
Sugar-Phosphate
Backbone
6
Zgłoś jeśli naruszono regulamin