Adeno associated virus (AAV): A small, stable virus that is not known to cause disease in humans. The naturally occurring form of the virus has only two genes, which are removed in the construction of AAV vectors for gene delivery. Neither AAV nor AAV vectors have been known to induce an immune response.

Adenovirus: A virus that causes clinical conditions such as the common cold and respiratory infections.

Agarose gel electrophoresis: A technique used to separate DNA fragments (and proteins) by their size. An electric current is used to propel the DNA (or proteins) through a porous gel matrix.

Allele: A particular sequence variation of a gene or a segment of a chromosome.

Alternative splicing: The processing of an RNA transcript into different mRNA molecules by including some exons and excluding others.

Amino acid: The basic building block of a protein. There are 20 different amino acids commonly found in proteins. The genetic code specifies the sequence of amino acids in a protein.

Amplification: The repeated copying of a DNA sequence.

Annealing: The hydrogen bonding between complementary DNA (or RNA) strands to form a double helix.

Annotation: The process of locating genes, their coding regions, and their functions.

Anonymized data: Data that cannot be traced back to their donor.

Anticodon: A 3-base sequence in a tRNA molecule that base-pairs with its complementary codon in an mRNA molecule.

Assembly: Putting sequenced fragments of DNA into their correct order along the chromosome.

Autosome: Any of the numbered chromosomes that are not involved with sex determination.

Bacterial Artificial Chromosome (BAC): A chromosome-like structure constructed using recombinant-DNA technology. It is used to clone large DNA inserts (100 to 300 kb) into E. coli cells.

Base pair: Two nitrogenous bases (adenine and thymine/uracil or guanine and cytosine) held together by weak hydrogen bonds.

Base-pair substitution: A type of mutation where one base pair is replaced with a different one; also called a point mutation.

Base: One of five chemicals (also called nitrogenous bases) found in nucleic acids (DNA and RNA): adenine, thymine, guanine, cytosine, and uracil.

Bioethics: The study of ethical issues raised by the developments in the life science technologies.

Bioinformatics: The study of collecting, sorting, and analyzing DNA and protein sequence information using computers and statistical techniques.

BLAST (Basic Local Alignment Search Tool): A computer program that searches for sequence similarities. It can be used to identify homologous genes in different organisms.

Candidate gene: A gene that is suspected of being associated with a particular disease.

Carrier: A person who is heterozygous for a mutation associated with a genetic disease. Usually, a carrier does not display symptoms of the disease but may pass the mutation on to offspring.

cDNA (complementary DNA): A DNA molecule synthesized from an mRNA molecule. They can be used experimentally to determine the sequence of an mRNA.

Centromere: The compact region near the center of a chromosome.

Chromosome: A rod-like structure found in the cell nucleus. It consists of one long DNA molecule with its associated proteins.

Clone: 1) a genetically identical copy of an individual cell or organism; 2) an exact copy of a DNA sequence.

Cloning vector: A DNA molecule, such as a modified plasmid or virus, that can be used to clone other DNA molecules in a suitable host cell. Cloning vectors must be able to replicate in the host cell and must possess restriction enzyme cut sites that allow the DNA molecules targeted for cloning to be inserted and retrieved.

Coding DNA or region: A sequence of DNA that is translated into protein; also called exons (in eukaryotes).

Codon: A three-base sequence in a DNA or mRNA molecule that specifies a specific amino acid or termination signal; the basic unit of the genetic code.

Combined DNA Index System (CODIS): A database maintained by the FBI. It includes DNA profiles of convicted offenders in the United States.

Comparative genomics: The process of learning about human genetics by comparing human DNA sequences with those from other organisms.

Consanguineous: Marriage or mating among related individuals.

Conserved sequence: A DNA (or amino acid) sequence that has remained relatively unchanged throughout evolution. Such a sequence is under selective pressure and therefore resistant to change.

Contig: contiguous sequence of DNA created by assembling shorter, overlapping sequenced fragments of a chromosome (whether natural or artificial, as in BACs). A list or diagram showing an ordered arrangement of cloned overlapping fragments that collectively contain the sequence of an originally continuous DNA.

Cosmid: A cloning vector derived from a bacterial virus. It can accommodate about 40 kb of inserted DNA.

Cycle sequencing: A DNA sequencing technique that combines the chain termination method developed by Fred Sanger with aspects of the polymerase chain reaction.

Deletion: A type of mutation caused by the loss of one or more adjacent base pairs from a gene.

Deoxyribose: The five-carbon sugar component of DNA. It has one less hydroxyl group than ribose, the sugar component of RNA.

Dideoxynucleotides (ddNTPs): Synthetic nucleotides lacking both 2' and 3' hydroxyl groups. They act as chain terminators during DNA sequencing reactions.

Directed sequencing: Successively sequencing DNA from adjacent stretches of chromosome.

DNA (deoxyribonucleic acid): The hereditary material that exists as a double-stranded helical molecule made up of a deoxyribose sugarphosphate backbone and the four nitrogenous bases, named adenine (A), cytosine (C), guanine (G), and thymine (T).

DNA chip: A microarray of oligonucleotides or cDNA clones fixed on a surface. They are commonly used to test for sequence variation in a known gene, or to profile gene expression in an mRNA preparation.

DNA ligase: An enzyme able to form a phosphodiester bond between adjacent but unlinked nucleotides in a double helix.

DNA polymerase: An enzyme that adds bases to a replicating DNA strand.

DNA probe: A chemically synthesized, often radioactively labeled, segment of DNA used to visualize a genomic sequence of interest by hydrogen-bonding to its complementary sequence.

DNA replication: The process of replicating a double-stranded DNA molecule.

Dominant: In human genetics, it describes any trait that is expressed in the heterozygous condition.

Draft sequence: DNA sequence with lower accuracy than finished sequence. Some segments may be missing or in the wrong orientation.

Duplication: A chromosome rearrangement that duplicates a given region of DNA. Duplications may occur in tandem or the sequences may be inverted.

Electropherogram: Sequence data produced by an automated DNA sequencing machine. The data are a series of colored peaks where each peak represents one of the four different DNA bases.

Escherichia coli: The bacteria commonly used as a host cell for cloning segments of genomic DNA.

Ethical, Legal, and Social Implications (ELSI) Program: Established in 1990 by the founders of the Human Genome Project to anticipate and address the ethical, legal, and social issues that arise as the result of human genetic research. ELSI programs are funded by both the Department of Energy and the National Institutes of Health.

Euchromatin: The gene-rich areas of a chromosome.

Eukaryote: An organism whose cells contain a nucleus in which the chromosomes are located, as compared to prokaryotes, in which the chromosome is found throughout the cytoplasm.

Exon: A segment of a gene that codes for a portion of a protein. Exons are interspersed with noncoding introns.

Expressed Sequence Tag (EST): A short DNA sequence from a coding region that is used to identify a gene.

Finished sequence: Sequence produced to an accuracy of no more than 1 error in 10,000 bases. Finished sequences are in the proper orientation and have little or no gaps.

Fluorescence in situ hybridization (FISH): A technique that uses fluorescent molecules to locate the position of a DNA sequence along the chromosome.

Founder mutation: A mutation carried by an individual or a small number of people who are among the founders of a present day population.

Frame shift mutation: A type of mutation characterized by insertions or deletions that change the identities of the codons following the mutation. Often this creates stop codons that cause premature termination of the protein.

Functional genomics: The study of genomes to determine the biological function of all the genes and their products.

Gene expression: The process by which a gene is transcribed into RNA and then translated into a protein.

Gene: A region of DNA that can encode one or more polypeptides or RNA products.

Genetic code: The mapping between the set of 64 possible three-base codons and the amino acids or stop codons specified by each of the triplets.

Genome: The complete DNA content of an organism.

Genomics: The comprehensive study of whole sets of genes and their interactions rather than single genes or proteins.

Genotype: The allelic composition of an individual for one or more genes.

Germ cell: A haploid egg or sperm cell.

Haploid: A single set of chromosomes found in the sperm and eggs of an animal or the pollen and egg of a plant.

Haplotype: A specific combination of alleles or sequence variations that are likely to be inherited together.

Heritability: The proportion of variation in a trait among individuals in a population that can be attributed to genetic effects.

Heterochromatin: The compact, gene-poor regions of a chromosome. They contain many repeated sequences.

Heterozygote: An individual that possesses two different alleles for a given gene.

Homologous genes: Genes having similar structures and functions.

Homozygote: An individual that possesses two identical alleles for a given gene.

Informed consent: The ethical practice of obtaining consent to undergo a medical procedure or participate in a medical study while respecting individual choice and protecting an individual from harm.

Insertion: A type of mutation caused by the addition of one or more adjacent base pairs to a gene.

Intron: A gene region that is not translated into protein. Introns are interspersed with coding regions called exons.

Karyotype: A photomicrograph that arranges a cells chromosomes to show their number, size, and type.

Kilobase (kb): A unit of DNA length corresponding to one thousand bases.

Library: An unordered collection of clones whose relationship to each other can be shown by physical mapping.

Linkage map: A map of the relative positions of genes and other regions of a chromosome, produced by tracking how often loci are inherited together.

Linkage: The proximity of two or more loci (especially genes) on a chromosome.

Locus: The position on a chromosome where a gene, or some other sequence, is located.

LOD score (log of the odds): A statistical estimate that measures the probability of two loci being close together and consequently being inherited together. A LOD score of 3 or higher is considered as evidence that two loci lie close together.

Megabase: A unit of DNA length corresponding to 1 million bases.

Messenger RNA (mRNA): An RNA molecule that serves as a template for the synthesis of a protein.

Microsatellite: Repetitive stretches of short DNA sequences that are used as markers to track the inheritance of genes.

Missense mutation: A type of mutation that results in the substitution of one type of amino acid for another in a given location in a polypeptide chain.

Multiple sequence alignment: A bioinformatics tool that compares multiple DNA or amino acid sequences and aligns them to highlight their similarities.

Mutagen: A chemical or physical agent that interacts with DNA to promote the appearance of mutations.

Mutation: A change in a DNA sequence with respect to a reference sequence.

Nonsense mutation: A type of mutation that changes an amino acid codon to one of the three stop codons, resulting in a shorter and usually nonfunctional protein.

Nucleic acid: A large polymer consisting of a linear stretch of nucleotides, as in DNA and RNA.

Nucleotide: The building block of a nucleic acid, consisting of a five-carbon sugar covalently bonded to a nitrogenous base (adenine, thymine, guanine, cytosine, or uracil) and a phosphate group.

Oligonucleotide: A short, synthetically made stretch of single-stranded DNA.

Open reading frame (ORF): A stretch of DNA that when translated into an amino acid sequence doesnt contain an internal stop codon. An ORF can be evidence that a DNA sequence is part of a gene.

Ortholog: A homologous sequence found in different species and derived from a common ancestor.

Paralog: A homologous sequence in the same organism derived from gene duplication.

Pedigree: A family tree describing the occurrence of heritable traits across as many generations as possible.

Penetrance: The degree to which a genetic disorder is expressed phenotypically.

Phenotype: The physical traits of an organism that are determined by the genotype.

Phylogenetic tree: A tree-like diagram that depicts the evolutionary relationships between different organisms.

Physical map: A map showing the locations of identifiable markers spaced along the chromosomes. A physical map may be constructed from a set of overlapping clones.

Plasmid: A small circular DNA molecule found in bacteria that replicates independently of the chromosome. Plasmids are used as cloning vectors.

Point mutation: A type of mutation that involves changing a single base in a DNA sequence.

Polygenic inheritance: Describing a type of inheritance where more than one gene contributes to a phenotype.

Polymerase chain reaction (PCR): An enzyme-mediated technique that allows specific DNA sequences to be amplified.

Polymorphism: A relatively common DNA sequence variation within a population at a given chromosomal location.

Predisposition: The condition of having a genotype that increases the risk for developing a genetic disease, if other environmental conditions are present.

Prokaryote: A cell or organism without a membrane-bound nucleus. Bacteria are prokaryotes.

Protein: A macromolecule consisting of one or more amino acid chains. Proteins carry out most of the cell functions.

Proteome: The full complement of proteins produced by a genome.

Proteomics: The study of the full set of proteins encoded by a genome and their interactions.

Pseudogene: A DNA sequence similar to that of an active gene. Pseudogenes have collected mutations that render them inactive.

Reading frame: The way an mRNA is read as a series of triplet codons during translation. There are three possible reading frames for any mRNA, and the correct reading frame is set by recognition of the AUG initiation codon.

Recessive: A trait is recessive if it is manifest only in the homozygous condition.

Recombinant DNA: A DNA molecule consisting of DNA from different sources; made using restriction enzymes and DNA ligase.

Recombination (also called crossing over): The process by which two homologous chromosomes exchange genetic material during the formation of eggs and sperm.

Repetitive DNA: A DNA sequence that is present in many identical or similar copies in the genome. The copies can be tandemly repeated or dispersed.

Restriction enzyme: An endonuclease isolated from bacteria that recognizes and cuts a DNA sequence at a specific sequence. They are used in genetic engineering.

Restriction fragment length polymorphism (RFLP): Differences in DNA sequence on homologous chromosomes that result in restriction fragments of varying lengths that can be detected using DNA probes.

Retrovirus: A virus that carries its genetic material as RNA, rather than DNA. Retroviruses use reverse transcriptase to insert their genetic material into the chromosomes of infected cells.

Reverse transcriptase: An RNA-dependent DNA polymerase isolated from retrovirus infected cells. It synthesizes a complementary DNA from an RNA template.

Ribonucleic acid (RNA): A type of nucleic acid consisting of nucleotides with a ribose sugar and the nitrogenous bases adenine, cytosine, guanine, and uracil (A, C, G, and U); usually single-stranded; functions in protein synthesis and as the genome of some viruses. Common types include mRNA, tRNA, and rRNA.

RNA splicing: The process by which introns are removed and exons are spliced together from an RNA transcript to produce an mRNA molecule.

Sense strand: The DNA strand of a gene that is complementary in sequence to the template (antisense) strand, and identical to the transcribed mRNA sequence (except that DNA contains T where RNA has U). Gene sequences found in databases are always of the sense strand, in the 5' to 3' direction.

Sequence tagged site (STS): A short stretch of DNA whose sequence occurs once in the genome and whose location is known. It serves as a landmark used in the mapping and assembly of a genome.

Short tandem repeat (STR): A short (2 to 5 bases) DNA sequence that repeats itself in tandem. STRs are used in DNA profiling.

Shotgun sequencing: The process of breaking a long DNA sequence (or an entire genome) into many small pieces, sequencing the pieces, and assembling the fragments.

Silent (synonymous) mutation: A type of mutation that changes a codon but does not alter the amino acid encoded. Such mutations may still have effects on mRNA splicing or stability.

Single Nucleotide Polymorphism (SNP): A common single-base pair variation in a DNA sequence.

Somatic cell: Any cell in a multicellular organism except a sperm or egg cell.

Splice site: The point in the sequence of the RNA transcript at which splicing takes place. Splice sites are found at exon-intron boundaries.

Start (or initiation) codon: The first AUG (methionine) codon to be used by the ribosome at the start of translation.

Stop codon: The codons UAA, UGA, or UAG, which cause the termination of translation.

Telomere: The end of a chromosome. Telomeres contain repeated DNA sequences and are associated with the replication and stability of the chromosome.

Transcription factor: A protein that binds to regulatory regions and controls gene expression.

Transcriptome: The full complement of activated genes as represented by the set of mRNAs and transcripts, in a particular tissue at a particular time.

Transformation: The process of introducing foreign DNA into a cell, or of a cell becoming cancerous.

Translocation: A type of chromosome aberration in which a sequence of DNA from one chromosome is moved to another chromosome.

Transposon: A short DNA sequence that has the ability to move from one chromosomal position to another.

Vector: A DNA molecule that replicates independently in a host cell. It is used to ferry a foreign DNA sequence into a cell to be cloned.

Yeast Artificial Chromosome (YAC): A yeast-derived DNA sequence that can be spliced with a large fragment of foreign DNA and inserted into yeast cells to bo be amplified and sequenced.

