Molecular Evolution Glossary
A ~ B ~ C ~ D ~ E ~ F ~ G ~ H ~ I ~ L ~ M ~ N ~ O ~ P ~ R ~ S ~ T ~ U ~ W
A
Adaptation. Evolutionary changes driven by
positive selection that increase
fitness.
Allele. A particular genetic variant at a given
locus.
Analogy. In evolutionary terms,
analogy indicates
similarity through convergence rather than
homology.
B
Bootstrapping. A method for attaching confidence values to the branches of a
phylogenetic tree.
Branch. Nodes on a
phylogenetic tree are joined by branches, which represent a particular period of evolutionary time. Often (but not always), the length of the
branch will indicate the amount of evolutionary change that has taken place.
C
Clade. A group of terminal
nodes (e.g. species,
genes or proteins) that share a common ancestral
node.
Coding Sequence. The part of an mRNA sequence (a transcribed
protein-coding gene with any introns spliced out) that encodes for the protein sequence (the "cistron") - translated using the
Genetic Code.
Coding Substitution. See
non-synonymous substitution.
Codon. A three-letter section of a
coding sequence that encodes for a single amino acid. Due to the degenerate nature of the
genetic code (64
codons coding for only 20 amino acids plus a STOP signal) means that changes at the third position of a
codon often constitute a
synonymous substitution.
Coevolution. The
evolution of two interacting entities, where changes in one drives changes in the other.
Coevolution can happen at many levels between species, between proteins and even between residues within a protein.
Conservation. Evolutionary similarity, as opposed to
divergent evolution.
Conservative Substitution. A substitution that replaces one amino acid with another of similar physiochemical properties.
Convergent Evolution. The independent acquisition of the same trait in different evolutionary events.
D
Deleterious. A mutation/allele that reduces the
fitness of the carrier.
Distance Matrix. An all-by-all matrix of distances derived from pairwise comparisons of
OTUs used for
phylogenetic tree construction. .
Divergent Evolution. The divergence of homologues through time following a
speciation or
duplication event.
Duplication. An evolutionary event where genetic material is duplicated and subsequently two copies are inherited.
Duplications can occur at many levels, including parts of genes/proteins (exons/domains), whole genes/Operons, whole chromosomes or even whole genomes (WGD).
E
Evolution. In the context of molecular evolution,
evolution is the change in
allele frequencies within a population, ultimately leading to
fixation.
F
Fitness. The relative success of an organism or mutation under
selection compared to wildtype.
Fitness (and
selection) is highly context-dependent and the same
phenotype may have very a different
fitness in different environments.
Fixation. When an
allele reaches a frequency of (effectively) 100%.
G
Gene. "Gene" can refer to a physical functional genetic locus, or the fundamental information unit of heredity and
evolution (as in "gene pool"). Although often used synonymously with "protein-coding gene", it should be remembered that it does not always mean this.
Gene Family. A family of
homologous genes that are related through
gene duplication events. For multi-domain proteins, different domains may be members of different families and have distinct evolutionary histories.
Genetic Code. The three-letter code that is used for translating the
coding sequence of a
protein-coding gene into amino acids.
Genotype. The genetic makeup of an individual.
H
HGT. See
Horizontal Gene Transfer.
HTU. See
Hypothetical Taxonomic Unit.
Homologous. Two sequences that show
homology (e.g. shared evolutionary ancestry).
Homology. Relationship through shared evolutionary ancestry.
Homology Search. Searching a sequence database for protein or nucleotide sequences with sequence
similarity to a given query.
Homoplasy. Independent
evolution of the same trait in different taxa. When mapped onto a correct phylogeny, such a trait will appear
polyphyletic.
Homoplasy can confuse attempt to construct a
phylogeny by making the affected taxa look more similar to each other than they should.
Horizontal Gene Transfer. Inheritance/incorporation of genetic material from a source other than parents, e.g. virus, plasmid etc.
Hypothetical Taxonomic Unit. An internal
node of a
phylogenetic tree.
I
Indel. Genetic insertion/deletion.
Informative Site. A nucleotide or amino acid position that is able to group two or more sequences together to the exclusion of the rest. Used in
maximum parsimony.
Informative Trait. A character that is able to group two or more species together to the exclusion of the rest. Used in
maximum parsimony.
L
Locus. A physical genetic location in a genome.
Long Branch Attraction. An artefact of
maximum parsimony where
homoplasy tends to attract sequences that are very divergent from the rest of the tree.
Long Branch Migration. An artefact of
distance matrix phylogenetic tree methods where rapidly evolving lineages that are very divergent from the rest of the tree tend to migrate to the
root of the tree (and each other).
M
MRCA. Most Recent Common Ancestor. The most recent share evolutionary ancestor of a group of species or proteins. This is not (usually) a literal individual but will instead refer to a population.
MSA. See
Multiple Sequence Alignment.
Maximum Likelihood. A
phylogenetic tree construction method that selects the best tree by maximising the likelihood (probability) of the derived
phylogeny given an evolutionary model.
Maximum Parsimony. A
phylogenetic tree construction method that selects the best tree by maximising
parsimony (i.e. minimising evolutionary changes).
Midpoint Rooting. Rooting a
phylogeny on the
branch that is equidistant from the two most distance OTUs.
Missense Mutation. A
non-synonymous substitution that alters the encoded amino acid for a different amino acid (i.e. not a stop codon).
Molecular Clock. The prediction of The Neutral Theory that, if most fixed changes are the result of
neutral mutations,
molecular evolution will occur at a reasonably regular clock-like rate, determined primarily by the
neutral mutation rate.
Molecular Evolution. The study of the
evolution of DNA and protein sequences.
Monophyletic. A trait (physical or genetic) that occurs within a single
clade on a
phylogenetic tree and thus can be explained by a single evolutionary event.
Multiple Sequence Alignment. The alignment of
homologous DNA or protein sequences. An alignment of two sequences is referred to as a Pairwise sequence alignment.
N
Negative Selection. See
purifying selection.
Neighbour-Joining. A distance-matrix based
phylogenetic tree construction method that does not assume a
molecular clock.
Neofunctionalisation. The process by which, following
gene duplication, a new protein function evolves in one of the duplicates.
Neutral Evolution. The accumulation (
fixation) of
neutral mutations over time by
random genetic drift.
Neutral Mutation. A mutation that does not affect
fitness.
Node. A point on a
phylogenetic tree representing either an extant species/protein/gene (a terminal node) or a speciation/duplication event (internal node) ancestral to all species/proteins/genes in that
clade.
Non-Synonymous Substitution. A substitution in a
coding sequence of DNA that affects the protein sequence encoded.
Nonsense Mediated Decay. Process by which an mRNA encoding a premature stop
codon may result in RNA degradation and no expression, rather than expression of a truncated protein.
Nonsense Mutation. A
non-synonymous substitution that replaces an amino acid with a stop codon, thereby prematurely ending translation and truncating the protein. Truncated proteins may be subject to
nonsense mediated decay.
O
OTU. See
operational taxonomic unit.
Operational Taxonomic Unit. A sequence or organism used as the terminal
nodes of a
phylogenetic tree.
Orthology. Proteins/genes that are related by
speciation events. Typically the same protein in different species, although subsequent
gene duplications can result in complex one-to-many or many-to-many
orthology relationships. A type of
homology.
Outgroup. An
operational taxonomic unit (
OTU) known (or presumed) to
branch off ancestrally to all other
OTUs in the
phylogeny. Often used for
rooting or
parsimony analysis.
Outgroup Rooting. Rooting a
phylogeny on the
branch leading to the
outgroup.
P
Pairwise Sequence Alignment. See
MSA.
Parallel Evolution. Independent
evolution in closely related species that follows the same trajectory.
Paralogy. Different members of a
gene family, related by
duplication events. Easiest remembered as a different protein in the same species (in contrast to
orthology) but it should be remembered that paralogues will also be present different species if there have been subsequent
speciation events. A type of
homology.
Paraphyletic. A distinct genetic or physical trait that is shared by all the individuals in a
clade barring those belonging to one or more
monophyletic groups.
Parsimony. The simplest explanation for an observation. The smallest number of changes needed to explain the data.
Phenotype. The expressed product of the genotype, upon which
selection acts.
Phylogenetic Tree. Graphic representation of a
phylogeny. Extant species/proteins/genes and historical events (
speciation and
duplication) form nodes on the tree, which are joined by branches.
Phylogeny. The evolutionary relationship of species/genes/proteins.
Pleiotropy. A single
gene or mutation can affect several different traits. This is known as
pleiotropy. This is particularly important when considering the role of
selection in
evolution a mutation may be beneficial for one trait but have neutral or even
deleterious affects on other traits.
Point Mutation. A single nucleotide substitution.
Polymorphism. The presence of at least two
alleles of a particular genetic
locus in the population.
Polyphyletic. A distinct genetic or physical trait that is shared by individuals in different
clades and needs multiple evolutionary events (gains and/or losses of the trait) to explain the observed pattern.
Polyploidy. The presence of multiple genome copies, typically as the result of ancestral
whole genome duplications (WGD). (Many domestic crops are
polyploidy as a result of hybridisation and artificial selection.).
Population Genetics. The study of genetic variation,
selection and
evolution in populations.
Positive Selection. The evolutionary force that increases the frequency (and ultimately fixes) an
allele that gives a selective advantage by increasing the
fitness of the individual possessing the
allele.
Protein-coding Gene. A region of DNA (genetic
locus) that encodes for one or more protein sequences.
Purifying Selection. The evolutionary force that removes
deleterious (harmful) alleles/mutations that lower the
fitness of the individual possessing the
allele. To be selected against, the decrease in
fitness must be strong enough to overcome
random genetic drift.
R
RGD. See
Random Genetic Drift.
Radical Substitution. A
non-synonymous substitution that replaces an amino acid with a different amino acid with very different physiochemical properties.
Random Genetic Drift. Random changes in
allele frequencies over time due to chance differences in inheritance of different alleles.
RGD is stronger in smaller populations.
Root. The hypothetical ancestral (
MRCA)
node (and
HTU) of a
phylogenetic tree.
Rooting. Defining the ancestral point (and
HTU) of a
phylogenetic tree.
S
SNP. Single nucleotide
polymorphism.
Selection. The evolutionary force that alters
allele frequencies (genotypes) based on the changes in
fitness (phenotypes) they confer.
Selection may be natural or artificial.
Silent Substitution. See
non-synonymous substitution.
Similarity. The observation that two things resemble each other. Without further definition regarding the nature of the similarity, this is a fairly meaningless term.
Speciation. The evolutionary process by which one ancestral species diverges into two distinct descendant species.
Subfunctionalisation. The partitioning of existing protein functions following
gene duplication.
Synonymous Substitution. A
point mutation in coding DNA that does not result in a change of the encoded amino acid.
Synteny. The physical co-localisation of
genes on the same chromosome.
T
Topology. The branching order of a
phylogenetic tree.
U
UPGMA. A simple distance-matrix based
phylogenetic tree construction method that assumes a
molecular clock.
W
WGD. See
Whole Genome Duplication.
Whole Genome Duplication. A rare evolutionary event that doubles the genetic content of an organism and gives rise to
polyploidy.
© RJ Edwards 2012. Last modified 4 Jun 2012.
No comments:
Post a Comment
Thanks for leaving a comment! (Unless you're a spammer, in which case please stop - I am only going to delete it. You are just wasting your time and mine.)