To previous part of Software pages
Tatsuya Ota
(ota (at) soken.ac.jp) of the
Graduate University of Advanced Studies, Hayama, Japan
written a package,
DISPAN, (Genetic Distance and Phylogenetic Analysis),
which computes for gene frequency data the heterozygosity, gene diversity, Nei's standard genetic
distance or the DA distance, and their standard error.
It also constructs phylogenies using the
neighbor-joining (NJ) method or the UPGMA method. These trees can also be
bootstrapped. A tree editor allows the user to rearrange the tree and print it out.
The package consists of two programs, GNKDST and TREEVIEW. The first is a
rewrite of a program by A. K. Roychoudhury, Y. Tateno, D. Graur, N. Saitou,
and R. Schwartz, the second was written by Koichiro Tamura. DISPAN
is distributed as DOS executables (which can run under Windows in a Command
tool window. It is available at the Molecular Evolution and Phylogenetics
software
web page at Pennsylvania
State University at http://mep.bio.psu.edu/readme.html,
and is also available by ftp
from ftp.bio.indiana.edu
in directory molbio/ibmpc as files dispan.*.
Sudhir Kumar,
(S.Kumar (at) asu.edu),
of the Center for Evolutionary Functional Genomics at Arizona State
University, Tempe, Arizona
has written PHYLTEST, version 2.0. It is a DOS
executable program for testing phylogenetic hypotheses about four
clusters of DNA sequences. It implements comparison of three alternative
phylogenetic trees for four monophyletic clusters of sequences, the
four-cluster analysis: Rzhetsky, A, S. Kumar, and M. Nei. 1995.
Four-cluster analysis: a simple method to test phylogenetic hypotheses.
Molecular Biology and Evolution 12: 163-167.
It can also carry out the interior branch test of the null hypothesis that an interior
branch length is significantly longer than zero (Rzhetsky, A. and M. Nei. 1992.
A simple method for estimating and testing minimum-evolution trees.
Molecular Biology and Evolution 9: 945-967), as
well as the estimation of average pairwise distances (and standard errors)
within and between clusters of sequences and
relative rate tests and the computation of the time of divergence.
PHYLTEST is distributed from two places:
http://mep.bio.psu.edu/readme.html
ftp.bio.indiana.edu in directory
molbio/ibmpc
TREECON
yvdp (at) uia.ua.ac.be) for the
construction and drawing of phylogenetic trees based on distance data.
Several equations are included to convert dissimilarity into evolutionary
distance and several methods (such as neighbor-joining) are included for
inferring the tree topology. It also includes bootstrap analysis. It also
has good facilities for drawing trees. The
program is available for free and runs on PCs under Windows.
It is described in several papers:
http://bioinformatics.psb.ugent.be/software_details.php?id=3, and it can be
downloaded from there. In spite of some statements in documentation files
that a fee is asked for it, I believe that in recent years it has been
available for free.
Andrey Rzhetsky
(ar345 (at) columbia.edu) of the Department
of Medical Informatics at Columbia University, New York
and Masatoshi Nei of the Institute of Molecular and Evolutionary Genetics at Pennsylvania State
University have produced
METREE version 1.2, a program for carrying out the minimum-evolution
distance matrix method. METREE runs on
DOS systems and on Windows (under a Command Tool window). It computes
minimum evolution distance matrix trees from DNA and amino acid sequence data
and tests the statistical significance of
topological differences and of the branch lengths. Different distance
matrix measures may be used. The package is menu driven and the TREEVIEW
program written by Koichiro Tamura for
visualizing and printing out the final tree is also included. The method is
described in the paper by A. Rzhetsky and M. Nei. 1992. A simple method for
estimating and testing minimum-evolution trees. Molecular Biology and
Evolution 9: 945-967, and the program is described in
a paper by A. Rzhetsky and M. Nei. 1994. METREE: a program package for
inferring and testing minimum-evolution trees. Computer Applications in
the Biological Sciences (CABIOS) 10: 409-12.
METREE is distributed from these places:
http://mep.bio.psu.edu/readme.html (see the
Phylogenetic Analysis section of the page)
ftp.bio.indiana.edu in directory
molbio/ibmpc
Richard Desper, of the National Center for Biotechnology Information in Bethesda, Maryland (desper (at) ncbi.nlm.nih.gov) and Olivier Gascuel of the LIRMM (Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier), Montpellier, France (gascuel (at) lirmm.fr) have written FastME, a fast program for the minimum evolution distance matrix method. It is described as faster than neighbor-joining methods, more accurate than them, and as accurate as least squares methods. It can analyze multiple data sets as part of bootstrapping analyses. Its methods are described in two papers:
http://atgc.lirmm.fr/fastme/.
Naruya Saitou
of the Laboratory for Evolutionary Genetics, National Institute of Genetics, Japan (nsaitou (at) genes.njg.ac.jp) has produced TreeTree,
a set of programs for neighbor-joining distance matrix analysis with
bootstrapping. Macintosh executables are provided, and
documentation and Pascal or C source code is provided in the package.
The package consists of three main programs: NJ, a standard neighbor-joining
program, NJorg, which makes an unrooted neighbor-joining tree, and
bootNJ, which bootstraps the analysis, given a data file with multiple
distance matrices, one for each bootstrap replicate.
It can be downloaded by
anonymous ftp from ftp.nig.ac.jp
in directory pub/mac/TreeTree.
Olivier Gascuel
gascuel (at) lirmm.fr) of
the Laboratoire d'Informatique, de Robotique et de Micro-Electronique de
Montpellier (LIRMM) of the Universite de Montpellier II, France has written
BIONJ, an improved version of Neighbor-Joining
based on a simple model of sequence data. It follows the same
agglomerative scheme as NJ but uses a simple, first-order
model of the variances and covariances of evolutionary distance
estimates. This model is appropriate when these estimates are
obtained from aligned sequences. It retains the speed advantages of
Neighbor-Joining while using a slightly different criterion to select
pairs of taxa to join, one which will perform better when distances
between taxa are large. It is described in the paper: Gascuel, O. 1997.
BIONJ: An Improved Version of the NJ Algorithm Based on a Simple Model
of Sequence Data. Molecular Biology and Evolution 14: 685-695.
C source code and Sun, DOS, and Macintosh executables of BIONJ are available at
its web page at
http://atgc.lirmm.fr/bionj/.
William J. Bruno of the Los Alamos National
Laboratory (billb (at) lanl.gov) has released
nneighbor, a modification of the
PHYLIP Neighbor-Joining distance
matrix program that avoids negative branch lengths (its name means
Non-Negative Neighbor). The program is available as generic C code.
It is available at
one of Bruno's
web pages at
http://www.t10.lanl.gov/billb/related_links.html.
William J. Bruno, Nicholas D. Socci, and Aaron L. Halpern
of the Los Alamos National Laboratory (billb (at) lanl.gov) have
produced weighbor (WEIGHted neighBOR joining or perhaps
Weighted nEIGHBOR joining), version 1.0.1,
a distance matrix program for
performing a weighted version of the Neighbor-Joining method. The
weighting used is for nucleotide sequences and more correctly reflects the
uncertainty of the longer distances in the tree than does ordinary
Neighbor-Joining. It is thus closer to approximating maximum likelihood
and will be more accurate than Neighbor-Joining on large trees.
It is described in a paper:
Bruno, W. J., N. D. Socci, and A. L. Halpern 2000. Weighted
neighbor joining: a likelihood-based approach to distance-based
phylogeny reconstruction. Molecular Biology and Evolution 17:
189-197. Weighbor
is available as C source code and as PowerMac and DOS executables from
its web site
at http://www.t10.lanl.gov/billb/weighbor/index.html.
It is also available as a web server at the Institut Pasteur
in Paris.
http://www.sanger.ac.uk/Software/analysis/quicktree/
Paul Lewis, (plewis (at) uconnvm.uconn.edu),
of the Department of Ecology and Evolutionary
Biology, University of Connecticut, and Dmitri Zaykin, then of North Carolina
State University.
have written GDA version 1(d13),
a set of programs to carry out many of the statistical methods for
analyzing gene frequencies and sequence data that are described in
Bruce Weir's book Genetic Data Analysis II
(Sinauer Associates, Sunderland, Massachusetts, 1996). The programs run under
Windows and include the calculation of UPGMA and Neighbor-Joining phylogenies.
The program is described in a
Web site
maintained by Paul Lewis at
http://hydrodictyon.eeb.uconn.edu/people/plewis/software.php
There is also a link there to a command-line-only version of GDA by
Chris Basten that runs under Mac OS X.
The relevant feature for the purposes of this listing is the ability of
the programs to compute a number of distances.
Mark Miller
Mark.Miller (at) cnr.usu.edu) of the
Department of Forest, Range, and Wildlife, Utah State University, Logan, Utah
has written TFPGA (Tools For Population Genetics
Analysis), A Windows program for the analysis of allozyme and
molecular population genetic data. It can calculate genetic distances.
In addition, this program calculates descriptive statistics,
and F-statistics, and performs tests for Hardy-Weinberg equilibrium, exact tests for genetic
differentiation, Mantel tests, and UPGMA cluster analyses. Additional features include the ability
to analyze hierarchical data sets as well as data from either codominant markers such as allozymes
or dominant markers such as AFLPs or RAPDs. It is available from
his web page at
http://bioweb.usu.edu/mpmbio/index.htm as a Windows executable.
François Bonhomme
genetix (at) univ-montp2.fr)
has released Genetix version 4.05. This is a Windows
executable program that does a wide variety of population genetic
procedures. The part relevant to the present list is that it
computes the Nei and the Cavalli-Sforza genetic distances, both with and
without bias correction. It also calculates F statistics and linkage
disequilibrium, and performs permutation tests on the results.
One advantage (or limitation, depending on your perspective) is that
the interface is in French.
Genetix is available from
its web site
(in French) at http://www.univ-montp2.fr/~genetix/genetix/genetix.htm.
Steven Kalinowski
http://www.montana.edu/kalinowski/Software/TreeFit.htm
Immanuel Yap and Rebecca Nelsonhttp://www.irri.org/science/software/winboot.asp.
Jaap Buntjer
http://www.dpw.wau.nl/pv/PUB/pt/
María Jesús Martín and Joaquín Dopazo
at the R&D Department of TDI (TDI-EMBNet), Spain, (martin (at) tdi.es or dopazo (at) tdi.es) have developed OSA (Optimal Sequence
Analysis), version 2.0. It finds, whithin large sequences, those regions with an information
content similar to that of the whole sequence and it selects, among
them, the shortest ones. This program was formerly called ORF.
The algorithm used is based on comparing pairwise genetic distances, calculated
for windows of variable size and position, to the
distance matrix obtained for the whole sequence. Either uncorrected
genetic distances or Jukes-Cantor distances can be used.
Two methods are used to set cutoff levels: simulation-based significance
values or bootstrapping. A variety of options for search among possible
windows are available. The method has been described in a paper:
M. J. Martín, F. Gonzalez-Candelas, F. Sobrino and J. Dopazo. 1995. A method for
determining the position and size of optimal sequence regions for
phylogenetic analysis. Journal of Molecular Evolution 41: 1128-1138.
OSA uses aligned sequences in a number of common formats as input.
It runs on UNIX based machines. Currently Gnu Pascal source code and also executable versions
for Solaris and IRIX operating systems are available.
The program can analyze up to 50 sequences of a maximum length of 10,000 bp.
It can be obtained at
its website
http://www.tdi.es/programas/osa-i.htm.
It can also be obtained by ftp
from ftp.ebi.ac.uk in directory pub/software/unix/osa,
and from ftp.no.embnet.org in directory pub/programs/dist/OSA. In all of these places the file names are
osa-solaris.2.4.tar.Z for the Solaris version, and
osa-irix.5.3.tar.Z for the Irix version.
Mikael Thollesson
(lddist (at) artedi.ebc.uu.se), of the Department of Molecular Evolution, Evolutionary Biology Centre, Uppsala University, Sweden has written LDDist version 1.3, which calculates LogDet distances from DNA and protein sequences. It accomodates rate variation from site to site as well, by excluding invariant sites or by allowing different rates for different sites to be preassigned. LDDist is described in a paper: Thollesson, M. 2004. LDDist: a Perl module for calculating LogDet pair-wise distances for protein and nucleotide sequences. Bioinformatics 20: 416-418. LDDist is, as this says, written in Perl and C++. It is distributed in source code from its web site athttp://artedi.ebc.uu.se/molev/software/LDDist.html
Joyce Miller Hersh
(msmead (at) doctorbeer.com), formerly of the Whitehead Institute at MIT (and currently a high-tech
patent attorney) wrote RESTSITE,
version 1.2,
a package of DOS programs for computing distances between species based on
restriction sites or restriction fragments. The programs also include
NJTREE and UPGMA which can infer phylogenies by the Neighbor-Joining and
UPGMA distance matrix methods. The programs are written in Microsoft C:
source code is available too. The programs, documentation, and source code are distributed by
its Web site, http://www-genome.wi.mit.edu/~jmiller/restsite.htm.
The programs were described in a paper: Miller, J. C. 1991. RESTSITE: A phylogenetic program that sorts
raw restriction data. Journal of Heredity 82: 262-263.
Doug McElroy
(Doug.McElroy (at) wku.edu) of
Western Kentucky University distributes REAP, the
Restriction Enzyme Analysis Package, written by him, Paul Moran, Eldredge
Bermingham, and Irv Kornfeld. REAP can calculate distances from restriction sites,
restriction fragments data, and from nucleotide sequences (the Kimura
2-parameter distance). REAP is a package of DOS executables available
from McElroy's web site.
at http://bioweb.wku.edu/faculty/mcelroy/.
It is described in the paper:
McElroy, D., P. Moran, E. Bermingham, and I. Kornfield. 1992. REAP: An integrated environment
for the manipulation and phylogenetic analysis of restriction data.
Journal of Heredity 83: 157-158.
Ken Rice
(ken_a_rice (at) gsk.com) of
GlaxoSmithKline Beecham, Upper Merion, Pennsylvania (and adjunct faculty at the
University of Pennsylvania)
has produced AMP (Accepted Mutation Parsimony), a program which calculates
stepmatrices for protein parsimony analysis, for use in PAUP*
and MacClade. It uses transition probabilities under
models of protein evolution to calculate these stepmatrices.
It is available as C source code for Unix, from its
web site at
http://www.cis.upenn.edu/~krice/.
Genetics Computer Group
("GCG"), a subsidiary of Accelrys, Inc., produces the GCG Wisconsin Package, version 10.3, a leading package of sequence search and analysis programs, together with updates of the leading sequence databases. Included are programs for tree-based multiple sequence alignment, calculation of distances, and estimating phylogenies by the neighbor-joining and UPGMA distance matrix methods:http://accelrys.com/products/additional-products.html. i
There it is announced that distribution and support of the GCG package will
cease in June 2008. It is not made clear how to order it before then.
Prices are no
longer given on the web site (you are asked to contact them). The most
recently posted prices (several years ago) were $5,000 (plus $3,000 per
year thereafter) for an academic installation ($18,000 and $6,000 for a
nonacademic installation). These have probably changed since then.
Peter Rice, Alan Bleasby, and Jon Ison
http://emboss.sourceforge.net/what/
MacVector, Inc., PMB 150, PO Box 582, 1939 High House
Rd., Cary, NC 27519
sells MacVector version 9.5.2, a
sequence analysis program for Mac OS and Mac OS X systems. It carries out
sequence search, alignment using ClustalW, and
either UPGMA or Neighbor-Joining distance matrix methods. It has many other
features including gene finding, motif searching, protein secondary structure
and hydrophobicity prediction, and prediction of restriction digests and
primer sites. Version 7.2 onwards can run natively on Mac OS X systems.
It can be ordered through
its web page
at http://www.macvector.com. Its
price for academic use was formerly $2,500, and for commercial use $5,000.
Currently they do not give prices on their web page, but they have said to
me that the above is slightly more expensive than what they charge now.
Accelrys, Inc.
, 10188 Telesis Court, Suite 100, San Diego, California 92121, USA (Phone: +1 858 799 5000, Fax: +1 858 799 5100) sells Discovery Studio Gene version 1.5, a sequence analysis package for Windows with the same functionality as their Macintosh package MacVector. It can do sequence search, alignment using ClustalW, and UPGMA or Neighbor-Joining distance matrix methods with a variety of distance measures and with bootstrap analysis available. It also has many other features including primer design, gene finding, motif searching, protein secondary structure and hydrophobicity prediction, and prediction of restriction digests. It runs on Windows systems and can be ordered through its Accelrys web page athttp://www.accelrys.com/dstudio/ds_gene/index.html. Its
price for academic use is $2,500, and for commercial use $5,000.
Solltech, Inc., (ZevSolltec (at) aol.com)
Technology Innovation Center, Oakdale, Iowa 52319, USA
California 94309-9150, USA distributes DENDRON, a
computer-assisted system for analyzing DNA fingerprinting gels. It reads
and compares gel images. One feature is an average-linkage clustering algorithm
that can produce trees from the gel images. For information and pricing,
contact Solltech. The DENDRON
web page is
at http://www.geocities.com/solltech/dendron/.
BioRad,
division of Sadtler USA, Inc.,(Sadtler_USA_Sales (at) bio-rad.com)
distributes
Fingerprinting II Informatix Software, a package for quantitative
RFLP and Fingerprinting analysis. The package is available for
PC and Macintosh, and includes average linkage clustering of the gel
patterns. The web page
for this software is at http://www.bio-rad.com.
(To find the page for the programs you have to choose the links there
for "Life Sciences Research", "Software", and then "Fingerprinting II
Informatix Software".
BioRad's headquarter is at
1000 Alfred Nobel Drive,
Hercules, California 94547, and their phone number is (510) 724-7000.
Other contact information is available on their web page. One
mention of the price elsewhere is that it is from $5,170 up to $20,119.
Philipp Schlüter
http://homepage.univie.ac.at/philipp.maria.schlueter/famd.html
James McInerney
(J.McInerney (at) nhm.ac.uk), of
The Natural History Museum, London, has written GCUA
(General Codon Usage Analysis). It does codon usage and amino acid usage
statistics, and also performs correspondence analysis/principle components
analysis on both codon usage and amino acid usage statistics. Its
relevance to the present list is that it also produces a distance matrix,
based on Relative Synonymous Codon Usage (RSCU) statistics, whose format is
PHYLIP/PAUP*4.0 -compatible. Although McInerney cautions that this matrix
should not be used for phylogenetic inference, I wonder whether this
distance does not have some phylogenetic information.
It is available as SunOS, PowerMac, and Linux binaries at the moment, and will
be followed soon by DOS, Windows, SGI, DEC and other binaries.
The code isn't available yet, "because it is so embarassingly poor".
It can be retrieved via anonymous ftp
from ftp.nhm.ac.uk in directory pub/gcua
David T. Pride (dpride (at) partners.org),
formerly of Vanderbilt University, has written Swaap version
1.02. Swaap performs sliding window analyses on nucleotide sequences, computing
a large variety of statistics on the sequences. The relevant feature for
this listing is the ability to compute four different distance measures
between sequences, either on full sequences or on sliding windows.
Swaap is distributed as a Windows executable from
the Swaap
and Swaap PH web site
at http://www.bacteriamuseum.org/SWAAP/SwaapPage.htm#Swaap
David T. Pride (dpride (at) partners.org),
formerly of Vanderbilt University, has written Swaap PH version
1.02. Swaap PH computes many different kinds of statistics on nucleotide
frequencies and oligonucleotide frequencies in sliding windows along
nucleotide sequences. It can compute distances based on these frequencies.
Swaap PH is a a Windows executable available from
the Swaap and Swaap PH web site
at http://www.bacteriamuseum.org/SWAAP/SwaapPage.htm#Swaap
Mathieu Blanchette
of the McGill University Centre for Bioinformatics (blanchem (at) mcb.mcgill.ca) and David Sankoff
of the Department of Mathematics and Statistics of the University of
Ottawa, Canada
have produced DERANGE2, a program to reconstruct the
history of two gene maps using weighted inversions, transpositions
and inverted transpositions. It can thus construct a set of distances
based on the gene orders (not the sequences of the genes themselves).
It is available as a standard C source code
and can readily be compiled on Unix systems. It is available from
his software web page at
http://www.mcb.mcgill.ca/~blanchem/software.html and
it is also available by anonymous
ftp from ftp.ebi.ac.uk in directory pub/software/unix.
Xun Gu,
of the Department of Genetics, Development and Cell Biology and the Center for Bioinformatics and Biological Statistics at Iowa State University, Ames, Iowa (xgu (at) iastate.edu) together with Wei Huang, Dongping Xu, and Hongmei Zhang has produced GeneContent, version 1.0.2, a program to compute distances between whole genomes from their gene contents. It uses the presences and absences of gene families (or an extended version which also notes the presence of a gene family only as a single copy). It also uses these distances to compute a Neighbor-Joining tree of genomes. It is described in a paper: Gu, X., W. Huang, D. Xu, and H. Zhang. 2005. GeneContent: software for whole-genome phylogenetic analysis. Bioinformatics 21: 1713-1714. The paper is available as a PDF at the Gu lab web site. GeneContent is available as Windows or Linux executables from the Gu laboratory software web site athttp://xgu.zool.iastate.edu/software.html
Laurent Excoffier
of the Computational and Population Genetics Lab of the Institute of Zoology, University of Bern, Switzerland (laurent.excoffier (at) zoo.unibe.ch) has produced MINSPNET, a program that produces a minimum spanning tree and network from a distance matrix. It is available as a Windows executable. It can be obtained from a web page which lists software from that lab at http://cmpg.unibe.ch/services/software.htm.
Francis Yeh (francis.yeh (at) ualberta.ca) of the
Department of Renewable Resources at the University of Alberta, Canada, has
released POPGENE version 1.32, a free program for the analysis of genetic
variation among and within populations using co-dominant and dominant markers.
The feature that is relevant to the present list is that it can compute
a number of genetic distances for gene frequencies.
It is distributed as a Windows executable from
its home page at
http://www.ualberta.ca/~fyeh/index.htm.
Warren Kovach
of Kovach Computing Services, Anglesey, Wales (info (at) kovcomp.co.uk) has produced MVSP,
a comprehensive multivariate statistical package for the PC platform.
It can do many kinds of analyses (principal components, clustering, etc.)
but the features relevant to this listing are clustering with a variety
of methods and a variety of distance measures, including Li and Nei's
restriction sites distance. MVSP may be ordered from Kovach Software
through its
web site
at http://www.kovcomp.com/mvsp/.
MVSP 3.1 for Windows
costs UK £85 or US$ 150 for an academic license.
A version on CD with a printed manual is £20 ($35) more.
Commercial licenses are £115 ($185).
Version 2.2 for DOS costs UK £65 or US$ 100.
Free evaluation versions which works for a limited period can be
downloaded from
the Kovach Computing download web page at
http://www.kovcomp.co.uk/downl2.html#mvsp. An evaluation
version of version 2.2 for
DOS is also available for downloading by ftp from
garbo.uwasa.fi in directory pc/stat/.
MVSP is also distributed by Exeter Software at
its web site
at http://www.ExeterSoftware.com/cat/kovach/mvsp.html
Version 3.1 costs $185 for an academic license, $265 for a commercial license.
There are discounts for multi-user licenses.
Other vendors include Rockware and
GeoMem.
János Podani of the Department of Plant Taxonomy and Ecology,
Eötvös Loránd University, Budapest, Hungary (podani (at) ludens.elte.hu)
has developed SYN-TAX 2000, a general package for
clustering. It can calculate a wide variety of distance coefficients from
numerical data, and can perform hierarchical clustering, nonhierarchical
clustering, and ordination. This includes, in addition to many clustering
methods, minimum spanning trees and additive trees by Neighbor-Joining.
SYN-TAX 2000 is available as commercial software from Exeter Software at
its web site there
at http://www.ExeterSoftware.com/cat/syntax/syntax.html. It costs $350 for an educational license, $450 for a commercial
license. Podani also maintains his own SYN-TAX web site at http://ramet.elte.hu/~podani/SYN2000.html where there are descriptions, screen shots, some free upgrades of
certain program components, and also an older DOS executable
version, 5.1, and a Macintosh version, SYN-TAX 5.02. There is a demo
version available for the DOS version, and both the DOS and Mac versions
are sold, each for $150 (for educational use $200), and both together for
$300. Over the years various versions of SYN-TAX have been described by papers.
The most recent description in a journal is: Podani, J. 1993. SYN-TAX 5.0: Computer programs for multivariate data analysis in ecology and systematics.
Abstracta Botanica 17: 289-302.
John Archer and David Robertson
http://www.manchester.ac.uk/bioinformatics/ctree
Simon Goodman, then of the Institute of Cell, Animal, and
Population Biology of the University of Edinburgh produced
RSTCALC, version 2.2. It is primarily
intended to perform analyses of population structure, genetic
differentiation and gene flow using microsatellite data.
IT calculates estimates the Rst measure of differentiation among a number of
populations, but in
addition you can also use RSTCALC to obtain estimates of the delta-mu^2 distance measure.
Its calculations are described in a paper:
Goodman, S. J. 1997. Rst Calc: a collection of computer programs for
calculating estimates of genetic differentition from microsatellite data and a determining their
significance. Molecular Ecology 6: 881-885.
The program runs on Windows and is available from
its web site
http://helios.bto.ed.ac.uk/evolgen/rst/rst.html as a Windows
executable.
Daniel Montagnon (Daniel.Montagnon (at)
wanadoo.fr) of the Institut d'Embryologie, Faculté de
Médecine, Strasbourg, France has written
YCDMA (Y Chromosome Data MAnagement), version 1.2. This is
a data management program for microsatellite data. It can do a wide variety
of management tasks, maintaining and manipulating databases of genotypes,
calculating gene frequencies, and converting file formats.
For the purposes of this listing, its relevant feature is the calculation of
a variety of gene frequency genetic distances between populations, and
a squared copy number microsatellite genetic distance. YCDMA is written in
Microsoft Visual Basic. It is available as a Windows executable from
its web site
at http://perso.wanadoo.fr/daniel.montagnon/YCDMAAng.htm.
http://web.unife.it/progetti/genetica/Giorgio/giorgio_soft.html
William J. Bruno and Lars Arvestad
(billb (at) t10.lanl.gov) of the Theoretical Biology and
Biophysics Group at Los Alamos National Laboratory,
have released DISTANCE, version 1.0.
It estimates the most general reversible substitution matrix
corresponding to a given collection of aligned DNA sequences.
This matrix can then be used to calculate evolutionary distances between pairs
of sequences. The method is described in a paper:
Arvestad, L. and W. J. Bruno. 1997. Estimation of reversible substitution
matrices from multiple pairs of sequences. Journal of Molecular Evolution
45: 696-703. The program is written in C, and distributed from
its web site at
http://www.t10.lanl.gov/evolution/, along with Sun SPARC
binaries.
Stephane Guindon and Olivier Gascuel
http://www.lirmm.fr/~guindon/gamma.html
Gaston Gonnet and Chantal Korostensky
of the Computational Biochemistry Research Group at ETH in Zürich, Switzerland, have made available Darwin, Data Analysis and Retrieval With Indexed Nucleotide/peptide sequences, version 2.1. It is an environment which enables the user to carry out a variety of kinds of analysis with sequences, including phylogeny methods These seem to include distance matrix, split decompositon, and a form of likelihood method. Darwin is available as executables for Solaris, Intel-compatible Linux, Irix, and HP/Compaq/Digital Alpha machines. These are available free if the user registers by filling out a form at the download page at the Darwin web page and returning it by fax or regular mail, or if that doesn't work, by e-mailing to (sekwr (at) inf.ethz.ch) including the user's
postal address. The executables can then be transferred to the user by ftp
or by e-mail of encoded files. Details and distribution policies are
explained further at
Darwin's web page.
Darwin is also made available as
a server.
Vladimir Makarenkov
casgrain (at) magellan.umontreal.ca) of the
Département de Sciences Biologiques of the
Université de Montréal have released
T-REX (Tree and Reticulogram rEconstruXion), version 4.0a1.
This program performs four methods of fitting an additive distance
(distance in a nonclocklike tree) to a given dissimilarity. The
methods available include Sattath and Tversky's ADDTREE method,
Nei and Saitou's Neighbor-Joining method, Gascuel's UNJ Unweighted
Neighbor-Joining method, his BIONJ method, the
Circular order reconstruction method of Makarenkov and Leclerc
(1997), and Yushmanov (1984),
and the MW weighted least-squares method by Makarenkov (1997) and
Makarenkov and Leclerc (1998). A number of methods for fitting trees to
distance matrices that have missing values are also available.
Nucleotide sequence distance can be computed from sequences using
many of the widely-used distances.
The program can also carry out bootstrap and jackknife resampling to
assess strength of support for features of the trees.
It also allows construction and plotting
of "reticulograms" that show departures from treelike structure, and
interactive manipulation of the tree and reticulogram diagrams.
It is described in the paper: Makarenkov, V. 2001. T-Rex: reconstructing and
visualizing phylogenetic trees and reticulation networks.
Bioinformatics 17: 664-668.
Executables for Windows (the 4.0a1 version) and for Macintosh (the version
1.2a4 executable for PowerMacs) and an executable for a 32-bit DOS version
are available at
The T-REX web site at
http://www.labunix.uqam.ca/~makarenv/trex.html. C++ source
code is also available there. A web
server for T-REX with more tree construction and manipulation methods is
also available.
![]()
http://www.bio.umontreal.ca/casgrain/en/labo/permute/index.html
Lars Sommer Jermiin of the School of Biological Sciences
of the University of Sydney, Australia (lars.jermiin (at) usyd.edu.au)
(formerly of the John Curtin School of
Medical Research of the University of Canberra, Australia) has released
K2WuLi version 1.0, a program to calculate the Kimura 2-parameter
distance among DNA sequences, to compute its standard deviation, to carry out
the relative rate test of Wu and Li (Wu, C.-I. and W.-H. Li. 1985. Evidence for
higher rates of nucleotide substitution in rodents than in man.
Proceedings of the National Academy of Sciences, USA 82:
1741-1745) , in the form suggested by Muse and Weir (Muse, S. W. and B. S.
Weir. 1992. Testing for equality of evolutionary rates. Genetics
132: 269-276).
The program is available as a DOS executable with Turbo Pascal course code
as well from
its web page
at http://jcsmr.anu.edu.au/dmm/humgen/lars/k2wulitop.htm.
Clare Constantine
of the Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia (constantine (at) wehi.edu.au)
and colleagues at the Division of Veterinary and Biomedical Sciences at
Murdoch University, Perth, Australia have written
GeneStrut, a Macintosh program which computes a range of
standard measures for the analysis of genetic
structure from discrete genetic data. The input data are multilocus genotypes.
It can calculate genotypic and allelic frequencies,
statistics for Hardy-Weinberg disequilibrium,
genetic diversity within populations,
genetic identities between populations, and indices of population
structure (F-statistics). For our purposes the important feature is
that it can also calculate Nei's genetic distance between populations, with
standard deviations. It is described in the paper:
Constantine C. C., R. P. Hobbs and A. J. Lymbery. 1994.
FORTRAN programs for analysing population structure from
multilocus genotype data. Journal of Heredity 85: 336-337.
It is available as a Macintosh executable, at
its web site
at http://numbat.murdoch.edu.au/vetschl/imgad/GenStrut.htm.
Jérôme Goudet, of the Department of Ecology and Evolution of the
University of Lausanne, Switzerland (jerome.goudet (at) unil.ch)
has written FSTAT, version 2.9.3.2, a program to
estimate and test gene diversity statistics from codominant markers.
For our purposes, the important feature is its ability to calculate the
Nei and Cockerham/Weir families of distance measures. It can
convert data in its own format to and from the format of Genepop.
Version 2.9.2.3 is a Windows executable; an earlier version, 1.2, which
is a DOS executable is also available. Both can be downloaded
from its web site
at http://www2.unil.ch/izea/softwares/fstat.html.
Michel Raymond and François Rousset
of the Equipe Génétique et Environnement of the Institut des
Sciences de l'Evolution at the University of Montellier II, France
(Raymond (at) isem.univ-montp2.fr and Rousset (at) isem.univ-montp2.fr).
have written distributed Genepop version 3.4,
a program to carry out a variety of population genetics tests. It can
test assumptions of Hardy-Weinberg and linkage equilibrium,
run log-likelihood G-based test of differentiation between populations,
use Slatkin's rare allele method to estimate number of migrants per generation,
and calculate allele frequencies. For our purposes the relevant feature is
its ability to calculates Fst and Rst measures of population differentiation,
which are genetic distances. It is described in a paper: Raymond, M. and
F. Rousset. 1995. GENEPOP (version 1.2) population genetic software for exact
tests and ecumenicism. Journal of Heredity 86: 248-249.
Genepop is a DOS executable that can run under Windows in a Command Tool
window. It can be downloaded from the University of Montpellier at http://ftp.cefe.cnrs.fr/PC/MSDOS/GENEPOP/ or from
a link at the Curtin University of Technology
in Australias at http://wbiomed.curtin.edu.au/genepop/ where
a Genepop server is also available.
Laurent Excoffier
http://lgb.unige.ch/arlequin/.
Olivier Hardy and Xavier Vekemans
http://www.ulb.ac.be/+sciences/ecoevol/spagedi.html
Julio Rozas, J. C. Sánchez-DelBarrio, X. Messeguer and Ricardo Rosas of the Departament de Genètica, Universitat
de Barcelona, Spain (jrozas (at) ub.edu) have released
DnaSP version 4.0.6, a software package for the analysis of
nucleotide polymorphism from aligned DNA sequence data. DnaSP can estimate
several measures of DNA sequence variation within and between populations
(in noncoding, synonymous or nonsynonymous sites), as well as linkage
disequilibrium, recombination, gene flow and gene conversion parameters.
It can also carry out several tests of neutrality:
Additionally, it can estimate the confidence intervals of some test-statistics
by the coalescent. The results of the analyses are displayed on tabular and graphic form.
For the purposes of this web site, the relevant features are the
calculation of measures of population divergence, which include the
Jukes-Cantor method which can be used as a
distance in phylogeny reconstruction. DnaSP is described in the papers:
It is distributed as a Windows executable from
its web site
at http://www.ub.es/dnasp/.
Kevin Thornton
http://molpopgen.org/software/lseqsoftware.html
Daniel Montagnon (Daniel.Montagnon (at)
wanadoo.fr) of the Institut d'Embryologie, Faculté de
Médecine, Strasbourg, France has written NSA
(Nucleotide Sequences Analyzer), version 1.2. It is a general program
for reading in sequences and writing them out in a variety of data
formats, with the ability to select particular sets of sites and sequences.
For our purposes, the relevant feature is the ability to calculate
a number of different nucleotide sequence distances, as well as some
simple protein sequence distances. These include the Jukes-Cantor,
Kimura, and Tamura-Nei distances, as well as a simple protein distance
based on the fraction of similar amino acids. These can also have a
correction for a gamma distribution of rates across sites. The program
is written in Visual Basic, and is available as a Windows executable from
its web site
at http://perso.wanadoo.fr/daniel.montagnon/NSAAng.htm
John Brzustowski
, of the Department of Biological Sciences of the University of Alberta, Canada (jbrzusto (at) ualberta.ca),
wrote qclust, a program to carry out a number of
clustering methods including Neighbor-Joining. The neighbor-joining method
has been improved over our own Neighbor program, so as to be able to handle
large numbers of taxa much more quickly. The program is available
along with another program, calcdist which calculates
distances from 0/1 data. The programs are available
as C source and as DOS executables from
its web
page at http://www.biology.ualberta.ca/jbrzusto/dosclust.html.
A more interactive version of the program is also available as Java from
a web page
at http://www2.biology.ualberta.ca/jbrzusto/cluster.php.
(Brzustowski has declared that both of these programs are unsupported software,
and he will not answer questions about them).
http://pubmlst.org/software/analysis/start/manual/index.shtml
.
http://profdist.bioapps.biozentrum.uni-wuerzburg.de/.
Olivier Langella
(Olivier.Langella (at) pge.cnrs-gif.fr)
of the Laboratoire PGE, CNRS UPR9034, Gif sur Yvette, France, distributes
Populations, version 1.2.30. It can calculate a wide
variety of distances from multiple-allele diploid or haploid genotypes
and from microsatellite data, and can
also infer phylogenies by distance methods including Neighbor-Joining and
UPGMA. It can bootstrap the data across loci and/or across individuals
when constructing phylogenies. The trees can be trees of populations or
trees of individuals.
Populations is available as a free download from its web site at
http://bioinformatics.org/~tryphon/populations/, as source code, as executables for Windows.
Patrick Meirmanshttp://www.bentleydrummer.nl/software
Allen Rodrigo, Alexei Drummond, and Matthew Goode
of the Computational and Evolutionary Biology Laboratory, School of Biological Sciences, University of Auckland, New Zealand (a.rodrigo (at) auckland.ac.nz and m.goode (at) auckland.ac.nz) have released vCEBL 0.3a the virtual Computational and Evolutionary Biology Laboratory. This is a graphical user interface around a functional programming language for evolutionary inferences. The system is written in Java using the PAL project classes as its components. This alpha release provides the basic user interface and some component packages. The following analyses and tools are available in vCEBL 0.3a:http://www.cebl.auckland.ac.nz/pages/vcebl.html. It requires
Java VM 1.1.1 or higher. It can also be obtained there as an applet for your
browser, with some features lacking.
Le Sy Vinh
(Vinh (at) cs.uni-duesseldorf.de) of the Bioinformatics Institute of the University of Düsseldorf, Germany and Arndt von Haeseler (arndt.von.haeseler (at) univie.ac.at) of the Centre for Integrative Bioinformatics Vienna (CBIV) have released STC (Shortest Triplet Clustering). This method constructs k-representative sets from triplet of species. The resuling clustering method is O(n2) in speed and can handle thousands of species with good accuracy. It is described in a paper: Vinh, L. S. and A. von Haeseler. 2005. Shortest triplet clustering: reconstructing large phylogenies using representative sets. BMC Bioinformatics 6: 92. The program is available as Linux and as Windows executables at its web site athttp://www.bi.uni-duesseldorf.de/software/stc/
Naoko Takezaki
of the Center for Information Biology of the National Institute of Genetics, Mishima, Japan (ntakezak (at) lab.nig.ac.jp) has written sendbs.
It computes average nucleotide substitutions within and between populations. The
method is described in the paper by M. Nei and L. Jin (1989, Molecular
Biology and Evolution 6: 290-300). However, sendbs differs from
their method by using a bootstrap across sites obtain standard errors of
the distances.
It also constructs a tree of populations using a neighbor-joining method.
It is distributed as source code for Unix, and also as a DOS executable, from
her web site
at http//www.cib.nig.ac.jp/dda/ntakezak.html#sendbs.
It is also distributed by ftp from the Indiana ftp server
and through the software page of Masatoshi Nei's lab
at Pennsylvania State University at
http://www.bio.psu.edu/People/Faculty/Nei/Lab/software.htm.
Applied Maths BVBA
of Keistraat 120, 9830 Sint-Martens-Latem, Belgium has released GelCompar II, a comprehensive 1-d gel analysis program. It includes capabilities of clustering data taken from the gels. These are described as "including phylogenetic and dimensioning algorithms". The phylogeny algorithms include a number of distance-matrix clustering methods. It is also said to be able to carry out generalized parsimony. Gelcompar II is a Windows program. It is described at its web site athttp://www.applied-maths.com/gc/gc.htm. A detailed
brochure is available for downloading there. Gelcompar II is commercial
software. For price and ordering information contact them by phone at
+32 9 22222 100, fax them at +32 9 2222 102, e-mail them at
info (at) applied-maths.com, or use the information request form at
their web pages. Their U.S. Sales Office is at Applied Maths Inc.,
512 East 11th Street, Suite 207, Austin, Texas 78701.
phone +1 512-482-9700, fax +1 512-482-9708 (email is info-us (at) applied-maths.com). (One company vending Gelcompar II sells the whole
package for $20,000, though if only the basic module and the cluster analysis
module are ordered the price is $5,400).
Probal Chaudhuri
http://www.isical.ac.in/~probal/main.htm. It is described as
available as C++ source code, Windows executables, Linux executables and
Mac OS X universal executables.
Mike Sanderson
(sanderm (at) email.arizona.edu) of Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, Arizona has written r8s, version 1.71, a program to adjust branch lengths and divergence times in a phylogeny to infer divergence times by smoothing rates of evolution to approximate a molecular clock (allow a "relaxed" clock). The program is given the tree with branch lengths as input and smooths this tree and infers divergence times. Sanderson's main approaches to smoothing divergence times are described in his papers:http://loco.biosci.arizona.edu/r8s/index.html.
Torsten Eriksson
of the Bergius Botanical Garden, Stockholm, Sweden (torsten (at) bergianska.se)
has released the r8s bootstrap kit. This is a number
of Perl scripts and three general command blocks for PAUP* and r8s which enable bootstrapping
analyses with r8s. It is available from his
software web site
at http://www.bergianska.se/index_forskning_soft.html.
Kai Chan (kaichan (at) stanford.edu)
of the Department of Biological Sciences, Stanford University, Stanford,
California, and Brian Moore
(brian.moore (at) yale.edu) of the Department of Ecology and
Evolutionary Biology, Yale University, New Haven, Connecticut
have released SymmeTREE version 1.1. It is a program
to test whether branches of a tree have diversified at different rates,
and along which branches the significant shifts of diversity have occurred.
This is evaluated using the species diversity of different parts of the tree.
The program is described in a paper: Chan, K. M. A. and B. R. Moore. 2004.
SYMMETREE: whole-tree analysis of differential diversification rates.
Bioinformatics Advance Access publication November 30, 2004.
The program is available as executables for Windows, Mac OS X, and Linux
and as source code for other flavors of Unix. It is distributed from
its web site at
http://www.phylodiversity.net/bmoore/software.html.