phylip-3.697/0000755004732000473200000000000012407622317012551 5ustar joefelsenst_gphylip-3.697/doc/0000755004732000473200000000000013212365300013305 5ustar joefelsenst_gphylip-3.697/doc/clique.html0000644004732000473200000002012012406201172015450 0ustar joefelsenst_g clique
version 3.696

Clique -- Compatibility Program

© Copyright 1986-2014 by Joseph Felsenstein. All rights reserved. License terms here.

This program uses the compatibility method for unrooted two-state characters to obtain the largest cliques of characters and the trees which they suggest. This approach originated in the work of Le Quesne (1969), though the algorithms were not precisely specified until the later work of Estabrook, Johnson, and McMorris (1976a, 1976b). These authors proved the theorem that a group of two-state characters which were pairwise compatible would be jointly compatible. This program uses an algorithm inspired by the Kent Fiala - George Estabrook program CLINCH, though closer in detail to the algorithm of Bron and Kerbosch (1973). I am indebted to Kent Fiala for pointing out that paper to me, and to David Penny for decribing to me his branch-and-bound approach to finding the largest cliques, from which I have also borrowed. I am particularly grateful to Kent Fiala for catching a bug in versions 2.0 and 2.1 which resulted in those versions failing to find all of the cliques which they should. The program computes a compatibility matrix for the characters, then uses a recursive procedure to examine all possible cliques of characters.

After one pass through all possible cliques, the program knows the size of the largest clique, and during a second pass it prints out the cliques of the right size. It also, along with each clique, prints out the tree suggested by that clique.

INPUT, OUTPUT, AND OPTIONS

Input to the algorithm is standard, but the "?", "P", and "B" states are not allowed. This is a serious limitation of this program. If you want to find large cliques in data that has "?" states, I recommend that you use MIX instead with the T (Threshold) option and the value of the threshold set to 2.0. The theory underlying this is given in my paper on character weighting (Felsenstein, 1981b).

The options are chosen from a menu, which looks like this:


Largest clique program, version 3.69

Settings for this run:
  A   Use ancestral states in input file?  No
  F              Use factors information?  No
  W                       Sites weighted?  No
  C          Specify minimum clique size?  No
  O                        Outgroup root?  No, use as outgroup species  1
  M           Analyze multiple data sets?  No
  0   Terminal type (IBM PC, ANSI, none)?  ANSI
  1    Print out the data at start of run  No
  2  Print indications of progress of run  Yes
  3        Print out compatibility matrix  No
  4                        Print out tree  Yes
  5       Write out trees onto tree file?  Yes

  Y to accept these or type the letter for one to change

The A (Ancestors), F (Factors), O (Outgroup) ,M (Multiple Data Sets), and W (Weights) options are the usual ones, described in the main documentation file.

If you use option A (Ancestors) you should also choose it in the menu. The compatibility matrix calculation in effect assumes if the Ancestors option is invoked that there is in the data another species that has all the ancestral states. This changes the compatibility patterns in the proper way. The Ancestors option also requires information on the ancestral states of each character to be in the input file.

The O (Outgroup) option will take effect only if the tree is not rooted by the Ancestral States option.

The C (Clique Size) option indicates that you wish to specify a minimum clique size and print out all cliques (and their associated trees) greater than or equal to that size. The program prompts you for the minimum clique size.

Note that this allows you to list all cliques (each with its tree) by simply setting the minimum clique size to 1. If you do one run and find that the largest clique has 23 characters, you can do another run with the minimum clique size set at 18, thus listing all cliques within 5 characters of the largest one.

Output involves a compatibility matrix (using the symbols "." and "1") and the cliques and trees.

If you have used the F option there will be two lists of characters for each clique, one the original multistate characters and the other the binary characters. It is the latter that are shown on the tree. When the F option is not used the output and the cliques reflect only the binary characters.

The trees produced have it indicated on each branch the points at which derived character states arise in the characters that define the clique. There is a legend above the tree showing which binary character is involved. Of course if the tree is unrooted you can read the changes as going in either direction.

The program runs very quickly but if the maximum number of characters is large it will need a good deal of storage, since the compatibility matrix requires ActualChars x ActualChars boolean variables, where ActualChars is the number of characters (in the case of the factors option, the total number of true multistate characters).

ASSUMPTIONS

Basically the following assumptions are made:

  1. Each character evolves independently.
  2. Different lineages evolve independently.
  3. The ancestral state is not known.
  4. Each character has a small chance of being one which evolves so rapidly, or is so thoroughly misinterpreted, that it provides no information on the tree.
  5. The probability of a single change in a character (other than in the high rate characters) is low but not as low as the probability of being one of these "bad" characters.
  6. The probability of two changes in a low-rate character is much less than the probability that it is a high-rate character.
  7. The true tree has segments which are not so unequal in length that two changes in a long are as easy to envisage as one change in a short segment.

The assumptions of compatibility methods have been treated in several of my papers (1978b, 1979, 1981b, 1988b), especially the 1981 paper. For an opposing view arguing that the parsimony methods make no substantive assumptions such as these, see the papers by Farris (1983) and Sober (1983a, 1983b), but also read the exchange between Felsenstein and Sober (1986).

A constant available for alteration at the beginning of the program is the form width, "FormWide", which you may want to change to make it as large as possible consistent with the page width available on your output device, so as to avoid the output of cliques and of trees getting wrapped around unnecessarily.


TEST DATA SET

     5    6
Alpha     110110
Beta      110000
Gamma     100110
Delta     001001
Epsilon   001110


TEST SET OUTPUT (with all numerical options on)


Largest clique program, version 3.69

 5 species,   6  characters
Species  Character states
-------  --------- ------

Alpha       11011 0
Beta        11000 0
Gamma       10011 0
Delta       00100 1
Epsilon     00111 0

Character Compatibility Matrix (1 if compatible)
--------- ------------- ------ -- -- -----------

                     111..1
                     111..1
                     111..1
                     ...111
                     ...111
                     111111


Largest Cliques
------- -------


Characters: (  1  2  3  6)


  Tree and characters:

     2  1  3  6
     0  0  1  1

             +1-Delta     
       +0--1-+
  +--0-+     +--Epsilon   
  !    !
  !    +--------Gamma     
  !
  +-------------Alpha     
  !
  +-------------Beta      

remember: this is an unrooted tree!


phylip-3.697/doc/consense.html0000644004732000473200000003426212406201172016017 0ustar joefelsenst_g consense

version 3.696

Consense -- Consensus tree program

© Copyright 1986-2014 by Joseph Felsenstein. All rights reserved. License terms here.

Consense reads a file of computer-readable trees and prints out (and may also write out onto a file) a consensus tree. At the moment it carries out a family of consensus tree methods called the Ml methods (Margush and McMorris, 1981). These include strict consensus and majority rule consensus. Basically the consensus tree consists of monophyletic groups that occur as often as possible in the data. If a group occurs in more than a fraction l of all the input trees it will definitely appear in the consensus tree.

The tree printed out has at each fork a number indicating how many times the group which consists of the species to the right of (descended from) the fork occurred. Thus if we read in 15 trees and find that a fork has the number 15, that group occurred in all of the trees. The strict consensus tree consists of all groups that occurred 100% of the time, the rest of the resolution being ignored. The tree printed out here includes groups down to 50%, and below it until the tree is fully resolved.

The majority rule consensus tree consists of all groups that occur more than 50% of the time. Any other percentage level between 50% and 100% can also be used, and that is why the program in effect carries out a family of methods. You have to decide on the percentage level, figure out for yourself what number of occurrences that would be (e.g. 15 in the above case for 100%), and resolutely ignore any group below that number. Do not use numbers at or below 50%, because some groups occurring (say) 35% of the time will not be shown on the tree. The collection of all groups that occur 35% or more of the time may include two groups that are mutually self contradictory and cannot appear in the same tree. In this program, as the default method I have included groups that occur less than 50% of the time, working downwards in their frequency of occurrence, as long as they continue to resolve the tree and do not contradict more frequent groups. In this respect the method is similar to the Nelson consensus method (Nelson, 1979) as explicated by Page (1989) although it is not identical to it.

The program can also carry out Strict consensus, Majority Rule consensus without the extension which adds groups until the tree is fully resolved, and other members of the Ml family, where the user supplied the fraction of times the group must appear in the input trees to be included in the consensus tree. For the moment the program cannot carry out any other consensus tree method, such as Adams consensus (Adams, 1972, 1986) or methods based on quadruples of species (Estabrook, McMorris, and Meacham, 1985).

INPUT, OUTPUT, AND OPTIONS

Input is a tree file (called intree) which contains a series of trees in the Newick standard form -- the form used when many of the programs in this package write out tree files. Each tree starts on a new line. Each tree can have a weight, which is a real number and is located in comment brackets "[" and "]" just before the final ";" which ends the description of the tree. When the input trees have weights (like [0.01000]) then the total number of trees will be the total of those weights, which is often a number like 1.00. When the a tree doesn't have a weight it will be assigned a weight of 1. This means that when we have tied trees (as from a parsimony program) three alternative tied trees will be counted as if each was 1/3 of a tree.

Note that this program can correctly read trees whether or not they are bifurcating: in fact they can be multifurcating at any level in the tree.

The options are selected from a menu, which looks like this:


Consensus tree program, version 3.69

Settings for this run:
 C         Consensus type (MRe, strict, MR, Ml):  Majority rule (extended)
 O                                Outgroup root:  No, use as outgroup species  1
 R                Trees to be treated as Rooted:  No
 T           Terminal type (IBM PC, ANSI, none):  ANSI
 1                Print out the sets of species:  Yes
 2         Print indications of progress of run:  Yes
 3                               Print out tree:  Yes
 4               Write out trees onto tree file:  Yes

Are these settings correct? (type Y or the letter for one to change)

Option C (Consensus method) selects which of four methods the program uses. The program defaults to using the extended Majority Rule method. Each time the C option is chosen the program moves on to another method, the others being in order Strict, Majority Rule, and Ml. Here are descriptions of the methods. In each case the fraction of times a set appears among the input trees is counted by weighting by the weights of the trees (the numbers like [0.6000] that appear at the ends of trees in some cases).

Strict
A set of species must appear in all input trees to be included in the strict consensus tree.

Majority Rule (extended)
Any set of species that appears in more than 50% of the trees is included. The program then considers the other sets of species in order of the frequency with which they have appeared, adding to the consensus tree any which are compatible with it until the tree is fully resolved. This is the default setting.

Ml
The user is asked for a fraction between 0.5 and 1, and the program then includes in the consensus tree any set of species that occurs among the input trees more than that fraction of then time. The Strict consensus and the Majority Rule consensus are extreme cases of the Ml consensus, being for fractions of 1 and 0.5 respectively.

Majority Rule
A set of species is included in the consensus tree if it is present in more than half of the input trees.

Option R (Rooted) toggles between the default assumption that the input trees are unrooted trees and the selection that specifies that the tree is to be treated as a rooted tree and not re-rooted. Otherwise the tree will be treated as outgroup-rooted and will be re-rooted automatically at the first species encountered on the first tree (or at a species designated by the Outgroup option).

Option O is the usual Outgroup rooting option. It is in effect only if the Rooted option selection is not in effect. The trees will be re-rooted with a species of your choosing. You will be asked for the number of the species that is to be the outgroup. If we want to outgroup-root the tree on the line leading to a species which appears as the third species (counting left-to-right) in the first computer-readable tree in the input file, we would invoke select menu option O and specify species 3.

Output is a list of the species (in the order in which they appear in the first tree, which is the numerical order used in the program), a list of the subsets that appear in the consensus tree, a list of those that appeared in one or another of the individual trees but did not occur frequently enough to get into the consensus tree, followed by a diagram showing the consensus tree. The lists of subsets consists of a row of symbols, each either "." or "*". The species that are in the set are marked by "*". Every ten species there is a blank, to help you keep track of the alignment of columns. The order of symbols corresponds to the order of species in the species list. Thus a set that consisted of the second, seventh, and eighth out of 13 species would be represented by:

          .*....**.. ...

Note that if the trees are unrooted the final tree will have one group, consisting of every species except the Outgroup (which by default is the first species encountered on the first tree), which always appears. It will not be listed in either of the lists of sets, but it will be shown in the final tree as occurring all of the time. This is hardly surprising: in telling the program that this species is the outgroup we have specified that the set consisting of all of the others is always a monophyletic set. So this is not to be taken as interesting information, despite its dramatic appearance.

Option 2 in the menu gives you the option of turning off the writing of these sets into the output file. This may be useful if you are primarily interested in getting the tree file.

Option 3 is the usual tree file option. If this is on (it is by default) then the final tree will be written onto an output tree file (whose default name is "outtree").

Branch Lengths on the Consensus Tree?

Note that the lengths on the tree on the output tree file are not branch lengths but the number of times that each group appeared in the input trees. This number is the sum of the weights of the trees in which it appeared, so that if there are 11 trees, ten of them having weight 0.1 and one weight 1.0, a group that appeared in the last tree and in 6 others would be shown as appearing 1.6 times and its branch length will be 1.6. This means that if you take the consensus tree from the output tree file and try to draw it, the branch lengths will be strange. I am often asked how to put the correct branch lengths on these (this is one of our Frequently Asked Questions).

There is no simple answer to this. It depends on what "correct" means. For example, if you have a group of species that shows up in 80% of the trees, and the branch leading to that group has average length 0.1 among that 80%, is the "correct" length 0.1? Or is it (0.80 x 0.1)? There is no simple answer.

However, if you want to take the consensus tree as an estimate of the true tree (rather than as an indicator of the conflicts among trees) you may be able to use the User Tree (option U) mode of the phylogeny program that you used, and use it to put branch lengths on that tree. Thus, if you used Dnaml, you can take the consensus tree, make sure it is an unrooted tree, and feed that to Dnaml using the original data set (before bootstrapping) and Dnaml's option U. As Dnaml wants an unrooted tree, you may have to use Retree to make the tree unrooted (using the W option of Retree and choosing the unrooted option within it). Of course you will also want to change the tree file name from "outree" to "intree".

If you used a phylogeny program that does not infer branch lengths, you might want to use a different one (such as Fitch or Dnaml) to infer the branch lengths, again making sure the tree is unrooted, if the program needs that.

Future

The program uses the consensus tree algorithm originally designed for the bootstrap programs. It is quite fast, and execution time is unlikely to be limiting for you (assembling the input file will be much more of a limiting step). In the future, if possible, more consensus tree methods will be incorporated (although the current methods are the ones needed for the component analysis of bootstrap estimates of phylogenies, and in other respects I also think that the present ones are among the best).


TEST SET OF INPUT TREES

(A,(B,(H,(D,(J,(((G,E),(F,I)),C))))));
(A,(B,(D,((J,H),(((G,E),(F,I)),C)))));
(A,(B,(D,(H,(J,(((G,E),(F,I)),C))))));
(A,(B,(E,(G,((F,I),((J,(H,D)),C))))));
(A,(B,(E,(G,((F,I),(((J,H),D),C))))));
(A,(B,(E,((F,I),(G,((J,(H,D)),C))))));
(A,(B,(E,((F,I),(G,(((J,H),D),C))))));
(A,(B,(E,((G,(F,I)),((J,(H,D)),C)))));
(A,(B,(E,((G,(F,I)),(((J,H),D),C)))));


TEST SET OUTPUT


Consensus tree program, version 3.69

Species in order: 

  1. A
  2. B
  3. H
  4. D
  5. J
  6. G
  7. E
  8. F
  9. I
  10. C



Sets included in the consensus tree

Set (species in order)     How many times out of    9.00

.......**.                   9.00
..********                   9.00
..****.***                   6.00
..***.....                   6.00
..***....*                   6.00
..*.*.....                   4.00
..***..***                   2.00


Sets NOT included in consensus tree:

Set (species in order)     How many times out of    9.00

.....**...                   3.00
.....*****                   3.00
..**......                   3.00
.....****.                   3.00
..****...*                   2.00
.....*.**.                   2.00
..*.******                   2.00
....******                   2.00
...*******                   1.00


Extended majority rule consensus tree

CONSENSUS TREE:
the numbers on the branches indicate the number
of times the partition of the species into the two sets
which are separated by that branch occurred
among the trees, out of   9.00 trees

                                          +-----------------------C
                                          |
                                  +--6.00-|               +-------H
                                  |       |       +--4.00-|
                                  |       +--6.00-|       +-------J
                          +--2.00-|               |
                          |       |               +---------------D
                          |       |
                  +--6.00-|       |                       +-------F
                  |       |       +------------------9.00-|
                  |       |                               +-------I
          +--9.00-|       |
          |       |       +---------------------------------------G
  +-------|       |
  |       |       +-----------------------------------------------E
  |       |
  |       +-------------------------------------------------------B
  |
  +---------------------------------------------------------------A


  remember: this is an unrooted tree!

phylip-3.697/doc/contchar.html0000644004732000473200000001630112406201172015775 0ustar joefelsenst_g contchar

version 3.696

Gene Frequencies and Continuous Character Data Programs

© Copyright 1986-2014 by Joseph Felsenstein. All rights reserved. License terms here.

The programs in this group use gene frequencies and quantitative character values. One (Contml) constructs maximum likelihood estimates of the phylogeny, another (Gendist) computes genetic distances for use in the distance matrix programs, and the third (Contrast) examines correlation of traits as they evolve along a given phylogeny.

When the gene frequencies data are used in Contml or Gendist, this involves the following assumptions:

  1. Different lineages evolve independently.
  2. After two lineages split, their characters change independently.
  3. Each gene frequency changes by genetic drift, with or without mutation (this varies from method to method).
  4. Different loci or characters drift independently.

How these assumptions affect the methods will be seen in my papers on inference of phylogenies from gene frequency and continuous character data (Felsenstein, 1973b, 1981c, 1985c).

The input formats are fairly similar to the discrete-character programs, but with one difference. When Contml is used in the gene-frequency mode (its usual, default mode), or when Gendist is used, the first line contains the number of species (or populations) and the number of loci and the options information. There then follows a line which gives the numbers of alleles at each locus, in order. This must be the full number of alleles, not the number of alleles which will be input: i. e. for a two-allele locus the number should be 2, not 1. There then follow the species (population) data, each species beginning on a new line. The first 10 characters are taken as the name, and thereafter the values of the individual characters are read free-format, preceded and separated by blanks. They can go to a new line if desired, though of course not in the middle of a number. Missing data is not allowed - an important limitation. In the default configuration, for each locus, the numbers should be the frequencies of all but one allele. The menu option A (All) signals that the frequencies of all alleles are provided in the input data -- the program will then automatically ignore the last of them. So without the A option, for a three-allele locus there should be two numbers, the frequencies of two of the alleles (and of course it must always be the same two!). Here is a typical data set without the A option:

     5    3
2 3 2
Alpha      0.90 0.80 0.10 0.56
Beta       0.72 0.54 0.30 0.20
Gamma      0.38 0.10 0.05  0.98
Delta      0.42 0.40 0.43 0.97
Epsilon    0.10 0.30 0.70 0.62

whereas here is what it would have to look like if the A option were invoked:

     5    3
2 3 2
Alpha      0.90 0.10 0.80 0.10 0.10 0.56 0.44
Beta       0.72 0.28 0.54 0.30 0.16 0.20 0.80
Gamma      0.38 0.62 0.10 0.05 0.85  0.98 0.02
Delta      0.42 0.58 0.40 0.43 0.17 0.97 0.03
Epsilon    0.10 0.90 0.30 0.70 0.00 0.62 0.38

The first line has the number of species (or populations) and the number of loci. The second line has the number of alleles for each of the 3 loci. The species lines have names (filled out to 10 characters with blanks) followed by the gene frequencies of the 2 alleles for the first locus, the 3 alleles for the second locus, and the 2 alleles for the third locus. You can start a new line after any of these allele frequencies, and continue to give the frequencies on that line (without repeating the species name).

If all alleles of a locus are given, it is important to have them add up to 1. Roundoff of the frequencies may cause the program to conclude that the numbers do not sum to 1, and stop with an error message.

While many compilers may be more tolerant, it is probably wise to make sure that each number, including the first, is preceded by a blank, and that there are digits both preceding and following any decimal points.

Contml and Contrast also treat quantitative characters (the continuous-characters mode in Contml, which is option C). It is assumed that each character is evolving according to a Brownian motion model, at the same rate, and independently. In reality it is almost always impossible to guarantee this. The issue is discussed at length in my review article in Annual Review of Ecology and Systematics (Felsenstein, 1988a), where I point out the difficulty of transforming the characters so that they are not only genetically independent but have independent selection acting on them. If you are going to use Contml to model evolution of continuous characters, then you should at least make some attempt to remove genetic correlations between the characters (usually all one can do is remove phenotypic correlations by transforming the characters so that there is no within-population covariance and so that the within-population variances of the characters are equal -- this is equivalent to using Canonical Variates). However, this will only guarantee that one has removed phenotypic covariances between characters. Genetic covariances could only be removed by knowing the coheritabilities of the characters, which would require genetic experiments, and selective covariances (covariances due to covariation of selection pressures) would require knowledge of the sources and extent of selection pressure in all variables.

Contrast is a program designed to infer, for a given phylogeny that is provided to the program, the covariation between characters in a data set. Thus we have a program in this set that allows us to take information about the covariation and rates of evolution of characters and make an estimate of the phylogeny (Contml), and a program that takes an estimate of the phylogeny and infers the variances and covariances of the character changes. But we have no program that infers both the phylogenies and the character covariation from the same data set.

In the quantitative characters mode, a typical small data set would be:

     5   6
Alpha      0.345 0.467 1.213  2.2  -1.2 1.0
Beta       0.457 0.444 1.1    1.987 -0.2 2.678
Gamma      0.6 0.12 0.97 2.3  -0.11 1.54
Delta      0.68  0.203 0.888 2.0  1.67
Epsilon    0.297  0.22 0.90 1.9 1.74

Note that in the latter case, there is no line giving the numbers of alleles at each locus. In this latter case no square-root transformation of the coordinates is done: each is assumed to give directly the position on the Brownian motion scale.

For further discussion of options and modifiable constants in Contml, Gendist, and Contrast see the documentation files for those programs. phylip-3.697/doc/contml.html0000644004732000473200000003710112406201172015471 0ustar joefelsenst_g contml

version 3.696

Contml - Gene Frequencies and Continuous Characters Maximum Likelihood method

© Copyright 1986-2014 by Joseph Felsenstein. All rights reserved. License terms here.

This program estimates phylogenies by the restricted maximum likelihood method based on the Brownian motion model. It is based on the model of Edwards and Cavalli-Sforza (1964; Cavalli-Sforza and Edwards, 1967). Gomberg (1966), Felsenstein (1973b, 1981c) and Thompson (1975) have done extensive further work leading to efficient algorithms. Contml uses restricted maximum likelihood estimation (REML), which is the criterion used by Felsenstein (1973b). The actual algorithm is an iterative EM Algorithm (Dempster, Laird, and Rubin, 1977) which is guaranteed to always give increasing likelihoods. The algorithm is described in detail in a paper of mine (Felsenstein, 1981c), which you should definitely consult if you are going to use this program. Some simulation tests of it are given by Rohlf and Wooten (1988) and Kim and Burgman (1988).

The default (gene frequency) mode treats the input as gene frequencies at a series of loci, and square-root-transforms the allele frequencies (constructing the frequency of the missing allele at each locus first). This enables us to use the Brownian motion model on the resulting coordinates, in an approximation equivalent to using Cavalli-Sforza and Edwards's (1967) chord measure of genetic distance and taking that to give distance between particles undergoing pure Brownian motion. It assumes that each locus evolves independently by pure genetic drift.

The alternative continuous characters mode (menu option C) treats the input as a series of coordinates of each species in N dimensions. It assumes that we have transformed the characters to remove correlations and to standardize their variances.

A word about microsatellite data

Many current users of Contml use it to analyze microsatellite data. There are three ways to do this:

The input file

The input file is as described in the continuous characters documentation file above. Options are selected using a menu:


Continuous character Maximum Likelihood method version 3.69

Settings for this run:
  U                       Search for best tree?  Yes
  C  Gene frequencies or continuous characters?  Gene frequencies
  A   Input file has all alleles at each locus?  No, one allele missing at each
  O                              Outgroup root?  No, use as outgroup species 1
  G                      Global rearrangements?  No
  J           Randomize input order of species?  No. Use input order
  M                 Analyze multiple data sets?  No
  0         Terminal type (IBM PC, ANSI, none)?  ANSI
  1          Print out the data at start of run  No
  2        Print indications of progress of run  Yes
  3                              Print out tree  Yes
  4             Write out trees onto tree file?  Yes

  Y to accept these or type the letter for one to change

Option U is the usual User Tree option. Options C (Continuous Characters) and A (All alleles present) have been described in the Gene Frequencies and Continuous Characters Programs documentation file. The options G, J, O and M are the usual Global Rearrangements, Jumble order of species, Outgroup root, and Multiple Data Sets options.

The M (Multiple data sets) option does not allow multiple sets of weights instead of multiple data sets, as there are no weights in this program.

The G and J options have no effect if the User Tree option is selected. User trees are given with a trifurcation (three-way split) at the base. They can start from any interior node. Thus the tree:

     A
     !
     *--B
     !
     *-----C
     !
     *--D
     !
     E

can be represented by any of the following:

     (A,B,(C,(D,E)));
     ((A,B),C,(D,E));
     (((A,B),C),D,E);

(there are of course 69 other representations as well obtained from these by swapping the order of branches at an interior node).

The output file

The output has a standard appearance. The topology of the tree is given by an unrooted tree diagram. The lengths (in time or in expected amounts of variance) are given in a table below the topology, and a rough confidence interval given for each length. Negative lower bounds on length indicate that rearrangements may be acceptable.

The units of length are amounts of expected accumulated variance (not time). The log likelihood (natural log) of each tree is also given, and it is indicated how many topologies have been tried. The tree does not necessarily have all tips contemporary, and the log likelihood may be either positive or negative (this simply corresponds to whether the density function does or does not exceed 1) and a negative log likelihood does not indicate any error. The log likelihood allows various formal likelihood ratio hypothesis tests. The description of the tree includes approximate standard errors on the lengths of segments of the tree. These are calculated by considering only the curvature of the likelihood surface as the length of the segment is varied, holding all other lengths constant. As such they are most probably underestimates of the variance, and hence may give too much confidence in the given tree.

One should use caution in interpreting the likelihoods that are printed out. If the model is wrong, it will not be possible to use the likelihoods to make formal statistical statements. Thus, if gene frequencies are being analyzed, but the gene frequencies change not only by genetic drift, but also by mutation, the model is not correct. It would be as well-justified in this case to use Gendist to compute the Nei (1972) genetic distance and then use Fitch, Kitsch or Neighbor to make a tree. If continuous characters are being analyzed, but if the characters have not been transformed to new coordinates that evolve independently and at equal rates, then the model is also violated and no statistical analysis is possible. Doing such a transformation is not easy, and usually not even possible.

If the U (User Tree) option is used and more than one tree is supplied, the program also performs a statistical test of each of these trees against the one with highest likelihood. If there are two user trees, the test done is one which is due to Kishino and Hasegawa (1989), a version of a test originally introduced by Templeton (1983). In this implementation it uses the mean and variance of log-likelihood differences between trees, taken across loci. If the two trees' means are more than 1.96 standard deviations different then the trees are declared significantly different. This use of the empirical variance of log-likelihood differences is more robust and nonparametric than the classical likelihood ratio test, and may to some extent compensate for any lack of realism in the model underlying this program.

If there are more than two trees, the test done is an extension of the KHT test, due to Shimodaira and Hasegawa (1999). They pointed out that a correction for the number of trees was necessary, and they introduced a resampling method to make this correction. The version used here is a multivariate normal approximation to their test; it is due to Shimodaira (1998). The variances and covariances of the sum of log likelihoods across loci are computed for all pairs of trees. To test whether the difference between each tree and the best one is larger than could have been expected if they all had the same expected log-likelihood, log-likelihoods for all trees are sampled with these covariances and equal means (Shimodaira and Hasegawa's "least favorable hypothesis"), and a P value is computed from the fraction of times the difference between the tree's value and the highest log-likelihood exceeds that actually observed. Note that this sampling needs random numbers, and so the program will prompt the user for a random number seed if one has not already been supplied. With the two-tree KHT test no random numbers are used.

In either the KHT or the SH test the program prints out a table of the log-likelihoods of each tree, the differences of each from the highest one, the variance of that quantity as determined by the log-likelihood differences at individual sites, and a conclusion as to whether that tree is or is not significantly worse than the best one.

One problem which sometimes arises is that the program is fed two species (or populations) with identical transformed gene frequencies: this can happen if sample sizes are small and/or many loci are monomorphic. In this case the program "gets its knickers in a twist" and can divide by zero, usually causing a crash. If you suspect that this has happened, check for two species with identical coordinates. If you find them, eliminate one from the problem: the two must always show up as being at the same point on the tree anyway.

The constants available for modification at the beginning of the program include "epsilon1", a small quantity used in the iterations of branch lengths, "epsilon2", another not quite so small quantity used to check whether gene frequencies that were fed in for all alleles do not add up to 1, "smoothings", the number of passes through a given tree in the iterative likelihood maximization for a given topology, "maxtrees", the maximum number of user trees that will be used for the Kishino-Hasegawa-Templeton test, and "namelength", the length of species names. There is no provision in this program for saving multiple trees that are tied for having the highest likelihood, mostly because an exact tie is unlikely anyway.

The algorithm does not run as quickly as the discrete character methods but is not enormously slower. Like them, its execution time should rise as the cube of the number of species.

TEST DATA SET

This data set was compiled by me from the compilation of human gene frequencies by Mourant (1976). It appeared in a paper of mine (Felsenstein, 1981c) on maximum likelihood phylogenies from gene frequencies. The names of the loci and alleles are given in that paper.

    5    10
2 2 2 2 2 2 2 2 2 2
European   0.2868 0.5684 0.4422 0.4286 0.3828 0.7285 0.6386 0.0205
0.8055 0.5043
African    0.1356 0.4840 0.0602 0.0397 0.5977 0.9675 0.9511 0.0600
0.7582 0.6207
Chinese    0.1628 0.5958 0.7298 1.0000 0.3811 0.7986 0.7782 0.0726
0.7482 0.7334
American   0.0144 0.6990 0.3280 0.7421 0.6606 0.8603 0.7924 0.0000
0.8086 0.8636
Australian 0.1211 0.2274 0.5821 1.0000 0.2018 0.9000 0.9837 0.0396
0.9097 0.2976


TEST SET OUTPUT (WITH ALL NUMERICAL OPTIONS TURNED ON)


Continuous character Maximum Likelihood method version 3.69


   5 Populations,   10 Loci

Numbers of alleles at the loci:
------- -- ------- -- --- -----

   2   2   2   2   2   2   2   2   2   2

Name                 Gene Frequencies
----                 ---- -----------

  locus:         1         2         3         4         5         6
                 7         8         9        10

European     0.28680   0.56840   0.44220   0.42860   0.38280   0.72850
             0.63860   0.02050   0.80550   0.50430
African      0.13560   0.48400   0.06020   0.03970   0.59770   0.96750
             0.95110   0.06000   0.75820   0.62070
Chinese      0.16280   0.59580   0.72980   1.00000   0.38110   0.79860
             0.77820   0.07260   0.74820   0.73340
American     0.01440   0.69900   0.32800   0.74210   0.66060   0.86030
             0.79240   0.00000   0.80860   0.86360
Australian   0.12110   0.22740   0.58210   1.00000   0.20180   0.90000
             0.98370   0.03960   0.90970   0.29760


  +-----------------------------------------------------------African   
  !  
  !             +-------------------------------Australian
  1-------------3  
  !             !     +-----------------------American  
  !             +-----2  
  !                   +Chinese   
  !  
  +European  


remember: this is an unrooted tree!

Ln Likelihood =    38.71914

Between     And             Length      Approx. Confidence Limits
-------     ---             ------      ------- ---------- ------
  1       African        0.09693444   (  0.03123910,  0.19853605)
  1          3           0.02252816   (  0.00089799,  0.05598045)
  3       Australian     0.05247406   (  0.01177094,  0.11542376)
  3          2           0.00945315   ( -0.00897717,  0.03795670)
  2       American       0.03806240   (  0.01095938,  0.07997877)
  2       Chinese        0.00208822   ( -0.00960622,  0.02017434)
  1       European       0.00000000   ( -0.01627246,  0.02516630)


phylip-3.697/doc/contrast.html0000644004732000473200000002734012406201172016036 0ustar joefelsenst_g Contrast

version 3.696

Contrast -- Computes contrasts for comparative method


© Copyright 1986-2014 by Joseph Felsenstein. All rights reserved. License terms here.

This program implements the contrasts calculation described in my 1985 paper on the comparative method (Felsenstein, 1985d). It reads in a data set of the standard quantitative characters sort, and also a tree from the treefile. It then forms the contrasts between species that, according to that tree, are statistically independent. This is done for each character. The contrasts are all standardized by branch lengths (actually, square roots of branch lengths).

The method is explained in the 1985 paper. It assumes a Brownian motion model. This model was introduced by Edwards and Cavalli-Sforza (1964; Cavalli-Sforza and Edwards, 1967) as an approximation to the evolution of gene frequencies. I have discussed (Felsenstein, 1973b, 1981c, 1985d, 1988b) the difficulties inherent in using it as a model for the evolution of quantitative characters. Chief among these is that the characters do not necessarily evolve independently or at equal rates. This program allows one to evaluate this, if there is independent information on the phylogeny. You can compute the variance of the contrasts for each character, as a measure of the variance accumulating per unit branch length. You can also test covariances of characters.

The input file is as described in the continuous characters documentation file above, for the case of continuous quantitative characters (not gene frequencies). Options are selected using a menu:


Continuous character comparative analysis, version 3.69

Settings for this run:
  W        Within-population variation in data?  No, species values are means
  R     Print out correlations and regressions?  Yes
  C                        Print out contrasts?  No
  M                     Analyze multiple trees?  No
  0         Terminal type (IBM PC, ANSI, none)?  ANSI
  1          Print out the data at start of run  No
  2        Print indications of progress of run  Yes

  Y to accept these or type the letter for one to change

Option W makes the program expect not means of the phenotypes in each species, but phenotypes of individual specimens. The details of the input file format in that case are given below. In that case the program estimates the covariances of the phenotypic change, as well as covariances of within-species phenotypic variation. The model used is similar to (but not identical to) that of Lynch (1990). The algorithms used differ from the ones he gives in that paper. They are described in a recent paper (Felsenstein, 2008). In the case that has within-species samples contrasts are used by the program, but it does not make sense to write them out to an output file for direct analysis. They are of two kinds, contrasts within species and contrasts between species. The former are affected only by the within-species phenotypic covariation, but the latter are affected by both within- and between-species covariation. Contrast infers these two kinds of covariances and writes the estimates out.

M is similar to the usual multiple data sets input option, but is used here to allow multiple trees to be read from the treefile, not multiple data sets to be read from the input file. In this way you can use bootstrapping on the data that estimated these trees, get multiple bootstrap estimates of the tree, and then use the M option to make multiple analyses of the contrasts and the covariances, correlations, and regressions. In this way (Felsenstein, 1988b) you can assess the effect of the inaccuracy of the trees on your estimates of these statistics.

R allows you to turn off or on the printing out of the statistics. If it is off only the contrasts will be printed out (unless option 1 is selected). With only the contrasts printed out, they are in a simple array that is in a form that many statistics packages should be able to read. The contrasts are rows, and each row has one contrast for each character. Any multivariate statistics package should be able to analyze these (but keep in mind that the contrasts have, by virtue of the way they are generated, expectation zero, so all regressions must pass through the origin). If the W option has been set to analyze within-species as well as between-species variation, the R option does not appear in the menu as the regression and correlation statistics should always be computed in that case.

As usual, the tree file has the default name intree. It should contain the desired tree or trees. These can be either in bifurcating form, or may have the bottommost fork be a trifurcation (it should not matter which of these ways you present the tree). Note that the tree may not contain any multifurcations aside from a trifurcation at the root! If there are any, the program may not work, or may give misleading results.

The tree must, of course, have branch lengths. These cannot be negative. Trees from some distance methods, particularly Neighbor-Joining, are sometimes inferred to have negative branch lengths, so be sure to choose options in those programs that prevent negative branch lengths.

If you have a molecular data set (for example) and also, on the same species, quantitative measurements, here is how you can allow for the uncertainty of your estimate of the tree. Use Seqboot to generate multiple data sets from your molecular data. Then, whichever method you use to analyze it (the relevant ones are those that produce estimates of the branch lengths: Dnaml, Dnamlk, Fitch, Kitsch, and Neighbor -- the latter three require you to use Dnadist to turn the bootstrap data sets into multiple distance matrices), you should use the Multiple Data Sets option of that program. This will result in a tree file with many trees on it. Then use this tree file with the input file containing your continuous quantitative characters, choosing the Multiple Trees (M) option. You will get one set of contrasts and statistics for each tree in the tree file. At the moment there is no overall summary: you will have to tabulate these by hand. A similar process can be followed if you have restriction sites data (using Restml) or gene frequencies data.

The statistics that are printed out include the covariances between all pairs of characters, the regressions of each character on each other (column j is regressed on row i), and the correlations between all pairs of characters. In assessing degress of freedom it is important to realize that each contrast was taken to have expectation zero, which is known because each contrast could as easily have been computed xi-xj instead of xj-xi. Thus there is no loss of a degree of freedom for estimation of a mean. The degrees of freedom are thus the same as the number of contrasts, namely one less than the number of species (tips). If you feed these contrasts into a multivariate statistics program make sure that it knows that each variable has expectation exactly zero.

Within-species variation

With the W option selected, Contrast analyzes data sets with variation within species, using a model like that proposed by Michael Lynch (1990). The method is described in vague terms in my book (Felsenstein, 2004, pp. 441). If you select the W option for within-species variation, the data set should have this structure (on the left are the data, on the right my comments:

   10    5              
Alpha        2          
 2.01 5.3 1.5  -3.41 0.3
 1.98 4.3 2.1  -2.98 0.45
Gammarus     3
 6.57 3.1 2.0  -1.89 0.6
 7.62 3.4 1.9  -2.01 0.7
 6.02 3.0 1.9  -2.03 0.6
...
   number of species, number of characters
   name of 1st species, # of individuals
   data for individual #1
   data for individual #2
   name of 2nd species, # of individuals
   data for individual #1
   data for individual #2
   data for individual #3
   (and so on)

The covariances, correlations, and regressions for the "additive" (between-species evolutionary variation) and "environmental" (within-species phenotypic variation) are printed out (the maximum likelihood estimates of each). The program also estimates the within-species phenotypic variation in the case where the between-species evolutionary covariances are forced to be zero. The log-likelihoods of these two cases are compared and a likelihood ratio test (LRT) is carried out. The program prints the result of this test as a chi-square variate, and gives the number of degrees of freedom of the LRT. You have to look up the chi-square variable on a table of the chi-square distribution. The A option is available (if the W option is invoked) to allow you to turn off the doing of this test if you want to.

The program prints out the log-likelihood of the data under the models with and without between-species variation. It shows the degrees of freedom and chi-square value for a likelihood ratio test of the absence of between-species variation. For the moment the program cannot handle the case where within-species variation is to be taken into account but where only species means are available. (It can handle cases where some species have only one member in their sample).

We hope to fix this soon. We are also on our way to incorporating full-sib, half-sib, or clonal groups within species, so as to do one analysis for within-species genetic and between-species phylogenetic variation.

The data set used as an example below is the example from a paper by Michael Lynch (1990), his characters having been log-transformed. In the case where there is only one specimen per species, Lynch's model is identical to our model of within-species variation (for multiple individuals per species it is not a subcase of his model).


TEST SET INPUT

    5   2
Homo        4.09434  4.74493
Pongo       3.61092  3.33220
Macaca      2.37024  3.36730
Ateles      2.02815  2.89037
Galago     -1.46968  2.30259


TEST SET INPUT TREEFILE

((((Homo:0.21,Pongo:0.21):0.28,Macaca:0.49):0.13,Ateles:0.62):0.38,Galago:1.00);


TEST SET OUTPUT (with all numerical options and option C on )


Continuous character contrasts analysis, version 3.69

   5 Populations,    2 Characters

Name                       Phenotypes
----                       ----------

Homo         4.09434   4.74493
Pongo        3.61092   3.33220
Macaca       2.37024   3.36730
Ateles       2.02815   2.89037
Galago      -1.46968   2.30259


Contrasts (columns are different characters)
--------- -------- --- --------- -----------

   0.74593   2.17989
   1.58474   0.71761
   1.19293   0.86790
   3.35832   0.89706

Covariance matrix
---------- ------

    3.9423    1.7028
    1.7028    1.7062

Regressions (columns on rows)
----------- -------- -- -----

    1.0000    0.4319
    0.9980    1.0000

Correlations
------------

    1.0000    0.6566
    0.6566    1.0000

phylip-3.697/doc/discrete.html0000644004732000473200000004645312406201172016011 0ustar joefelsenst_g discrete

version 3.696

DOCUMENTATION FOR (0,1) DISCRETE CHARACTER PROGRAMS

© Copyright 1986-2014 by Joseph Felsenstein. All rights reserved. License terms here.

These programs are intended for the use of morphological systematists who are dealing with discrete characters, or by molecular evolutionists dealing with presence-absence data on restriction sites. One of the programs (Pars) allows multistate characters, with up to 8 states, plus the unknown state symbol "?". For the others, the characters are assumed to be coded into a series of (0,1) two-state characters. For most of the programs there are two other states possible, "P", which stands for the state of Polymorphism for both states (0 and 1), and "?", which stands for the state of ignorance: it is the state "unknown", or "does not apply". The state "P" can also be denoted by "B", for "both".

There is a method invented by Sokal and Sneath (1963) for linear sequences of character states, and fully developed for branching sequences of character states by Kluge and Farris (1969) for recoding a multistate character into a series of two-state (0,1) characters. Suppose we had a character with four states whose character-state tree had the rooted form:

               1 ---> 0 ---> 2
                      |
                      |
                      V
                      3

so that 1 is the ancestral state and 0, 2 and 3 derived states. We can represent this as three two-state characters:

                Old State           New States
                --- -----           --- ------
                    0                  001
                    1                  000
                    2                  011
                    3                  101

The three new states correspond to the three arrows in the above character state tree. Possession of one of the new states corresponds to whether or not the old state had that arrow in its ancestry. Thus the first new state corresponds to the bottommost arrow, which only state 3 has in its ancestry, the second state to the rightmost of the top arrows, and the third state to the leftmost top arrow. This coding will guarantee that the number of times that states arise on the tree (in programs Mix, Move, Penny and Boot) or the number of polymorphic states in a tree segment (in the Polymorphism option of Dollop, Dolmove, Dolpenny and Dolboot) will correctly correspond to what would have been the case had our programs been able to take multistate characters into account. Although I have shown the above character state tree as rooted, the recoding method works equally well on unrooted multistate characters as long as the connections between the states are known and contain no loops.

However, in the default option of programs Dollop, Dolmove, Dolpenny and Dolboot the multistate recoding does not necessarily work properly, as it may lead the program to reconstruct nonexistent state combinations such as 010. An example of this problem is given in my paper on alternative phylogenetic methods (1979).

If you have multistate character data where the states are connected in a branching "character state tree" you may want to do the binary recoding yourself. Thanks to Christopher Meacham, the package contains a program, Factor, which will do the recoding itself. For details see the documentation file for Factor.

We now also have the program Pars, which can do parsimony for unordered character states.

COMPARISON OF METHODS

The methods used in these programs make different assumptions about evolutionary rates, probabilities of different kinds of events, and our knowledge about the characters or about the character state trees. Basic references on these assumptions are my 1979, 1981b and 1983b papers, particularly the latter. The assumptions of each method are briefly described in the documentation file for the corresponding program. In most cases my assertions about what are the assumptions of these methods are challenged by others, whose papers I also cite at that point. Personally, I believe that they are wrong and I am right. I must emphasize the importance of understanding the assumptions underlying the methods you are using. No matter how fancy the algorithms, how maximum the likelihood or how minimum the number of steps, your results can only be as good as the correspondence between biological reality and your assumptions!

INPUT FORMAT

The input format is as described in the general documentation file. The input starts with a line containing the number of species and the number of characters.

In Pars, each character can have up to 8 states plus a "?" state. In any character, the first 8 symbols encountered will be taken to represent these states. Any of the digits 0-9, letters A-Z and a-z, and even symbols such as + and -, can be used (and in fact which 8 symbols are used can be different in different characters).

In the other discrete characters programs the allowable states are, 0, 1, P, B, and ?. Blanks may be included between the states (i. e. you can have a species whose data is DISCOGLOSS0 1 1 0 1 1 1). It is possible for extraneous information to follow the end of the character state data on the same line. For example, if there were 7 characters in the data set, a line of species data could read "DISCOGLOSS0110111 Hello there").

The discrete character data can continue to a new line whenever needed. The characters are not in the "aligned" or "interleaved" format used by the molecular sequence programs: they have the name and entire set of characters for one species, then the name and entire set of characters for the next one, and so on. This is known as the sequential format. Be particularly careful when you use restriction sites data, which can be in either the aligned or the sequential format for use in Restml but must be in the sequential format for these discrete character programs.

For Pars the discrete character data can be in either Sequential or Interleaved format; the latter is the default.

Errors in the input data will often be detected by the programs, and this will cause them to issue an error message such as 'BAD OUTGROUP NUMBER: ' together with information as to which species, character, or in this case outgroup number is the incorrect one. The program will then terminate; you will have to look at the data and figure out what went wrong and fix it. Often an error in the data causes a lack of synchronization between what is in the data file and what the program thinks is to be there. Thus a missing character may cause the program to read part of the next species name as a character and complain about its value. In this type of case you should look for the error earlier in the data file than the point about which the program is complaining.

OPTIONS GENERALLY AVAILABLE

Specific information on options will be given in the documentation file associated with each program. However, some options occur in many programs. Options are selected from the menu in each program.

INFORMATION IN THE OUTPUT

On the line in that table corresponding to each branch of the tree will also be printed "yes", "no" or "maybe" as an answer to the question of whether this branch is of nonzero length. If there is no evidence that any character has changed in that branch, then "no" will be printed. If there is definite evidence that one has changed, then "yes" will be printed. If the matter is ambiguous, then "maybe" will be printed. You should keep in mind that all of these conclusions assume that we are only interested in the assignment of states that require the least amount of change. In reality, the confidence limit on tree topology usually includes many different topologies, and presumably also then the confidence limits on amounts of change in branches are also very broad.

In addition to the table showing numbers of events, a table may be printed out showing which ancestral state causes the fewest events for each character. This will not always be done, but only when the tree is rooted and some ancestral states are unknown. This can be used to infer states of occurred and making it easy for the user to reconstruct all the alternative patterns of the characters states in the hypothetical ancestral nodes. In Pars you can, using the menu, turn off this dot-differencing convention and see all states at all hypothetical ancestral nodes of the tree.

If you select the proper menu option, a table of the number of events required in each character can also be printed, to help in reconstructing the placement of changes on the tree.

This table may not be obvious at first. A typical example looks like this:

 steps in each character:
         0   1   2   3   4   5   6   7   8   9
     *-----------------------------------------
    0!       2   2   2   2   1   1   2   2   1
   10!   1   2   3   1   1   1   1   1   1   2
   20!   1   2   2   1   2   2   1   1   1   2
   30!   1   2   1   1   1   2   1   3   1   1
   40!   1
The numbers across the top and down the side indicate which character is being referred to. Thus character 23 is column "3" of row "20" and has 2 steps in this case.

I cannot emphasize too strongly that just because the tree diagram which the program prints out contains a particular branch MAY NOT MEAN THAT WE HAVE EVIDENCE THAT THE BRANCH IS OF NONZERO LENGTH. In some of the older programs, the procedure which prints out the tree cannot cope with a trifurcation, nor can the internal data structures used in some of my programs. Therefore, even when we have no resolution and a multifurcation, successive bifurcations may be printed out, although some of the branches shown will in fact actually be of zero length. To find out which, you will have to work out character by character where the placements of the changes on the tree are, under all possible ways that the changes can be placed on that tree.

In Pars, Mix, Penny, Dollop, and Dolpenny the trees will be (if the user selects the option to see them) accompanied by tables showing the reconstructed states of the characters in the hypothetical ancestral nodes in the interior of the tree. This will enable you to reconstruct where the changes were in each of the characters. In some cases the state shown in an interior node will be "?", which means that either 0 or 1 would be possible at that point. In such cases you have to work out the ambiguity by hand. A unique assignment of locations of changes is often not possible in the case of the Wagner parsimony method. There may be multiple ways of assigning changes to segments of the tree with that method. Printing only one would be misleading, as it might imply that certain segments of the tree had no change, when another equally valid assignment would put changes there. It must be emphasized that all these multiple assignments have exactly equal numbers of total changes, so that none is preferred over any other.

I have followed the convention of having a "." printed out in the table of character states of the hypothetical ancestral nodes whenever a state is 0 or 1 and its immediate ancestor is the same. This has the effect of highlighting the places where changes might have occurred and making it easy for the user to reconstruct all the alternative patterns of the characters states in the hypothetical ancestral nodes. In Pars you can, using the menu, turn off this dot-differencing convention and see all states at all hypothetical ancestral nodes of the tree.

On the line in that table corresponding to each branch of the tree will also be printed "yes", "no" or "maybe" as an answer to the question of whether this branch is of nonzero length. If there is no evidence that any character has changed in that branch, then "no" will be printed. If there is definite evidence that one has changed, then "yes" will be printed. If the matter is ambiguous, then "maybe" will be printed. You should keep in mind that all of these conclusions assume that we are only interested in the assignment of states that requires the least amount of change. In reality, the confidence limit on tree topology usually includes many different topologies, and presumably also then the confidence limits on amounts of change in branches are also very broad.

In addition to the table showing numbers of events, a table may be printed out showing which ancestral state causes the fewest events for each character. This will not always be done, but only when the tree is rooted and some ancestral states are unknown. This can be used to infer states of ancestors. For example, if you use the O (Outgroup) and A (Ancestral states) options together, with at least some of the ancestral states being given as "?", then inferences will be made for those characters, as the outgroup makes the tree rooted if it was not already.

In programs Mix and Penny, if you are using the Camin-Sokal parsimony option with ancestral state "?" and it turns out that the program cannot decide between ancestral states 0 and 1, it will fail to even attempt reconstruction of states of the hypothetical ancestors, printing them all out as "." for those characters. This is done for internal bookkeeping reasons -- to reconstruct their changes would require a fair amount of additional code and additional data structures. It is not too hard to reconstruct the internal states by hand, trying the two possible ancestral states one after the other. A similar comment applies to the use of ancestral state "?" in the Dollo or Polymorphism parsimony methods (programs Dollop and Dolpenny) which also can result in a similar hesitancy to print the estimate of the states of the hypothetical ancestors. In all of these cases the program will print "?" rather than "no" when it describes whether there are any changes in a branch, since there might or might not be changes in those characters which are not reconstructed.

For further information see the documentation files for the individual programs. phylip-3.697/doc/distance.html0000644004732000473200000004001412406201172015764 0ustar joefelsenst_g distance

version 3.696

Distance matrix programs

© Copyright 1986-2014 by Joseph Felsenstein. All rights reserved. License terms here.

The programs Fitch, Kitsch, and Neighbor are for dealing with data which comes in the form of a matrix of pairwise distances between all pairs of taxa, such as distances based on molecular sequence data, gene frequency genetic distances, amounts of DNA hybridization, or immunological distances. In analyzing these data, distance matrix programs implicitly assume that:

  1. Each distance is measured independently from the others: no item of data contributes to more than one distance.
  2. The distance between each pair of taxa is drawn from a distribution with an expectation which is the sum of values (in effect amounts of evolution) along the tree from one tip to the other. The variance of the distribution is proportional to a power p of the expectation.

These assumptions can be traced in the least squares methods of programs Fitch and Kitsch but it is not quite so easy to see them in operation in the Neighbor-Joining method of Neighbor, where the independence assumption is less obvious.

THESE TWO ASSUMPTIONS ARE DUBIOUS IN MOST CASES: independence will not be expected to be true in most kinds of data, such as genetic distances from gene frequency data. For genetic distance data in which pure genetic drift without mutation can be assumed to be the mechanism of change Contml may be more appropriate. However, Fitch, Kitsch, and Neighbor will not give positively misleading results (they will not make a statistically inconsistent estimate) provided that additivity holds, which it will if the distance is computed from the original data by a method which corrects for reversals and parallelisms in evolution. If additivity is not expected to hold, problems are more severe. A short discussion of these matters will be found in a review article of mine (1984a). For detailed, if sometimes irrelevant, controversy see the papers by Farris (1981, 1985, 1986) and myself (1986, 1988b).

For genetic distances from gene frequencies, Fitch, Kitsch, and Neighbor may be appropriate if a neutral mutation model can be assumed and Nei's genetic distance is used, or if pure drift can be assumed and either Cavalli-Sforza's chord measure or Reynolds, Weir, and Cockerham's (1983) genetic distance is used. However, in the latter case (pure drift) Contml should be better.

Restriction site and restriction fragment data can be treated by distance matrix methods if a distance such as that of Nei and Li (1979) is used. Distances of this sort can be computed in PHYLIP by the program Restdist.

For nucleic acid sequences, the distances computed in Dnadist allow correction for multiple hits (in different ways) and should allow one to analyse the data under the presumption of additivity. In all of these cases independence will not be expected to hold. DNA hybridization and immunological distances may be additive and independent if transformed properly and if (and only if) the standards against which each value is measured are independent. (This is rarely exactly true).

Fitch and the Neighbor-Joining option of Neighbor fit a tree which has the branch lengths unconstrained. Kitsch and the UPGMA option of Neighbor, by contrast, assume that an "evolutionary clock" is valid, according to which the true branch lengths from the root of the tree to each tip are the same: the expected amount of evolution in any lineage is proportional to elapsed time.

The input format for distance data is straightforward. The first line of the input file contains the number of species. There follows species data, starting, as with all other programs, with a species name. The species name is ten characters long, and must be padded out with blanks if shorter. For each species there then follows a set of distances to all the other species (options selected in the programs' menus allow the distance matrix to be upper or lower triangular or square). The distances can continue to a new line after any of them. If the matrix is lower-triangular, the diagonal entries (the distances from a species to itself) will not be read by the programs. If they are included anyway, they will be ignored by the programs, except for the case where one of them starts a new line, in which case the program will mistake it for a species name and get very confused.

For example, here is a sample input matrix, with a square matrix:

     5
Alpha      0.000 1.000 2.000 3.000 3.000
Beta       1.000 0.000 2.000 3.000 3.000
Gamma      2.000 2.000 0.000 3.000 3.000
Delta      3.000 3.000 3.000 0.000 1.000
Epsilon    3.000 3.000 3.000 1.000 0.000

and here is a sample lower-triangular input matrix with distances continuing to new lines as needed:

   14
Mouse     
Bovine      1.7043
Lemur       2.0235  1.1901
Tarsier     2.1378  1.3287  1.2905
Squir Monk  1.5232  1.2423  1.3199  1.7878
Jpn Macaq   1.8261  1.2508  1.3887  1.3137  1.0642
Rhesus Mac  1.9182  1.2536  1.4658  1.3788  1.1124  0.1022
Crab-E.Mac  2.0039  1.3066  1.4826  1.3826  0.9832  0.2061  0.2681
BarbMacaq   1.9431  1.2827  1.4502  1.4543  1.0629  0.3895  0.3930  0.3665
Gibbon      1.9663  1.3296  1.8708  1.6683  0.9228  0.8035  0.7109  0.8132
  0.7858
Orang       2.0593  1.2005  1.5356  1.6606  1.0681  0.7239  0.7290  0.7894
  0.7140  0.7095
Gorilla     1.6664  1.3460  1.4577  1.5935  0.9127  0.7278  0.7412  0.8763
  0.7966  0.5959  0.4604
Chimp       1.7320  1.3757  1.7803  1.7119  1.0635  0.7899  0.8742  0.8868
  0.8288  0.6213  0.5065  0.3502
Human       1.7101  1.3956  1.6661  1.7599  1.0557  0.6933  0.7118  0.7589
  0.8542  0.5612  0.4700  0.3097  0.2712

Note that the name "Mouse" in this matrix must be padded out by blanks to the full length of 10 characters.

In general the distances are assumed to all be present: at the moment there is only one way we can have missing entries in the distance matrix. If the S option (which allows the user to specify the degree of replication of each distance) is invoked, with some of the entries having degree of replication zero, if the U (User Tree) option is in effect, and if the tree being examined is such that every branch length can be estimated from the data, it will be possible to solve for the branch lengths and sum of squares when there is some missing data. You may not get away with this if the U option is not in effect, as a tree may be tried on which the program will calculate a branch length by dividing zero by zero, and get upset.

The present version of Neighbor does allow the Subreplication option to be used and the number of replicates to be in the input file, but it actally does nothing with this information except read it in. It makes use of the average distances in the cells of the input data matrix. This means that you cannot use the S option to treat zero cells. We hope to modify Neighbor in the future to allow Subreplication. Of course the U (User tree) option is not available in Neighbor in any case.

The present versions of Fitch and Kitsch will do much better on missing values than did previous versions, but you will still have to be careful about them. Nevertheless you might (just) be able to explore relevant alternative tree topologies one at a time using the U option when there is missing data.

Alternatively, if the missing values in one cell always correspond to a cell with non-missing values on the opposite side of the main diagonal (i.e., if D(i,j) missing implies that D(j,i) is not missing), then use of the S option will always be sufficient to cope with missing values. When it is used, the missing distances should be entered as if present (any number can be used) and the degree of replication for them should be given as 0.

Note that the algorithm for searching among topologies in Fitch and Kitsch is the same one used in other programs, so that it is necessary to try different orders of species in the input data. The J (Jumble) menu option may be sufficient for most purposes.

The programs Fitch and Kitsch carry out the method of Fitch and Margoliash (1967) for fitting trees to distance matrices. They also are able to carry out the least squares method of Cavalli-Sforza and Edwards (1967), plus a variety of other methods of the same family (see the discussion of the P option below). They can also be set to use the Minimum Evolution method (Nei and Rzhetsky, 1993; Kidd and Sgaramella-Zonta, 1971).

The objective of these methods is to find that tree which minimizes

                      __  __
                      \   \    nij ( Dij  - dij)2  
  Sum of squares  =   /_  /_  ------------------
                       i   j       Dijp

(the symbol made up of \, / and _ characters is of course a summation sign) where D is the observed distance between species i and j and d is the expected distance, computed as the sum of the lengths (amounts of evolution) of the segments of the tree from species i to species j. The quantity n is the number of times each distance has been replicated. In simple cases this is taken to be one, but the user can, as an option, specify the degree of replication for each distance. The distance is then assumed to be a mean of those replicates. The power P is what distinguished the various methods. For the Fitch- Margoliash method, which is the default method with this program, P is 2.0. For the Cavalli-Sforza and Edwards least squares method it should be set to 0 (so that the denominator is always 1). An intermediate method is also available in which P is 1.0, and any other value of P, such as 4.0 or -2.3, can also be used. This generates a whole family of methods.

The P (Power) option is not available in the Neighbor-Joining program Neighbor. Implicitly, in this program P is 0.0 (though it is hard to prove this). The UPGMA option of Neighbor will assign the same branch lengths to the particular tree topology that it finds as will Kitsch when given the same tree and Power = 0.0.

All these methods make the assumptions of additivity and independent errors. The difference between the methods is how they weight departures of observed from expected. In effect, these methods differ in how they assume that the variance of measurement of a distance will rise as a function of the expected value of the distance.

These methods assume that the variance of the measurement error is proportional to the P-th power of the expectation (hence the standard deviation will be proportional to the P/2-th power of the expectation). If you have reason to think that the measurement error of a distance is the same for small distances as it is for large, then you should set P=0 and use the least squares method, but if you have reason to think that the relative (percentage) error is more nearly constant than the absolute error, you should use P=2, the Fitch-Margoliash method. In between, P=1 would be appropriate if the sizes of the errors were proportional to the square roots of the expected distance.

One question which arises frequently is what the units of branch length are in the resulting trees. In general, they are not time but units of distance. Thus if two species have a distance 0.3 between them, they will tend to be separated by branches whose total length is about 0.3. In the case of DNA distances, for example, the unit of branch length will be substitutions per base. (In the case of protein distances, it will be amino acid substitutions per amino acid position.)

OPTIONS

Here are the options available in all three programs. They are selected using the menu of options.

U
the User tree option. The trees in Fitch are regarded as unrooted, and are specified with a trifurcation (three-way split) at their base: e. g.:

((A,B),C,(D,E));

while in Kitsch they are to be regarded as rooted and have a bifurcation at the base:

((A,B),(C,(D,E)));

Be careful not to move User trees from Fitch to Kitsch without changing their form appropriately (you can use Retree to do this). User trees are not available in Neighbor. In Fitch if you specify the branch lengths on one or more branches, you can select the L (use branch Lengths) option to avoid having those branches iterated, so that the tree is evaluated with their lengths fixed.

P
indicates that you are going to set the Power (P in the above formula). The default value is 2 (the Fitch-Margoliash method). The power, a real number such as 1.0, is prompted for by the programs. This option is not available in Neighbor.

-
indicates that negative segment lengths are to be allowed in the tree (default is to require that all branch lengths be nonnegative). This option is not available in Neighbor.

O
is the usual Outgroup option, available in Fitch and Neighbor but not in Kitsch, nor when the UPGMA option of Neighbor is used.

L
indicates that the distance matrix is input in Lower-triangular form (the lower-left half of the distance matrix only, without the zero diagonal elements).

R
indicates that the distance matrix is input in uppeR-triangular form (the upper-right half of the distance matrix only, without the zero diagonal elements).

S
is the Subreplication option. It informs the program that after each distance will be provided an integer indicating that the distance is a mean of that many replicates. There is no auxiliary information, but the presence of the S option indicates that the data will be in a different form. Each distance must be followed by an integer indicating the number of replicates, so that a line of data looks like this:

Delta      3.00 5  3.21 3  1.84 9

the 5, 3, and 9 being the number of times the measurement was replicated. When the number of replicates is zero, a distance value must still be provided, although its value will not afect the result. This option is not available in Neighbor.

G
is the usual Global branch-swapping option. It is available in Fitch and Kitsch but is not relevant to Neighbor.

J
indicates the usual J (Jumble) option for entering species in a random order. In Fitch and Kitsch if you do multiple jumbles in one run the program will print out the best tree found overall.

M
is the usal Multiple data sets option, available in all of these programs. It allows us (when the output tree file is analyzed in Consense) to do a bootstrap (or delete-half-jackknife) analysis with the distance matrix programs.

The numerical options are the usual ones and should be clear from the menu.

Note that when the options L or R are used one of the species, the first or last one, will have its name on an otherwise empty line. Even so, the name should be padded out to full length with blanks. Here is a sample lower- triangular data set.

     5
Alpha      
Beta       1.00
Gamma      3.00 3.00
Delta      3.00 3.00 2.00
Epsilon    3.00 3.00 2.00 1.00
<--- note: five blanks should follow the name "Alpha"



Be careful if you are using lower- or upper-triangular trees to make the corresponding selection from the menu (L or R), as the program may get horribly confused otherwise, but it still gives a result even though the result is then meaningless. With the menu option selected all should be well. phylip-3.697/doc/dnacomp.html0000644004732000473200000002752612406201172015630 0ustar joefelsenst_g dnacomp

version 3.696

Dnacomp -- DNA Compatibility Program

© Copyright 1986-2014 by Joseph Felsenstein. All rights reserved. License terms here.

This program implements the compatibility method for DNA sequence data. For a four-state character without a character-state tree, as in DNA sequences, the usual clique theorems cannot be applied. The approach taken in this program is to directly evaluate each tree topology by counting how many substitutions are needed in each site, comparing this to the minimum number that might be needed (one less than the number of bases observed at that site), and then evaluating the number of sites which achieve the minimum number. This is the evaluation of the tree (the number of compatible sites), and the topology is chosen so as to maximize that number.

Compatibility methods originated with Le Quesne's (1969) suggestion that one ought to look for trees supported by the largest number of perfectly fitting (compatible) characters. Fitch (1975) showed by counterexample that one could not use the pairwise compatibility methods used in Clique to discover the largest clique of jointly compatible characters.

The assumptions of this method are similar to those of Clique. In a paper in the Biological Journal of the Linnean Society (1981b) I discuss this matter extensively. In effect, the assumptions are that:

  1. Each character evolves independently.
  2. Different lineages evolve independently.
  3. The ancestral base at each site is unknown.
  4. The rates of change in most sites over the time spans involved in the divergence of the group are very small.
  5. A few of the sites have very high rates of change.
  6. We do not know in advance which are the high and which the low rate sites.

That these are the assumptions of compatibility methods has been documented in a series of papers of mine: (1973a, 1978b, 1979, 1981b, 1983b, 1988b). For an opposing view arguing that arguments such as mine are invalid and that parsimony (and perhaps compatibility) methods make no substantive assumptions such as these, see the papers by Farris (1983) and Sober (1983a, 1983b, 1988), but also read the exchange between Felsenstein and Sober (1986).

There is, however, some reason to believe that the present criterion is not the proper way to correct for the presence of some sites with high rates of change in nucleotide sequence data. It can be argued that sites showing more than two nucleotide states, even if those are compatible with the other sites, are also candidates for sites with high rates of change. It might then be more proper to use Dnapars with the Threshold option with a threshold value of 2.

Change from an occupied site to a gap is counted as one change. Reversion from a gap to an occupied site is allowed and is also counted as one change. Note that this in effect assumes that a gap N bases long is N separate events. This may be an overcorrection. When we have nonoverlapping gaps, we could instead code a gap as a single event by changing all but the first "-" in the gap into "?" characters. In this way only the first base of the gap causes the program to infer a change.

The input data is standard. The first line of the input file contains the number of species and the number of sites.

Next come the species data. Each sequence starts on a new line, has a ten-character species name that must be blank-filled to be of that length, followed immediately by the species data in the one-letter code. The sequences must either be in the "interleaved" or "sequential" formats described in the Molecular Sequence Programs document. The I option selects between them. The sequences can have internal blanks in the sequence but there must be no extra blanks at the end of the terminated line. Note that a blank is not a valid symbol for a deletion.

The options are selected using an interactive menu. The menu looks like this:


DNA compatibility algorithm, version 3.69

Settings for this run:
  U                 Search for best tree?  Yes
  J   Randomize input order of sequences?  No. Use input order
  O                        Outgroup root?  No, use as outgroup species  1
  W                       Sites weighted?  No
  M           Analyze multiple data sets?  No
  I          Input sequences interleaved?  Yes
  0   Terminal type (IBM PC, ANSI, none)?  ANSI
  1    Print out the data at start of run  No
  2  Print indications of progress of run  Yes
  3                        Print out tree  Yes
  4  Print steps & compatibility at sites  No
  5  Print sequences at all nodes of tree  No
  6       Write out trees onto tree file?  Yes

Are these settings correct? (type Y or the letter for one to change)

The user either types "Y" (followed, of course, by a carriage-return) if the settings shown are to be accepted, or the letter or digit corresponding to an option that is to be changed.

The options U, J, O, W, M, and 0 are the usual ones. They are described in the main documentation file of this package. Option I is the same as in other molecular sequence programs and is described in the documentation file for the sequence programs.

The O (outgroup) option has no effect if the U (user-defined tree) option is in effect. The user-defined trees (option U) fed in must be strictly bifurcating, with a two-way split at their base.

The interpretation of weights (option W) in the case of a compatibility method is that they count how many times the character (in this case the site) is counted in the analysis. Thus a character can be dropped from the analysis by assigning it zero weight. On the other hand, giving it a weight of 5 means that in any clique it is in, it is counted as 5 characters when the size of the clique is evaluated. Generally, weights other than 0 or 1 do not have much meaning when dealing with DNA sequences.

Output is standard: if option 1 is toggled on, the data is printed out, with the convention that "." means "the same as in the first species". Then comes a list of equally parsimonious trees, and (if option 2 is toggled on) a table of the number of changes of state required in each character. If option 5 is toggled on, a table is printed out after each tree, showing for each branch whether there are known to be changes in the branch, and what the states are inferred to have been at the top end of the branch. If the inferred state is a "?" or one of the IUB ambiguity symbols, there will be multiple equally-parsimonious assignments of states; the user must work these out for themselves by hand. A "?" in the reconstructed states means that in addition to one or more bases, a gap may or may not be present. If option 6 is left in its default state the trees found will be written to a tree file, so that they are available to be used in other programs. If the program finds multiple trees tied for best, all of these are written out onto the output tree file. Each is followed by a numerical weight in square brackets (such as [0.25000]). This is needed when we use the trees to make a consensus tree of the results of bootstrapping or jackknifing, to avoid overrepresenting replicates that find many tied trees.

If the U (User Tree) option is used and more than one tree is supplied, the program also performs a statistical test of each of these trees against the one with highest likelihood. If there are two user trees, the test done is one which is due to Kishino and Hasegawa (1989), a version of a test originally introduced by Templeton (1983). In this implementation it uses the mean and variance of weighted compatibility differences between trees, taken across sites. If the two trees' compatibilities are more than 1.96 standard deviations different then the trees are declared significantly different.

If there are more than two trees, the test done is an extension of the KHT test, due to Shimodaira and Hasegawa (1999). They pointed out that a correction for the number of trees was necessary, and they introduced a resampling method to make this correction. In the version used here the variances and covariances of the sum of weighted compatibilities of sites are computed for all pairs of trees. To test whether the difference between each tree and the best one is larger than could have been expected if they all had the same expected compatibility, compatibilities for all trees are sampled with these covariances and equal means (Shimodaira and Hasegawa's "least favorable hypothesis"), and a P value is computed from the fraction of times the difference between the tree's value and the highest compatibility exceeds that actually observed. Note that this sampling needs random numbers, and so the program will prompt the user for a random number seed if one has not already been supplied. With the two-tree KHT test no random numbers are used.

In either the KHT or the SH test the program prints out a table of the compatibility of each tree, the differences of each from the highest one, the variance of that quantity as determined by the compatibility differences at individual sites, and a conclusion as to whether that tree is or is not significantly worse than the best one.

The algorithm is a straightforward modification of Dnapars, but with some extra machinery added to calculate, as each species is added, how many base changes are the minimum which could be required at that site. The program runs fairly quickly.

The constants which can be changed at the beginning of the program are: the name length "nmlngth", "maxtrees", the maximum number of trees which the program will store for output, and "maxuser", the maximum number of user trees that can be used in the paired sites test.


TEST DATA SET

    5   13
Alpha     AACGUGGCCAAAU
Beta      AAGGUCGCCAAAC
Gamma     CAUUUCGUCACAA
Delta     GGUAUUUCGGCCU
Epsilon   GGGAUCUCGGCCC

CONTENTS OF OUTPUT FILE (if all numerical options are turned on)


DNA compatibility algorithm, version 3.69

 5 species,  13  sites

Name            Sequences
----            ---------

Alpha        AACGUGGCCA AAU
Beta         ..G..C.... ..C
Gamma        C.UU.C.U.. C.A
Delta        GGUA.UU.GG CC.
Epsilon      GGGA.CU.GG CCC



One most parsimonious tree found:




           +--Epsilon   
        +--4  
     +--3  +--Delta     
     !  !  
  +--2  +-----Gamma     
  !  !  
  1  +--------Beta      
  !  
  +-----------Alpha     

  remember: this is an unrooted tree!


total number of compatible sites is       11.0

steps in each site:
         0   1   2   3   4   5   6   7   8   9
     *-----------------------------------------
    0|       2   1   3   2   0   2   1   1   1
   10|   1   1   1   3                        

 compatibility (Y or N) of each site with this tree:

      0123456789
     *----------
   0 ! YYNYYYYYY
  10 !YYYN      

From    To     Any Steps?    State at upper node
                             ( . means same as in the node below it on tree)

          1                AABGTSGCCA AAY
   1      2        maybe   .....C.... ...
   2      3         yes    V.KD...... C..
   3      4         yes    GG.A..T.GG .C.
   4   Epsilon     maybe   ..G....... ..C
   4   Delta        yes    ..T..T.... ..T
   3   Gamma        yes    C.TT...T.. ..A
   2   Beta        maybe   ..G....... ..C
   1   Alpha       maybe   ..C..G.... ..T


phylip-3.697/doc/dnadist.html0000644004732000473200000006127012406201172015627 0ustar joefelsenst_g dnadist

version 3.696

Dnadist -- Program to compute distance matrix
from nucleotide sequences

© Copyright 1986-2008 by the University of Washington. Written by Joseph Felsenstein. Permission is granted to copy this document provided that no fee is charged for it and that this copyright notice is not removed.

This program uses nucleotide sequences to compute a distance matrix, under four different models of nucleotide substitution. It can also compute a table of similarity between the nucleotide sequences. The distance for each pair of species estimates the total branch length between the two species, and can be used in the distance matrix programs Fitch, Kitsch or Neighbor. This is an alternative to using the sequence data itself in the maximum likelihood program Dnaml or the parsimony program Dnapars.

The program reads in nucleotide sequences and writes an output file containing the distance matrix, or else a table of similarity between sequences. The four models of nucleotide substitution are those of Jukes and Cantor (1969), Kimura (1980), the F84 model (Kishino and Hasegawa, 1989; Felsenstein and Churchill, 1996), and the model underlying the LogDet distance (Barry and Hartigan, 1987; Lake, 1994; Steel, 1994; Lockhart et. al., 1994). All except the LogDet distance can be made to allow for for unequal rates of substitution at different sites, as Jin and Nei (1990) did for the Jukes-Cantor model. The program correctly takes into account a variety of sequence ambiguities, although in cases where they exist it can be slow.

Jukes and Cantor's (1969) model assumes that there is independent change at all sites, with equal probability. Whether a base changes is independent of its identity, and when it changes there is an equal probability of ending up with each of the other three bases. Thus the transition probability matrix (this is a technical term from probability theory and has nothing to do with transitions as opposed to transversions) for a short period of time dt is:

              To:    A        G        C        T
                   ---------------------------------
               A  | 1-3a      a         a       a
       From:   G  |  a       1-3a       a       a
               C  |  a        a        1-3a     a
               T  |  a        a         a      1-3a

where a is u dt, the product of the rate of substitution per unit time (u) and the length dt of the time interval. For longer periods of time this implies that the probability that two sequences will differ at a given site is:

      p = 3/4 ( 1 - e- 4/3 u t)

and hence that if we observe p, we can compute an estimate of the branch length ut by inverting this to get

     ut = - 3/4 loge ( 1 - 4/3 p )

The Kimura "2-parameter" model is almost as symmetric as this, but allows for a difference between transition and transversion rates. Its transition probability matrix for a short interval of time is:

              To:     A        G        C        T
                   ---------------------------------
               A  | 1-a-2b     a         b       b
       From:   G  |   a      1-a-2b      b       b
               C  |   b        b       1-a-2b    a
               T  |   b        b         a     1-a-2b

where a is u dt, the product of the rate of transitions per unit time and dt is the length dt of the time interval, and b is v dt, the product of half the rate of transversions (i.e., the rate of a specific transversion) and the length dt of the time interval.

The F84 model incorporates different rates of transition and transversion, but also allows for different frequencies of the four nucleotides. It is the model which is used in Dnaml, the maximum likelihood nucelotide sequence phylogenies program in this package. You will find the model described in the document for that program. The transition probabilities for this model are given by Kishino and Hasegawa (1989), and further explained in a paper by me and Gary Churchill (1996).

The LogDet distance allows a fairly general model of substitution. It computes the distance from the determinant of the empirically observed matrix of joint probabilities of nucleotides in the two species. An explanation of it is available in the chapter by Swofford et al. (1996).

The first three models are closely related. The Dnaml model reduces to Kimura's two-parameter model if we assume that the equilibrium frequencies of the four bases are equal. The Jukes-Cantor model in turn is a special case of the Kimura 2-parameter model where a = b. Thus each model is a special case of the ones that follow it, Jukes-Cantor being a special case of both of the others.

The Jin and Nei (1990) correction for variation in rate of evolution from site to site can be adapted to all of the first three models. It assumes that the rate of substitution varies from site to site according to a gamma distribution, with a coefficient of variation that is specified by the user. The user is asked for it when choosing this option in the menu.

Each distance that is calculated is an estimate, from that particular pair of species, of the divergence time between those two species. For the Jukes- Cantor model, the estimate is computed using the formula for ut given above, as long as the nucleotide symbols in the two sequences are all either A, C, G, T, U, N, X, ?, or - (the latter four indicate a deletion or an unknown nucleotide). This estimate is a maximum likelihood estimate for that model. For the Kimura 2-parameter model, with only these nucleotide symbols, formulas special to that estimate are also computed. These are also, in effect, computing the maximum likelihood estimate for that model. In the Kimura case it depends on the observed sequences only through the sequence length and the observed number of transition and transversion differences between those two sequences. The calculation in that case is a maximum likelihood estimate and will differ somewhat from the estimate obtained from the formulas in Kimura's original paper. That formula was also a maximum likelihood estimate, but with the transition/transversion ratio estimated empirically, separately for each pair of sequences. In the present case, one overall preset transition/transversion ratio is used which makes the computations harder but achieves greater consistency between different comparisons.

For the F84 model, or for any of the models where one or both sequences contain at least one of the other ambiguity codons such as Y, R, etc., a maximum likelihood calculation is also done using code which was originally written for Dnaml. Its disadvantage is that it is slow. The resulting distance is in effect a maximum likelihood estimate of the divergence time (total branch length between) the two sequences. However the present program will be much faster than versions earlier than 3.5, because I have speeded up the iterations.

The LogDet model computes the distance from the determinant of the matrix of co-occurrence of nucleotides in the two species, according to the formula

   D  = - 1/4(loge(|F|) - 1/2loge(fA1fC1fG1fT1fA2fC2fG2fT2))
Where F is a matrix whose (i,j) element is the fraction of sites at which base i occurs in one species and base j occurs in the other. fji is the fraction of sites at which species i has base j. The LogDet distance cannot cope with ambiguity codes. It must have completely defined sequences. One limitation of the LogDet distance is that it may be infinite sometimes, if there are too many changes between certain pairs of nucleotides. This can be particularly noticeable with distances computed from bootstrapped sequences.

Note that there is an assumption that we are looking at all sites, including those that have not changed at all. It is important not to restrict attention to some sites based on whether or not they have changed; doing that would bias the distances by making them too large, and that in turn would cause the distances to misinterpret the meaning of those sites that had changed.

For all of these distance methods, the program allows us to specify that "third position" bases have a different rate of substitution than first and second positions, that introns have a different rate than exons, and so on. The Categories option which does this allows us to make up to 9 categories of sites and specify different rates of change for them.

In addition to the four distance calculations, the program can also compute a table of similarities between nucleotide sequences. These values are the fractions of sites identical between the sequences. The diagonal values are 1.0000. No attempt is made to count similarity of nonidentical nucleotides, so that no credit is given for having (for example) different purines at corresponding sites in the two sequences. This option has been requested by many users, who need it for descriptive purposes. It is not intended that the table be used for inferring the tree.

INPUT FORMAT AND OPTIONS

Input is fairly standard, with one addition. As usual the first line of the file gives the number of species and the number of sites.

Next come the species data. Each sequence starts on a new line, has a ten-character species name that must be blank-filled to be of that length, followed immediately by the species data in the one-letter code. The sequences must either be in the "interleaved" or "sequential" formats described in the Molecular Sequence Programs document. The I option selects between them. The sequences can have internal blanks in the sequence but there must be no extra blanks at the end of the terminated line. Note that a blank is not a valid symbol for a deletion -- neither is dot (".").

The options are selected using an interactive menu. The menu looks like this:


Nucleic acid sequence Distance Matrix program, version 3.69

Settings for this run:
  D  Distance (F84, Kimura, Jukes-Cantor, LogDet)?  F84
  G          Gamma distributed rates across sites?  No
  T                 Transition/transversion ratio?  2.0
  C            One category of substitution rates?  Yes
  W                         Use weights for sites?  No
  F                Use empirical base frequencies?  Yes
  L                       Form of distance matrix?  Square
  M                    Analyze multiple data sets?  No
  I                   Input sequences interleaved?  Yes
  0            Terminal type (IBM PC, ANSI, none)?  ANSI
  1             Print out the data at start of run  No
  2           Print indications of progress of run  Yes

  Y to accept these or type the letter for one to change

The user either types "Y" (followed, of course, by a carriage-return) if the settings shown are to be accepted, or the letter or digit corresponding to an option that is to be changed.

The D option selects one of the four distance methods, or the similarity table. It toggles among the five methods. The default method, if none is specified, is the F84 model.

If the G (Gamma distribution) option is selected, the user will be asked to supply the coefficient of variation of the rate of substitution among sites. This is different from the parameters used by Nei and Jin but related to them: their parameter a is also known as "alpha", the shape parameter of the Gamma distribution. It is related to the coefficient of variation by

     CV = 1 / a1/2

or

     a = 1 / (CV)2

(their parameter b is absorbed here by the requirement that time is scaled so that the mean rate of evolution is 1 per unit time, which means that a = b). As we consider cases in which the rates are less variable we should set a larger and larger, as CV gets smaller and smaller.

The F (Frequencies) option appears when the Maximum Likelihood distance is selected. This distance requires that the program be provided with the equilibrium frequencies of the four bases A, C, G, and T (or U). Its default setting is one which may save users much time. If you want to use the empirical frequencies of the bases, observed in the input sequences, as the base frequencies, you simply use the default setting of the F option. These empirical frequencies are not really the maximum likelihood estimates of the base frequencies, but they will often be close to those values (what they are is maximum likelihood estimates under a "star" or "explosion" phylogeny). If you change the setting of the F option you will be prompted for the frequencies of the four bases. These must add to 1 and are to be typed on one line separated by blanks, not commas.

The T option in this program does not stand for Threshold, but instead is the Transition/transversion option. The user is prompted for a real number greater than 0.0, as the expected ratio of transitions to transversions. Note that this is not the ratio of the first to the second kinds of events, but the resulting expected ratio of transitions to transversions. The exact relationship between these two quantities depends on the frequencies in the base pools. The default value of the T parameter if you do not use the T option is 2.0.

The C option allows user-defined rate categories. The user is prompted for the number of user-defined rates, and for the rates themselves, which cannot be negative but can be zero. These numbers, which must be nonnegative (some could be 0), are defined relative to each other, so that if rates for three categories are set to 1 : 3 : 2.5 this would have the same meaning as setting them to 2 : 6 : 5. The assignment of rates to sites is then made by reading a file whose default name is "categories". It should contain a string of digits 1 through 9. A new line or a blank can occur after any character in this string. Thus the categories file might look like this:

122231111122411155
1155333333444

If both user-assigned rate categories and Gamma-distributed rates are allowed, the program assumes that the actual rate at a site is the product of the user-assigned category rate and the Gamma-distributed rate. This allows you to specify that certain sites have higher or lower rates of change while also allowing the program to allow variation of rates in addition to that. (This may not always make perfect biological sense: it would be more natural to assume some upper bound to the rate, as we have discussed in the Felsenstein and Churchill paper). Nevertheless you may want to use both types of rate variation.

The L option specifies that the output file is to have the distance matrix in lower triangular form.

The W (Weights) option is invoked in the usual way, with only weights 0 and 1 allowed. It selects a set of sites to be analyzed, ignoring the others. The sites selected are those with weight 1. If the W option is not invoked, all sites are analyzed. The Weights (W) option takes the weights from a file whose default name is "weights". The weights follow the format described in the main documentation file.

The M (multiple data sets) option will ask you whether you want to use multiple sets of weights (from the weights file) or multiple data sets from the input file. The ability to use a single data set with multiple weights means that much less disk space will be used for this input data. The bootstrapping and jackknifing tool Seqboot has the ability to create a weights file with multiple weights. Note also that when we use multiple weights for bootstrapping we can also then maintain different rate categories for different sites in a meaningful way. If you use the multiple data sets option rather than multiple weights, you should not at the same time use the user-defined rate categories option (option C), because the user-defined rate categories could then be associated with the wrong sites. This is not a concern when the M option is used by using multiple weights.

The option 0 is the usual one. It is described in the main documentation file of this package. Option I is the same as in other molecular sequence programs and is described in the documentation file for the sequence programs.

OUTPUT FORMAT

As the distances are computed, the program prints on your screen or terminal the names of the species in turn, followed by one dot (".") for each other species for which the distance to that species has been computed. Thus if there are ten species, the first species name is printed out, followed by nine dots, then on the next line the next species name is printed out followed by eight dots, then the next followed by seven dots, and so on. The pattern of dots should form a triangle. When the distance matrix has been written out to the output file, the user is notified of that.

The output file contains on its first line the number of species. The distance matrix is then printed in standard form, with each species starting on a new line with the species name, followed by the distances to the species in order. These continue onto a new line after every nine distances. If the L option is used, the matrix of distances is in lower triangular form, so that only the distances to the other species that precede each species are printed. Otherwise the distance matrix is square with zero distances on the diagonal. In general the format of the distance matrix is such that it can serve as input to any of the distance matrix programs.

If the option to print out the data is selected, the output file will precede the data by more complete information on the input and the menu selections. The output file begins by giving the number of species and the number of characters, and the identity of the distance measure that is being used.

If the C (Categories) option is used a table of the relative rates of expected substitution at each category of sites is printed, and a listing of the categories each site is in.

There will then follow the equilibrium frequencies of the four bases. If the Jukes-Cantor or Kimura distances are used, these will necessarily be 0.25 : 0.25 : 0.25 : 0.25. The output then shows the transition/transversion ratio that was specified or used by default. In the case of the Jukes-Cantor distance this will always be 0.5. The transition-transversion parameter (as opposed to the ratio) is also printed out: this is used within the program and can be ignored. There then follow the data sequences, with the base sequences printed in groups of ten bases along the lines of the Genbank and EMBL formats.

The distances printed out are scaled in terms of expected numbers of substitutions, counting both transitions and transversions but not replacements of a base by itself, and scaled so that the average rate of change, averaged over all sites analyzed, is set to 1.0 if there are multiple categories of sites. This means that whether or not there are multiple categories of sites, the expected fraction of change for very small branches is equal to the branch length. Of course, when a branch is twice as long this does not mean that there will be twice as much net change expected along it, since some of the changes may occur in the same site and overlie or even reverse each other. The branch length estimates here are in terms of the expected underlying numbers of changes. That means that a branch of length 0.26 is 26 times as long as one which would show a 1% difference between the nucleotide sequences at the beginning and end of the branch. But we would not expect the sequences at the beginning and end of the branch to be 26% different, as there would be some overlaying of changes.

One problem that can arise is that two or more of the species can be so dissimilar that the distance between them would have to be infinite, as the likelihood rises indefinitely as the estimated divergence time increases. For example, with the Jukes-Cantor model, if the two sequences differ in 75% or more of their positions then the estimate of divergence time would be infinite. Since there is no way to represent an infinite distance in the output file, the program regards this as an error, issues an error message indicating which pair of species are causing the problem, and stops. It might be that, had it continued running, it would have also run into the same problem with other pairs of species. If the Kimura distance is being used there may be no error message; the program may simply give a large distance value (it is iterating towards infinity and the value is just where the iteration stopped). Likewise some maximum likelihood estimates may also become large for the same reason (the sequences showing more divergence than is expected even with infinite branch length). I hope in the future to add more warning messages that would alert the user the this.

If the similarity table is selected, the table that is produced is not in a format that can be used as input to the distance matrix programs. It has a heading, and the species names are also put at the tops of the columns of the table (or rather, the first 8 characters of each species name is there, the other two characters omitted to save space). There is not an option to put the table into a format that can be read by the distance matrix programs, nor is there one to make it into a table of fractions of difference by subtracting the similarity values from 1. This is done deliberately to make it more difficult for to use these values to construct trees. The similarity values are not corrected for multiple changes, and their use to construct trees (even after converting them to fractions of difference) would be wrong, as it would lead to severe conflict between the distant pairs of sequences and the close pairs of sequences.

PROGRAM CONSTANTS

The constants that are available to be changed by the user at the beginning of the program include "maxcategories", the maximum number of site categories, "iterations", which controls the number of times the program iterates the EM algorithm that is used to do the maximum likelihood distance, "namelength", the length of species names in characters, and "epsilon", a parameter which controls the accuracy of the results of the iterations which estimate the distances. Making "epsilon" smaller will increase run times but result in more decimal places of accuracy. This should not be necessary.

The program spends most of its time doing real arithmetic. The algorithm, with separate and independent computations occurring for each pattern, lends itself readily to parallel processing.


TEST DATA SET

   5   13
Alpha     AACGTGGCCACAT
Beta      AAGGTCGCCACAC
Gamma     CAGTTCGCCACAA
Delta     GAGATTTCCGCCT
Epsilon   GAGATCTCCGCCC


CONTENTS OF OUTPUT FILE (with all numerical options on)

(Note that when the options for displaying the input data are turned off, the output is in a form suitable for use as an input file in the distance matrix programs).


Nucleic acid sequence Distance Matrix program, version 3.69

 5 species,  13  sites

  F84 Distance

Transition/transversion ratio =   2.000000

Name            Sequences
----            ---------

Alpha        AACGTGGCCA CAT
Beta         ..G..C.... ..C
Gamma        C.GT.C.... ..A
Delta        G.GA.TT..G .C.
Epsilon      G.GA.CT..G .CC



Empirical Base Frequencies:

   A       0.24615
   C       0.36923
   G       0.21538
  T(U)     0.16923

    5
Alpha       0.000000  0.303900  0.857544  1.158927  1.542899
Beta        0.303900  0.000000  0.339727  0.913522  0.619671
Gamma       0.857544  0.339727  0.000000  1.631729  1.293713
Delta       1.158927  0.913522  1.631729  0.000000  0.165882
Epsilon     1.542899  0.619671  1.293713  0.165882  0.000000
phylip-3.697/doc/dnainvar.html0000644004732000473200000004025412406201172016002 0ustar joefelsenst_g dnainvar

version 3.696

Dnainvar -- Program to compute Lake's and Cavender's
phylogenetic invariants from nucleotide sequences

© Copyright 1986-2014 by Joseph Felsenstein. All rights reserved. License terms here.

This program reads in nucleotide sequences for four species and computes the phylogenetic invariants discovered by James Cavender (Cavender and Felsenstein, 1987) and James Lake (1987). Lake's method is also called by him "evolutionary parsimony". I prefer Cavender's more mathematically precise term "invariants", as the method bears somewhat more relationship to likelihood methods than to parsimony. The invariants are mathematical formulas (in the present case linear or quadratic) in the EXPECTED frequencies of site patterns which are zero for all trees of a given tree topology, irrespective of branch lengths. We can consider at a given site that if there are no ambiguities, we could have for four species the nucleotide patterns (considering the same site across all four species) AAAA, AAAC, AAAG, ... through TTTT, 256 patterns in all.

The invariants are formulas in the expected pattern frequencies, not the observed pattern frequencies. When they are computed using the observed pattern frequencies, we will usually find that they are not precisely zero even when the model is correct and we have the correct tree topology. Only as the number of nucleotides scored becomes infinite will the observed pattern frequencies approach their expectations; otherwise, we must do a statistical test of the invariants.

Some explanation of invariants will be found in the above papers, and also in my review article on statistical aspects of inferring phylogenies (Felsenstein, 1988b). Although invariants have some important advantages, their validity also depends on symmetry assumptions that may not be satisfied. In the discussion below suppose that the possible unrooted phylogenies are I: ((A,B),(C,D)), II: ((A,C),(B,D)), and III: ((A,D),(B,C)).

Lake's Invariants, Their Testing and Assumptions

Lake's invariants are fairly simple to describe: the patterns involved are only those in which there are two purines and two pyrimidines at a site. Thus a site with AACT would affect the invariants, but a site with AAGG would not. Let us use (as Lake does) the symbols 1, 2, 3, and 4, with the proviso that 1 and 2 are either both of the purines or both of the pyrimidines; 3 and 4 are the other two nucleotides. Thus 1 and 2 always differ by a transition; so do 3 and 4. Lake's invariants, expressed in terms of expected frequencies, are the three quantities:

(1)      P(1133) + P(1234) - P(1134) - P(1233),

(2)      P(1313) + P(1324) - P(1314) - P(1323),

(3)      P(1331) + P(1342) - P(1341) - P(1332),

He showed that invariants (2) and (3) are zero under Topology I, (1) and (3) are zero under topology II, and (1) and (2) are zero under Topology III. If, for example, we see a site with pattern ACGC, we can start by setting 1=A. Then 2 must be G. We can then set 3=C (so that 4 is T). Thus its pattern type, making those substitutions, is 1323. P(1323) is the expected probability of the type of pattern which includes ACGC, TGAG, GTAT, etc.

Lake's invariants are easily tested with observed frequencies. For example, the first of them is a test of whether there are as many sites of types 1133 and 1234 as there are of types 1134 and 1233; this is easily tested with a chi-square test or, as in this program, with an exact binomial test. Note that with several invariants to test, we risk overestimating the significance of results if we simply accept the nominal 95% levels of significance (Li and Guoy, 1990).

Lake's invariants assume that each site is evolving independently, and that starting from any base a transversion is equally likely to end up at each of the two possible bases (thus, an A undergoing a transversion is equally likely to end up as a C or a T, and similarly for the other four bases from which one could start. Interestingly, Lake's results do not assume that rates of evolution are the same at all sites. The result that the total of 1133 and 1234 is expected to be the same as the total of 1134 and 1233 is unaffected by the fact that we may have aggregated the counts over classes of sites evolving at different rates.

Cavender's Invariants, Their Testing and Assumptions

Cavender's invariants (Cavender and Felsenstein, 1987) are for the case of a character with two states. In the nucleic acid case we can classify nucleotides into two states, R and Y (Purine and Pyrimidine) and then use the two-state results. Cavender starts, as before, with the pattern frequencies. Coding purines as R and pyrimidines as Y, the patterns types are RRRR, RRRY, and so on until YYYY, a total of 16 types. Cavender found quadratic functions of the expected frequencies of these 16 types that were expected to be zero under a given phylogeny, irrespective of branch lengths. Two invariants (called K and L) were found for each tree topology. The L invariants are particularly easy to understand. If we have the tree topology ((A,B),(C,D)), then in the case of two symmetric states, the event that A and B have the same state should be independent of whether C and D have the same state, as the events determining these happen in different parts of the tree. We can set up a contingency table:

                                 C = D         C =/= D
                           ------------------------------
                          |
                   A = B  |   YYYY, YYRR,     YYYR, YYRY,
                          |   RRRR, RRYY      RRYR, RRRY
                          |
                 A =/= B  |   YRYY, YRRR,     YRYR, YRRY,
                          |   RYYY, RYRR      RYYR, RYRY

where "=/=" means "is not equal to." We expect that the events C = D and A = B will be independent. Cavender's L invariant for this tree topology is simply the negative of the crossproduct difference,

      P(A=/=B and C=D) P(A=B and C=/=D) - P(A=B and C=D) P(A=/=B and C=/=D).

One of these L invariants is defined for each of the three tree topologies. They can obviously be tested simply by doing a chi-square test on the contingency table. The one corresponding to the correct topology should be statistically indistinguishable from zero. Again, there is a possible multiple tests problem if all three are tested at a nominal value of 95%.

The K invariants are differences between the L invariants. When one of the tables is expected to have crossproduct difference zero, the other two are expected to be nonzero, and also to be equal. So the difference of their crossproduct differences can be taken; this is the K invariant. It is not so easily tested.

The assumptions of Cavender's invariants are different from those of Lake's. One obviously need not assume anything about the frequencies of, or transitions among, the two different purines or the two different pyrimidines. However one does need to assume independent events at each site, and one needs to assume that the Y and R states are symmetric, that the probability per unit time that a Y changes into an R is the same as the probability that an R changes into a Y, so that we expect equal frequencies of the two states. There is also an assumption that all sites are changing between these two states at the same expected rate. This assumption is not needed for Lake's invariants, since expectations of sums are equal to sums of expectations, but for Cavender's it is, since products of expectations are not equal to expectations of products.

It is helpful to have both sorts of invariants available; with further work we may appreciate what other invaraints there are for various models of nucleic acid change.

INPUT FORMAT

The input data for Dnainvar is standard. The first line of the input file contains the number of species (which must always be 4 for this version of Dnainvar) and the number of sites.

Next come the species data. Each sequence starts on a new line, has a ten-character species name that must be blank-filled to be of that length, followed immediately by the species data in the one-letter code. The sequences must either be in the "interleaved" or "sequential" formats described in the Molecular Sequence Programs document. The I option selects between them. The sequences can have internal blanks in the sequence but there must be no extra blanks at the end of the terminated line. Note that a blank is not a valid symbol for a deletion.

The options are selected using an interactive menu. The menu looks like this:


Nucleic acid sequence Invariants method, version 3.69

Settings for this run:
  W                       Sites weighted?  No
  M           Analyze multiple data sets?  No
  I          Input sequences interleaved?  Yes
  0   Terminal type (IBM PC, ANSI, none)?  ANSI
  1    Print out the data at start of run  No
  2  Print indications of progress of run  Yes
  3      Print out the counts of patterns  Yes
  4              Print out the invariants  Yes

  Y to accept these or type the letter for one to change

The user either types "Y" (followed, of course, by a carriage-return) if the settings shown are to be accepted, or the letter or digit corresponding to an option that is to be changed.

The options W, M and 0 are the usual ones. They are described in the main documentation file of this package. Option I is the same as in other molecular sequence programs and is described in the documentation file for the sequence programs.

OUTPUT FORMAT

The output consists first (if option 1 is selected) of a reprinting of the input data, then (if option 2 is on) tables of observed pattern frequencies and pattern type frequencies. A table will be printed out, in alphabetic order AAAA through TTTT of all the patterns that appear among the sites and the number of times each appears. This table will be invaluable for computation of any other invariants. There follows another table, of pattern types, using the 1234 notation, in numerical order 1111 through 1234, of the number of times each type of pattern appears. In this computation all sites at which there are any ambiguities or deletions are omitted. Cavender's invariants could actually be computed from sites that have only Y or R ambiguities; this will be done in the next release of this program.

If option 3 is on the invariants are then printed out, together with their statistical tests. For Lake's invariants the two sums which are expected to be equal are printed out, and then the result of an one-tailed exact binomial test which tests whether the difference is expected to be this positive or more. The P level is given (but remember the multiple-tests problem!).

For Cavender's L invariants the contingency tables are given. Each is tested with a one-tailed chi-square test. It is possible that the expected numbers in some categories could be too small for valid use of this test; the program does not check for this. It is also possible that the chi-square could be significant but in the wrong direction; this is not tested in the current version of the program. To check it beware of a chi-square greater than 3.841 but with a positive invariant. The invariants themselves are computed, as the difference of cross-products. Their absolute magnitudes are not important, but which one is closest to zero may be indicative. Significantly nonzero invariants should be negative if the model is valid. The K invariants, which are simply differences among the L invariants, are also printed out without any test on them being conducted. Note that it is possible to use the bootstrap utility Seqboot to create multiple data sets, and from the output from summing all of these get the empirical variability of these quadratic invariants.

PROGRAM CONSTANTS

The constants that are defined at the beginning of the program include "maxsp", which must always be 4 and should not be changed.

The program is very fast, as it has rather little work to do; these methods are just a little bit beyond the reach of hand tabulation. Execution speed should never be a limiting factor.

FUTURE OF THE PROGRAM

In a future version I hope to allow for Y and R codes in the calculation of the Cavender invariants, and to check for significantly negative cross-product differences in them, which would indicate violation of the model. By then there should be more known about invariants for larger number of species, and any such advances will also be incorporated.


TEST DATA SET

   4   13
Alpha     AACGTGGCCAAAT
Beta      AAGGTCGCCAAAC
Gamma     CATTTCGTCACAA
Delta     GGTATTTCGGCCT


TEST SET OUTPUT (run with all numerical options turned on)


Nucleic acid sequence Invariants method, version 3.69

 4 species,  13  sites

Name            Sequences
----            ---------

Alpha        AACGTGGCCA AAT
Beta         ..G..C.... ..C
Gamma        C.TT.C.T.. C.A
Delta        GGTA.TT.GG CC.



   Pattern   Number of times

     AAAC         1
     AAAG         2
     AACC         1
     AACG         1
     CCCG         1
     CCTC         1
     CGTT         1
     GCCT         1
     GGGT         1
     GGTA         1
     TCAT         1
     TTTT         1


Symmetrized patterns (1, 2 = the two purines  and  3, 4 = the two pyrimidines
                  or  1, 2 = the two pyrimidines  and  3, 4 = the two purines)

     1111         1
     1112         2
     1113         3
     1121         1
     1132         2
     1133         1
     1231         1
     1322         1
     1334         1

Tree topologies (unrooted): 

    I:  ((Alpha,Beta),(Gamma,Delta))
   II:  ((Alpha,Gamma),(Beta,Delta))
  III:  ((Alpha,Delta),(Beta,Gamma))


Lake's linear invariants
 (these are expected to be zero for the two incorrect tree topologies.
  This is tested by testing the equality of the two parts
  of each expression using a one-sided exact binomial test.
  The null hypothesis is that the first part is no larger than the second.)

 Tree                             Exact test P value    Significant?

   I      1    -     0   =     1       0.5000               no
   II     0    -     0   =     0       1.0000               no
   III    0    -     0   =     0       1.0000               no


Cavender's quadratic invariants (type L) using purines vs. pyrimidines
 (these are expected to be zero, and thus have a nonsignificant
  chi-square, for the correct tree topology)
They will be misled if there are substantially
different evolutionary rate between sites, or
different purine:pyrimidine ratios from 1:1.

  Tree I:

   Contingency Table

      2     8
      1     2

   Quadratic invariant =             4.0

   Chi-square =    0.23111 (not significant)


  Tree II:

   Contingency Table

      1     5
      1     6

   Quadratic invariant =            -1.0

   Chi-square =    0.01407 (not significant)


  Tree III:

   Contingency Table

      1     2
      6     4

   Quadratic invariant =             8.0

   Chi-square =    0.66032 (not significant)




Cavender's quadratic invariants (type K) using purines vs. pyrimidines
 (these are expected to be zero for the correct tree topology)
They will be misled if there are substantially
different evolutionary rate between sites, or
different purine:pyrimidine ratios from 1:1.
No statistical test is done on them here.

  Tree I:              -9.0
  Tree II:              4.0
  Tree III:             5.0

phylip-3.697/doc/dnaml.html0000644004732000473200000011054512406201172015274 0ustar joefelsenst_g dnaml

version 3.696

Dnaml -- DNA Maximum Likelihood program

© Copyright 1986-2014 by Joseph Felsenstein. All rights reserved. License terms here.

This program implements the maximum likelihood method for DNA sequences. The present version is faster than earlier versions of Dnaml. Details of the algorithm are published in the paper by Felsenstein and Churchill (1996). The model of base substitution allows the expected frequencies of the four bases to be unequal, allows the expected frequencies of transitions and transversions to be unequal, and has several ways of allowing different rates of evolution at different sites.

The assumptions of the present model are:

  1. Each site in the sequence evolves independently.
  2. Different lineages evolve independently.
  3. Each site undergoes substitution at an expected rate which is chosen from a series of rates (each with a probability of occurrence) which we specify.
  4. All relevant sites are included in the sequence, not just those that have changed or those that are "phylogenetically informative".
  5. A substitution consists of one of two sorts of events:
    (a)
    The first kind of event consists of the replacement of the existing base by a base drawn from a pool of purines or a pool of pyrimidines (depending on whether the base being replaced was a purine or a pyrimidine). It can lead either to no change or to a transition.
    (b)
    The second kind of event consists of the replacement of the existing base by a base drawn at random from a pool of bases at known frequencies, independently of the identity of the base which is being replaced. This could lead either to a no change, to a transition or to a transversion.

    The ratio of the two purines in the purine replacement pool is the same as their ratio in the overall pool, and similarly for the pyrimidines.

    The ratios of transitions to transversions can be set by the user. The substitution process can be diagrammed as follows: Suppose that you specified A, C, G, and T base frequencies of 0.24, 0.28, 0.27, and 0.21.

    • First kind of event:

      1. Determine whether the existing base is a purine or a pyrimidine.
      2. Draw from the proper pool:

              Purine pool:                Pyrimidine pool:
        
             |               |            |               |
             |   0.4706 A    |            |   0.5714 C    |
             |   0.5294 G    |            |   0.4286 T    |
             | (ratio is     |            | (ratio is     |
             |  0.24 : 0.27) |            |  0.28 : 0.21) |
             |_______________|            |_______________|
        

    • Second kind of event:

      Draw from the overall pool:

      
                    |                  |
                    |      0.24 A      |
                    |      0.28 C      |
                    |      0.27 G      |
                    |      0.21 T      |
                    |__________________|
      

    Note that if the existing base is, say, an A, the first kind of event has a 0.4706 probability of "replacing" it by another A. The second kind of event has a 0.24 chance of replacing it by another A. This rather disconcerting model is used because it has nice mathematical properties that make likelihood calculations far easier. A closely similar, but not precisely identical model having different rates of transitions and transversions has been used by Hasegawa et. al. (1985b). The transition probability formulas for the current model were given (with my permission) by Kishino and Hasegawa (1989). Another explanation is available in the paper by Felsenstein and Churchill (1996).

Note the assumption that we are looking at all sites, including those that have not changed at all. It is important not to restrict attention to some sites based on whether or not they have changed; doing that would bias branch lengths by making them too long, and that in turn would cause the method to misinterpret the meaning of those sites that had changed.

This program uses a Hidden Markov Model (HMM) method of inferring different rates of evolution at different sites. This was described in a paper by me and Gary Churchill (1996). It allows us to specify to the program that there will be a number of different possible evolutionary rates, what the prior probabilities of occurrence of each is, and what the average length of a patch of sites all having the same rate is. The rates can also be chosen by the program to approximate a Gamma distribution of rates, or a Gamma distribution plus a class of invariant sites. The program computes the the likelihood by summing it over all possible assignments of rates to sites, weighting each by its prior probability of occurrence.

For example, if we have used the C and A options (described below) to specify that there are three possible rates of evolution, 1.0, 2.4, and 0.0, that the prior probabilities of a site having these rates are 0.4, 0.3, and 0.3, and that the average patch length (number of consecutive sites with the same rate) is 2.0, the program will sum the likelihood over all possibilities, but give less weight to those that (say) assign all sites to rate 2.4, or that fail to have consecutive sites that have the same rate.

The Hidden Markov Model framework for rate variation among sites was independently developed by Yang (1993, 1994, 1995). We have implemented a general scheme for a Hidden Markov Model of rates; we allow the rates and their prior probabilities to be specified arbitrarily by the user, or by a discrete approximation to a Gamma distribution of rates (Yang, 1995), or by a mixture of a Gamma distribution and a class of invariant sites.

This feature effectively removes the artificial assumption that all sites have the same rate, and also means that we need not know in advance the identities of the sites that have a particular rate of evolution.

Another layer of rate variation also is available. The user can assign categories of rates to each site (for example, we might want first, second, and third codon positions in a protein coding sequence to be three different categories). This is done with the categories input file and the C option. We then specify (using the menu) the relative rates of evolution of sites in the different categories. For example, we might specify that first, second, and third positions evolve at relative rates of 1.0, 0.8, and 2.7.

If both user-assigned rate categories and Hidden Markov Model rates are allowed, the program assumes that the actual rate at a site is the product of the user-assigned category rate and the Hidden Markov Model regional rate. (This may not always make perfect biological sense: it would be more natural to assume some upper bound to the rate, as we have discussed in the Felsenstein and Churchill paper). Nevertheless you may want to use both types of rate variation.

INPUT FORMAT AND OPTIONS

Subject to these assumptions, the program is a correct maximum likelihood method. The input is fairly standard, with one addition. As usual the first line of the file gives the number of species and the number of sites.

Next come the species data. Each sequence starts on a new line, has a ten-character species name that must be blank-filled to be of that length, followed immediately by the species data in the one-letter code. The sequences must either be in the "interleaved" or "sequential" formats described in the Molecular Sequence Programs document. The I option selects between them. The sequences can have internal blanks in the sequence but there must be no extra blanks at the end of the terminated line. Note that a blank is not a valid symbol for a deletion.

The options are selected using an interactive menu. The menu looks like this:


Nucleic acid sequence Maximum Likelihood method, version 3.69

Settings for this run:
  U                 Search for best tree?  Yes
  T        Transition/transversion ratio:  2.0000
  F       Use empirical base frequencies?  Yes
  C                One category of sites?  Yes
  R           Rate variation among sites?  constant rate
  W                       Sites weighted?  No
  S        Speedier but rougher analysis?  Yes
  G                Global rearrangements?  No
  J   Randomize input order of sequences?  No. Use input order
  O                        Outgroup root?  No, use as outgroup species  1
  M           Analyze multiple data sets?  No
  I          Input sequences interleaved?  Yes
  0   Terminal type (IBM PC, ANSI, none)?  ANSI
  1    Print out the data at start of run  No
  2  Print indications of progress of run  Yes
  3                        Print out tree  Yes
  4       Write out trees onto tree file?  Yes
  5   Reconstruct hypothetical sequences?  No

  Y to accept these or type the letter for one to change

The user either types "Y" (followed, of course, by a carriage-return) if the settings shown are to be accepted, or the letter or digit corresponding to an option that is to be changed.

The options U, W, J, O, M, and 0 are the usual ones. They are described in the main documentation file of this package. Option I is the same as in other molecular sequence programs and is described in the documentation file for the sequence programs.

The T option in this program does not stand for Threshold, but instead is the Transition/transversion option. The user is prompted for a real number greater than 0.0, as the expected ratio of transitions to transversions. Note that this is not the ratio of the first to the second kinds of events, but the resulting expected ratio of transitions to transversions. The exact relationship between these two quantities depends on the frequencies in the base pools. The default value of the T parameter if you do not use the T option is 2.0.

The F (Frequencies) option is one which may save users much time. If you want to use the empirical frequencies of the bases, observed in the input sequences, as the base frequencies, you simply use the default setting of the F option. These empirical frequencies are not really the maximum likelihood estimates of the base frequencies, but they will often be close to those values (they are the maximum likelihood estimates under a "star" or "explosion" phylogeny). If you change the setting of the F option you will be prompted for the frequencies of the four bases. These must add to 1 and are to be typed on one line separated by blanks, not commas.

The R (Hidden Markov Model rates) option allows the user to approximate a Gamma distribution of rates among sites, or a Gamma distribution plus a class of invariant sites, or to specify how many categories of substitution rates there will be in a Hidden Markov Model of rate variation, and what are the rates and probabilities for each. By repeatedly selecting the R option one toggles among no rate variation, the Gamma, Gamma+I, and general HMM possibilities.

If you choose Gamma or Gamma+I the program will ask how many rate categories you want. If you have chosen Gamma+I, keep in mind that one rate category will be set aside for the invariant class and only the remaining ones used to approximate the Gamma distribution. For the approximation we do not use the quantile method of Yang (1995) but instead use a quadrature method using generalized Laguerre polynomials. This should give a good approximation to the Gamma distribution with as few as 5 or 6 categories.

In the Gamma and Gamma+I cases, the user will be asked to supply the coefficient of variation of the rate of substitution among sites. This is different from the parameters used by Nei and Jin (1990) but related to them: their parameter a is also known as "alpha", the shape parameter of the Gamma distribution. It is related to the coefficient of variation by

     CV = 1 / a1/2

or

     a = 1 / (CV)2

(their parameter b is absorbed here by the requirement that time is scaled so that the mean rate of evolution is 1 per unit time, which means that a = b). As we consider cases in which the rates are less variable we should set a larger and larger, as CV gets smaller and smaller.

If the user instead chooses the general Hidden Markov Model option, they are first asked how many HMM rate categories there will be (for the moment there is an upper limit of 9, which should not be restrictive). Then the program asks for the rates for each category. These rates are only meaningful relative to each other, so that rates 1.0, 2.0, and 2.4 have the exact same effect as rates 2.0, 4.0, and 4.8. Note that an HMM rate category can have rate of change 0, so that this allows us to take into account that there may be a category of sites that are invariant. Note that the run time of the program will be proportional to the number of HMM rate categories: twice as many categories means twice as long a run. Finally the program will ask for the probabilities of a random site falling into each of these regional rate categories. These probabilities must be nonnegative and sum to 1. The default for the program is one category, with rate 1.0 and probability 1.0 (actually the rate does not matter in that case).

If more than one HMM rate category is specified, then another option, A, becomes visible in the menu. This allows us to specify that we want to assume that sites that have the same HMM rate category are expected to be clustered so that there is autocorrelation of rates. The program asks for the value of the average patch length. This is an expected length of patches that have the same rate. If it is 1, the rates of successive sites will be independent. If it is, say, 10.25, then the chance of change to a new rate will be 1/10.25 after every site. However the "new rate" is randomly drawn from the mix of rates, and hence could even be the same. So the actual observed length of patches with the same rate will be a bit larger than 10.25. Note below that if you choose multiple patches, there will be an estimate in the output file as to which combination of rate categories contributed most to the likelihood.

Note that the autocorrelation scheme we use is somewhat different from Yang's (1995) autocorrelated Gamma distribution. I am unsure whether this difference is of any importance -- our scheme is chosen for the ease with which it can be implemented.

The C option allows user-defined rate categories. The user is prompted for the number of user-defined rates, and for the rates themselves, which cannot be negative but can be zero. These numbers, which must be nonnegative (some could be 0), are defined relative to each other, so that if rates for three categories are set to 1 : 3 : 2.5 this would have the same meaning as setting them to 2 : 6 : 5. The assignment of rates to sites is then made by reading a file whose default name is "categories". It should contain a string of digits 1 through 9. A new line or a blank can occur after any character in this string. Thus the categories file might look like this:

122231111122411155
1155333333444

With the current options R, A, and C the program has gained greatly in its ability to infer different rates at different sites and estimate phylogenies under a more realistic model. Note that Likelihood Ratio Tests can be used to test whether one combination of rates is significantly better than another, provided one rate scheme represents a restriction of another with fewer parameters. The number of parameters needed for rate variation is the number of regional rate categories, plus the number of user-defined rate categories less 2, plus one if the regional rate categories have a nonzero autocorrelation.

The G (global search) option causes, after the last species is added to the tree, each possible group to be removed and re-added. This improves the result, since the position of every species is reconsidered. It approximately triples the run-time of the program.

The User tree (option U) is read from a file whose default name is intree. The trees can be multifurcating.

If the U (user tree) option is chosen another option appears in the menu, the L option. If it is selected, it signals the program that it should take any branch lengths that are in the user tree and simply evaluate the likelihood of that tree, without further altering those branch lengths. This means that if some branches have lengths and others do not, the program will estimate the lengths of those that do not have lengths given in the user tree. Note that the program Retree can be used to add and remove lengths from a tree.

The U option can read a multifurcating tree. This allows us to test the hypothesis that a certain branch has zero length (we can also do this by using Retree to set the length of that branch to 0.0 when it is present in the tree). By doing a series of runs with different specified lengths for a branch we can plot a likelihood curve for its branch length while allowing all other branches to adjust their lengths to it. If all branches have lengths specified, none of them will be iterated. This is useful to allow a tree produced by another method to have its likelihood evaluated. The L option has no effect and does not appear in the menu if the U option is not used.

The W (Weights) option is invoked in the usual way, with only weights 0 and 1 allowed. It selects a set of sites to be analyzed, ignoring the others. The sites selected are those with weight 1. If the W option is not invoked, all sites are analyzed. The Weights (W) option takes the weights from a file whose default name is "weights". The weights follow the format described in the main documentation file.

The M (multiple data sets) option will ask you whether you want to use multiple sets of weights (from the weights file) or multiple data sets from the input file. The ability to use a single data set with multiple weights means that much less disk space will be used for this input data. The bootstrapping and jackknifing tool Seqboot has the ability to create a weights file with multiple weights. Note also that when we use multiple weights for bootstrapping we can also then maintain different rate categories for different sites in a meaningful way. If you use the multiple data sets option rather than multiple weights, you should not at the same time use the user-defined rate categories option (option C), because the user-defined rate categories could then be associated with the wrong sites. This is not a concern when the M option is used by using multiple weights.

The algorithm used for searching among trees is faster than it was in version 3.5, thanks to using a technique invented by David Swofford and J. S. Rogers. This involves not iterating most branch lengths on most trees when searching among tree topologies, This is of necessity a "quick-and-dirty" search but it saves much time. There is a menu option (option S) which can turn off this search and revert to the earlier search method which iterated branch lengths in all topologies. This will be substantially slower but will also be a bit more likely to find the tree topology of highest likelihood.

OUTPUT FORMAT

The output starts by giving the number of species, the number of sites, and the base frequencies for A, C, G, and T that have been specified. It then prints out the transition/transversion ratio that was specified or used by default. It also uses the base frequencies to compute the actual transition/transversion ratio implied by the parameter.

If the R (HMM rates) option is used a table of the relative rates of expected substitution at each category of sites is printed, as well as the probabilities of each of those rates.

There then follow the data sequences, if the user has selected the menu option to print them out, with the base sequences printed in groups of ten bases along the lines of the Genbank and EMBL formats. The trees found are printed as an unrooted tree topology (possibly rooted by outgroup if so requested). The internal nodes are numbered arbitrarily for the sake of identification. The number of trees evaluated so far and the log likelihood of the tree are also given. Note that the trees printed out have a trifurcation at the base. The branch lengths in the diagram are roughly proportional to the estimated branch lengths, except that very short branches are printed out at least three characters in length so that the connections can be seen.

A table is printed showing the length of each tree segment (in units of expected nucleotide substitutions per site), as well as (very) rough confidence limits on their lengths. If a confidence limit is negative, this indicates that rearrangement of the tree in that region is not excluded, while if both limits are positive, rearrangement is still not necessarily excluded because the variance calculation on which the confidence limits are based results in an underestimate, which makes the confidence limits too narrow.

In addition to the confidence limits, the program performs a crude Likelihood Ratio Test (LRT) for each branch of the tree. The program computes the ratio of likelihoods with and without this branch length forced to zero length. This is done by comparing the likelihoods changing only that branch length. A truly correct LRT would force that branch length to zero and also allow the other branch lengths to adjust to that. The result would be a likelihood ratio closer to 1. Therefore the present LRT will err on the side of being too significant. YOU ARE WARNED AGAINST TAKING IT TOO SERIOUSLY. If you want to get a better likelihood curve for a branch length you can do multiple runs with different prespecified lengths for that branch, as discussed above in the discussion of the L option.

One should also realize that if you are looking not at a previously-chosen branch but at all branches, that you are seeing the results of multiple tests. With 20 tests, one is expected to reach significance at the P = .05 level purely by chance. You should therefore use a much more conservative significance level, such as .05 divided by the number of tests. The significance of these tests is shown by printing asterisks next to the confidence interval on each branch length. It is important to keep in mind that both the confidence limits and the tests are very rough and approximate, and probably indicate more significance than they should. Nevertheless, maximum likelihood is one of the few methods that can give you any indication of its own error; most other methods simply fail to warn the user that there is any error! (In fact, whole philosophical schools of taxonomists exist whose main point seems to be that there isn't any error, that the "most parsimonious" tree is the best tree by definition and that's that).

The log likelihood printed out with the final tree can be used to perform various likelihood ratio tests. One can, for example, compare runs with different values of the expected transition/transversion ratio to determine which value is the maximum likelihood estimate, and what is the allowable range of values (using a likelihood ratio test, which you will find described in mathematical statistics books). One could also estimate the base frequencies in the same way. Both of these, particularly the latter, require multiple runs of the program to evaluate different possible values, and this might get expensive.

If the U (User Tree) option is used and more than one tree is supplied, and the program is not told to assume autocorrelation between the rates at different sites, the program also performs a statistical test of each of these trees against the one with highest likelihood. If there are two user trees, the test done is one which is due to Kishino and Hasegawa (1989), a version of a test originally introduced by Templeton (1983). In this implementation it uses the mean and variance of log-likelihood differences between trees, taken across sites. If the two trees' means are more than 1.96 standard deviations different then the trees are declared significantly different. This use of the empirical variance of log-likelihood differences is more robust and nonparametric than the classical likelihood ratio test, and may to some extent compensate for the any lack of realism in the model underlying this program.

If there are more than two trees, the test done is an extension of the KHT test, due to Shimodaira and Hasegawa (1999). They pointed out that a correction for the number of trees was necessary, and they introduced a resampling method to make this correction. In the version used here the variances and covariances of the sum of log likelihoods across sites are computed for all pairs of trees. To test whether the difference between each tree and the best one is larger than could have been expected if they all had the same expected log-likelihood, log-likelihoods for all trees are sampled with these covariances and equal means (Shimodaira and Hasegawa's "least favorable hypothesis"), and a P value is computed from the fraction of times the difference between the tree's value and the highest log-likelihood exceeds that actually observed. Note that this sampling needs random numbers, and so the program will prompt the user for a random number seed if one has not already been supplied. With the two-tree KHT test no random numbers are used.

In either the KHT or the SH test the program prints out a table of the log-likelihoods of each tree, the differences of each from the highest one, the variance of that quantity as determined by the log-likelihood differences at individual sites, and a conclusion as to whether that tree is or is not significantly worse than the best one. However the test is not available if we assume that there is autocorrelation of rates at neighboring sites (option A) and is not done in those cases.

The branch lengths printed out are scaled in terms of expected numbers of substitutions, counting both transitions and transversions but not replacements of a base by itself, and scaled so that the average rate of change, averaged over all sites analyzed, is set to 1.0 if there are multiple categories of sites. This means that whether or not there are multiple categories of sites, the expected fraction of change for very small branches is equal to the branch length. Of course, when a branch is twice as long this does not mean that there will be twice as much net change expected along it, since some of the changes occur in the same site and overlie or even reverse each other. The branch length estimates here are in terms of the expected underlying numbers of changes. That means that a branch of length 0.26 is 26 times as long as one which would show a 1% difference between the nucleotide sequences at the beginning and end of the branch. But we would not expect the sequences at the beginning and end of the branch to be 26% different, as there would be some overlaying of changes.

Confidence limits on the branch lengths are also given. Of course a negative value of the branch length is meaningless, and a confidence limit overlapping zero simply means that the branch length is not necessarily significantly different from zero. Because of limitations of the numerical algorithm, branch length estimates of zero will often print out as small numbers such as 0.00001. If you see a branch length that small, it is really estimated to be of zero length. Note that versions 2.7 and earlier of this program printed out the branch lengths in terms of expected probability of change, so that they were scaled differently.

Another possible source of confusion is the existence of negative values for the log likelihood. This is not really a problem; the log likelihood is not a probability but the logarithm of a probability. When it is negative it simply means that the corresponding probability is less than one (since we are seeing its logarithm). The log likelihood is maximized by being made more positive: -30.23 is worse than -29.14.

At the end of the output, if the R option is in effect with multiple HMM rates, the program will print a list of what site categories contributed the most to the final likelihood. This combination of HMM rate categories need not have contributed a majority of the likelihood, just a plurality. Still, it will be helpful as a view of where the program infers that the higher and lower rates are. Note that the use in this calculation of the prior probabilities of different rates, and the average patch length, gives this inference a "smoothed" appearance: some other combination of rates might make a greater contribution to the likelihood, but be discounted because it conflicts with this prior information. See the example output below to see what this printout of rate categories looks like. A second list will also be printed out, showing for each site which rate accounted for the highest fraction of the likelihood. If the fraction of the likelihood accounted for is less than 95%, a dot is printed instead.

Option 3 in the menu controls whether the tree is printed out into the output file. This is on by default, and usually you will want to leave it this way. However for runs with multiple data sets such as bootstrapping runs, you will primarily be interested in the trees which are written onto the output tree file, rather than the trees printed on the output file. To keep the output file from becoming too large, it may be wisest to use option 3 to prevent trees being printed onto the output file.

Option 4 in the menu controls whether the tree estimated by the program is written onto a tree file. The default name of this output tree file is "outtree". If the U option is in effect, all the user-defined trees are written to the output tree file.

Option 5 in the menu controls whether ancestral states are estimated at each node in the tree. If it is in effect, a table of ancestral sequences is printed out (including the sequences in the tip species which are the input sequences). In that table, if a site has a base which accounts for more than 95% of the likelihood, it is printed in capital letters (A rather than a). If the best nucleotide accounts for less than 50% of the likelihood, the program prints out an ambiguity code (such as M for "A or C") for the set of nucleotides which, taken together, account for more than half of the likelihood. The ambiguity codes are listed in the sequence programs documentation file. One limitation of the current version of the program is that when there are multiple HMM rates (option R) the reconstructed nucleotides are based on only the single assignment of rates to sites which accounts for the largest amount of the likelihood. Thus the assessment of 95% of the likelihood, in tabulating the ancestral states, refers to 95% of the likelihood that is accounted for by that particular combination of rates.

PROGRAM CONSTANTS

The constants defined at the beginning of the program include "maxtrees", the maximum number of user trees that can be processed. It is small (100) at present to save some further memory but the cost of increasing it is not very great. Other constants include "maxcategories", the maximum number of site categories, "namelength", the length of species names in characters, and three others, "smoothings", "iterations", and "epsilon", that help "tune" the algorithm and define the compromise between execution speed and the quality of the branch lengths found by iteratively maximizing the likelihood. Reducing iterations and smoothings, and increasing epsilon, will result in faster execution but a worse result. These values will not usually have to be changed.

The program spends most of its time doing real arithmetic. The algorithm, with separate and independent computations occurring for each pattern, lends itself readily to parallel processing.

PAST AND FUTURE OF THE PROGRAM

This program, which in version 2.6 replaced the old version of Dnaml, is not derived directly from it but instead was developed by modifying Contml, with which it shares many of its data structures and much of its strategy. It was speeded up by two major developments, the use of aliasing of nucleotide sites (version 3.1) and pretabulation of some exponentials (added by Akiko Fuseki in version 3.4). In version 3.5 the Hidden Markov Model code was added and the method of iterating branch lengths was changed from an EM algorithm to direct search. The Hidden Markov Model code slows things down, especially if there is autocorrelation between sites, so this version is slower than version 3.4. Nevertheless we hope that the sacrifice is worth it.

One change that is needed in the future is to put in some way of allowing for base composition of nucleotide sequences in different parts of the phylogeny.


TEST DATA SET

   5   13
Alpha     AACGTGGCCAAAT
Beta      AAGGTCGCCAAAC
Gamma     CATTTCGTCACAA
Delta     GGTATTTCGGCCT
Epsilon   GGGATCTCGGCCC


CONTENTS OF OUTPUT FILE (with all numerical options on)

(It was run with HMM rates having gamma-distributed rates approximated by 5 rate categories, with coefficient of variation of rates 1.0, and with patch length parameter = 1.5. Two user-defined rate categories were used, one for the first 6 sites, the other for the last 7, with rates 1.0 : 2.0. Weights were used, with sites 1 and 13 given weight 0, and all others weight 1.)


Nucleic acid sequence Maximum Likelihood method, version 3.69

 5 species,  13  sites

    Site categories are:

             1111112222 222


    Sites are weighted as follows:

             01111 11111 110


Name            Sequences
----            ---------

Alpha        AACGTGGCCA AAT
Beta         ..G..C.... ..C
Gamma        C.TT.C.T.. C.A
Delta        GGTA.TT.GG CC.
Epsilon      GGGA.CT.GG CCC



Empirical Base Frequencies:

   A       0.23636
   C       0.29091
   G       0.25455
  T(U)     0.21818


Transition/transversion ratio =   2.000000


Discrete approximation to gamma distributed rates
 Coefficient of variation of rates = 1.000000  (alpha = 1.000000)

State in HMM    Rate of change    Probability

        1           0.264            0.522
        2           1.413            0.399
        3           3.596            0.076
        4           7.086            0.0036
        5          12.641            0.000023

Expected length of a patch of sites having the same rate =    1.500


Site category   Rate of change

        1           1.000
        2           2.000



  +Beta      
  |  
  |                                                  +Epsilon   
  |   +----------------------------------------------3  
  1---2                                              +-Delta     
  |   |  
  |   +--Gamma     
  |  
  +-Alpha     


remember: this is an unrooted tree!

Ln Likelihood =   -58.41388

 Between        And            Length      Approx. Confidence Limits
 -------        ---            ------      ------- ---------- ------

     1          Alpha             0.32320     (     zero,     0.93246) **
     1          Beta              0.02699     (     zero,     0.49959)
     1             2              0.65789     (     zero,     2.29501)
     2             3              7.11637     (     zero,    20.73855) **
     3          Epsilon           0.00006     (     zero,     0.52703)
     3          Delta             0.30602     (     zero,     0.83268) **
     2          Gamma             0.43465     (     zero,     2.10073)

     *  = significantly positive, P < 0.05
     ** = significantly positive, P < 0.01

Combination of categories that contributes the most to the likelihood:

             1122121111 111

Most probable category at each site if > 0.95 probability ("." otherwise)

             .......... ...

Probable sequences at interior nodes:

  node       Reconstructed sequence (caps if > 0.95)

    1        .AgGTCGCCA AA.
 Beta        AAGGTCGCCA AAC
    2        .AkkTcGtCA cA.
    3        .GGATCTCGG CC.
 Epsilon     GGGATCTCGG CCC
 Delta       GGTATTTCGG CCT
 Gamma       CATTTCGTCA CAA
 Alpha       AACGTGGCCA AAT

phylip-3.697/doc/dnamlk.html0000644004732000473200000010454612406201172015453 0ustar joefelsenst_g dnamlk

version 3.696

Dnamlk -- DNA Maximum Likelihood program
with molecular clock

© Copyright 1986-2014 by Joseph Felsenstein. All rights reserved. License terms here.

This program implements the maximum likelihood method for DNA sequences under the constraint that the trees estimated must be consistent with a molecular clock. The molecular clock is the assumption that the tips of the tree are all equidistant, in branch length, from its root. This program is indirectly related to Dnaml. Details of the algorithm are not yet published, but many aspects of it are similar to Dnaml, and these are published in the paper by Felsenstein and Churchill (1996). The model of base substitution allows the expected frequencies of the four bases to be unequal, allows the expected frequencies of transitions and transversions to be unequal, and has several ways of allowing different rates of evolution at different sites.

The assumptions of the model are:

  1. Each site in the sequence evolves independently.
  2. Different lineages evolve independently.
  3. There is a molecular clock.
  4. Each site undergoes substitution at an expected rate which is chosen from a series of rates (each with a probability of occurrence) which we specify.
  5. All relevant sites are included in the sequence, not just those that have changed or those that are "phylogenetically informative".
  6. A substitution consists of one of two sorts of events:
    (a)
    The first kind of event consists of the replacement of the existing base by a base drawn from a pool of purines or a pool of pyrimidines (depending on whether the base being replaced was a purine or a pyrimidine). It can lead either to no change or to a transition.
    (b)
    The second kind of event consists of the replacement of the existing base by a base drawn at random from a pool of bases at known frequencies, independently of the identity of the base which is being replaced. This could lead either to a no change, to a transition or to a transversion.

    The ratio of the two purines in the purine replacement pool is the same as their ratio in the overall pool, and similarly for the pyrimidines.

    The ratios of transitions to transversions can be set by the user. The substitution process can be diagrammed as follows: Suppose that you specified A, C, G, and T base frequencies of 0.24, 0.28, 0.27, and 0.21.

    • First kind of event:

      1. Determine whether the existing base is a purine or a pyrimidine.
      2. Draw from the proper pool:

              Purine pool:                Pyrimidine pool:
        
             |               |            |               |
             |   0.4706 A    |            |   0.5714 C    |
             |   0.5294 G    |            |   0.4286 T    |
             | (ratio is     |            | (ratio is     |
             |  0.24 : 0.27) |            |  0.28 : 0.21) |
             |_______________|            |_______________|
        

    • Second kind of event:

      Draw from the overall pool:

      
                    |                  |
                    |      0.24 A      |
                    |      0.28 C      |
                    |      0.27 G      |
                    |      0.21 T      |
                    |__________________|
      

    Note that if the existing base is, say, an A, the first kind of event has a 0.4706 probability of "replacing" it by another A. The second kind of event has a 0.24 chance of replacing it by another A. This rather disconcerting model is used because it has nice mathematical properties that make likelihood calculations far easier. A closely similar, but not precisely identical model having different rates of transitions and transversions has been used by Hasegawa et. al. (1985b). The transition probability formulas for the current model were given (with my permission) by Kishino and Hasegawa (1989). Another explanation is available in the paper by Felsenstein and Churchill (1996).

Note the assumption that we are looking at all sites, including those that have not changed at all. It is important not to restrict attention to some sites based on whether or not they have changed; doing that would bias branch lengths by making them too long, and that in turn would cause the method to misinterpret the meaning of those sites that had changed.

This program uses a Hidden Markov Model (HMM) method of inferring different rates of evolution at different sites. This was described in a paper by me and Gary Churchill (1996). It allows us to specify to the program that there will be a number of different possible evolutionary rates, what the prior probabilities of occurrence of each is, and what the average length of a patch of sites all having the same rate is. The rates can also be chosen by the program to approximate a Gamma distribution of rates, or a Gamma distribution plus a class of invariant sites. The program computes the the likelihood by summing it over all possible assignments of rates to sites, weighting each by its prior probability of occurrence.

For example, if we have used the C and A options (described below) to specify that there are three possible rates of evolution, 1.0, 2.4, and 0.0, that the prior probabilities of a site having these rates are 0.4, 0.3, and 0.3, and that the average patch length (number of consecutive sites with the same rate) is 2.0, the program will sum the likelihood over all possibilities, but give less weight to those that (say) assign all sites to rate 2.4, or that fail to have consecutive sites that have the same rate.

The Hidden Markov Model framework for rate variation among sites was independently developed by Yang (1993, 1994, 1995). We have implemented a general scheme for a Hidden Markov Model of rates; we allow the rates and their prior probabilities to be specified arbitrarily by the user, or by a discrete approximation to a Gamma distribution of rates (Yang, 1995), or by a mixture of a Gamma distribution and a class of invariant sites.

This feature effectively removes the artificial assumption that all sites have the same rate, and also means that we need not know in advance the identities of the sites that have a particular rate of evolution.

Another layer of rate variation also is available. The user can assign categories of rates to each site (for example, we might want first, second, and third codon positions in a protein coding sequence to be three different categories). This is done with the categories input file and the C option. We then specify (using the menu) the relative rates of evolution of sites in the different categories. For example, we might specify that first, second, and third positions evolve at relative rates of 1.0, 0.8, and 2.7.

If both user-assigned rate categories and Hidden Markov Model rates are allowed, the program assumes that the actual rate at a site is the product of the user-assigned category rate and the Hidden Markov Model regional rate. (This may not always make perfect biological sense: it would be more natural to assume some upper bound to the rate, as we have discussed in the Felsenstein and Churchill paper). Nevertheless you may want to use both types of rate variation.

INPUT FORMAT AND OPTIONS

Subject to these assumptions, the program is a correct maximum likelihood method. The input is fairly standard, with one addition. As usual the first line of the file gives the number of species and the number of sites.

Next come the species data. Each sequence starts on a new line, has a ten-character species name that must be blank-filled to be of that length, followed immediately by the species data in the one-letter code. The sequences must either be in the "interleaved" or "sequential" formats described in the Molecular Sequence Programs document. The I option selects between them. The sequences can have internal blanks in the sequence but there must be no extra blanks at the end of the terminated line. Note that a blank is not a valid symbol for a deletion.

The options are selected using an interactive menu. The menu looks like this:


Nucleic acid sequence
   Maximum Likelihood method with molecular clock, version 3.69

Settings for this run:
  U                 Search for best tree?  Yes
  T        Transition/transversion ratio:  2.0
  F       Use empirical base frequencies?  Yes
  C   One category of substitution rates?  Yes
  R           Rate variation among sites?  constant rate
  G                Global rearrangements?  No
  W                       Sites weighted?  No
  J   Randomize input order of sequences?  No. Use input order
  M           Analyze multiple data sets?  No
  I          Input sequences interleaved?  Yes
  0   Terminal type (IBM PC, ANSI, none)?  ANSI
  1    Print out the data at start of run  No
  2  Print indications of progress of run  Yes
  3                        Print out tree  Yes
  4       Write out trees onto tree file?  Yes
  5   Reconstruct hypothetical sequences?  No

Are these settings correct? (type Y or the letter for one to change)

The user either types "Y" (followed, of course, by a carriage-return) if the settings shown are to be accepted, or the letter or digit corresponding to an option that is to be changed.

The options U, W, J, O, M, and 0 are the usual ones. They are described in the main documentation file of this package. Option I is the same as in other molecular sequence programs and is described in the documentation file for the sequence programs.

The T option in this program does not stand for Threshold, but instead is the Transition/transversion option. The user is prompted for a real number greater than 0.0, as the expected ratio of transitions to transversions. Note that this is not the ratio of the first to the second kinds of events, but the resulting expected ratio of transitions to transversions. The exact relationship between these two quantities depends on the frequencies in the base pools. The default value of the T parameter if you do not use the T option is 2.0.

The F (Frequencies) option is one which may save users much time. If you want to use the empirical frequencies of the bases, observed in the input sequences, as the base frequencies, you simply use the default setting of the F option. These empirical frequencies are not really the maximum likelihood estimates of the base frequencies, but they will often be close to those values (they are maximum likelihood estimates under a "star" or "explosion" phylogeny). If you change the setting of the F option you will be prompted for the frequencies of the four bases. These must add to 1 and are to be typed on one line separated by blanks, not commas.

The R (Hidden Markov Model rates) option allows the user to approximate a Gamma distribution of rates among sites, or a Gamma distribution plus a class of invariant sites, or to specify how many categories of substitution rates there will be in a Hidden Markov Model of rate variation, and what are the rates and probabilities for each. By repeatedly selecting the R option one toggles among no rate variation, the Gamma, Gamma+I, and general HMM possibilities.

If you choose Gamma or Gamma+I the program will ask how many rate categories you want. If you have chosen Gamma+I, keep in mind that one rate category will be set aside for the invariant class and only the remaining ones used to approximate the Gamma distribution. For the approximation we do not use the quantile method of Yang (1995) but instead use a quadrature method using generalized Laguerre polynomials. This should give a good approximation to the Gamma distribution with as few as 5 or 6 categories.

In the Gamma and Gamma+I cases, the user will be asked to supply the coefficient of variation of the rate of substitution among sites. This is different from the parameters used by Nei and Jin (1990) but related to them: their parameter a is also known as "alpha", the shape parameter of the Gamma distribution. It is related to the coefficient of variation by

     CV = 1 / a1/2

or

     a = 1 / (CV)2

(their parameter b is absorbed here by the requirement that time is scaled so that the mean rate of evolution is 1 per unit time, which means that a = b). As we consider cases in which the rates are less variable we should set a larger and larger, as CV gets smaller and smaller.

If the user instead chooses the general Hidden Markov Model option, they are first asked how many HMM rate categories there will be (for the moment there is an upper limit of 9, which should not be restrictive). Then the program asks for the rates for each category. These rates are only meaningful relative to each other, so that rates 1.0, 2.0, and 2.4 have the exact same effect as rates 2.0, 4.0, and 4.8. Note that an HMM rate category can have rate of change 0, so that this allows us to take into account that there may be a category of sites that are invariant. Note that the run time of the program will be proportional to the number of HMM rate categories: twice as many categories means twice as long a run. Finally the program will ask for the probabilities of a random site falling into each of these regional rate categories. These probabilities must be nonnegative and sum to 1. The default for the program is one category, with rate 1.0 and probability 1.0 (actually the rate does not matter in that case).

If more than one category is specified, then another option, A, becomes visible in the menu. This allows us to specify that we want to assume that sites that have the same HMM rate category are expected to be clustered so that there is autocorrelation of rates. The program asks for the value of the average patch length. This is an expected length of patches that have the same rate. If it is 1, the rates of successive sites will be independent. If it is, say, 10.25, then the chance of change to a new rate will be 1/10.25 after every site. However the "new rate" is randomly drawn from the mix of rates, and hence could even be the same. So the actual observed length of patches with the same rate will be a bit larger than 10.25. Note below that if you choose multiple patches, there will be an estimate in the output file as to which combination of rate categories contributed most to the likelihood.

Note that the autocorrelation scheme we use is somewhat different from Yang's (1995) autocorrelated Gamma distribution. I am unsure whether this difference is of any importance -- our scheme is chosen for the ease with which it can be implemented.

The C option allows user-defined rate categories. The user is prompted for the number of user-defined rates, and for the rates themselves, which cannot be negative but can be zero. These numbers, which must be nonnegative (some could be 0), are defined relative to each other, so that if rates for three categories are set to 1 : 3 : 2.5 this would have the same meaning as setting them to 2 : 6 : 5. The assignment of rates to sites is then made by reading a file whose default name is "categories". It should contain a string of digits 1 through 9. A new line or a blank can occur after any character in this string. Thus the categories file might look like this:

122231111122411155
1155333333444

With the current options R, A, and C the program has gained greatly in its ability to infer different rates at different sites and estimate phylogenies under a more realistic model. Note that Likelihood Ratio Tests can be used to test whether one combination of rates is significantly better than another, provided one rate scheme represents a restriction of another with fewer parameters. The number of parameters needed for rate variation is the number of regional rate categories, plus the number of user-defined rate categories less 2, plus one if the regional rate categories have a nonzero autocorrelation.

The G (global search) option causes, after the last species is added to the tree, each possible group to be removed and re-added. This improves the result, since the position of every species is reconsidered. It approximately triples the run-time of the program.

The User tree (option U) is read from a file whose default name is intree. The trees can be multifurcating. This allows us to test the hypothesis that a given branch has zero length.

If the U (user tree) option is chosen another option appears in the menu, the L option. If it is selected, it signals the program that it should take any branch lengths that are in the user tree and simply evaluate the likelihood of that tree, without further altering those branch lengths. In the case of a clock, if some branches have lengths and others do not, the program does not estimate the lengths of those that do not have lengths given in the user tree. If any of the branches do not have lengths, the program re-estimates the lengths of all of them. This is done because estimating some and not others is hard in the case of a clock.

The W (Weights) option is invoked in the usual way, with only weights 0 and 1 allowed. It selects a set of sites to be analyzed, ignoring the others. The sites selected are those with weight 1. If the W option is not invoked, all sites are analyzed. The Weights (W) option takes the weights from a file whose default name is "weights". The weights follow the format described in the main documentation file.

The M (multiple data sets) option will ask you whether you want to use multiple sets of weights (from the weights file) or multiple data sets from the input file. The ability to use a single data set with multiple weights means that much less disk space will be used for this input data. The bootstrapping and jackknifing tool Seqboot has the ability to create a weights file with multiple weights. Note also that when we use multiple weights for bootstrapping we can also then maintain different rate categories for different sites in a meaningful way. If you use the multiple data sets option rather than multiple weights, you should not at the same time use the user-defined rate categories option (option C), because the user-defined rate categories could then be associated with the wrong sites. This is not a concern when the M option is used by using multiple weights.

The algorithm used for searching among trees is faster than it was in version 3.5, thanks to using a technique invented by David Swofford and J. S. Rogers. This involves not iterating most branch lengths on most trees when searching among tree topologies, This is of necessity a "quick-and-dirty" search but it saves much time.

OUTPUT FORMAT

The output starts by giving the number of species, the number of sites, and the base frequencies for A, C, G, and T that have been specified. It then prints out the transition/transversion ratio that was specified or used by default. It also uses the base frequencies to compute the actual transition/transversion ratio implied by the parameter.

If the R (HMM rates) option is used a table of the relative rates of expected substitution at each category of sites is printed, as well as the probabilities of each of those rates.

There then follow the data sequences, if the user has selected the menu option to print them out, with the base sequences printed in groups of ten bases along the lines of the Genbank and EMBL formats. The trees found are printed as a rooted tree topology. The internal nodes are numbered arbitrarily for the sake of identification. The number of trees evaluated so far and the log likelihood of the tree are also given. The branch lengths in the diagram are roughly proportional to the estimated branch lengths, except that very short branches are printed out at least three characters in length so that the connections can be seen.

A table is printed showing the length of each tree segment, and the time (in units of expected nucleotide substitutions per site) of each fork in the tree, measured from the root of the tree. I have not attempted to include code for approximate confidence limits on branch points, as I have done for branch lengths in Dnaml, both because of the extreme crudeness of that test, and because the variation of times for different forks would be highly correlated.

The log likelihood printed out with the final tree can be used to perform various likelihood ratio tests. One can, for example, compare runs with different values of the expected transition/transversion ratio to determine which value is the maximum likelihood estimate, and what is the allowable range of values (using a likelihood ratio test, which you will find described in mathematical statistics books). One could also estimate the base frequencies in the same way. Both of these, particularly the latter, require multiple runs of the program to evaluate different possible values, and this might get expensive.

This program makes possible a (reasonably) legitimate statistical test of the molecular clock. To do such a test, run Dnaml and Dnamlk on the same data. If the trees obtained are of the same topology (when considered as unrooted), it is legitimate to compare their likelihoods by the likelihood ratio test. In Dnaml the likelihood has been computed by estimating 2n-3 branch lengths, if there are n tips on the tree. In Dnamlk it has been computed by estimating n-1 branching times (in effect, n-1 branch lengths). The difference in the number of parameters is (2n-3)-(n-1) = n-2. To perform the test take the difference in log likelihoods between the two runs (Dnaml should be the higher of the two, barring numerical iteration difficulties) and double it. Look this up on a chi-square distribution with n-2 degrees of freedom. If the result is significant, the log likelihood has been significantly increased by allowing all 2n-3 branch lengths to be estimated instead of just n-1, and molecular clock may be rejected.

If the U (User Tree) option is used and more than one tree is supplied, and the program is not told to assume autocorrelation between the rates at different sites, the program also performs a statistical test of each of these trees against the one with highest likelihood. If there are two user trees, the test done is one which is due to Kishino and Hasegawa (1989), a version of a test originally introduced by Templeton (1983). In this implementation it uses the mean and variance of log-likelihood differences between trees, taken across sites. If the two trees' means are more than 1.96 standard deviations different then the trees are declared significantly different. This use of the empirical variance of log-likelihood differences is more robust and nonparametric than the classical likelihood ratio test, and may to some extent compensate for the any lack of realism in the model underlying this program.

If there are more than two trees, the test done is an extension of the KHT test, due to Shimodaira and Hasegawa (1999). They pointed out that a correction for the number of trees was necessary, and they introduced a resampling method to make this correction. In the version used here the variances and covariances of the sum of log likelihoods across sites are computed for all pairs of trees. To test whether the difference between each tree and the best one is larger than could have been expected if they all had the same expected log-likelihood, log-likelihoods for all trees are sampled with these covariances and equal means (Shimodaira and Hasegawa's "least favorable hypothesis"), and a P value is computed from the fraction of times the difference between the tree's value and the highest log-likelihood exceeds that actually observed. Note that this sampling needs random numbers, and so the program will prompt the user for a random number seed if one has not already been supplied. With the two-tree KHT test no random numbers are used.

In either the KHT or the SH test the program prints out a table of the log-likelihoods of each tree, the differences of each from the highest one, the variance of that quantity as determined by the log-likelihood differences at individual sites, and a conclusion as to whether that tree is or is not significantly worse than the best one. However the test is not available if we assume that there is autocorrelation of rates at neighboring sites (option A) and is not done in those cases.

The branch lengths printed out are scaled in terms of expected numbers of substitutions, counting both transitions and transversions but not replacements of a base by itself, and scaled so that the average rate of change, averaged over all sites analyzed, is set to 1.0 if there are multiple categories of sites. This means that whether or not there are multiple categories of sites, the expected fraction of change for very small branches is equal to the branch length. Of course, when a branch is twice as long this does not mean that there will be twice as much net change expected along it, since some of the changes occur in the same site and overlie or even reverse each other. The branch length estimates here are in terms of the expected underlying numbers of changes. That means that a branch of length 0.26 is 26 times as long as one which would show a 1% difference between the nucleotide sequences at the beginning and end of the branch. But we would not expect the sequences at the beginning and end of the branch to be 26% different, as there would be some overlaying of changes.

Because of limitations of the numerical algorithm, branch length estimates of zero will often print out as small numbers such as 0.00001. If you see a branch length that small, it is really estimated to be of zero length.

Another possible source of confusion is the existence of negative values for the log likelihood. This is not really a problem; the log likelihood is not a probability but the logarithm of a probability. When it is negative it simply means that the corresponding probability is less than one (since we are seeing its logarithm). The log likelihood is maximized by being made more positive: -30.23 is worse than -29.14.

At the end of the output, if the R option is in effect with multiple HMM rates, the program will print a list of what site categories contributed the most to the final likelihood. This combination of HMM rate categories need not have contributed a majority of the likelihood, just a plurality. Still, it will be helpful as a view of where the program infers that the higher and lower rates are. Note that the use in this calculation of the prior probabilities of different rates, and the average patch length, gives this inference a "smoothed" appearance: some other combination of rates might make a greater contribution to the likelihood, but be discounted because it conflicts with this prior information. See the example output below to see what this printout of rate categories looks like.

A second list will also be printed out, showing for each site which rate accounted for the highest fraction of the likelihood. If the fraction of the likelihood accounted for is less than 95%, a dot is printed instead.

Option 3 in the menu controls whether the tree is printed out into the output file. This is on by default, and usually you will want to leave it this way. However for runs with multiple data sets such as bootstrapping runs, you will primarily be interested in the trees which are written onto the output tree file, rather than the trees printed on the output file. To keep the output file from becoming too large, it may be wisest to use option 3 to prevent trees being printed onto the output file.

Option 4 in the menu controls whether the tree estimated by the program is written onto a tree file. The default name of this output tree file is "outtree". If the U option is in effect, all the user-defined trees are written to the output tree file.

Option 5 in the menu controls whether ancestral states are estimated at each node in the tree. If it is in effect, a table of ancestral sequences is printed out (including the sequences in the tip species which are the input sequences). In that table, if a site has a base which accounts for more than 95% of the likelihood, it is printed in capital letters (A rather than a). If the best nucleotide accounts for less than 50% of the likelihood, the program prints out an ambiguity code (such as M for "A or C") for the set of nucleotides which, taken together, account for more than half of the likelihood. The ambiguity codes are listed in the sequence programs documentation file. One limitation of the current version of the program is that when there are multiple HMM rates (option R) the reconstructed nucleotides are based on only the single assignment of rates to sites which accounts for the largest amount of the likelihood. Thus the assessment of 95% of the likelihood, in tabulating the ancestral states, refers to 95% of the likelihood that is accounted for by that particular combination of rates.

PROGRAM CONSTANTS

The constants defined at the beginning of the program include "maxtrees", the maximum number of user trees that can be processed. It is small (100) at present to save some further memory but the cost of increasing it is not very great. Other constants include "maxcategories", the maximum number of site categories, "namelength", the length of species names in characters, and three others, "smoothings", "iterations", and "epsilon", that help "tune" the algorithm and define the compromise between execution speed and the quality of the branch lengths found by iteratively maximizing the likelihood. Reducing iterations and smoothings, and increasing epsilon, will result in faster execution but a worse result. These values will not usually have to be changed.

The program spends most of its time doing real arithmetic. The algorithm, with separate and independent computations occurring for each pattern, lends itself readily to parallel processing.

PAST AND FUTURE OF THE PROGRAM

This program was developed in 1989 by combining code from Dnapars and from Dnaml. It was speeded up by two major developments, the use of aliasing of nucleotide sites (version 3.1) and pretabulation of some exponentials (added by Akiko Fuseki in version 3.4). In version 3.5 the Hidden Markov Model code was added and the method of iterating branch lengths was changed from an EM algorithm to direct search. The Hidden Markov Model code slows things down, especially if there is autocorrelation between sites, so this version is slower than version 3.4. Nevertheless we hope that the sacrifice is worth it.

One change that is needed in the future is to put in some way of allowing for base composition of nucleotide sequences in different parts of the phylogeny.


TEST DATA SET

   5   13
Alpha     AACGTGGCCAAAT
Beta      AAGGTCGCCAAAC
Gamma     CATTTCGTCACAA
Delta     GGTATTTCGGCCT
Epsilon   GGGATCTCGGCCC


CONTENTS OF OUTPUT FILE (with all numerical options on)

(It was run with HMM rates having gamma-distributed rates approximated by 5 rate categories, with coefficient of variation of rates 1.0, and with patch length parameter = 1.5. Two user-defined rate categories were used, one for the first 6 sites, the other for the last 7, with rates 1.0 : 2.0. Weights were used, with sites 1 and 13 given weight 0, and all others weight 1.)


Nucleic acid sequence
   Maximum Likelihood method with molecular clock, version 3.69

 5 species,  13  sites

    Site categories are:

             1111112222 222


    Sites are weighted as follows:

             01111 11111 110


Name            Sequences
----            ---------

Alpha        AACGTGGCCA AAT
Beta         ..G..C.... ..C
Gamma        C.TT.C.T.. C.A
Delta        GGTA.TT.GG CC.
Epsilon      GGGA.CT.GG CCC



Empirical Base Frequencies:

   A       0.23636
   C       0.29091
   G       0.25455
  T(U)     0.21818


Transition/transversion ratio =   2.000000


Discrete approximation to gamma distributed rates
 Coefficient of variation of rates = 1.000000  (alpha = 1.000000)

State in HMM    Rate of change    Probability

        1           0.264            0.522
        2           1.413            0.399
        3           3.596            0.076
        4           7.086            0.0036
        5          12.641            0.000023

Expected length of a patch of sites having the same rate =    1.500


Site category   Rate of change

        1           1.000
        2           2.000






                                                      +-Epsilon   
  +---------------------------------------------------4  
  !                                                   +-Delta     
--3  
  !                                             +-------Gamma     
  +---------------------------------------------2  
                                                !     +-Beta      
                                                +-----1  
                                                      +-Alpha     


Ln Likelihood =   -58.51728

 Ancestor      Node      Node Height     Length
 --------      ----      ---- ------     ------
 root            3      
   3             4          4.14820      4.14820
   4          Epsilon       4.29769      0.14949
   4          Delta         4.29769      0.14949
   3             2          3.67522      3.67522
   2          Gamma         4.29769      0.62247
   2             1          4.12429      0.44907
   1          Beta          4.29769      0.17340
   1          Alpha         4.29769      0.17340

Combination of categories that contributes the most to the likelihood:

             1122121111 111

Most probable category at each site if > 0.95 probability ("." otherwise)

             .......... ...


Probable sequences at interior nodes:

  node       Reconstructed sequence (caps if > 0.95)

    3        .ayrtykcsr cm.
    4        .GkaTctCgg Cc.
 Epsilon     GGGATCTCGG CCC
 Delta       GGTATTTCGG CCT
    2        .AykTcgtcA ca.
 Gamma       CATTTCGTCA CAA
    1        .AcgTcGCCA AA.
 Beta        AAGGTCGCCA AAC
 Alpha       AACGTGGCCA AAT

phylip-3.697/doc/dnamove.html0000644004732000473200000003564412406201172015640 0ustar joefelsenst_g dnamove

version 3.696

Dnamove - Interactive DNA parsimony

© Copyright 1986-2014 by Joseph Felsenstein. All rights reserved. License terms here.

Dnamove is an interactive DNA parsimony program, inspired by Wayne Maddison and David and Wayne Maddison's marvellous program MacClade, which is written for Macintosh computers. Dnamove reads in a data set which is prepared in almost the same format as one for the DNA parsimony program Dnapars. It allows the user to choose an initial tree, and displays this tree on the screen. The user can look at different sites and the way the nucleotide states are distributed on that tree, given the most parsimonious reconstruction of state changes for that particular tree. The user then can specify how the tree is to be rearraranged, rerooted or written out to a file. By looking at different rearrangements of the tree the user can manually search for the most parsimonious tree, and can get a feel for how different sites are affected by changes in the tree topology.

This program uses graphic characters that show the tree to best advantage on some computer systems. Its graphic characters will work best on MSDOS systems or MSDOS windows in Windows, and to any system whose screen or terminals emulate ANSI standard terminals such as old Digital VT100 terminals, Telnet programs, or VT100-compatible windows in the X windowing system. For any other screen types, (such as Macintosh windows) there is a generic option which does not make use of screen graphics characters. The program will work well in those cases, but the tree it displays will look a bit uglier.

The input data file is set up almost identically to the data files for Dnapars. The code for nucleotide sequences is the standard one, as described in the molecular sequence programs document. The user trees are contained in the input tree file which is used for input of the starting tree (if desired). The output tree file is used for the final tree.

The user interaction starts with the program presenting a menu. The menu looks like this:


Interactive DNA parsimony, version 3.69

Settings for this run:
  O                             Outgroup root?  No, use as outgroup species  1
  W                            Sites weighted?  No
  T                   Use Threshold parsimony?  No, use ordinary parsimony
  I               Input sequences interleaved?  Yes
  U   Initial tree (arbitrary, user, specify)?  Arbitrary
  0        Graphics type (IBM PC, ANSI, none)?  ANSI
  S                  Width of terminal screen?  80
  L                 Number of lines on screen?  24

Are these settings correct? (type Y or the letter for one to change)

The O (Outgroup), W (Weights), T (Threshold), and 0 (Graphics type) options are the usual ones and are described in the main documentation file. The I (Interleaved) option is the usual one and is described in the main documentation file and the molecular sequences programs documentation file. The U (initial tree) option allows the user to choose whether the initial tree is to be arbitrary, interactively specified by the user, or read from a tree file. Typing U causes the program to change among the three possibilities in turn. I would recommend that for a first run, you allow the tree to be set up arbitrarily (the default), as the "specify" choice is difficult to use and the "user tree" choice requires that you have available a tree file with the tree topology of the initial tree, which must be a rooted tree. Its default name is intree. The program will ask you for its name if it looks for the input tree file and does not find one of this name. If you wish to set up some particular tree you can also do that by the rearrangement commands specified below.

The W (Weights) option allows only weights of 0 or 1.

The T (threshold) option allows a continuum of methods between parsimony and compatibility. Thresholds less than or equal to 1.0 do not have any meaning and should not be used: they will result in a tree dependent only on the input order of species and not at all on the data!

The S (Screen width) option allows the width in characters of the display to be adjusted when more than 80 characters can be displayed on the user's screen.

The L (screen Lines) option allows the user to change the height of the screen (in lines of characters) that is assumed to be available on the display. This may be particularly helpful when displaying large trees on terminals that have more than 24 lines per screen, or on workstation or X-terminal screens that can emulate the ANSI terminals with more than 24 lines.

After the initial menu is displayed and the choices are made, the program then sets up an initial tree and displays it. Below it will be a one-line menu of possible commands, which looks like this:

NEXT? (Options: R # + - S . T U W O F C H ? X Q) (H or ? for Help) 

If you type H or ? you will get a single screen showing a description of each of these commands in a few words. Here are slightly more detailed descriptions:

R
("Rearrange")
This command asks for the number of a node which is to be removed from the tree. It and everything to the right of it on the tree is to be removed (by breaking the branch immediately below it). The command also asks for the number of a node below which that group is to be inserted. If an impossible number is given, the program refuses to carry out the rearrangement and asks for a new command. The rearranged tree is displayed: it will often have a different number of steps than the original. If you wish to undo a rearrangement, use the Undo command, for which see below.
#
This command, and the +, - and S commands described below, determine which site has its states displayed on the branches of the trees. The initial tree displayed by the program does not show states of sites. When # is typed, the program does not ask the user which site is to be shown but automatically shows the states of the next site that is not compatible with the tree (the next site that does not perfectly fit the current tree). The search for this site "wraps around" so that if it reaches the last site without finding one that is not compatible with the tree, the search continues at the first site; if no incompatible site is found the current site is shown again, and if no current site is being shown then the first site is shown. The display takes the form of different symbols or textures on the branches of the tree. The state of each branch is actually the state of the node above it. A key of the symbols or shadings used for states A, C, G, T (U) and ? are shown next to the tree. State ? means that more than one possible nucleotide could exist at that point on the tree, and that the user may want to consider the different possibilities, which are usually apparent by inspection.
+
This command is the same as # except that it goes forward one site, showing the states of the next site. If no site has been shown, using + will cause the first site to be shown. Once the last site has been reached, using + again will show the first site.

-
This command is the same as + except that it goes backwards, showing the states of the previous site. If no site has been shown, using - will cause the last site to be shown. Once site number 1 has been reached, using - again will show the last site.
S ("Show").
This command is the same as + and - except that it causes the program to ask you for the number of a site. That site is the one whose states will be displayed. If you give the site number as 0, the program will go back to not showing the states of the sites.
. (dot)
This command simply causes the current tree to be redisplayed. It is of use when the tree has partly disappeared off of the top of the screen owing to too many responses to commands being printed out at the bottom of the screen.

T ("Try rearrangements").
This command asks for the name of a node. The part of the tree at and above that node is removed from the tree. The program tries to re-insert it in each possible location on the tree (this may take some time, and the program reminds you to wait). Then it prints out a summary. For each possible location the program prints out the number of the node to the right of the place of insertion and the number of steps required in each case. These are divided into those that are better than or tied with the current tree. Once this summary is printed out, the group that was removed is reinserted into its original position. It is up to you to use the R command to actually carry out any of the arrangements that have been tried.
U ("Undo").
This command reverses the effect of the most recent rearrangement, outgroup re-rooting, or flipping of branches. It returns to the previous tree topology. It will be of great use when rearranging the tree and when a rearrangement proves worse than the preceding one -- it permits you to abandon the new one and return to the previous one without remembering its topology in detail.
W ("Write").
This command writes out the current tree onto a tree output file. If the file already has been written to by this run of Dnamove, it will ask you whether you want to replace the contents of the file, add the tree to the end of the file, or not write out the tree to the file. The tree is written in the standard format used by PHYLIP (a subset of the Newick standard). It is in the proper format to serve as the User-Defined Tree for setting up the initial tree in a subsequent run of the program. Note that if you provided the initial tree topology in a tree file and replace its contents, that initial tree will be lost.
O ("Outgroup").
This asks for the number of a node which is to be the outgroup. The tree will be redisplayed with that node as the left descendant of the bottom fork. Note that it is possible to use this to make a multi-species group the outgroup (i.e., you can give the number of an interior node of the tree as the outgroup, and the program will re-root the tree properly with that on the left of the bottom fork).
F ("Flip").
This asks for a node number and then flips the two branches at that node, so that the left-right order of branches at that node is changed. This does not actually change the tree topology (or the number of steps on that tree) but it does change the appearance of the tree.
C ("Clade").
When the data consist of more than 12 species (or more than half the number of lines on the screen if this is not 24), it may be difficult to display the tree on one screen. In that case the tree will be squeezed down to one line per species. This is too small to see all the interior states of the tree. The C command instructs the program to print out only that part of the tree (the "clade") from a certain node on up. The program will prompt you for the number of this node. Remember that thereafter you are not looking at the whole tree. To go back to looking at the whole tree give the C command again and enter "0" for the node number when asked. Most users will not want to use this option unless forced to.
H ("Help").
Prints a one-screen summary of what the commands do, a few words for each command.
? ("huh?").
A synonym for H. Same as Help command.
X ("Exit").
Exit from program. If the current tree has not yet been saved into a file, the program will first ask you whether it should be saved.
Q ("Quit").
A synonym for X. Same as the eXit command.

ADAPTING THE PROGRAM TO YOUR COMPUTER AND TO YOUR TERMINAL

As we have seen, the initial menu of the program allows you to choose among three screen types (PCDOS, Ansi, and none). If you want to avoid having to make this choice every time, you can change some of the constants in the file phylip.h to have the terminal type initialize itself in the proper way, and recompile. We have tried to have the default values be correct for PC, Macintosh, and Unix screens. If the setting is "none" (which is necessary on Macintosh MacOS 9 screens), the special graphics characters will not be used to indicate nucleotide states, but only letters will be used for the four nucleotides. This is less easy to look at.

The constants that need attention are ANSICRT and IBMCRT. Currently these are both set to "false" on Macintosh MacOS 9 systems, to "true" on MacOS X and on Unix/Linux systems, and IBMCRT is set to "true" on Windows systems. If your system has an ANSI compatible terminal, you might want to find the definition of ANSICRT in phylip.h and set it to "true", and IBMCRT to "false".

MORE ABOUT THE PARSIMONY CRITERION

This program carries out unrooted parsimony (analogous to Wagner trees) (Eck and Dayhoff, 1966; Kluge and Farris, 1969) on DNA sequences. The method of Fitch (1971) is used to count the number of changes of base needed on a given tree. The assumptions of this method are exactly analogous to those of MIX:

  1. Each site evolves independently.
  2. Different lineages evolve independently.
  3. The probability of a base substitution at a given site is small over the lengths of time involved in a branch of the phylogeny.
  4. The expected amounts of change in different branches of the phylogeny do not vary by so much that two changes in a high-rate branch are more probable than one change in a low-rate branch.
  5. The expected amounts of change do not vary enough among sites that two changes in one site are more probable than one change in another.

That these are the assumptions of parsimony methods has been documented in a series of papers of mine: (1973a, 1978b, 1979, 1981b, 1983b, 1988b). For an opposing view arguing that the parsimony methods make no substantive assumptions such as these, see the papers by Farris (1983) and Sober (1983a, 1983b), but also read the exchange between Felsenstein and Sober (1986).

Change from an occupied site to a deletion is counted as one change. Reversion from a deletion to an occupied site is allowed and is also counted as one change.

Below is a test data set, but we cannot show the output it generates because of the interactive nature of the program.


DATA SET

   5   13
Alpha     AACGUGGCCA AAU
Beta      AAGGUCGCCA AAC
Gamma     CAUUUCGUCA CAA
Delta     GGUAUUUCGG CCU
Epsilon   GGGAUCUCGG CCC
phylip-3.697/doc/dnapars.html0000644004732000473200000003365312406201172015635 0ustar joefelsenst_g main

version 3.696

Dnapars -- DNA Parsimony Program

© Copyright 1986-2014 by Joseph Felsenstein. All rights reserved. License terms here.

This program carries out unrooted parsimony (analogous to Wagner trees) (Eck and Dayhoff, 1966; Kluge and Farris, 1969) on DNA sequences. The method of Fitch (1971) is used to count the number of changes of base needed on a given tree. The assumptions of this method are analogous to those of MIX:

  1. Each site evolves independently.
  2. Different lineages evolve independently.
  3. The probability of a base substitution at a given site is small over the lengths of time involved in a branch of the phylogeny.
  4. The expected amounts of change in different branches of the phylogeny do not vary by so much that two changes in a high-rate branch are more probable than one change in a low-rate branch.
  5. The expected amounts of change do not vary enough among sites that two changes in one site are more probable than one change in another.

That these are the assumptions of parsimony methods has been documented in a series of papers of mine: (1973a, 1978b, 1979, 1981b, 1983b, 1988b). For an opposing view arguing that the parsimony methods make no substantive assumptions such as these, see the papers by Farris (1983) and Sober (1983a, 1983b, 1988), but also read the exchange between Felsenstein and Sober (1986).

Change from an occupied site to a deletion is counted as one change. Reversion from a deletion to an occupied site is allowed and is also counted as one change. Note that this in effect assumes that a deletion N bases long is N separate events.

Dnapars can handle both bifurcating and multifurcating trees. In doing its search for most parsimonious trees, it adds species not only by creating new forks in the middle of existing branches, but it also tries putting them at the end of new branches which are added to existing forks. Thus it searches among both bifurcating and multifurcating trees. If a branch in a tree does not have any characters which might change in that branch in the most parsimonious tree, it does not save that tree. Thus in any tree that results, a branch exists only if some character has a most parsimonious reconstruction that would involve change in that branch.

It also saves a number of trees tied for best (you can alter the number it saves using the V option in the menu). When rearranging trees, it tries rearrangements of all of the saved trees. This makes the algorithm slower than earlier versions of Dnapars.

The input data is standard. The first line of the input file contains the number of species and the number of sites.

Next come the species data. Each sequence starts on a new line, has a ten-character species name that must be blank-filled to be of that length, followed immediately by the species data in the one-letter code. The sequences must either be in the "interleaved" or "sequential" formats described in the Molecular Sequence Programs document. The I option selects between them. The sequences can have internal blanks in the sequence but there must be no extra blanks at the end of the terminated line. Note that a blank is not a valid symbol for a deletion.

The options are selected using an interactive menu. The menu looks like this:


DNA parsimony algorithm, version 3.69

Setting for this run:
  U                 Search for best tree?  Yes
  S                        Search option?  More thorough search
  V              Number of trees to save?  10000
  J   Randomize input order of sequences?  No. Use input order
  O                        Outgroup root?  No, use as outgroup species  1
  T              Use Threshold parsimony?  No, use ordinary parsimony
  N           Use Transversion parsimony?  No, count all steps
  W                       Sites weighted?  No
  M           Analyze multiple data sets?  No
  I          Input sequences interleaved?  Yes
  0   Terminal type (IBM PC, ANSI, none)?  ANSI
  1    Print out the data at start of run  No
  2  Print indications of progress of run  Yes
  3                        Print out tree  Yes
  4          Print out steps in each site  No
  5  Print sequences at all nodes of tree  No
  6       Write out trees onto tree file?  Yes

  Y to accept these or type the letter for one to change

The user either types "Y" (followed, of course, by a carriage-return) if the settings shown are to be accepted, or the letter or digit corresponding to an option that is to be changed.

The S (search) option controls how, and how much, rearrangement is done on the tied trees that are saved by the program. If the "More thorough search" option (the default) is chosen, the program will save multiple tied trees, without collapsing internal branches that have no evidence of change on them. It will subsequently rearrange on all parts of each of those trees. If the "Less thorough search" option is chosen, before saving, the program will collapse all branches that have no evidence that there is any change on that branch. This leads to less attempted rearrangement. If the "Rearrange on one best tree" option is chosen, only the first of the tied trees is used for rearrangement. This is faster but less thorough. If your trees are likely to have large multifurcations, do not use the default "More thorough search" option as it could result in too large a number of trees being saved.

The N option allows you to choose transversion parsimony, which counts only transversions (changes between one of the purines A or G and one of the pyrimidines C or T). This setting is turned off by default.

The Weights (W) option takes the weights from a file whose default name is "weights". The weights follow the format described in the main documentation file, with integer weights from 0 to 35 allowed by using the characters 0, 1, 2, ..., 9 and A, B, ... Z.

The User tree (option U) is read from a file whose default name is intree. The trees can be multifurcating. They must be preceded in the file by a line giving the number of trees in the file.

The options J, O, T, M, and 0 are the usual ones. They are described in the main documentation file of this package. Option I is the same as in other molecular sequence programs and is described in the documentation file for the sequence programs.

The M (multiple data sets option) will ask you whether you want to use multiple sets of weights (from the weights file) or multiple data sets. The ability to use a single data set with multiple weights means that much less disk space will be used for this input data. The bootstrapping and jackknifing tool Seqboot has the ability to create a weights file with multiple weights.

The O (outgroup) option will have no effect if the U (user-defined tree) option is in effect. The T (threshold) option allows a continuum of methods between parsimony and compatibility. Thresholds less than or equal to 1.0 do not have any meaning and should not be used: they will result in a tree dependent only on the input order of species and not at all on the data!

Output is standard: if option 1 is toggled on, the data is printed out, with the convention that "." means "the same as in the first species". Then comes a list of equally parsimonious trees. Each tree has branch lengths. These are computed using an algorithm published by Hochbaum and Pathria (1997) which I first heard of from Wayne Maddison who invented it independently of them. This algorithm averages the number of reconstructed changes of state over all sites over all possible most parsimonious placements of the changes of state among branches. Note that it does not correct in any way for multiple changes that overlay each other.

If option 2 is toggled on a table of the number of changes of state required in each character is also printed. If option 5 is toggled on, a table is printed out after each tree, showing for each branch whether there are known to be changes in the branch, and what the states are inferred to have been at the top end of the branch. This is a reconstruction of the ancestral sequences in the tree. If you choose option 5, a menu item "." appears which gives you the opportunity to turn off dot-differencing so that complete ancestral sequences are shown. If the inferred state is a "?" or one of the IUB ambiguity symbols, there will be multiple equally-parsimonious assignments of states; the user must work these out for themselves by hand. A "?" in the reconstructed states means that in addition to one or more bases, a deletion may or may not be present. If option 6 is left in its default state the trees found will be written to a tree file, so that they are available to be used in other programs.

If the U (User Tree) option is used and more than one tree is supplied, and the program is not told to assume autocorrelation between the rates at different sites, the program also performs a statistical test of each of these trees against the one with highest likelihood. If there are two user trees, this is a version of the test proposed by Alan Templeton (1983) and evaluated in a test case by me (1985a). It is closely parallel to a test using log likelihood differences due to Kishino and Hasegawa (1989) It uses the mean and variance of the differences in the number of steps between trees, taken across sites. If the two trees' means are more than 1.96 standard deviations different, then the trees are declared significantly different.

If there are more than two trees, the test done is an extension of the KHT test, due to Shimodaira and Hasegawa (1999). They pointed out that a correction for the number of trees was necessary, and they introduced a resampling method to make this correction. In the version used here the variances and covariances of the sums of steps across sites are computed for all pairs of trees. To test whether the difference between each tree and the best one is larger than could have been expected if they all had the same expected number of steps, numbers of steps for all trees are sampled with these covariances and equal means (Shimodaira and Hasegawa's "least favorable hypothesis"), and a P value is computed from the fraction of times the difference between the tree's value and the lowest number of steps exceeds that actually observed. Note that this sampling needs random numbers, and so the program will prompt the user for a random number seed if one has not already been supplied. With the two-tree KHT test no random numbers are used.

In either the KHT or the SH test the program prints out a table of the number of steps for each tree, the differences of each from the lowest one, the variance of that quantity as determined by the differences of the numbers of steps at individual sites, and a conclusion as to whether that tree is or is not significantly worse than the best one.

Option 6 in the menu controls whether the tree estimated by the program is written onto a tree file. The default name of this output tree file is "outtree". If the U option is in effect, all the user-defined trees are written to the output tree file. If the program finds multiple trees tied for best, all of these are written out onto the output tree file. Each is followed by a numerical weight in square brackets (such as [0.25000]). This is needed when we use the trees to make a consensus tree of the results of bootstrapping or jackknifing, to avoid overrepresenting replicates that find many tied trees.

The program is a straightforward relative of MIX and runs reasonably quickly, especially with many sites and few species.


TEST DATA SET

 
   5   13
Alpha     AACGUGGCCAAAU
Beta      AAGGUCGCCAAAC
Gamma     CAUUUCGUCACAA
Delta     GGUAUUUCGGCCU
Epsilon   GGGAUCUCGGCCC


CONTENTS OF OUTPUT FILE (if all numerical options are on)


DNA parsimony algorithm, version 3.69

 5 species,  13  sites


Name            Sequences
----            ---------

Alpha        AACGUGGCCA AAU
Beta         ..G..C.... ..C
Gamma        C.UU.C.U.. C.A
Delta        GGUA.UU.GG CC.
Epsilon      GGGA.CU.GG CCC



One most parsimonious tree found:


                                            +-----Epsilon   
               +----------------------------3  
  +------------2                            +-------Delta     
  |            |  
  |            +----------------Gamma     
  |  
  1----Beta      
  |  
  +---------Alpha     


requires a total of     19.000

  between      and       length
  -------      ---       ------
     1           2       0.217949
     2           3       0.487179
     3      Epsilon      0.096154
     3      Delta        0.134615
     2      Gamma        0.275641
     1      Beta         0.076923
     1      Alpha        0.173077

steps in each site:
         0   1   2   3   4   5   6   7   8   9
     *-----------------------------------------
    0|       2   1   3   2   0   2   1   1   1
   10|   1   1   1   3                        

From    To     Any Steps?    State at upper node
                             ( . means same as in the node below it on tree)

          1                AABGTCGCCA AAY
   1      2         yes    V.KD...... C..
   2      3         yes    GG.A..T.GG .C.
   3   Epsilon     maybe   ..G....... ..C
   3   Delta        yes    ..T..T.... ..T
   2   Gamma        yes    C.TT...T.. ..A
   1   Beta        maybe   ..G....... ..C
   1   Alpha        yes    ..C..G.... ..T


phylip-3.697/doc/dnapenny.html0000644004732000473200000007657412406201172016032 0ustar joefelsenst_g dnapenny

version 3.696

Dnapenny - Branch and bound to find
all most parsimonious trees
for nucleic acid sequence parsimony criteria

© Copyright 1986-2014 by Joseph Felsenstein. All rights reserved. License terms here.

Dnapenny is a program that will find all of the most parsimonious trees implied by your data when the nucleic acid sequence parsimony criterion is employed. It does so not by examining all possible trees, but by using the more sophisticated "branch and bound" algorithm, a standard computer science search strategy first applied to phylogenetic inference by Hendy and Penny (1982). (J. S. Farris [personal communication, 1975] had also suggested that this strategy, which is well-known in computer science, might be applied to phylogenies, but he did not publish this suggestion).

There is, however, a price to be paid for the certainty that one has found all members of the set of most parsimonious trees. The problem of finding these has been shown (Graham and Foulds, 1982; Day, 1983) to be NP-complete, which is equivalent to saying that there is no fast algorithm that is guaranteed to solve the problem in all cases (for a discussion of NP-completeness, see the Scientific American article by Lewis and Papadimitriou, 1978). The result is that this program, despite its algorithmic sophistication, is VERY SLOW.

The program should be slower than the other tree-building programs in the package, but usable up to about ten species. Above this it will bog down rapidly, but exactly when depends on the data and on how much computer time you have. IT IS VERY IMPORTANT FOR YOU TO GET A FEEL FOR HOW LONG THE PROGRAM WILL TAKE ON YOUR DATA. This can be done by running it on subsets of the species, increasing the number of species in the run until you either are able to treat the full data set or know that the program will take unacceptably long on it. (Making a plot of the logarithm of run time against species number may help to project run times).

The Algorithm

The search strategy used by Dnapenny starts by making a tree consisting of the first two species (the first three if the tree is to be unrooted). Then it tries to add the next species in all possible places (there are three of these). For each of the resulting trees it evaluates the number of base substitutions. It adds the next species to each of these, again in all possible spaces. If this process would continue it would simply generate all possible trees, of which there are a very large number even when the number of species is moderate (34,459,425 with 10 species). Actually it does not do this, because the trees are generated in a particular order and some of them are never generated.

This is because the order in which trees are generated is not quite as implied above, but is a "depth-first search". This means that first one adds the third species in the first possible place, then the fourth species in its first possible place, then the fifth and so on until the first possible tree has been produced. For each tree the number of steps is evaluated. Then one "backtracks" by trying the alternative placements of the last species. When these are exhausted one tries the next placement of the next-to-last species. The order of placement in a depth-first search is like this for a four-species case (parentheses enclose monophyletic groups):

     Make tree of first two species:     (A,B)
          Add C in first place:     ((A,B),C)
               Add D in first place:     (((A,D),B),C)
               Add D in second place:     ((A,(B,D)),C)
               Add D in third place:     (((A,B),D),C)
               Add D in fourth place:     ((A,B),(C,D))
               Add D in fifth place:     (((A,B),C),D)
          Add C in second place:     ((A,C),B)
               Add D in first place:     (((A,D),C),B)
               Add D in second place:     ((A,(C,D)),B)
               Add D in third place:     (((A,C),D),B)
               Add D in fourth place:     ((A,C),(B,D))
               Add D in fifth place:     (((A,C),B),D)
          Add C in third place:     (A,(B,C))
               Add D in first place:     ((A,D),(B,C))
               Add D in second place:     (A,((B,D),C))
               Add D in third place:     (A,(B,(C,D)))
               Add D in fourth place:     (A,((B,C),D))
               Add D in fifth place:     ((A,(B,C)),D)

Among these fifteen trees you will find all of the four-species rooted trees, each exactly once (the parentheses each enclose a monophyletic group). As displayed above, the backtracking depth-first search algorithm is just another way of producing all possible trees one at a time. The branch and bound algorithm consists of this with one change. As each tree is constructed, including the partial trees such as (A,(B,C)), its number of steps is evaluated. In addition a prediction is made as to how many steps will be added, at a minimum, as further species are added.

This is done by counting how many sites which are invariant in the data up to the most recent species added will ultimately show variation when further species are added. Thus if 20 sites vary among species A, B, and C and their root, and if tree ((A,C),B) requires 24 steps, then if there are 8 more sites which will be seen to vary when species D is added, we can immediately say that no matter how we add D, the resulting tree can have no less than 24 + 8 = 32 steps. The point of all this is that if a previously-found tree such as ((A,B),(C,D)) required only 30 steps, then we know that there is no point in even trying to add D to ((A,C),B). We have computed the bound that enables us to cut off a whole line of inquiry (in this case five trees) and avoid going down that particular branch any farther.

The branch-and-bound algorithm thus allows us to find all most parsimonious trees without generating all possible trees. How much of a saving this is depends strongly on the data. For very clean (nearly "Hennigian") data, it saves much time, but on very messy data it will still take a very long time.

The algorithm in the program differs from the one outlined here in some essential details: it investigates possibilities in the order of their apparent promise. This applies to the order of addition of species, and to the places where they are added to the tree. After the first two-species tree is constructed, the program tries adding each of the remaining species in turn, each in the best possible place it can find. Whichever of those species adds (at a minimum) the most additional steps is taken to be the one to be added next to the tree. When it is added, it is added in turn to places which cause the fewest additional steps to be added. This sounds a bit complex, but it is done with the intention of eliminating regions of the search of all possible trees as soon as possible, and lowering the bound on tree length as quickly as possible. This process of evaluating which species to add in which order goes on the first time the search makes a tree; thereafter it uses that order.

The program keeps a list of all the most parsimonious trees found so far. Whenever it finds one that has fewer losses than these, it clears out the list and restarts it with that tree. In the process the bound tightens and fewer possibilities need be investigated. At the end the list contains all the shortest trees. These are then printed out. It should be mentioned that the program Clique for finding all largest cliques also works by branch-and-bound. Both problems are NP-complete but for some reason Clique runs far faster. Although their worst-case behavior is bad for both programs, those worst cases occur far more frequently in parsimony problems than in compatibility problems.

Controlling Run Times

Among the quantities available to be set from the menu of Dnapenny, two (howoften and howmany) are of particular importance. As Dnapenny goes along it will keep count of how many trees it has examined. Suppose that howoften is 100 and howmany is 1000, the default settings. Every time 100 trees have been examined, Dnapenny will print out a line saying how many multiples of 100 trees have now been examined, how many steps the most parsimonious tree found so far has, how many trees with that number of steps have been found, and a very rough estimate of what fraction of all trees have been looked at so far.

When the number of these multiples printed out reaches the number howmany (say 1000), the whole algorithm aborts and prints out that it has not found all most parsimonious trees, but prints out what is has gotten so far anyway. These trees need not be any of the most parsimonious trees: they are simply the most parsimonious ones found so far. By setting the product (howoften times howmany) large you can make the algorithm less likely to abort, but then you risk getting bogged down in a gigantic computation. You should adjust these constants so that the program cannot go beyond examining the number of trees you are reasonably willing to pay for (or wait for). In their initial setting the program will abort after looking at 100,000 trees. Obviously you may want to adjust howoften in order to get more or fewer lines of intermediate notice of how many trees have been looked at so far. Of course, in small cases you may never even reach the first multiple of howoften, and nothing will be printed out except some headings and then the final trees.

The indication of the approximate percentage of trees searched so far will be helpful in judging how much farther you would have to go to get the full search. Actually, since that fraction is the fraction of the set of all possible trees searched or ruled out so far, and since the search becomes progressively more efficient, the approximate fraction printed out will usually be an underestimate of how far along the program is, sometimes a serious underestimate.

A constant at the beginning of the program that affects the result is "maxtrees", which controls the maximum number of trees that can be stored. Thus if maxtrees is 25, and 32 most parsimonious trees are found, only the first 25 of these are stored and printed out. If maxtrees is increased, the program does not run any slower but requires a little more intermediate storage space. I recommend that maxtrees be kept as large as you can, provided you are willing to look at an output with that many trees on it! Initially, maxtrees is set to 100 in the distribution copy.

Method and Options

The counting of the length of trees is done by an algorithm nearly identical to the corresponding algorithms in Dnapars, and thus the remainder of this document will be nearly identical to the Dnapars document.

This program carries out unrooted parsimony (analogous to Wagner trees) (Eck and Dayhoff, 1966; Kluge and Farris, 1969) on DNA sequences. The method of Fitch (1971) is used to count the number of changes of base needed on a given tree. The assumptions of this method are exactly analogous to those of Dnapars:

  1. Each site evolves independently.
  2. Different lineages evolve independently.
  3. The probability of a base substitution at a given site is small over the lengths of time involved in a branch of the phylogeny.
  4. The expected amounts of change in different branches of the phylogeny do not vary by so much that two changes in a high-rate branch are more probable than one change in a low-rate branch.
  5. The expected amounts of change do not vary enough among sites that two changes in one site are more probable than one change in another.

Change from an occupied site to a deletion is counted as one change. Reversion from a deletion to an occupied site is allowed and is also counted as one change.

That these are the assumptions of parsimony methods has been documented in a series of papers of mine: (1973a, 1978b, 1979, 1981b, 1983b, 1988b). For an opposing view arguing that the parsimony methods make no substantive assumptions such as these, see the papers by Farris (1983) and Sober (1983a, 1983b), but also read the exchange between Felsenstein and Sober (1986).

Change from an occupied site to a deletion is counted as one change. Reversion from a deletion to an occupied site is allowed and is also counted as one change. Note that this in effect assumes that a deletion N bases long is N separate events.

The input data is standard. The first line of the input file contains the number of species and the number of sites. If the Weights option is being used, there must also be a W in this first line to signal its presence. There are only two options requiring information to be present in the input file, W (Weights) and U (User tree). All options other than W (including U) are invoked using the menu.

Next come the species data. Each sequence starts on a new line, has a ten-character species name that must be blank-filled to be of that length, followed immediately by the species data in the one-letter code. The sequences must either be in the "interleaved" or "sequential" formats described in the Molecular Sequence Programs document. The I option selects between them. The sequences can have internal blanks in the sequence but there must be no extra blanks at the end of the terminated line. Note that a blank is not a valid symbol for a deletion.

The options are selected using an interactive menu. The menu looks like this:


Penny algorithm for DNA, version 3.69
 branch-and-bound to find all most parsimonious trees

Settings for this run:
  H        How many groups of  100 trees:  1000
  F        How often to report, in trees:   100
  S           Branch and bound is simple?  Yes
  O                        Outgroup root?  No, use as outgroup species  1
  T              Use Threshold parsimony?  No, use ordinary parsimony
  W                       Sites weighted?  No
  M           Analyze multiple data sets?  No
  I          Input sequences interleaved?  Yes
  0   Terminal type (IBM PC, ANSI, none)?  ANSI
  1    Print out the data at start of run  No
  2  Print indications of progress of run  Yes
  3                        Print out tree  Yes
  4          Print out steps in each site  No
  5  Print sequences at all nodes of tree  No
  6       Write out trees onto tree file?  Yes

Are these settings correct? (type Y or the letter for one to change)

The user either types "Y" (followed, of course, by a carriage-return) if the settings shown are to be accepted, or the letter or digit corresponding to an option that is to be changed.

The options O, T, W, M, and 0 are the usual ones. They are described in the main documentation file of this package. Option I is the same as in other molecular sequence programs and is described in the documentation file for the sequence programs.

The T (threshold) option allows a continuum of methods between parsimony and compatibility. Thresholds less than or equal to 1.0 do not have any meaning and should not be used: they will result in a tree dependent only on the input order of species and not at all on the data!

The W (Weights) option allows only weights of 0 or 1.

The options H, F, and S are not found in the other molecular sequence programs. H (How many) allows the user to set the quantity howmany, which we have already seen controls number of times that the program will report on its progress. F allows the user to set the quantity howoften, which sets how often it will report -- after scanning how many trees.

The S (Simple) option alters a step in Dnapenny which reconsiders the order in which species are added to the tree. Normally the decision as to what species to add to the tree next is made as the first tree is being constructed; that ordering of species is not altered subsequently. The S option causes it to be continually reconsidered. This will probably result in a substantial increase in run time, but on some data sets of intermediate messiness it may help. It is included in case it might prove of use on some data sets. The Simple option, in which the ordering is kept the same after being established by trying alternatives during the construction of the first tree, is the default. Continual reconsideration can be selected as an alternative.

Output is standard: if option 1 is toggled on, the data is printed out, with the convention that "." means "the same as in the first species". Then comes a list of equally parsimonious trees, and (if option 2 is toggled on) a table of the number of changes of state required in each character. If option 5 is toggled on, a table is printed out after each tree, showing for each branch whether there are known to be changes in the branch, and what the states are inferred to have been at the top end of the branch. If the inferred state is a "?" or one of the IUB ambiguity symbols, there will be multiple equally-parsimonious assignments of states; the user must work these out for themselves by hand. A "?" in the reconstructed states means that in addition to one or more bases, a deletion may or may not be present. If option 6 is left in its default state the trees found will be written to a tree file, so that they are available to be used in other programs. If the program finds multiple trees tied for best, all of these are written out onto the output tree file. Each is followed by a numerical weight in square brackets (such as [0.25000]). This is needed when we use the trees to make a consensus tree of the results of bootstrapping or jackknifing, to avoid overrepresenting replicates that find many tied trees.


TEST DATA SET

    8    6
Alpha1    AAGAAG
Alpha2    AAGAAG
Beta1     AAGGGG
Beta2     AAGGGG
Gamma1    AGGAAG
Gamma2    AGGAAG
Delta     GGAGGA
Epsilon   GGAAAG


CONTENTS OF OUTPUT FILE (if all numerical options are on)


Penny algorithm for DNA, version 3.69
 branch-and-bound to find all most parsimonious trees

 8 species,   6  sites

Name         Sequences
----         ---------

Alpha1       AAGAAG
Alpha2       ......
Beta1        ...GG.
Beta2        ...GG.
Gamma1       .G....
Gamma2       .G....
Delta        GGAGGA
Epsilon      GGA...



requires a total of              8.000

     9 trees in all found




  +--------------------Alpha1    
  !  
  !        +-----------Alpha2    
  !        !  
  1  +-----4        +--Epsilon   
  !  !     !  +-----6  
  !  !     !  !     +--Delta     
  !  !     +--5  
  +--2        !     +--Gamma2    
     !        +-----7  
     !              +--Gamma1    
     !  
     !              +--Beta2     
     +--------------3  
                    +--Beta1     

  remember: this is an unrooted tree!


steps in each site:
         0   1   2   3   4   5   6   7   8   9
     *-----------------------------------------
    0|       1   1   1   2   2   1            

From    To     Any Steps?    State at upper node
                             ( . means same as in the node below it on tree)

          1                AAGAAG
   1   Alpha1       no     ......
   1      2         no     ......
   2      4         no     ......
   4   Alpha2       no     ......
   4      5         yes    .G....
   5      6         yes    G.A...
   6   Epsilon      no     ......
   6   Delta        yes    ...GGA
   5      7         no     ......
   7   Gamma2       no     ......
   7   Gamma1       no     ......
   2      3         yes    ...GG.
   3   Beta2        no     ......
   3   Beta1        no     ......





  +--------------------Alpha1    
  !  
  !        +-----------Alpha2    
  !        !  
  1  +-----4  +--------Gamma2    
  !  !     !  !  
  !  !     +--7     +--Epsilon   
  !  !        !  +--6  
  +--2        +--5  +--Delta     
     !           !  
     !           +-----Gamma1    
     !  
     !              +--Beta2     
     +--------------3  
                    +--Beta1     

  remember: this is an unrooted tree!


steps in each site:
         0   1   2   3   4   5   6   7   8   9
     *-----------------------------------------
    0|       1   1   1   2   2   1            

From    To     Any Steps?    State at upper node
                             ( . means same as in the node below it on tree)

          1                AAGAAG
   1   Alpha1       no     ......
   1      2         no     ......
   2      4         no     ......
   4   Alpha2       no     ......
   4      7         yes    .G....
   7   Gamma2       no     ......
   7      5         no     ......
   5      6         yes    G.A...
   6   Epsilon      no     ......
   6   Delta        yes    ...GGA
   5   Gamma1       no     ......
   2      3         yes    ...GG.
   3   Beta2        no     ......
   3   Beta1        no     ......





  +--------------------Alpha1    
  !  
  !        +-----------Alpha2    
  !        !  
  1  +-----4     +-----Gamma2    
  !  !     !  +--7  
  !  !     !  !  !  +--Epsilon   
  !  !     +--5  +--6  
  +--2        !     +--Delta     
     !        !  
     !        +--------Gamma1    
     !  
     !              +--Beta2     
     +--------------3  
                    +--Beta1     

  remember: this is an unrooted tree!


steps in each site:
         0   1   2   3   4   5   6   7   8   9
     *-----------------------------------------
    0|       1   1   1   2   2   1            

From    To     Any Steps?    State at upper node
                             ( . means same as in the node below it on tree)

          1                AAGAAG
   1   Alpha1       no     ......
   1      2         no     ......
   2      4         no     ......
   4   Alpha2       no     ......
   4      5         yes    .G....
   5      7         no     ......
   7   Gamma2       no     ......
   7      6         yes    G.A...
   6   Epsilon      no     ......
   6   Delta        yes    ...GGA
   5   Gamma1       no     ......
   2      3         yes    ...GG.
   3   Beta2        no     ......
   3   Beta1        no     ......





  +--------------------Alpha1    
  !  
  1  +-----------------Alpha2    
  !  !  
  !  !        +--------Gamma2    
  +--2        !  
     !  +-----7     +--Epsilon   
     !  !     !  +--6  
     !  !     +--5  +--Delta     
     +--4        !  
        !        +-----Gamma1    
        !  
        !           +--Beta2     
        +-----------3  
                    +--Beta1     

  remember: this is an unrooted tree!


steps in each site:
         0   1   2   3   4   5   6   7   8   9
     *-----------------------------------------
    0|       1   1   1   2   2   1            

From    To     Any Steps?    State at upper node
                             ( . means same as in the node below it on tree)

          1                AAGAAG
   1   Alpha1       no     ......
   1      2         no     ......
   2   Alpha2       no     ......
   2      4         no     ......
   4      7         yes    .G....
   7   Gamma2       no     ......
   7      5         no     ......
   5      6         yes    G.A...
   6   Epsilon      no     ......
   6   Delta        yes    ...GGA
   5   Gamma1       no     ......
   4      3         yes    ...GG.
   3   Beta2        no     ......
   3   Beta1        no     ......





  +--------------------Alpha1    
  !  
  !  +-----------------Alpha2    
  1  !  
  !  !              +--Epsilon   
  !  !        +-----6  
  +--2        !     +--Delta     
     !  +-----5  
     !  !     !     +--Gamma2    
     !  !     +-----7  
     +--4           +--Gamma1    
        !  
        !           +--Beta2     
        +-----------3  
                    +--Beta1     

  remember: this is an unrooted tree!


steps in each site:
         0   1   2   3   4   5   6   7   8   9
     *-----------------------------------------
    0|       1   1   1   2   2   1            

From    To     Any Steps?    State at upper node
                             ( . means same as in the node below it on tree)

          1                AAGAAG
   1   Alpha1       no     ......
   1      2         no     ......
   2   Alpha2       no     ......
   2      4         no     ......
   4      5         yes    .G....
   5      6         yes    G.A...
   6   Epsilon      no     ......
   6   Delta        yes    ...GGA
   5      7         no     ......
   7   Gamma2       no     ......
   7   Gamma1       no     ......
   4      3         yes    ...GG.
   3   Beta2        no     ......
   3   Beta1        no     ......





  +--------------------Alpha1    
  !  
  !  +-----------------Alpha2    
  1  !  
  !  !           +-----Gamma2    
  !  !        +--7  
  +--2        !  !  +--Epsilon   
     !  +-----5  +--6  
     !  !     !     +--Delta     
     !  !     !  
     +--4     +--------Gamma1    
        !  
        !           +--Beta2     
        +-----------3  
                    +--Beta1     

  remember: this is an unrooted tree!


steps in each site:
         0   1   2   3   4   5   6   7   8   9
     *-----------------------------------------
    0|       1   1   1   2   2   1            

From    To     Any Steps?    State at upper node
                             ( . means same as in the node below it on tree)

          1                AAGAAG
   1   Alpha1       no     ......
   1      2         no     ......
   2   Alpha2       no     ......
   2      4         no     ......
   4      5         yes    .G....
   5      7         no     ......
   7   Gamma2       no     ......
   7      6         yes    G.A...
   6   Epsilon      no     ......
   6   Delta        yes    ...GGA
   5   Gamma1       no     ......
   4      3         yes    ...GG.
   3   Beta2        no     ......
   3   Beta1        no     ......





  +--------------------Alpha1    
  !  
  !              +-----Alpha2    
  1  +-----------2  
  !  !           !  +--Beta2     
  !  !           +--3  
  +--4              +--Beta1     
     !  
     !        +--------Gamma2    
     !        !  
     +--------7     +--Epsilon   
              !  +--6  
              +--5  +--Delta     
                 !  
                 +-----Gamma1    

  remember: this is an unrooted tree!


steps in each site:
         0   1   2   3   4   5   6   7   8   9
     *-----------------------------------------
    0|       1   1   1   2   2   1            

From    To     Any Steps?    State at upper node
                             ( . means same as in the node below it on tree)

          1                AAGAAG
   1   Alpha1       no     ......
   1      4         no     ......
   4      2         no     ......
   2   Alpha2       no     ......
   2      3         yes    ...GG.
   3   Beta2        no     ......
   3   Beta1        no     ......
   4      7         yes    .G....
   7   Gamma2       no     ......
   7      5         no     ......
   5      6         yes    G.A...
   6   Epsilon      no     ......
   6   Delta        yes    ...GGA
   5   Gamma1       no     ......





  +--------------------Alpha1    
  !  
  !              +-----Alpha2    
  1  +-----------2  
  !  !           !  +--Beta2     
  !  !           +--3  
  !  !              +--Beta1     
  +--4  
     !           +-----Gamma2    
     !        +--7  
     !        !  !  +--Epsilon   
     +--------5  +--6  
              !     +--Delta     
              !  
              +--------Gamma1    

  remember: this is an unrooted tree!


steps in each site:
         0   1   2   3   4   5   6   7   8   9
     *-----------------------------------------
    0|       1   1   1   2   2   1            

From    To     Any Steps?    State at upper node
                             ( . means same as in the node below it on tree)

          1                AAGAAG
   1   Alpha1       no     ......
   1      4         no     ......
   4      2         no     ......
   2   Alpha2       no     ......
   2      3         yes    ...GG.
   3   Beta2        no     ......
   3   Beta1        no     ......
   4      5         yes    .G....
   5      7         no     ......
   7   Gamma2       no     ......
   7      6         yes    G.A...
   6   Epsilon      no     ......
   6   Delta        yes    ...GGA
   5   Gamma1       no     ......





  +--------------------Alpha1    
  !  
  !              +-----Alpha2    
  1  +-----------2  
  !  !           !  +--Beta2     
  !  !           +--3  
  !  !              +--Beta1     
  +--4  
     !              +--Epsilon   
     !        +-----6  
     !        !     +--Delta     
     +--------5  
              !     +--Gamma2    
              +-----7  
                    +--Gamma1    

  remember: this is an unrooted tree!


steps in each site:
         0   1   2   3   4   5   6   7   8   9
     *-----------------------------------------
    0|       1   1   1   2   2   1            

From    To     Any Steps?    State at upper node
                             ( . means same as in the node below it on tree)

          1                AAGAAG
   1   Alpha1       no     ......
   1      4         no     ......
   4      2         no     ......
   2   Alpha2       no     ......
   2      3         yes    ...GG.
   3   Beta2        no     ......
   3   Beta1        no     ......
   4      5         yes    .G....
   5      6         yes    G.A...
   6   Epsilon      no     ......
   6   Delta        yes    ...GGA
   5      7         no     ......
   7   Gamma2       no     ......
   7   Gamma1       no     ......


phylip-3.697/doc/penny.html0000644004732000473200000006203112406201173015327 0ustar joefelsenst_g penny

version 3.696

Penny - Branch and bound to find
all most parsimonious trees

© Copyright 1986-2014 by Joseph Felsenstein. All rights reserved. License terms here.

Penny is a program that will find all of the most parsimonious trees implied by your data. It does so not by examining all possible trees, but by using the more sophisticated "branch and bound" algorithm, a standard computer science search strategy first applied to phylogenetic inference by Hendy and Penny (1982). (J. S. Farris [personal communication, 1975] had also suggested that this strategy, which is well-known in computer science, might be applied to phylogenies, but he did not publish this suggestion).

There is, however, a price to be paid for the certainty that one has found all members of the set of most parsimonious trees. The problem of finding these has been shown (Graham and Foulds, 1982; Day, 1983) to be NP-complete, which is equivalent to saying that there is no fast algorithm that is guaranteed to solve the problem in all cases (for a discussion of NP-completeness, see the Scientific American article by Lewis and Papadimitriou, 1978). The result is that this program, despite its algorithmic sophistication, is VERY SLOW.

The program should be slower than the other tree-building programs in the package, but useable up to about ten species. Above this it will bog down rapidly, but exactly when depends on the data and on how much computer time you have. IT IS VERY IMPORTANT FOR YOU TO GET A FEEL FOR HOW LONG THE PROGRAM WILL TAKE ON YOUR DATA. This can be done by running it on subsets of the species, increasing the number of species in the run until you either are able to treat the full data set or know that the program will take unacceptably long on it. (Making a plot of the logarithm of run time against species number may help to project run times).

The Algorithm

The search strategy used by Penny starts by making a tree consisting of the first two species (the first three if the tree is to be unrooted). Then it tries to add the next species in all possible places (there are three of these). For each of the resulting trees it evaluates the number of steps. It adds the next species to each of these, again in all possible spaces. If this process would continue it would simply generate all possible trees, of which there are a very large number even when the number of species is moderate (34,459,425 with 10 species). Actually it does not do this, because the trees are generated in a particular order and some of them are never generated.

Actually the order in which trees are generated is not quite as implied above, but is a "depth-first search". This means that first one adds the third species in the first possible place, then the fourth species in its first possible place, then the fifth and so on until the first possible tree has been produced. Its number of steps is evaluated. Then one "backtracks" by trying the alternative placements of the last species. When these are exhausted one tries the next placement of the next-to-last species. The order of placement in a depth-first search is like this for a four-species case (parentheses enclose monophyletic groups):

     Make tree of first two species     (A,B)
          Add C in first place     ((A,B),C)
               Add D in first place     (((A,D),B),C)
               Add D in second place     ((A,(B,D)),C)
               Add D in third place     (((A,B),D),C)
               Add D in fourth place     ((A,B),(C,D))
               Add D in fifth place     (((A,B),C),D)
          Add C in second place: ((A,C),B)
               Add D in first place     (((A,D),C),B)
               Add D in second place     ((A,(C,D)),B)
               Add D in third place     (((A,C),D),B)
               Add D in fourth place     ((A,C),(B,D))
               Add D in fifth place     (((A,C),B),D)
          Add C in third place     (A,(B,C))
               Add D in first place     ((A,D),(B,C))
               Add D in second place     (A,((B,D),C))
               Add D in third place     (A,(B,(C,D)))
               Add D in fourth place     (A,((B,C),D))
               Add D in fifth place     ((A,(B,C)),D)

Among these fifteen trees you will find all of the four-species rooted bifurcating trees, each exactly once (the parentheses each enclose a monophyletic group). As displayed above, the backtracking depth-first search algorithm is just another way of producing all possible trees one at a time. The branch and bound algorithm consists of this with one change. As each tree is constructed, including the partial trees such as (A,(B,C)), its number of steps is evaluated. In addition a prediction is made as to how many steps will be added, at a minimum, as further species are added.

This is done by counting how many binary characters which are invariant in the data up the species most recently added will ultimately show variation when further species are added. Thus if 20 characters vary among species A, B, and C and their root, and if tree ((A,C),B) requires 24 steps, then if there are 8 more characters which will be seen to vary when species D is added, we can immediately say that no matter how we add D, the resulting tree can have no less than 24 + 8 = 32 steps. The point of all this is that if a previously-found tree such as ((A,B),(C,D)) required only 30 steps, then we know that there is no point in even trying to add D to ((A,C),B). We have computed the bound that enables us to cut off a whole line of inquiry (in this case five trees) and avoid going down that particular branch any farther.

The branch-and-bound algorithm thus allows us to find all most parsimonious trees without generating all possible trees. How much of a saving this is depends strongly on the data. For very clean (nearly "Hennigian") data, it saves much time, but on very messy data it will still take a very long time.

The algorithm in the program differs from the one outlined here in some essential details: it investigates possibilities in the order of their apparent promise. This applies to the order of addition of species, and to the places where they are added to the tree. After the first two-species tree is constructed, the program tries adding each of the remaining species in turn, each in the best possible place it can find. Whichever of those species adds (at a minimum) the most additional steps is taken to be the one to be added next to the tree. When it is added, it is added in turn to places which cause the fewest additional steps to be added. This sounds a bit complex, but it is done with the intention of eliminating regions of the search of all possible trees as soon as possible, and lowering the bound on tree length as quickly as possible.

The program keeps a list of all the most parsimonious trees found so far. Whenever it finds one that has fewer steps than these, it clears out the list and restarts the list with that tree. In the process the bound tightens and fewer possibilities need be investigated. At the end the list contains all the shortest trees. These are then printed out. It should be mentioned that the program Clique for finding all largest cliques also works by branch-and-bound. Both problems are NP-complete but for some reason Clique runs far faster. Although their worst-case behavior is bad for both programs, those worst cases occur far more frequently in parsimony problems than in compatibility problems.

Controlling Run Times

Among the quantities available to be set at the beginning of a run of Penny, two (howoften and howmany) are of particular importance. As Penny goes along it will keep count of how many trees it has examined. Suppose that howoften is 100 and howmany is 1000, the default settings. Every time 100 trees have been examined, Penny will print out a line saying how many multiples of 100 trees have now been examined, how many steps the most parsimonious tree found so far has, how many trees with that number of steps have been found, and a very rough estimate of what fraction of all trees have been looked at so far.

When the number of these multiples printed out reaches the number howmany (say 1000), the whole algorithm aborts and prints out that it has not found all most parsimonious trees, but prints out what is has got so far anyway. These trees need not be any of the most parsimonious trees: they are simply the most parsimonious ones found so far. By setting the product (howoften times howmany) large you can make the algorithm less likely to abort, but then you risk getting bogged down in a gigantic computation. You should adjust these constants so that the program cannot go beyond examining the number of trees you are reasonably willing to wait for. In their initial setting the program will abort after looking at 100,000 trees. Obviously you may want to adjust howoften in order to get more or fewer lines of intermediate notice of how many trees have been looked at so far. Of course, in small cases you may never even reach the first multiple of howoften and nothing will be printed out except some headings and then the final trees.

The indication of the approximate percentage of trees searched so far will be helpful in judging how much farther you would have to go to get the full search. Actually, since that fraction is the fraction of the set of all possible trees searched or ruled out so far, and since the search becomes progressively more efficient, the approximate fraction printed out will usually be an underestimate of how far along the program is, sometimes a serious underestimate.

A constant at the beginning of the program that affects the result is "maxtrees", which controls the maximum number of trees that can be stored. Thus if "maxtrees" is 25, and 32 most parsimonious trees are found, only the first 25 of these are stored and printed out. If "maxtrees" is increased, the program does not run any slower but requires a little more intermediate storage space. I recommend that "maxtrees" be kept as large as you can, provided you are willing to look at an output with that many trees on it! Initially, "maxtrees" is set to 100 in the distribution copy.

Methods and Options

The counting of the length of trees is done by an algorithm nearly identical to the corresponding algorithms in Mix, and thus the remainder of this document will be nearly identical to the Mix document. Mix is a general parsimony program which carries out the Wagner and Camin-Sokal parsimony methods in mixture, where each character can have its method specified. The program defaults to carrying out Wagner parsimony.

The Camin-Sokal parsimony method explains the data by assuming that changes 0 --> 1 are allowed but not changes 1 --> 0. Wagner parsimony allows both kinds of changes. (This under the assumption that 0 is the ancestral state, though the program allows reassignment of the ancestral state, in which case we must reverse the state numbers 0 and 1 throughout this discussion). The criterion is to find the tree which requires the minimum number of changes. The Camin-Sokal method is due to Camin and Sokal (1965) and the Wagner method to Eck and Dayhoff (1966) and to Kluge and Farris (1969).

Here are the assumptions of these two methods:

  1. Ancestral states are known (Camin-Sokal) or unknown (Wagner).
  2. Different characters evolve independently.
  3. Different lineages evolve independently.
  4. Changes 0 --> 1 are much more probable than changes 1 --> 0 (Camin-Sokal) or equally probable (Wagner).
  5. Both of these kinds of changes are a priori improbable over the evolutionary time spans involved in the differentiation of the group in question.
  6. Other kinds of evolutionary event such as retention of polymorphism are far less probable than 0 --> 1 changes.
  7. Rates of evolution in different lineages are sufficiently low that two changes in a long segment of the tree are far less probable than one change in a short segment.

That these are the assumptions of parsimony methods has been documented in a series of papers of mine: (1973a, 1978b, 1979, 1981b, 1983b, 1988b). For an opposing view arguing that the parsimony methods make no substantive assumptions such as these, see the papers by Farris (1983) and Sober (1983a, 1983b), but also read the exchange between Felsenstein and Sober (1986).

The input for Penny is the standard input for discrete characters programs, described above in the documentation file for the discrete-characters programs. States "?", "P", and "B" are allowed.

The options are selected using a menu:


Penny algorithm, version 3.696
 branch-and-bound to find all most parsimonious trees

Settings for this run:
  X                     Use Mixed method?  No
  P                     Parsimony method?  Wagner
  F        How often to report, in trees:  100
  H        How many groups of  100 trees:  1000
  O                        Outgroup root?  No, use as outgroup species  1
  S           Branch and bound is simple?  Yes
  T              Use Threshold parsimony?  No, use ordinary parsimony
  A   Use ancestral states in input file?  No
  W                       Sites weighted?  No
  M           Analyze multiple data sets?  No
  0   Terminal type (IBM PC, ANSI, none)?  ANSI
  1    Print out the data at start of run  No
  2  Print indications of progress of run  Yes
  3                        Print out tree  Yes
  4     Print out steps in each character  No
  5     Print states at all nodes of tree  No
  6       Write out trees onto tree file?  Yes

Are these settings correct? (type Y or the letter for one to change)

The options X, O, T, A, and M are the usual miXed Methods, Outgroup, Threshold, Ancestral States, and Multiple Data Sets options. They are described in the Main documentation file and in the Discrete Characters Programs documentation file. The O option is only acted upon if the final tree is unrooted.

The option P toggles between the Camin-Sokal parsimony criterion and the Wagner parsimony criterion. Options F and H reset the variables howoften (F) and howmany (H). The user is prompted for the new values. By setting these larger the program will report its progress less often (howoften) and will run longer (howmany times howoften). These values default to 100 and 1000 which guarantees a search of 100,000 trees, but these can be changed. Note that option F in this program is not the Factors option available in some of the other programs in this section of the package.

The A (Ancestral states) option works in the usual way, described in the Discrete Characters Programs documentation file. If the A option is not used, then the program will assume 0 as the ancestral state for those characters following the Camin-Sokal method, and will assume that the ancestral state is unknown for those characters following Wagner parsimony. If any characters have unknown ancestral states, and if the resulting tree is rooted (even by outgroup), a table will be printed out showing the best guesses of which are the ancestral states in each character.

The S (Simple) option alters a step in Penny which reconsiders the order in which species are added to the tree. Normally the decision as to what species to add to the tree next is made as the first tree is being constructed; that ordering of species is not altered subsequently. The S option causes it to be continually reconsidered. This will probably result in a substantial increase in run time, but on some data sets of intermediate messiness it may help. It is included in case it might prove of use on some data sets. The Simple option, in which the ordering is kept the same after being established by trying alternatives during the construction of the first tree, is the default. Continual reconsideration can be selected as an alternative.

The F (Factors) option is not available in this program, as it would have no effect on the result even if that information were provided in the input file.

The final output is standard: a set of trees, which will be printed as rooted or unrooted depending on which is appropriate, and if the user elects to see them, tables of the number of changes of state required in each character. If the Wagner option is in force for a character, it may not be possible to unambiguously locate the places on the tree where the changes occur, as there may be multiple possibilities. A table is available to be printed out after each tree, showing for each branch whether there are known to be changes in the branch, and what the states are inferred to have been at the top end of the branch. If the inferred state is a "?" there will be multiple equally-parsimonious assignments of states; the user must work these out for themselves by hand.

If the Camin-Sokal parsimony method (option C or S) is invoked and the A option is also used, then the program will infer, for any character whose ancestral state is unknown ("?") whether the ancestral state 0 or 1 will give the fewest state changes. If these are tied, then it may not be possible for the program to infer the state in the internal nodes, and these will all be printed as ".". If this has happened and you want to know more about the states at the internal nodes, you will find helpful to use Move to display the tree and examine its interior states, as the algorithm in Move shows all that can be known in this case about the interior states, including where there is and is not amibiguity. The algorithm in Penny gives up more easily on displaying these states.

If the A option is not used, then the program will assume 0 as the ancestral state for those characters following the Camin-Sokal method, and will assume that the ancestral state is unknown for those characters following Wagner parsimony. If any characters have unknown ancestral states, and if the resulting tree is rooted (even by outgroup), a table will be printed out showing the best guesses of which are the ancestral states in each character. You will find it useful to understand the difference between the Camin-Sokal parsimony criterion with unknown ancestral state and the Wagner parsimony criterion.

If option 6 is left in its default state the trees found will be written to a tree file, so that they are available to be used in other programs. If the program finds multiple trees tied for best, all of these are written out onto the output tree file. Each is followed by a numerical weight in square brackets (such as [0.25000]). This is needed when we use the trees to make a consensus tree of the results of bootstrapping or jackknifing, to avoid overrepresenting replicates that find many tied trees.

At the beginning of the program are a series of constants, which can be changed to help adapt the program to different computer systems. Two are the initial values of howmany and howoften, constants "often" and "many". Constant "maxtrees" is the maximum number of tied trees that will be stored.


TEST DATA SET

    7    6
Alpha1    110110
Alpha2    110110
Beta1     110000
Beta2     110000
Gamma1    100110
Delta     001001
Epsilon   001110


TEST SET OUTPUT (with all numerical options turned on)


Penny algorithm, version 3.69
 branch-and-bound to find all most parsimonious trees

 7 species,   6 characters
Wagner parsimony method


Name         Characters
----         ----------

Alpha1       11011 0
Alpha2       11011 0
Beta1        11000 0
Beta2        11000 0
Gamma1       10011 0
Delta        00100 1
Epsilon      00111 0



requires a total of              8.000

    3 trees in all found




  +-----------------Alpha1    
  !  
  !        +--------Alpha2    
--1        !  
  !  +-----4     +--Epsilon   
  !  !     !  +--6  
  !  !     +--5  +--Delta     
  +--2        !  
     !        +-----Gamma1    
     !  
     !           +--Beta2     
     +-----------3  
                 +--Beta1     

  remember: this is an unrooted tree!


steps in each character:
         0   1   2   3   4   5   6   7   8   9
     *-----------------------------------------
    0!       1   1   1   2   2   1            

From    To     Any Steps?    State at upper node
                             ( . means same as in the node below it on tree)

          1                11011 0
  1    Alpha1       no     ..... .
  1       2         no     ..... .
  2       4         no     ..... .
  4    Alpha2       no     ..... .
  4       5         yes    .0... .
  5       6         yes    0.1.. .
  6    Epsilon      no     ..... .
  6    Delta        yes    ...00 1
  5    Gamma1       no     ..... .
  2       3         yes    ...00 .
  3    Beta2        no     ..... .
  3    Beta1        no     ..... .




  +-----------------Alpha1    
  !  
--1  +--------------Alpha2    
  !  !  
  !  !           +--Epsilon   
  +--2        +--6  
     !  +-----5  +--Delta     
     !  !     !  
     +--4     +-----Gamma1    
        !  
        !        +--Beta2     
        +--------3  
                 +--Beta1     

  remember: this is an unrooted tree!


steps in each character:
         0   1   2   3   4   5   6   7   8   9
     *-----------------------------------------
    0!       1   1   1   2   2   1            

From    To     Any Steps?    State at upper node
                             ( . means same as in the node below it on tree)

          1                11011 0
  1    Alpha1       no     ..... .
  1       2         no     ..... .
  2    Alpha2       no     ..... .
  2       4         no     ..... .
  4       5         yes    .0... .
  5       6         yes    0.1.. .
  6    Epsilon      no     ..... .
  6    Delta        yes    ...00 1
  5    Gamma1       no     ..... .
  4       3         yes    ...00 .
  3    Beta2        no     ..... .
  3    Beta1        no     ..... .




  +-----------------Alpha1    
  !  
  !           +-----Alpha2    
--1  +--------2  
  !  !        !  +--Beta2     
  !  !        +--3  
  +--4           +--Beta1     
     !  
     !           +--Epsilon   
     !        +--6  
     +--------5  +--Delta     
              !  
              +-----Gamma1    

  remember: this is an unrooted tree!


steps in each character:
         0   1   2   3   4   5   6   7   8   9
     *-----------------------------------------
    0!       1   1   1   2   2   1            

From    To     Any Steps?    State at upper node
                             ( . means same as in the node below it on tree)

          1                11011 0
  1    Alpha1       no     ..... .
  1       4         no     ..... .
  4       2         no     ..... .
  2    Alpha2       no     ..... .
  2       3         yes    ...00 .
  3    Beta2        no     ..... .
  3    Beta1        no     ..... .
  4       5         yes    .0... .
  5       6         yes    0.1.. .
  6    Epsilon      no     ..... .
  6    Delta        yes    ...00 1
  5    Gamma1       no     ..... .

phylip-3.697/doc/dollop.html0000644004732000473200000003334712406201172015476 0ustar joefelsenst_g dollop

version 3.696

Dollop -- Dollo and Polymorphism Parsimony Program

© Copyright 1986-2014 by Joseph Felsenstein. All rights reserved. License terms here.

This program carries out the Dollo and polymorphism parsimony methods. The Dollo parsimony method was first suggested in print by Le Quesne (1974) and was first well-specified by Farris (1977). The method is named after Louis Dollo since he was one of the first to assert that in evolution it is harder to gain a complex feature than to lose it. The algorithm explains the presence of the state 1 by allowing up to one forward change 0-->1 and as many reversions 1-->0 as are necessary to explain the pattern of states seen. The program attempts to minimize the number of 1-->0 reversions necessary.

The assumptions of this method are in effect:

  1. We know which state is the ancestral one (state 0).
  2. The characters are evolving independently.
  3. Different lineages evolve independently.
  4. The probability of a forward change (0-->1) is small over the evolutionary times involved.
  5. The probability of a reversion (1-->0) is also small, but still far larger than the probability of a forward change, so that many reversions are easier to envisage than even one extra forward change.
  6. Retention of polymorphism for both states (0 and 1) is highly improbable.
  7. The lengths of the segments of the true tree are not so unequal that two changes in a long segment are as probable as one in a short segment.

One problem can arise when using additive binary recoding to represent a multistate character as a series of two-state characters. Unlike the Camin-Sokal, Wagner, and Polymorphism methods, the Dollo method can reconstruct ancestral states which do not exist. An example is given in my 1979 paper. It will be necessary to check the output to make sure that this has not occurred.

The polymorphism parsimony method was first used by me, and the results published (without a clear specification of the method) by Inger (1967). The method was independently published by Farris (1978a) and by me (1979). The method assumes that we can explain the pattern of states by no more than one origination (0-->1) of state 1, followed by retention of polymorphism along as many segments of the tree as are necessary, followed by loss of state 0 or of state 1 where necessary. The program tries to minimize the total number of polymorphic characters, where each polymorphism is counted once for each segment of the tree in which it is retained.

The assumptions of the polymorphism parsimony method are in effect:

  1. The ancestral state (state 0) is known in each character.
  2. The characters are evolving independently of each other.
  3. Different lineages are evolving independently.
  4. Forward change (0-->1) is highly improbable over the length of time involved in the evolution of the group.
  5. Retention of polymorphism is also improbable, but far more probable that forward change, so that we can more easily envisage much polymorhism than even one additional forward change.
  6. Once state 1 is reached, reoccurrence of state 0 is very improbable, much less probable than multiple retentions of polymorphism.
  7. The lengths of segments in the true tree are not so unequal that we can more easily envisage retention events occurring in both of two long segments than one retention in a short segment.

That these are the assumptions of parsimony methods has been documented in a series of papers of mine: (1973a, 1978b, 1979, 1981b, 1983b, 1988b). For an opposing view arguing that the parsimony methods make no substantive assumptions such as these, see the papers by Farris (1983) and Sober (1983a, 1983b), but also read the exchange between Felsenstein and Sober (1986).

The input format is the standard one, with "?", "P", "B" states allowed. The options are selected using a menu:


Dollo and polymorphism parsimony algorithm, version 3.69

Settings for this run:
  U                 Search for best tree?  Yes
  P                     Parsimony method?  Dollo
  J     Randomize input order of species?  No. Use input order
  T              Use Threshold parsimony?  No, use ordinary parsimony
  A   Use ancestral states in input file?  No
  W                       Sites weighted?  No
  M           Analyze multiple data sets?  No
  0   Terminal type (IBM PC, ANSI, none)?  ANSI
  1    Print out the data at start of run  No
  2  Print indications of progress of run  Yes
  3                        Print out tree  Yes
  4     Print out steps in each character  No
  5     Print states at all nodes of tree  No
  6       Write out trees onto tree file?  Yes

Are these settings correct? (type Y or the letter for one to change)

The options U, J, T, A, and M are the usual User Tree, Jumble, Ancestral States, and Multiple Data Sets options, described either in the main documentation file or in the Discrete Characters Programs documentation file. The A (Ancestral States) option allows implementation of the unordered Dollo parsimony and unordered polymorphism parsimony methods which I have described elsewhere (1984b). When the A option is used the ancestor is not to be counted as one of the species. The O (outgroup) option is not available since the tree produced is already rooted. Since the Dollo and polymorphism methods produce a rooted tree, the user-defined trees required by the U option have two-way forks at each level.

The P (Parsimony Method) option is the one that toggles between polymorphism parsimony and Dollo parsimony. The program defaults to Dollo parsimony.

The T (Threshold) option has already been described in the Discrete Characters programs documentation file. Setting T at or below 1.0 but above 0 causes the criterion to become compatibility rather than polymorphism parsimony, although there is no advantage to using this program instead of MIX to do a compatibility method. Setting the threshold value higher brings about an intermediate between the Dollo or polymorphism parsimony methods and the compatibility method, so that there is some rationale for doing that. Since the Dollo and polymorphism methods produces a rooted tree, the user-defined trees required by the U option have two-way forks at each level.

Using a threshold value of 1.0 or lower, but above 0, one can obtain a rooted (or, if the A option is used with ancestral states of "?", unrooted) compatibility criterion, but there is no particular advantage to using this program for that instead of MIX. Higher threshold values are of course meaningful and provide intermediates between Dollo and compatibility methods.

The X (Mixed parsimony methods) option is not available in this program. The Factors option is also not available in this program, as it would have no effect on the result even if that information were provided in the input file.

Output is standard: a list of equally parsimonious trees, and, if the user selects menu option 4, a table of the numbers of reversions or retentions of polymorphism necessary in each character. If any of the ancestral states has been specified to be unknown, a table of reconstructed ancestral states is also provided. When reconstructing the placement of forward changes and reversions under the Dollo method, keep in mind that each polymorphic state in the input data will require one "last minute" reversion. This is included in the tabulated counts. Thus if we have both states 0 and 1 at a tip of the tree the program will assume that the lineage had state 1 up to the last minute, and then state 0 arose in that population by reversion, without loss of state 1.

If the user selects menu option 5, a table is printed out after each tree, showing for each branch whether there are known to be changes in the branch, and what the states are inferred to have been at the top end of the branch. If the inferred state is a "?" there may be multiple equally-parsimonious assignments of states; the user must work these out for themselves by hand.

If the A option is used, then the program will infer, for any character whose ancestral state is unknown ("?") whether the ancestral state 0 or 1 will give the best tree. If these are tied, then it may not be possible for the program to infer the state in the internal nodes, and these will all be printed as ".". If this has happened and you want to know more about the states at the internal nodes, you will find it helpful to use Dolmove to display the tree and examine its interior states, as the algorithm in Dolmove shows all that can be known in this case about the interior states, including where there is and is not ambiguity. The algorithm in Dollop gives up more easily on displaying these states.

If the U (User Tree) option is used and more than one tree is supplied, the program also performs a statistical test of each of these trees against the best tree. This test is a version of the test proposed by Alan Templeton (1983), evaluated in a test case by me (1985a). It is closely parallel to a test using log likelihood differences invented by Kishino and Hasegawa (1989), and uses the mean and variance of step differences between trees, taken across characters. If the mean is more than 1.96 standard deviations different then the trees are declared significantly different. The program prints out a table of the steps for each tree, the differences of each from the highest one, the variance of that quantity as determined by the step differences at individual characters, and a conclusion as to whether that tree is or is not significantly worse than the best one. It is important to understand that the test assumes that all the binary characters are evolving independently, which is unlikely to be true for many suites of morphological characters.

If there are more than two trees, the test done is an extension of the KHT test, due to Shimodaira and Hasegawa (1999). They pointed out that a correction for the number of trees was necessary, and they introduced a resampling method to make this correction. In the version used here the variances and covariances of the sums of steps across characters are computed for all pairs of trees. To test whether the difference between each tree and the best one is larger than could have been expected if they all had the same expected number of steps, numbers of steps for all trees are sampled with these covariances and equal means (Shimodaira and Hasegawa's "least favorable hypothesis"), and a P value is computed from the fraction of times the difference between the tree's value and the lowest number of steps exceeds that actually observed. Note that this sampling needs random numbers, and so the program will prompt the user for a random number seed if one has not already been supplied. With the two-tree KHT test no random numbers are used.

In either the KHT or the SH test the program prints out a table of the number of steps for each tree, the differences of each from the lowest one, the variance of that quantity as determined by the differences of the numbers of steps at individual characters, and a conclusion as to whether that tree is or is not significantly worse than the best one.

If option 6 is left in its default state the trees found will be written to a tree file, so that they are available to be used in other programs. If the program finds multiple trees tied for best, all of these are written out onto the output tree file. Each is followed by a numerical weight in square brackets (such as [0.25000]). This is needed when we use the trees to make a consensus tree of the results of bootstrapping or jackknifing, to avoid overrepresenting replicates that find many tied trees.

The algorithm is a fairly simple adaptation of the one used in the program Sokal, which was formerly in this package and has been superseded by Mix. It requires two passes through each tree to count the numbers of reversions.


TEST DATA SET

     5    6
Alpha     110110
Beta      110000
Gamma     100110
Delta     001001
Epsilon   001110


TEST SET OUTPUT (with all numerical options on)


Dollo and polymorphism parsimony algorithm, version 3.69

Dollo parsimony method

 5 species,   6  characters


Name         Characters
----         ----------

Alpha        11011 0
Beta         11000 0
Gamma        10011 0
Delta        00100 1
Epsilon      00111 0



One most parsimonious tree found:




  +-----------Delta     
--3  
  !  +--------Epsilon   
  +--4  
     !  +-----Gamma     
     +--2  
        !  +--Beta      
        +--1  
           +--Alpha     


requires a total of      3.000

 reversions in each character:
         0   1   2   3   4   5   6   7   8   9
     *-----------------------------------------
    0!       0   0   1   1   1   0            

From    To     Any Steps?    State at upper node
                             ( . means same as in the node below it on tree)

root      3         yes    ..1.. .
  3    Delta        yes    ..... 1
  3       4         yes    ...11 .
  4    Epsilon      no     ..... .
  4       2         yes    1.0.. .
  2    Gamma        no     ..... .
  2       1         yes    .1... .
  1    Beta         yes    ...00 .
  1    Alpha        no     ..... .


phylip-3.697/doc/dolmove.html0000644004732000473200000004612112406201172015644 0ustar joefelsenst_g dolmove

version 3.696

Dolmove -- Interactive Dollo and Polymorphism Parsimony

© Copyright 1986-2014 by Joseph Felsenstein. All rights reserved. License terms here.

Dolmove is an interactive parsimony program which uses the Dollo and Polymorphism parsimony criteria. It was inspired by Wayne Maddison and David Maddison's marvellous program MacClade, which was written for Macintosh computers. Dolmove reads in a data set which is prepared in almost the same format as one for the Dollo and polymorhism parsimony program Dollop. It allows the user to choose an initial tree, and displays this tree on the screen. The user can look at different characters and the way their states are distributed on that tree, given the most parsimonious reconstruction of state changes for that particular tree. The user then can specify how the tree is to be rearraranged, rerooted or written out to a file. By looking at different rearrangements of the tree the user can manually search for the most parsimonious tree, and can get a feel for how different characters are affected by changes in the tree topology.

This program is compatible with fewer computer systems than the other programs in PHYLIP. It can be adapted to PCDOS systems or to any system whose screen or terminals emulate DEC VT100 terminals (such as Telnet programs for logging in to remote computers over a TCP/IP network, VT100-compatible windows in the X windowing system, and any terminal compatible with ANSI standard terminals). For any other screen types, there is a generic option which does not make use of screen graphics characters to display the character states. This will be less effective, as the states will be less easy to see when displayed.

The input data file is set up almost identically to the input file for Dollop.

The user interaction starts with the program presenting a menu. The menu looks like this:


Interactive Dollo or polymorphism parsimony, version 3.69

Settings for this run:
  P                        Parsimony method?  Dollo
  A                    Use ancestral states?  No
  F                 Use factors information?  No
  W                          Sites weighted?  No
  T                 Use Threshold parsimony?  No, use ordinary parsimony
  A      Use ancestral states in input file?  No
  U Initial tree (arbitrary, user, specify)?  Arbitrary
  0      Graphics type (IBM PC, ANSI, none)?  ANSI
  L               Number of lines on screen?  24
  S                Width of terminal screen?  80


Are these settings correct? (type Y or the letter for one to change)

The P (Parsimony Method) option is the one that toggles between polymorphism parsimony and Dollo parsimony. The program defaults to Dollo parsimony.

The T (Threshold), F (Factors), A (Ancestors), and 0 (Graphics type) options are the usual ones and are described in the main documentation page and in the Discrete Characters Program documentation page.

The F (Factors) option is used to inform the program which groups of characters are to be counted together in computing the number of characters compatible with the tree. Thus if three binary characters are all factors of the same multistate character, the multistate character will be counted as compatible with the tree only if all three factors are compatible with it.

The X (miXed methods) option is not available in Dolmove.

The usual W (Weights) option is available in Dolmove. It allows integer weights up to 36, using the symbols 0-9 and A-Z. Increased weight on a step increases both the number of parsimony steps on the character and the contribution it makes to the number of compatibilities.

The T (threshold) option allows a continuum of methods between parsimony and compatibility. Thresholds less than or equal to 0 do not have any meaning and should not be used: they will result in a tree dependent only on the input order of species and not at all on the data!

The U (initial tree) option allows the user to choose whether the initial tree is to be arbitrary, interactively specified by the user, or read from a tree file. Typing U causes the program to change among the three possibilities in turn. I would recommend that for a first run, you allow the tree to be set up arbitrarily (the default), as the "specify" choice is difficult to use and the "user tree" choice requires that you have available a tree file with the tree topology of the initial tree. Its default name is intree. The program will ask you for its name if it looks for the input tree file and does not find one of this name. If you wish to set up some particular tree you can also do that by the rearrangement commands specified below.

The S (Screen width) option allows the width in characters of the display to be adjusted when more then 80 characters can be displayed on the user's screen.

The L (screen Lines) option allows the user to change the height of the screen (in lines of characters) that is assumed to be available on the display. This may be particularly helpful when displaying large trees on terminals that have more than 24 lines per screen, or on workstation or X-terminal screens that can emulate the ANSI terminals with more than 24 lines.

After the initial menu is displayed and the choices are made, the program then sets up an initial tree and displays it. Below it will be a one-line menu of possible commands, which looks like this:

NEXT? (Options: R # + - S . T U W O F C H ? X Q) (H or ? for Help)

If you type H or ? you will get a single screen showing a description of each of these commands in a few words. Here are slightly more detailed descriptions:

R
("Rearrange"). This command asks for the number of a node which is to be removed from the tree. It and everything to the right of it on the tree is to be removed (by breaking the branch immediately below it). The command also asks for the number of a node below which that group is to be inserted. If an impossible number is given, the program refuses to carry out the rearrangement and asks for a new command. The rearranged tree is displayed: it will often have a different number of steps than the original. If you wish to undo a rearrangement, use the Undo command, for which see below.

#
This command, and the +, - and S commands described below, determine which character has its states displayed on the branches of the trees. The initial tree displayed by the program does not show states of sites. When # is typed, the program does not ask the user which character is to be shown but automatically shows the states of the next binary character that is not compatible with the tree (the next character that does not perfectly fit the current tree). The search for this character "wraps around" so that if it reaches the last character without finding one that is not compatible with the tree, the search continues at the first character; if no incompatible character is found the current character is shown, and if no current character is shown then the first character is shown. If the last character has been reached, using + again causes the first character to be shown. The display takes the form of different symbols or textures on the branches of the tree. The state of each branch is actually the state of the node above it. A key of the symbols or shadings used for states 0, 1 and ? are shown next to the tree. State ? means that either state 0 or state 1 could exist at that point on the tree, and that the user may want to consider the different possibilities, which are usually apparent by inspection.
+
This command is the same as # except that it goes forward one character, showing the states of the next character. If no character has been shown, using + will cause the first character to be shown. Once the last character has been reached, using + again will show the first character.

-
This command is the same as + except that it goes backwards, showing the states of the previous character. If no character has been shown, using - will cause the last character to be shown. Once character number 1 has been reached, using - again will show the last character.

S
("Show"). This command is the same as + and - except that it causes the program to ask you for the number of a character. That character is the one whose states will be displayed. If you give the character number as 0, the program will go back to not showing the states of the characters.

. (dot)
This command simply causes the current tree to be redisplayed. It is of use when the tree has partly disappeared off of the top of the screen owing to too many responses to commands being printed out at the bottom of the screen.

T
("Try rearrangements"). This command asks for the name of a node. The part of the tree at and above that node is removed from the tree. The program tries to re-insert it in each possible location on the tree (this may take some time, and the program reminds you to wait). Then it prints out a summary. For each possible location the program prints out the number of the node to the right of the place of insertion and the number of steps required in each case. These are divided into those that are better, tied, or worse than the current tree. Once this summary is printed out, the group that was removed is inserted into its original position. It is up to you to use the R command to actually carry out any the arrangements that have been tried.

U
("Undo"). This command reverses the effect of the most recent rearrangement, outgroup re-rooting, or flipping of branches. It returns to the previous tree topology. It will be of great use when rearranging the tree and when a rearrangement proves worse than the preceding one -- it permits you to abandon the new one and return to the previous one without remembering its topology in detail.

W
("Write"). This command writes out the current tree onto a tree output file. If the file already has been written to by this run of Dolmove, it will ask you whether you want to replace the contents of the file, add the tree to the end of the file, or not write out the tree to the file. The tree is written in the standard format used by PHYLIP (a subset of the Newick standard). It is in the proper format to serve as the User-Defined Tree for setting up the initial tree in a subsequent run of the program.

O
("Outgroup"). This asks for the number of a node which is to be the outgroup. The tree will be redisplayed with that node as the left descendant of the bottom fork. The number of steps required on the tree may change on re-rooting. Note that it is possible to use this to make a multi-species group the outgroup (i.e., you can give the number of an interior node of the tree as the outgroup, and the program will re-root the tree properly with that on the left of the bottom fork).

F
("Flip"). This asks for a node number and then flips the two branches at that, so that the left-right order of branches at that node is changed. This does not actually change the tree topology (or the number of steps on that tree) but it does change the appearance of the tree.

C
("Clade"). When the data consist of more than 12 species (or more than half the number of lines on the screen if this is not 24), it may be difficult to display the tree on one screen. In that case the tree will be squeezed down to one line per species. This is too small to see all the interior states of the tree. The C command instructs the program to print out only that part of the tree (the "clade") from a certain node on up. The program will prompt you for the number of this node. Remember that thereafter you are not looking at the whole tree. To go back to looking at the whole tree give the C command again and enter "0" for the node number when asked. Most users will not want to use this option unless forced to.

H
("Help"). Prints a one-screen summary of what the commands do, a few words for each command.

?
("huh?"). A synonym for H. Same as Help command.

X
("Exit"). Exit from program. If the current tree has not yet been saved into a file, the program will ask you whether it should be saved.

Q
("Quit"). A synonym for X. Same as the eXit command.

OUTPUT

If the A option is used, then the program will infer, for any character whose ancestral state is unknown ("?") whether the ancestral state 0 or 1 will give the fewest changes (according to the criterion in use). If these are tied, then it may not be possible for the program to infer the state in the internal nodes, and many of these will be shown as "?". If the A option is not used, then the program will assume 0 as the ancestral state.

When reconstructing the placement of forward changes and reversions under the Dollo method, keep in mind that each polymorphic state in the input data will require one "last minute" reversion. This is included in the counts. Thus if we have both states 0 and 1 at a tip of the tree the program will assume that the lineage had state 1 up to the last minute, and then state 0 arose in that population by reversion, without loss of state 1.

When Dolmove calculates the number of characters compatible with the tree, it will take the F option into account and count the multistate characters as units, counting a character as compatible with the tree only when all of the binary characters corresponding to it are compatible with the tree.

ADAPTING THE PROGRAM TO YOUR COMPUTER AND TO YOUR TERMINAL

As we have seen, the initial menu of the program allows you to choose among three screen types (PCDOS, Ansi, and none). If you want to avoid having to make this choice every time, you can change some of the constants in the file phylip.h to have the terminal type initialize itself in the proper way, and recompile. We have tried to have the default values be correct for PC, Macintosh, and Unix screens. If the setting is "none" (which is necessary on Macintosh MacOS 9 screens), the special graphics characters will not be used to indicate nucleotide states, but only letters will be used for the four nucleotides. This is less easy to look at.

The constants that need attention are ANSICRT and IBMCRT. Currently these are both set to "false" on Macintosh MacOS 9 systems, to "true" on MacOS X and on Unix/Linux systems, and IBMCRT is set to "true" on Windows systems. If your system has an ANSI compatible terminal, you might want to find the definition of ANSICRT in phylip.h and set it to "true", and IBMCRT to "false".

MORE ABOUT THE PARSIMONY CRITERION

Dolmove uses as its numerical criterion the Dollo and polymorphism parsimony methods. The program defaults to carrying out Dollo parsimony.

The Dollo parsimony method was first suggested in print by Le Quesne (1974) and was first well-specified by Farris (1977). The method is named after Louis Dollo since he was one of the first to assert that in evolution it is harder to gain a complex feature than to lose it. The algorithm explains the presence of the state 1 by allowing up to one forward change 0-->1 and as many reversions 1-->0 as are necessary to explain the pattern of states seen. The program attempts to minimize the number of 1-->0 reversions necessary.

The assumptions of this method are in effect:

  1. We know which state is the ancestral one (state 0).
  2. The characters are evolving independently.
  3. Different lineages evolve independently.
  4. The probability of a forward change (0-->1) is small over the evolutionary times involved.
  5. The probability of a reversion (1-->0) is also small, but still far larger than the probability of a forward change, so that many reversions are easier to envisage than even one extra forward change.
  6. Retention of polymorphism for both states (0 and 1) is highly improbable.
  7. The lengths of the segments of the true tree are not so unequal that two changes in a long segment are as probable as one in a short segment.

One problem can arise when using additive binary recoding to represent a multistate character as a series of two-state characters. Unlike the Camin-Sokal, Wagner, and Polymorphism methods, the Dollo method can reconstruct ancestral states which do not exist. An example is given in my 1979 paper. It will be necessary to check the output to make sure that this has not occurred.

The polymorphism parsimony method was first used by me, and the results published (without a clear specification of the method) by Inger (1967). The method was independently published by Farris (1978a) and by me (1979). The method assumes that we can explain the pattern of states by no more than one origination (0-->1) of state 1, followed by retention of polymorphism along as many segments of the tree as are necessary, followed by loss of state 0 or of state 1 where necessary. The program tries to minimize the total number of polymorphic characters, where each polymorphism is counted once for each segment of the tree in which it is retained.

The assumptions of the polymorphism parsimony method are in effect:

  1. The ancestral state (state 0) is known in each character.
  2. The characters are evolving independently of each other.
  3. Different lineages are evolving independently.
  4. Forward change (0-->1) is highly improbable over the length of time involved in the evolution of the group.
  5. Retention of polymorphism is also improbable, but far more probable that forward change, so that we can more easily envisage much polymorhism than even one additional forward change.
  6. Once state 1 is reached, reoccurrence of state 0 is very improbable, much less probable than multiple retentions of polymorphism.
  7. The lengths of segments in the true tree are not so unequal that we can more easily envisage retention events occurring in both of two long segments than one retention in a short segment.

That these are the assumptions of parsimony methods has been documented in a series of papers of mine: (1973a, 1978b, 1979, 1981b, 1983b, 1988b). For an opposing view arguing that the parsimony methods make no substantive assumptions such as these, see the papers by Farris (1983) and Sober (1983a, 1983b), but also read the exchange between Felsenstein and Sober (1986).

Below is a test data set, but we cannot show the output it generates because of the interactive nature of the program.


TEST DATA SET

     5    6
Alpha     110110
Beta      110000
Gamma     100110
Delta     001001
Epsilon   001110
phylip-3.697/doc/dolpenny.html0000644004732000473200000006501412406201172016031 0ustar joefelsenst_g dolpenny

version 3.696

Dolpenny - Branch and bound
to find all most parsimonious trees
for Dollo, polymorphism parsimony criteria

© Copyright 1986-2014 by Joseph Felsenstein. All rights reserved. License terms here.

Dolpenny is a program that will find all of the most parsimonious trees implied by your data when the Dollo or polymorphism parsimony criteria are employed. It does so not by examining all possible trees, but by using the more sophisticated "branch and bound" algorithm, a standard computer science search strategy first applied to phylogenetic inference by Hendy and Penny (1982). (J. S. Farris [personal communication, 1975] had also suggested that this strategy, which is well-known in computer science, might be applied to phylogenies, but he did not publish this suggestion).

There is, however, a price to be paid for the certainty that one has found all members of the set of most parsimonious trees. The problem of finding these has been shown (Graham and Foulds, 1982; Day, 1983) to be NP-complete, which is equivalent to saying that there is no fast algorithm that is guaranteed to solve the problem in all cases (for a discussion of NP-completeness, see the Scientific American article by Lewis and Papadimitriou, 1978). The result is that this program, despite its algorithmic sophistication, is VERY SLOW.

The program should be slower than the other tree-building programs in the package, but useable up to about ten species. Above this it will bog down rapidly, but exactly when depends on the data and on how much computer time you have (it may be more effective in the hands of someone who can let a microcomputer grind all night than for someone who has the "benefit" of paying for time on the campus mainframe computer). IT IS VERY IMPORTANT FOR YOU TO GET A FEEL FOR HOW LONG THE PROGRAM WILL TAKE ON YOUR DATA. This can be done by running it on subsets of the species, increasing the number of species in the run until you either are able to treat the full data set or know that the program will take unacceptably long on it. (Making a plot of the logarithm of run time against species number may help to project run times).

The Algorithm

The search strategy used by Dolpenny starts by making a tree consisting of the first two species (the first three if the tree is to be unrooted). Then it tries to add the next species in all possible places (there are three of these). For each of the resulting trees it evaluates the number of losses. It adds the next species to each of these, again in all possible spaces. If this process would continue it would simply generate all possible trees, of which there are a very large number even when the number of species is moderate (34,459,425 with 10 species). Actually it does not do this, because the trees are generated in a particular order and some of them are never generated.

Actually the order in which trees are generated is not quite as implied above, but is a "depth-first search". This means that first one adds the third species in the first possible place, then the fourth species in its first possible place, then the fifth and so on until the first possible tree has been produced. Its number of steps is evaluated. Then one "backtracks" by trying the alternative placements of the last species. When these are exhausted one tries the next placement of the next-to-last species. The order of placement in a depth-first search is like this for a four-species case (parentheses enclose monophyletic groups):

     Make tree of first two species     (A,B)
          Add C in first place     ((A,B),C)
               Add D in first place     (((A,D),B),C)
               Add D in second place     ((A,(B,D)),C)
               Add D in third place     (((A,B),D),C)
               Add D in fourth place     ((A,B),(C,D))
               Add D in fifth place     (((A,B),C),D)
          Add C in second place: ((A,C),B)
               Add D in first place     (((A,D),C),B)
               Add D in second place     ((A,(C,D)),B)
               Add D in third place     (((A,C),D),B)
               Add D in fourth place     ((A,C),(B,D))
               Add D in fifth place     (((A,C),B),D)
          Add C in third place     (A,(B,C))
               Add D in first place     ((A,D),(B,C))
               Add D in second place     (A,((B,D),C))
               Add D in third place     (A,(B,(C,D)))
               Add D in fourth place     (A,((B,C),D))
               Add D in fifth place     ((A,(B,C)),D)

Among these fifteen trees you will find all of the four-species rooted bifurcating trees, each exactly once (the parentheses each enclose a monophyletic group). As displayed above, the backtracking depth-first search algorithm is just another way of producing all possible trees one at a time. The branch and bound algorithm consists of this with one change. As each tree is constructed, including the partial trees such as (A,(B,C)), its number of losses (or retentions of polymorphism) is evaluated.

The point of this is that if a previously-found tree such as ((A,B),(C,D)) required fewer losses, then we know that there is no point in even trying to add D to ((A,C),B). We have computed the bound that enables us to cut off a whole line of inquiry (in this case five trees) and avoid going down that particular branch any farther.

The branch-and-bound algorithm thus allows us to find all most parsimonious trees without generating all possible trees. How much of a saving this is depends strongly on the data. For very clean (nearly "Hennigian") data, it saves much time, but on very messy data it will still take a very long time.

The algorithm in the program differs from the one outlined here in some essential details: it investigates possibilities in the order of their apparent promise. This applies to the order of addition of species, and to the places where they are added to the tree. After the first two-species tree is constructed, the program tries adding each of the remaining species in turn, each in the best possible place it can find. Whichever of those species adds (at a minimum) the most additional steps is taken to be the one to be added next to the tree. When it is added, it is added in turn to places which cause the fewest additional steps to be added. This sounds a bit complex, but it is done with the intention of eliminating regions of the search of all possible trees as soon as possible, and lowering the bound on tree length as quickly as possible.

The program keeps a list of all the most parsimonious trees found so far. Whenever it finds one that has fewer losses than these, it clears out the list and restarts the list with that tree. In the process the bound tightens and fewer possibilities need be investigated. At the end the list contains all the shortest trees. These are then printed out. It should be mentioned that the program Clique for finding all largest cliques also works by branch-and-bound. Both problems are NP-complete but for some reason Clique runs far faster. Although their worst-case behavior is bad for both programs, those worst cases occur far more frequently in parsimony problems than in compatibility problems.

Controlling Run Times

Among the quantities available to be set at the beginning of a run of Dolpenny, two (howoften and howmany) are of particular importance. As Dolpenny goes along it will keep count of how many trees it has examined. Suppose that howoften is 100 and howmany is 300, the default settings. Every time 100 trees have been examined, Dolpenny will print out a line saying how many multiples of 100 trees have now been examined, how many steps the most parsimonious tree found so far has, how many trees of with that number of steps have been found, and a very rough estimate of what fraction of all trees have been looked at so far.

When the number of these multiples printed out reaches the number howmany (say 1000), the whole algorithm aborts and prints out that it has not found all most parsimonious trees, but prints out what is has got so far anyway. These trees need not be any of the most parsimonious trees: they are simply the most parsimonious ones found so far. By setting the product (howoften X howmany) large you can make the algorithm less likely to abort, but then you risk getting bogged down in a gigantic computation. You should adjust these constants so that the program cannot go beyond examining the number of trees you are reasonably willing to pay for (or wait for). In their initial setting the program will abort after looking at 100,000 trees. Obviously you may want to adjust howoften in order to get more or fewer lines of intermediate notice of how many trees have been looked at so far. Of course, in small cases you may never even reach the first multiple of howoften and nothing will be printed out except some headings and then the final trees.

The indication of the approximate percentage of trees searched so far will be helpful in judging how much farther you would have to go to get the full search. Actually, since that fraction is the fraction of the set of all possible trees searched or ruled out so far, and since the search becomes progressively more efficient, the approximate fraction printed out will usually be an underestimate of how far along the program is, sometimes a serious underestimate.

A constant that affects the result is "maxtrees", which controls the maximum number of trees that can be stored. Thus if "maxtrees" is 25, and 32 most parsimonious trees are found, only the first 25 of these are stored and printed out. If "maxtrees" is increased, the program does not run any slower but requires a little more intermediate storage space. I recommend that "maxtrees" be kept as large as you can, provided you are willing to look at an output with that many trees on it! Initially, "maxtrees" is set to 100 in the distribution copy.

Methods and Options

The counting of the length of trees is done by an algorithm nearly identical to the corresponding algorithms in Dollop, and thus the remainder of this document will be nearly identical to the Dollop document. The Dollo parsimony method was first suggested in print by Le Quesne (1974) and was first well-specified by Farris (1977). The method is named after Louis Dollo since he was one of the first to assert that in evolution it is harder to gain a complex feature than to lose it. The algorithm explains the presence of the state 1 by allowing up to one forward change 0-->1 and as many reversions 1-->0 as are necessary to explain the pattern of states seen. The program attempts to minimize the number of 1-->0 reversions necessary.

The assumptions of this method are in effect:

  1. We know which state is the ancestral one (state 0).
  2. The characters are evolving independently.
  3. Different lineages evolve independently.
  4. The probability of a forward change (0-->1) is small over the evolutionary times involved.
  5. The probability of a reversion (1-->0) is also small, but still far larger than the probability of a forward change, so that many reversions are easier to envisage than even one extra forward change.
  6. Retention of polymorphism for both states (0 and 1) is highly improbable.
  7. The lengths of the segments of the true tree are not so unequal that two changes in a long segment are as probable as one in a short segment.

That these are the assumptions is established in several of my papers (1973a, 1978b, 1979, 1981b, 1983). For an opposing view arguing that the parsimony methods make no substantive assumptions such as these, see the papers by Farris (1983) and Sober (1983a, 1983b), but also read the exchange between Felsenstein and Sober (1986).

One problem can arise when using additive binary recoding to represent a multistate character as a series of two-state characters. Unlike the Camin-Sokal, Wagner, and Polymorphism methods, the Dollo method can reconstruct ancestral states which do not exist. An example is given in my 1979 paper. It will be necessary to check the output to make sure that this has not occurred.

The polymorphism parsimony method was first used by me, and the results published (without a clear specification of the method) by Inger (1967). The method was published by Farris (1978a) and by me (1979). The method assumes that we can explain the pattern of states by no more than one origination (0-->1) of state 1, followed by retention of polymorphism along as many segments of the tree as are necessary, followed by loss of state 0 or of state 1 where necessary. The program tries to minimize the total number of polymorphic characters, where each polymorphism is counted once for each segment of the tree in which it is retained.

The assumptions of the polymorphism parsimony method are in effect:

  1. The ancestral state (state 0) is known in each character.
  2. The characters are evolving independently of each other.
  3. Different lineages are evolving independently.
  4. Forward change (0-->1) is highly improbable over the length of time involved in the evolution of the group.
  5. Retention of polymorphism is also improbable, but far more probable that forward change, so that we can more easily envisage much polymorhism than even one additional forward change.
  6. Once state 1 is reached, reoccurrence of state 0 is very improbable, much less probable than multiple retentions of polymorphism.
  7. The lengths of segments in the true tree are not so unequal that we can more easily envisage retention events occurring in both of two long segments than one retention in a short segment.

That these are the assumptions of parsimony methods has been documented in a series of papers of mine: (1973a, 1978b, 1979, 1981b, 1983b, 1988b). For an opposing view arguing that the parsimony methods make no substantive assumptions such as these, see the papers by Farris (1983) and Sober (1983a, 1983b), but also read the exchange between Felsenstein and Sober (1986).

The input format is the standard one, with "?", "P", "B" states allowed. Most of the options are selected using a menu:


Penny algorithm for Dollo or polymorphism parsimony, version 3.69
 branch-and-bound to find all most parsimonious trees

Settings for this run:
  P                     Parsimony method?  Dollo
  H        How many groups of  100 trees:  1000
  F        How often to report, in trees:  100
  S           Branch and bound is simple?  Yes
  T              Use Threshold parsimony?  No, use ordinary parsimony
  A                 Use ancestral states?  No
  W                       Sites weighted?  No
  M           Analyze multiple data sets?  No
  0   Terminal type (IBM PC, ANSI, none)?  ANSI
  1    Print out the data at start of run  No
  2  Print indications of progress of run  Yes
  3                        Print out tree  Yes
  4     Print out steps in each character  No
  5     Print states at all nodes of tree  No
  6       Write out trees onto tree file?  Yes

Are these settings correct? (type Y or the letter for one to change)

The P option toggles between the Polymorphism parsimony method and the default Dollo parsimony method.

The options T, A, and M are the usual Threshold, Ancestral States, and Multiple Data Sets options. They are described in the Main documentation file and in the Discrete Characters Programs documentation file.

Options F and H reset the variables howoften (F) and howmany (H). The user is prompted for the new values. By setting these larger the program will report its progress less often (howoften) and will run longer (howmany times howoften). These values default to 100 and 1000 which guarantees a search of 100,000 trees, but these can be changed. Note that option F in this program is not the Factors option available in some of the other programs in this section of the package.

The use of the A option allows implementation of the unordered Dollo parsimony and unordered polymorphism parsimony methods which I have described elsewhere (1984b). When the A option is used the ancestor is not to be counted as one of the species. The O (outgroup) option is not available since the tree produced is already rooted.

Setting T at or below 1.0 but above 0 causes the criterion to become compatibility rather than polymorphism parsimony, although there is no advantage to using this program instead of Penny to do a compatibility method. Setting the threshold value higher brings about an intermediate between the Dollo or polymorphism parsimony methods and the compatibility method, so that there is some rationale for doing that.

Using a threshold value of 1.0 or lower, but above 0, one can obtain a rooted (or, if the A option is used with ancestral states of "?", unrooted) compatibility criterion, but there is no particular advantage to using this program for that instead of MIX. Higher threshold values are of course meaningful and provide intermediates between Dollo and compatibility methods.

The S (Simple) option alters a step in Dolpenny which reconsiders the order in which species are added to the tree. Normally the decision as to what species to add to the tree next is made as the first tree is being constructucted; that ordering of species is not altered subsequently. The R option causes it to be continually reconsidered. This will probably result in a substantial increase in run time, but on some data sets of intermediate messiness it may help. It is included in case it might prove of use on some data sets. The Simple option, in which the ordering is kept the same after being established by trying alternatives during the construction of the first tree, is the default. Continual reconsideration can be selected as an alternative.

The Factors option is not available in this program, as it would have no effect on the result even if that information were provided in the input file.

The output format is also standard. It includes a rooted tree and, if the user selects option 4, a table of the numbers of reversions or retentions of polymorphism necessary in each character. If any of the ancestral states has been specified to be unknown, a table of reconstructed ancestral states is also provided. When reconstructing the placement of forward changes and reversions under the Dollo method, keep in mind that each polymorphic state in the input data will require one "last minute" reversion. This is included in the tabulated counts. Thus if we have both states 0 and 1 at a tip of the tree the program will assume that the lineage had state 1 up to the last minute, and then state 0 arose in that population by reversion, without loss of state 1.

A table is available to be printed out after each tree, showing for each branch whether there are known to be changes in the branch, and what the states are inferred to have been at the top end of the branch. If the inferred state is a "?" there will be multiple equally-parsimonious assignments of states; the user must work these out for themselves by hand.

If the A option is used, then the program will infer, for any character whose ancestral state is unknown ("?") whether the ancestral state 0 or 1 will give the best tree. If these are tied, then it may not be possible for the program to infer the state in the internal nodes, and these will all be printed as ".". If this has happened and you want to know more about the states at the internal nodes, you will find it helpful to use Dolmove to display the tree and examine its interior states, as the algorithm in Dolmove shows all that can be known in this case about the interior states, including where there is and is not ambiguity. The algorithm in Dolpenny gives up more easily on displaying these states.

If option 6 is left in its default state the trees found will be written to a tree file, so that they are available to be used in other programs. If the program finds multiple trees tied for best, all of these are written out onto the output tree file. Each is followed by a numerical weight in square brackets (such as [0.25000]). This is needed when we use the trees to make a consensus tree of the results of bootstrapping or jackknifing, to avoid overrepresenting replicates that find many tied trees.

At the beginning of the program are a series of constants, which can be changed to help adapt the program to different computer systems. Two are the initial values of howmany and howoften, constants "often" and "many". Constant "maxtrees" is the maximum number of tied trees that will be stored.


TEST DATA SET

    7    6
Alpha1    110110
Alpha2    110110
Beta1     110000
Beta2     110000
Gamma1    100110
Delta     001001
Epsilon   001110


TEST SET OUTPUT (with all numerical options turned on)


Penny algorithm for Dollo or polymorphism parsimony, version 3.69
 branch-and-bound to find all most parsimonious trees

 7 species,   6 characters
Dollo parsimony method


Name         Characters
----         ----------

Alpha1       11011 0
Alpha2       11011 0
Beta1        11000 0
Beta2        11000 0
Gamma1       10011 0
Delta        00100 1
Epsilon      00111 0



requires a total of              3.000

    3 trees in all found




  +-----------------Delta     
  !  
--2  +--------------Epsilon   
  !  !  
  +--3  +-----------Gamma1    
     !  !  
     +--6  +--------Alpha2    
        !  !  
        +--1     +--Beta2     
           !  +--5  
           +--4  +--Beta1     
              !  
              +-----Alpha1    


 reversions in each character:
         0   1   2   3   4   5   6   7   8   9
     *-----------------------------------------
    0!       0   0   1   1   1   0            

From    To     Any Steps?    State at upper node
                             ( . means same as in the node below it on tree)

root      2         yes    ..1.. .
  2    Delta        yes    ..... 1
  2       3         yes    ...11 .
  3    Epsilon      no     ..... .
  3       6         yes    1.0.. .
  6    Gamma1       no     ..... .
  6       1         yes    .1... .
  1    Alpha2       no     ..... .
  1       4         no     ..... .
  4       5         yes    ...00 .
  5    Beta2        no     ..... .
  5    Beta1        no     ..... .
  4    Alpha1       no     ..... .





  +-----------------Delta     
  !  
--2  +--------------Epsilon   
  !  !  
  +--3  +-----------Gamma1    
     !  !  
     +--6        +--Beta2     
        !  +-----5  
        !  !     +--Beta1     
        +--4  
           !     +--Alpha2    
           +-----1  
                 +--Alpha1    


 reversions in each character:
         0   1   2   3   4   5   6   7   8   9
     *-----------------------------------------
    0!       0   0   1   1   1   0            

From    To     Any Steps?    State at upper node
                             ( . means same as in the node below it on tree)

root      2         yes    ..1.. .
  2    Delta        yes    ..... 1
  2       3         yes    ...11 .
  3    Epsilon      no     ..... .
  3       6         yes    1.0.. .
  6    Gamma1       no     ..... .
  6       4         yes    .1... .
  4       5         yes    ...00 .
  5    Beta2        no     ..... .
  5    Beta1        no     ..... .
  4       1         no     ..... .
  1    Alpha2       no     ..... .
  1    Alpha1       no     ..... .





  +-----------------Delta     
  !  
--2  +--------------Epsilon   
  !  !  
  +--3  +-----------Gamma1    
     !  !  
     !  !        +--Beta2     
     +--6     +--5  
        !  +--4  +--Beta1     
        !  !  !  
        +--1  +-----Alpha2    
           !  
           +--------Alpha1    


 reversions in each character:
         0   1   2   3   4   5   6   7   8   9
     *-----------------------------------------
    0!       0   0   1   1   1   0            

From    To     Any Steps?    State at upper node
                             ( . means same as in the node below it on tree)

root      2         yes    ..1.. .
  2    Delta        yes    ..... 1
  2       3         yes    ...11 .
  3    Epsilon      no     ..... .
  3       6         yes    1.0.. .
  6    Gamma1       no     ..... .
  6       1         yes    .1... .
  1       4         no     ..... .
  4       5         yes    ...00 .
  5    Beta2        no     ..... .
  5    Beta1        no     ..... .
  4    Alpha2       no     ..... .
  1    Alpha1       no     ..... .


phylip-3.697/doc/draw.html0000644004732000473200000013540212406201172015135 0ustar joefelsenst_g main

version 3.696

Drawtree and Drawgram

Written by Joseph Felsenstein and James McGill.
© Copyright 1986-2014 by Joseph Felsenstein. All rights reserved. License terms here.

Drawtree and Drawgram are interactive tree-plotting programs that take a tree description in a file and read it, and then let you interactively make various settings and then make a plot of the tree in a file in some graphical file format, or plot the tree on a laser printer, plotter, or dot matrix printer. In most cases you can preview the resulting tree. This allows you to modify the tree until you like the result, then plot the result. Drawtree plots unrooted trees and Drawgram plots rooted cladograms and phenograms. With a plot to a file whose format is one acceptable to publishers both programs can produce fully publishable results.

These programs are descended from PLOTGRAM and PLOTREE, written by Christopher Meacham in 1984 and contributed to PHYLIP. I have incorporated his code for fonts and his plotter drivers, and in Drawtree have used some of his code for drawing unrooted trees. In both programs I have also included some plotter driver code by David Swofford, Julian Humphries and George D.F. "Buz" Wilson, to all of whom I am very grateful. Mostly, however, they consist of my own code and that of my programmers. The font files are printable-character recodings of the public-domain Hershey fonts, recoded by Christopher Meacham. The Java interface for the programs was created by Jim McGill.

This document will describe the features common to both programs. The documents for Drawtree and Drawgram describe the particular choices you can make in each of those programs. The Appendix to this documentation file contains some pieces of C code that can be inserted to make the program handle another plotting device -- the plotters by Calcomp.

A Short Introduction

To use Drawtree and Drawgram, you must have

(1)
The executable of the program, including, if necessary, the Java archive file for the Java interface version.

(2)
A tree file. Trees are described in the nested-parenthesis Newick notation used throughout PHYLIP and standardized in an informal meeting of program authors in Durham, New Hampshire in June, 1986. Trees for both programs may be either bifurcating or multifurcating, and may either have or not have branch lengths. Tree files produced by the PHYLIP programs are in this form. There is further description of the tree file format later in this document.

(3)
In some cases you may also need a font file. However, if you are using the version of Drawgram or Drawtree that has a Java front end, you do not need these font files and can skip this item. There are five font files distributed with PHYLIP: these consist of three Roman and two Italic fonts, all from the public-domain Hershey Fonts, and in Text Only (ASCII/ISO) readable form. These are, respectively, files font1 through font5. The programs are set up to read a font file named fontfile. This does not exist, so the program may ask you for the font file name (in which case you simply need to type one of font1 through font5. It may be useful to make a copy of your favorite font file and call that fontfile -- the program will then read it and not ask you for the name of the font file. The details of font representation need not concern you. The six fonts are, respectively, a one- and a two-stroke sans-serif Roman font, a three-stroke serifed Roman font, a two- and a three- stroke serifed Italic font, and a two-stroke Cyrillic font for the Russian language. If this is not clear just try them all. Note that for many printers built-in fonts such as Times-Roman and Courier can be used too. The Hershey fonts were created by Dr. A. V. Hershey of the U. S. National Bureau of Standards in the late 1960s. They may be freely distributed except that they may not be distributed in the original format used the the U. S. National Technical Information Service. Our format is different from the NTIS one. See Appendix 2, below, if you need a detailed discussion of the format.

(4)
A way to view the final plot file on your computer. These days this is widely available. Or you need a plotting device such as a laser printer. You also need some way of viewing the preview of the plot, if you want to make changes in the settings for plotting the tree. The Java front end for Drawgram or Drawtree provides the previews. The programs work with most printers available today. In particular they can produce plot files in many major formats, including Postscript, and they work with Postscript-compatible laser printers and with laser printers compatible with the PCL printer language also found in Hewlett-Packard laser printers and inkjet printers. The Postscript format is used within the PDF format, and it can also be imported into Microsoft Word documents. The programs can also produce the PICT format used (which was developed for the MacDraw drawing program on Apple Macintosh systems), the file formats for the freeware X-windows drawing programs xfig and idraw, Windows Bitmap (.BMP) file format, the PCX file format for the PC Paintbrush painting program, and the X Bitmap format for X-windows. Some common raster graphics file formats such as .JPG, .PNG, and .GIF are not supported, but software to convert some of our file formats to these is widely available. Drawgram and Drawtree also support some older and now far less common options such as pen plotters including Hewlett-Packard models, dot matrix printers including models by Epson and Apple, graphics terminals from DEC and Tektronix. They also support the graphics file format for the free ray-tracing (3-dimensional rendering) programs POV and rayshade, and, strangest and most wonderful of all, the Virtual Reality Markup Language (VRML) which is a file format that is used by some freely-available virtual reality programs. You can choose the plotting devices from a menu at run time. There are places in the source code for the program where you can insert code for a new plotter, should you want to do that.

Once you have all these pieces, the programs should be fairly self explanatory, particular if you can preview your plots so that you can discover the meaning of the different options by trying them out.

Running the programs

Previewing the plot

Once you have an executable version of the appropriate program (say Drawgram), and a file called (say) intree with the tree in it, and if necessary a font file (say font2 which you have copied as a file called fontfile), all you do is run the Drawgram program. It should automatically read the tree file and any font file needed, and will allow you to change the graphics device. Then it will let you see the options it has chosen, and ask you if you want to change these. Once you have modified those that you want to, you can tell it to accept those. The version of the program that has a Java interface will then allow you to preview the tree on the computer screen.

Making the final plot in Java GUI version`

After you are done previewing the tree, the program will want to know whether you are ready to plot the tree. In the Java GUI version of the programs, you press on the Create Plot File button when you want to produce the final plot.

Making the final plot in the character-mode menu version`

In the character-mode menu-driven versions of the programs, options can be changed but previewing does not occur. Plotting will occur after you close the menu by making the Y (yes) choice when you are asked whether you want to accept the plot as is. If you say no, it will once again allow you to change options, as many times as you want. If you say yes, then it will write a file called (say) plotfile. If you then copy this file to your printer or plotter, it should result in a beautifully plotted tree. You may need to change the filename to have the file format recognized by your operating system (for example, you may want to change plotfile to plotfile.ps if the file is in Postscript format).

Altering the figure in a drawing program

If you don't want to print the file immediately, but want to edit the figure first, you should have chosen an output format that is readable by a draw program. Postscript format is readable by drawing programs such as Adobe Illustrator, Canvas, Freehand, and Coreldraw, and can be displayed by the Unix utilities Ghostscript and Ghostview. It can also be imported into word processors such as Microsoft Word as a figure. The PICT format was created for earlier Macintosh drawing programs such as MacDraw, and can be read by some other drawing programs and word processors. A widely-available bitmap drawing editor is GIMP (the Gnu Image Manipulation Program). On Windows systems bitmap drawing editors such as Paint can read Windows Bitmap files. We have provided output formats here for Xfig and Idraw drawing programs available on Linux or Unix systems.

Drawing programs can be used to add branch length numbers (something too hard for us to do automatically in these programs) and to make scale bars. Another use is as a way of printing out the trees, as most drawing programs are set up to print out their figures.

Ready to run?

Having read the above, you may be ready to run the program. Below you will find more information about representation of trees in the tree file, on the different kinds of graphics devices supported by this program, and on how to recompile these programs.

Trees

The Newick Standard for representing trees in computer-readable form makes use of the correspondence between trees and nested parentheses, noticed in 1857 by the famous English mathematician Arthur Cayley. If we have this rooted tree:

                         A                 D
                          \         E     /
                           \   C   /     /
                            \  !  /     /
                             \ ! /     /
                        B     \!/     /
                         \     o     /
                          \    !    /
                           \   !   /
                            \  !  /
                             \ ! /
                              \!/
                               o
                               !
                               !

then in the tree file it is represented by the following sequence of printable characters, starting at the beginning of the file:

(B,(A,C,E),D);

The tree ends with a semicolon. Everything after the semicolon in the input file is ignored, including any other trees. The bottommost node in the tree is an interior node, not a tip. Interior nodes are represented by a pair of matched parentheses. Between them are representations of the nodes that are immediately descended from that node, separated by commas. In the above tree, the immediate descendants are B, another interior node, and D. The other interior node is represented by a pair of parentheses, enclosing representations of its immediate descendants, A, C, and E.

Tips are represented by their names. A name can be any string of printable characters except blanks, colons, semcolons, parentheses, and square brackets. In the programs a maximum of 20 characters are allowed for names: this limit can easily be increased by recompiling the program and changing the constant declaration for "MAXNCH" in phylip.h.

Because you may want to include a blank in a name, it is assumed that an underscore character ("_") stands for a blank; any of these in a name will be converted to a blank when it is read in. Any name may also be empty: a tree like

(,(,,),);

is allowed. Trees can be multifurcating at any level (while in many of the programs multifurcations of user-defined trees are not allowed or restricted to a trifurcation at the bottommost level, these programs do make any such restriction).

Branch lengths can be incorporated into a tree by putting a real number, with or without decimal point, after a node and preceded by a colon. This represents the length of the branch immediately below that node. Thus the above tree might have lengths represented as:

(B:6.0,(A:5.0,C:3.0,E:4.0):5.0,D:11.0);

These programs will be able to make use of this information only if lengths exist for every branch, except the one at the bottom of the tree.

The tree starts on the first line of the file, and can continue to subsequent lines. It is best to proceed to a new line, if at all, immediately after a comma. Blanks can be inserted at any point except in the middle of a species name or a branch length.

The above description is of a subset of the Newick Standard. For example, interior nodes can have names in that standard, but if any are included the present programs will omit them.

To help you understand this tree representation, here are some trees in the above form:

((raccoon:19.19959,bear:6.80041):0.84600,((sea_lion:11.99700,
seal:12.00300):7.52973,((monkey:100.85930,cat:47.14069):20.59201,
weasel:18.87953):2.09460):3.87382,dog:25.46154);

(Bovine:0.69395,(Gibbon:0.36079,(Orang:0.33636,(Gorilla:0.17147,(Chimp:0.19268, Human:0.11927):0.08386):0.06124):0.15057):0.54939,Mouse:1.21460);

(Bovine:0.69395,(Hylobates:0.36079,(Pongo:0.33636,(G._Gorilla:0.17147, (P._paniscus:0.19268,H._sapiens:0.11927):0.08386):0.06124):0.15057):0.54939, Rodent:1.21460);

();

((A,B),(C,D));

(Alpha,Beta,Gamma,Delta,,Epsilon,,,);

The Newick standard is based on a standard invented by Christopher Meacham for his programs PLOTREE and PLOTGRAM. The Newick Standard was adopted June 26, 1986 by an informal committee that met during the Society for the Study of Evolution meetings in Durham, New Hampshire and consisted of James Archie, William H.E. Day, Wayne Maddison, Christopher Meacham, F. James Rohlf, David Swofford, and myself. A web page describing it will be found at http://evolution.gs.washington.edu/phylip/newicktree.html.

Plot file formats

When the programs run they have a menu which allows you to set (on its option P) the final plotting device, and another menu which allows you to set the type of preview screen. The choices for previewing are a subset of those available for plotting, and they can be different (the most useful combination will usually be a previewing graphics screen with a hard-copy plotter or a drawing program graphics file format).

In the Java interface the "Final plot file type" menu gives you the choices

In the non-Java menu-driven interface the plotting device menu looks like this:

Which plotter or printer will the tree be drawn on?
(many other brands or models are compatible with these)

   type:       to choose one compatible with:

        L         Postscript printer file format
        M         PICT format (for drawing programs)
        J         HP Laserjet PCL file format
        W         MS-Windows Bitmap
        F         FIG 2.0 drawing program format
        A         Idraw drawing program format
        Z         VRML Virtual Reality Markup Language file
        P         PCX file format (for drawing programs)
        K         TeKtronix 4010 graphics terminal
        X         X Bitmap format
        V         POVRAY 3D rendering program file
        R         Rayshade 3D rendering program file
        H         Hewlett-Packard pen plotter (HPGL file format)
        D         DEC ReGIS graphics (VT240 terminal)
        E         Epson MX-80 dot-matrix printer
        C         Prowriter/Imagewriter dot-matrix printer
        O         Okidata dot-matrix printer
        B         Houston Instruments plotter
        U         other: one you have inserted code for
 Choose one:

Here are the choices, with some comments on each:

Postscript printer file format. This means that the program will generate a file containing Postscript commands as its plot file. This can be printed on any Postscript-compatible laser printer, and can be incorporated into Microsoft Word documents or into PDF documents. The page size is assumed to be 8.5 by 11 inches, but as plotting is within this limit A4 metric paper should work well too. This is the best quality output option. For this printer the menu options in Drawgram and Drawtree that allow you to select one of the built-in fonts will work. The programs default to Times-Roman when this plotting option is in effect. I have been able to use fonts Courier, Times-Roman, and Helvetica. The others have eluded me for some reason known only to those who really understand Postscript. The font name is written into the file, so any name that works there is possible.

PICT format (for drawing programs). This file format is read by many drawing programs (an early example was MacDraw). It has support for some fonts, though if fonts are used the species names can only be drawn horizontally or vertically, not at other angles in between. The control over line widths is a bit rough also, so that some lines at different angles may turn out to be different widths when you do not want them to be. If you are working on a Mac OS X system and have not been able to persuade it to print a Postscript file, even after adding a .ps extension to the file name, this option may be the best solution, as you could then read the file into a drawing program and then order it to print the resulting screen. The PICT file format has font support, and the default font for this plotting option is set to Times. You can also choose font attributes for the labels such as Bold, Italic, Outline, and Shadowed. PICT files can be read and supported by various drawing programs, but support for it in Adobe Photoshop has recently been dropped. It has been replaced by PDF format as the default graphics file format in Mac OS X.

HP Laserjet PCL file format. Hewlett-Packard's extremely popular line of laser printers has been emulated by many other brands of laser printer, so that for many years this format was compatible with many printers. It was also the default format for many inkjet printers. More recently almost all of these printers have support for the Postscript format, and their support for the PCL format may ultimately disappear. One limitation of the early versions of the PCL command language for these printers was that they did not have primitive operations for drawing arbitrary diagonal lines. This means that they must be treated by these programs as if they were dot matrix printers with a great many dots. This makes output files large, and output can be slow. The user will be asked to choose the dot resolution (75, 150, or 300 dots per inch). The 300 dot per inch setting should not be used if the laser printer's memory is less than 512k bytes. The quality of output is also not as good as it might be so that the Postscript file format will usually produce better results even at the same resolution. I am grateful to Kevin Nixon for inadvertently assisting me by pointing out that on Laserjets one does not have to dump the complete bitmap of a page to plot a tree.

MS-Windows Bitmap. This file format is used by most Windows drawing and paint programs, including Windows Paint which comes with the Windows operating system. It asks you to choose the height and width of the graphic image in pixels. For the moment, the image is set to be a monochrome image which can only be black or white. We hope to change that soon, but note that by pasting the image into a copy of Paint that is set to have a color image of the appropriate size, one can get a version whose color can be changed. Note also that large enough Windows Bitmap files can be used as "wallpaper" images for the background of a desktop.

FIG 2.0 drawing program format. This is the file format of the free drawing program Xfig, available for X-windows systems on Unix or Linux systems. Xfig can be downloaded from these places:

Xfig but may draw them with lines. This often makes the names look rather bumpy. We hope to change this soon.

Idraw drawing program format. Idraw is a free drawing program for X windows systems (such as Unix and Linux systems). Its interface is loosely based on MacDraw, and I find it much more useable than Xfig (almost no one else seems to agree with me). Though it was unsupported for a number of years, it has more recently been actively supported by Scott Johnston, of Vectaport, Inc. (http://www.vectaport.com). He has produced, in his ivtools package, a number of specialized versions of Idraw, and he also distributes the original Idraw as part of it. ivtools is available as a package on the Debian family of Linux distributions, as packages ivtools-bin, libiv-unidraw1, libiv1 and (for development) ivtools-dev. Thus on a Debia-family Linux system such as Ubuntu Linux, Linux Mint, SUSE, etc. you may simply need to type:

sudo apt-get install ivtools-bin
sudo apt-get install libiv-unidraw1
sudo apt-get install libiv1
in order to install Ivtools.

The Idraw file format that our programs produce can be read into Idraw, and also can be imported into the other Ivtools programs such as Drawtool. The file format saved from Idraw (or which can be exported from the other Ivtools programs) is Postscript, and if one does not print directly from Idraw one can simply send the file to the printer. But the format that we produce is missing some of the header information and will not work directly as a Postscript file. However if you read it into Idraw and then save it (or import it into one of the other Ivtools drawing programs such as Drawtool, and then export it) you will get a Postscript version that is fully useable.

Drawgram and Drawtree have font support in their Idraw file format options. The default font is Times-Bold but you can also enter the name of any other font that is supported by your Postscript printer. Idraw labels can be rotated to any angle. Some of these fonts are directly supported by the Idraw program. There is also a way to install new Postscript Type 1 fonts in the Ivtools programs.

Note that the Idraw drawing program from Ivtools is not related to the drawing program iDraw, which is produced by Indeeo, Inc.

VRML Virtual Reality Markup Language file. This is by far the most interesting plotting file format. VRML files describe objects in 3-dimensional space with lighting on them. A number of freely available "virtual reality browsers" or browser plugins can read VRML files. A list of available virtual reality browsers and browser plugins can be found at http://cic.nist.gov/vrml/vbdetect.html, a site that also automatically detects which VRML plugins are appropriate for your web browser. VRML plugins for your web browser or standalone browsers allow you to wander around looking at the tree from various angles, including from behind!  I found VRMLView particularly easy to download -- it is distributed as an executable. It is not particulary fast and somewhat mysterious to use (try your mouse buttons). At the moment our VRML output is unsophisticated. The branches are made of tubes, with spheres at their joints. The tree is made of three-dimensional tubes but is basically flat. Names are made of connected tubes (to get this make sure you use a simple default font such as the Hershey font in file font1). This has the interesting effect that if you (virtually) move around and look at the tree from behind, the names will be backwards. VRML itself has VRML itself has been superseded by a standard called X3D (see http://www.web3d.org/), and we will be moving toward X3D support. Fortunately X3D is backwards compatible with VRML. What's next? Trees whose branches stick out in three dimensions? Animated trees whose forks rotate slowly? A video game involving combat among schools of systematists?

PCX file format (for drawing programs). A bitmap format that was formerly much used on the PC platform, this has been largely superseded by the Windows Bitmap (BMP) format, but it is still useful. This file format is simple and is read by many other programs as well. The user must choose one of three resolutions for the file, 640x480, 800x600, or 1024x768. The file is a monochrome paint file. Our PCX format is correct but is not read correctly by versions of Microsoft Paint (PBrush) that are running on systems that have loaded Word97. The version of the Paint utility provided with Windows 7 also does not support the PCX format. The free image manipulation program GIMP (Gnu Image Manipulation Program) is able to read the PCX format.

The plot devices from here on are only available in the non-Java-interface version of the programs:

Tektronix 4010 graphics terminal. The plot file will contain commands for driving the Tektronix series of graphics terminals. Other graphics terminals were compatible with the Tektronix 4010 and its immediate descendants. There are terminal emulation programs for Macintoshes that emulate Tektronix graphics. On workstations with X windows you can use one option of the "xterm" utility to create a Tektronix-compatible window. On Sun workstations there used to be a Tektronix emulator you can run called "tektool" which can be used to view the trees.

X Bitmap format. This produces an X-bitmap for the X Windows system on Unix or Linux systems, which can be displayed on X screens. You will be asked for the size of the bitmap (e.g., 16x16, or 256x256, etc.). This format cannot be printed out without further format conversion but is usable for backgrounds of windows ("wallpaper"). This can be a very bulky format if you choose a large bitmap. The bitmap is a structure that can actually be compiled into a C program (and thus built in to it), if you should have some reason for doing that.

POVRAY 3D rendering program file. This produces a file for the free ray-tracing program POVRay (Persistence of Vision Raytracer), which is available at http://www.povray.org/. It shows a tree floating above a flat landscape. The tree is flat but made out of tubes (as are the letters of the species names). It casts a realistic shadow across the landscape. lit from over the left shoulder of the viewer. You will be asked to confirm the colors of the tree branches, the species names, the background, and the bottom plane. These default to Blue, Yellow, White, and White respectively.

Rayshade 3D rendering program file. The input format for the free ray-tracing program "rayshade" which is available at http://www-graphics.stanford.edu/~cek/rayshade/rayshade.html for many kinds of systems. Rayshade takes files of this format and turns them into color scenes in "raw" raster format (also called "MTV" format after a raytracing program of that name). If you get the Netpbm package (available from http://netpbm.sourceforge.net/projects/netpbm/). and compile it on your system you can use the "mtvtoppm" and "ppmtogif" programs to convert this into the widely-used GIF raster format. (the Netpbm package will also allow you to convert into tiff, pcx and many other formats) The resultant image will show a tree floating above a landscape, rendered in a real-looking 3-dimensional scene with shadows and illumination. It is possible to use Rayshade to make two scenes that together are a stereo pair. When producing output for Rayshade you will be asked by the Drawgram or Drawtree whether you want to reset the values for the colors you want for the tree, the species names, the background, and the desired resolution.

Hewlett-Packard pen plotter (HPGL file format). This means that the program will generate a file as its plot file which uses the HPGL graphics language. Hewlett-Packard 7470, 7475, and many other plotters are compatible with this. The paper size is again assumed to be 8.5 by 11 inches (again, A4 should work well too). It is assumed that there are two pens, a finer one for drawing names, and the HPGL commands will call for switching between these. Few people have HP plotters these days, the PCL printer control language found in but recent Hewlett-Packard printers can emulate an HP plotter, as this feature is included in its PCL5 command language (but not in the PCL4 command languages of earlier Hewlett-Packard models).

DEC ReGIS graphics (VT240 terminal). The DEC ReGIS standard is used by the VT240 and VT340 series terminals by DEC (Digital Equipment Corporation). There used to be many graphics terminals that emulate the VT240 or VT340 as well. The DECTerm windows in many versions of Digital's (now Compaq's) DECWindows windowing system also did so. These days DEC ReGIS graphics is rarely seen: it is most likely to be encountered as an option in X11 Xterm windows.

Epson MX-80 dot-matrix printer. This file format is for the dot-matrix printers by Epson (starting with the MX80 and continuing on to many other models), as well as the IBM Graphics printers. The code here plots in double-density graphics mode. Many of the later models are capable of higher-density graphics but not with every dot printed. This density was chosen for reasonably wide compatibility. Many other dot-matrix printers on the market have graphics modes compatible with the Epson printers. I cannot guarantee that the plot files generated by these programs would be compatible with all of these, but they do work on Epsons. They have also worked, in our hands, on IBM Graphics Printers. There used to be many printers that claimed compatibility with these too, but I do not know whether it will work on all of them. If you have trouble with any of these you might consider trying in the epson option of procedure initplotter to put in a fprintf statement that writes to plotfile an escape sequence that changes line spacing. As dot matrix printers are rare these days, used mostly to print multipart receipts in business, I suspect this option will not get much testing.

Prowriter/Imagewriter dot-matrix printer. The trading firm C. Itoh distributed this line of dot-matrix printers, which was made by Tokyo Electric (TEC), now a subsidiary of Toshiba. It was also sold by NEC under the product number PC8023. These were 9-pin dot matrix printers. In a slightly modified form they were also the Imagewriter printer sold by Apple for their Macintosh line. The same escape codes seem to work on both machines, the Apple version being a serial interface version. They are not related to the IBM Proprinter, despite the name.

Okidata dot-matrix printer. The ML81, 82, 83 and ML181, 182, 183 line of dot-matrix printers from Okidata had their own graphics codes and those are dealt with by this option. The later Okidata ML190 series emulated IBM Graphics Printers so that you would not want to use this option for them but the option for that printer.

Houston Instruments plotter. The Houston Instruments line of plotters were also known as Bausch and Lomb plotters. The code in the programs for these has not been tested recently; I would appreciate anyone who tries it out telling me whether it works. I do not have access to such a plotter myself, and doubt most users will come across one.

Conversion from these formats to others is also possible. There is a free program NetPBM that interconverts many bitmap formats (see above under Rayshade). A more accessible option will be the free image manipulation program GIMP (Gnu Image Manipulation Program) which can read our Postscript, Windows Bitmap (.BMP), PCX, and X Bitmap formats and can write many raster and vector formats.

Preview of Plots

In the Java GUI version of Drawgram and Drawtree, the graphics capabilities of Java are used for previewing. The programs actually write a Postscript file called JavaPreview.ps, and each time the preview is displayed this is read in and displayed.

Other problems and opportunities

Another problem is adding labels (such as vertical scales and branch lengths) to the plots produced by this program. This may require you to use the Postcript, BMP, PICT, Idraw, Xfig, or PCX file format and use a draw or paint program to add them. GIMP and Adobe Illustrator can do this.

I would like to add more built-in fonts. The fontfiles now have recoded versions of the Hershey fonts. They are legally publicly distributable. Most other font families on the market are not public domain and I cannot afford to license them for distribution. Some people have noticed that the Hershey fonts, which are drawn by a series of straight lines, have noticeable angles in what are supposed to be curves, when they are printed on modern laser printers and looked at closely. This is less a problem than one might think since, fortunately, when scientific journals print a tree it is usually shrunk so small that these imperfections (and often the tree itself) are hard to see!

One more font that could be added from the Hershey font collection would be a Greek font. If Greek users would find that useful I could add it, but my impression is that they publish mostly in English anyway.

Writing Code for a new Plotter, Printer or File Format

The C code of these programs consists of two C programs, "drawgram.c" and "drawtree.c". Each of these uses two other pieces of C code "draw.c", "draw2.c", plus a common header file, "draw.h". All of the graphics commands that are common to both programs will be found in "draw.c" and "draw2.c". The following instructions for writing your own code to drive a different kind of printer, plotter, or graphics file format, require you only to make changes in "draw.c" and "draw2.c". The two programs can then be recompiled.

If you want to write code for other printers, plotters, or vector file formats, this is not too hard. The plotter option "U" is provided as a place for you to insert your own code. Chris Meacham's system was to draw everything, including the characters in the names and all curves, by drawing a series of straight lines. Thus you need only master your plotter's commands for drawing straight lines. In function "plotrparms" you must set up the values of variables "xunitspercm" and "yunitspercm", which are the number of units in the x and y directions per centimeter, as well as variables "xsize" and "ysize" which are the size of the plotting area in centimeters in the x direction and the y direction. A variable "penchange" of a user-defined type is set to "yes" or "no" depending on whether the commands to change the pen must be issued when switching between plotting lines and drawing characters. Even though dot-matrix printers do not have pens, penchange should be set to "yes" for them. In function "plot" you must issue commands to draw a line from the current position (which is at (xnow, ynow) in the plotter's units) to the position (xabs, yabs), under the convention that the lower-left corner of the plotting area is (0.0, 0.0). In functions "initplotter" and "finishplotter" you must issue commands to initialize the plotter and to finish plotting, respectively. If the pen is to be changed an appropriate piece of code must be inserted in function "penchange". The code to print the text needs to be added to the "plottext" function.

For dot matrix printers and raster graphics matters are a bit more complex. The functions "plotrparms", "initplotter", "finishplotter" and "plot" still respectively set up the parameters for the plotter, initialize it, finish a plot, and plot one line. But now the plotting consists of drawing dots into a two-dimensional array called "stripe". Once the plot is finished this array is printed out. In most cases the array is not as tall as a full plot: instead it is a rectangular strip across it. When the program has finished drawing in ther strip, it prints it out and then moves down the plot to the next strip. For example, for Hewlett-Packard Laserjets we have defined the strip as 2550 dots wide and 20 dots deep. When the program goes to draw a line, it draws it into the strip and ignores any part of it that falls outside the strip. Thus the program does a complete plotting into the strip, then prints it, then moves down the diagram by (in this case) 20 dots, then does a complete plot into that strip, and so on.

To work with a new raster or dot matrix format, you will have to define the desired width of a strip ("strpwide"), the desired depth ("strpdeep"), and how many lines of bytes must be printed out to print a strip. Procedure "striprint" is the one that prints out a strip, and has special-case code for the different printers and file formats. For file formats, all of which print out a single row of dots at a time, the variable "strpdiv" is not used. The variable "dotmatrix" is set to "true" or "false" in function "plotrparms" according to whether or not "strpdiv" is to be used. Procedure "plotdot" sets a single dot in the array "strip" to 1 at position (xabs, yabs). The coordinates run from 1 at the top of the plot to larger numbers as we proceed down the page. Again, there is special-case code for different printers and file formats in that function. You will probably want to read the code for some of the dot matrix or file format options if you want to write code for one of them. Many of them have provision for printing only part of a line, ignoring parts of it that have no dots to print.

I would be happy to obtain the resulting code from you to consider adding it to this listing so we can cover more kinds of plotters, printers, and file formats.


APPENDIX 1. Code to drive some other graphics devices.

These pieces of code are to be inserted in the places reserved for the "Y" plotter option. The variables necessary to run this have already been incorporated into the programs.

Calcomp plotters:

Calcomp's industrial-strength plotters were once a fixture of University computer centers, but are rarely found now, but just in case you need to use one, this code should work:

A global declaration needed near the front of drawtree.c:

Char cchex[16];

Code to be inserted into function plotrparms:

  case 'Y':
    plotter = other;
    xunitspercm = 39.37;
    yunitspercm = 39.37;
    xsize = 25.0;
    ysize = 25.0;
    xposition = 12.5;
    yposition = 0.0;
    xoption = center;
    yoption = above;
    rotation = 0.0;
    break;

Code to be inserted into function plot:

Declare these variables at the beginning of the function:

long n, inc, xinc, yinc, xlast, ylast, xrel,
   yrel, xhigh, yhigh, xlow, ylow;
Char quadrant;

and insert this into the switch statement:

  case other:
    if (penstatus == pendown)
      putc('H', plotfile);
    else
      putc('D', plotfile);
    xrel = (long)floor(xabs + 0.5) - xnow;
    yrel = (long)floor(yabs + 0.5) - ynow;
    xnow = (long)floor(xabs + 0.5);
    ynow = (long)floor(yabs + 0.5);
    if (xrel > 0) {
      if (yrel > 0)
        quadrant = 'P';
      else
        quadrant = 'T';
    } else if (yrel > 0)
      quadrant = 'X';
    else
      quadrant = '1';
    xrel = labs(xrel);
    yrel = labs(yrel);
    if (xrel > yrel)
      n = xrel / 255 + 1;
    else
      n = yrel / 255 + 1;
    xinc = xrel / n;
    yinc = yrel / n;
    xlast = xrel % n;
    ylast = yrel % n;
    xhigh = xinc / 16;
    yhigh = yinc / 16;
    xlow = xinc & 15;
    ylow = yinc & 15;
    for (i = 1; i <= n; i++)
      fprintf(plotfile, "%c%c%c%c%c",
              quadrant, cchex[xhigh - 1], cchex[xlow - 1], cchex[yhigh - 1],
              cchex[ylow - 1]);
    if (xlast != 0 || ylast != 0)
      fprintf(plotfile, "%c%c%c%c%c",
              quadrant, cchex[-1], cchex[xlast - 1], cchex[-1],
              cchex[ylast - 1]);
    break;

Code to be inserted into function initplotter:

  case other:
    cchex[-1] = 'C';
    cchex[0] = 'D';
    cchex[1] = 'H';
    cchex[2] = 'L';
    cchex[3] = 'P';
    cchex[4] = 'T';
    cchex[5] = 'X';
    cchex[6] = '1';
    cchex[7] = '5';
    cchex[8] = '9';
    cchex[9] = '/';
    cchex[10] = '=';
    cchex[11] = '#';
    cchex[12] = '"';
    cchex[13] = '\'';
    cchex[14] = '^';
    xnow = 0.0;
    ynow = 0.0;
    fprintf(plotfile, "CCCCCCCCCC");
    break;

Code to be inserted into function finishplotter:

  case other:
    plot(penup, 0.0, yrange + 50.0);
    break;


Appendix 2. Our Hershey font encoding.

The Hershey fonts were digitized fonts created by Dr. A. V. Hershey in the late 1960s when he was working at the U. S. Naval Weapons Laboratory. They were published in U. S. National Bureau of Standards Special Publication No. 424, distributed by the U. S. National Technical Information Service. Legally, it is possible to freely distribute these fonts in any encoding system except the original one used by the U. S. National Technical Information Service, provided that you acknowledge that the original fonts were produced by Dr. Hershey and published by NBS. This is a somewhat odd restriction, but convenient for us. Chris Meacham developed the software we use to read the Hershey fonts, and it uses a simple coding system that he developed. The original Hershey fonts were transformed by him into this encoding system. Six of them are distributed with PHYLIP: three Roman fonts, one unserifed and two serifed, two Italic fonts, one unserifed and one serifed, and a Russian Cyrillic font.

Each font file consists of groups of lines, one for each character. Here are the lines for character "h" in the font #1 in this encoding:

Ch 608 21 19 28
 -1456 1435 -1445 1748 1949 2249 2448 2545 2535 -12935

The group of lines starts with the letter C (for Character). Then follows the character that this font will draw (in this case "h"). It is the byte which, when read by the computer, signals that character. Then there is the number of this character in the original Hershey fonts (608). This is not used by our software.

The Hershey fonts are drawn on a grid of points as a series of lines. The next three numbers (21, 19, and 28) are the height (21), and two widths (19, and 28, which we don't use). Then comes a new line which shows the individual pen moves. When these are negative, they indicate that the pen is to be up when moving; when they are positive, the pen is to be down. They are integers. The last of them is greater than 10,000, and that is the signal to end after that move.

Each number has a final four digits that give the coordinate to which the pen is to move. These are given as (x,y) coordinates. Thus the first number (-1456) indicates the pen is to be up and the plotting is to move to coordinate (14, 56), which is  x = 14, y = 56. Then the pen is put down and moved to (14, 35). This draws a line from (14, 56) to (14, 35), in fact the vertical line that forms the back of the "h". Then the pen is picked up and moved to (14, 45). Then there follow a series of moves with pen down to (14, 35), (14, 45), (17, 48), (19, 49), (22, 49), (24, 48), (25, 45), and finally (25, 35). This draws a series of connected line segments that make the arch and right-hand vertical, ending up at the bottom-right of the character. -12935 then signals a pen-up move to (29, 35). This moves to a point where the next character can start, putting in a little "white space".

As you can see, the coding system is quite simple. Does anyone want to draw us some new fonts to add to our repertoire? I have spared you the Gothic, Old English, and Greek Hershey fonts, but perhaps there are some other nice ones people might want to use.


phylip-3.697/doc/drawgram.html0000644004732000473200000005040212406201172016000 0ustar joefelsenst_g drawgram

version 3.696

Drawgram

Written by Joseph Felsenstein and James McGill.
© Copyright 1986-2014 by Joseph Felsenstein. All rights reserved. License terms here.

Drawgram interactively plots a cladogram- or phenogram-like rooted tree diagram, with many options including orientation of tree and branches, style of tree, label sizes and angles, tree depth, margin sizes, stem lengths, and placement of nodes in the tree. Particularly if you can use your computer to preview the plot, you can very effectively adjust the details of the plotting to get just the kind of plot you want.

To understand the working of Drawgram you should first read the Tree Drawing Programs web page in this documentation.

Java Interface

All Phylip programs will get Java interfaces in the 4.0 release. But under some operating systems there are currently serious problems with Drawgram, so it has received its Java interface early as part of the 3.695 bug fix release. We do not anticipate changing this Java interface substantially in the 3.7 release, but don't be surprised if we do.

This new Java interface supersedes the old character-mode menu interface. PHYLIP also contains versions of Drawgram and Drawtree that have the character-mode menu interface. We have kept these available because PHYLIP is used in many places as part of pipelines driven by scripts. Since these scripts do not usually invoke the preview mode of Drawgram, we have disabled the previewing of tree plotting in Drawgram in this release. Previewing is available in the version of Drawgram that has the interactive Java interface.

The Java interface is different from the previous character-mode menu interface; it calls the C code of Drawgram, which is in a dynamic library. Thus, after the previewing is done, the code producing final plot file should make plots that are indistinguishable from those produced by previous versions of Drawgram.

Java Menu Interface

The Java Drawgram Interface is a modern GUI. It will run only on a machine that has a recent version of Oracle Java installed. This is not a serious limitation because Java is freeware that is universally available.

When you start the Drawgram Java interface it looks similar to the following, which has been edited to generate the plot which follows:

DrawGram Main Control Screen

It has all the usual GUI funtionality: input and output file selectors, drop down menu options, data entry boxes and toggles. "Preview" brings up a nearly WYSIWYG preview window that displays the Postscript plot created by the current settings (the fonts used in the previewing window are not the same, but use Serif, SansSerif, and Monospaced fonts that approximate the PostScript fonts that are used in the output plot):

DrawGram Cat Tree

Each time you select "Preview" another preview window is generated, so that multiple previews can be visible. This allows you to compare various display options. When the plot has been fine tuned, clicking "Create Plot File" writes the Postscript file that generated the last Preview to the plot file specified. Note that if there are multiple preview windows open, the most recent one is the one that shows how the tree in the final plot file will look, since it will be plotted using the most recent settings.

All the functionality in the Java GUI is the same as in the equivalent menu item in the character-mode menu interface. To ease the transition, we have kept the text in the Java GUI as close as possible to the description in the character-mode menu interface. So, for example, "S" in the old interface, which has the description "Tree style", has the counterpart "Tree style" in the new interface. The detailed explanations of each label are found below.

Command Line Interface

To understand the working of Drawgram and Drawtree, you should first read the Tree Drawing Programs web page in this documentation.

The Command Line Interface gives the user access to a huge collection of both display systems and output formats (some of them are historical curiosities at this point, but they still work so there is no reason to remove them). It can also be driven by scripting because it is a command line interface. But, as most users have little experience with command line systems, it is a bit daunting.

As with Drawtree, to run Drawgram you need a compiled copy of the program, a font file, and a tree file. The tree file has a default name of intree. The font file has a default name of "fontfile". If there is no file of that name, the program will ask you for the name of a font file (we provide ones that have the names font1 through font6). Once you decide on a favorite one of these, you could make a copy of it and call it fontfile, and it will then be used by default. Note that the program will get confused if the input tree file has the number of trees on the first line of the file, so that number may have to be removed.

Once these choices have been made you will see the central menu of the program, which looks like this:


Rooted tree plotting program version 3.695

Here are the settings:
 0  Screen type (IBM PC, ANSI):  ANSI
 P       Final plotting device:  Postscript printer
 V           Previewing device:  X Windows display
 H                  Tree grows:  Horizontally
 S                  Tree style:  Phenogram
 B          Use branch lengths:  (no branch lengths available)
 L             Angle of labels:  90.0
 R      Scale of branch length:  Automatically rescaled
 D       Depth/Breadth of tree:  0.53
 T      Stem-length/tree-depth:  0.05
 C    Character ht / tip space:  0.3333
 A             Ancestral nodes:  Centered
 F                        Font:  Times-Roman
 M          Horizontal margins:  1.65 cm
 M            Vertical margins:  2.16 cm

 Y to accept these or type the letter for one to change

These are the settings that control the appearance of the tree, which has already been read in. You can either accept these as is, in which case you would answer Y to the question and press the Return or Enter key, or you can answer N if you want to change one, or simply type the character corresponding to the one you want to change (if you answer N it will just immediately ask you for that number anyway).

For a first run in the Java interface version, you might accept these default values and see what the result looks like.

You can resize the preview window, though you may have to ask the system to redraw the preview to see it at the new window size.

Once you are finished looking at the preview, you will want to specify whether the program should make the final plot or change some of the settings. The possible settings are listed below.

When you are ready to produce the final plot file, you should use the button "Create Plot File" (if you are using the Java interface) or you should type Y (if you are using the character-mode menu). In the Java-interface version, the name of the plot file has been set in the dialog box near the top of the Java window. It defaults to plotfile.ps. In the character-mode menu, the file name defaults to plotfile.

If there is already a file of that name, the program will ask you whether you want to Overwrite the file, Append to the file, or Quit (in the character-mode menu version it also gives the option of writing to a new file whose name you will be asked to supply.

THE OPTIONS

Below I will describe the options one by one; you may prefer to skip reading this unless you are puzzled about one of them.

Postscript Font
(In the character-mode menu version, selection F). Allows you to select the name of the font that you will use for the species names. For each of the plot file formats, this will either choose the Postscript font (if they allow Postscript fonts) or the built-in Hershey font that most closely matches it. Please understand that for plot file formats that lack Postscript font support, you will get one of our five Hershey fonts. The plot file types that allow Postscript fonts are (as far as we know): Postscript, FIG 2.0, and Idraw. In the preview of the tree in the Java-interface version, actual Postscript fonts are always used, but with any plot file type other then these three, the font is replaced by the closest Hershey font. The size of the characters in the species names is scaled according to the character heights you have selected in the menu, whether plotter fonts or the Hershey font are used. Note that for some plotter drivers (in particular FIG 2.0 and PICT) Postscript fonts can be used in the final plot file only if the species labels are horizontal or vertical (at angles of 0 degrees or 90 degrees). Otherwise Hershey fonts will be used.

Tree grows:
(In the character-mode menu version, selection H). Whether the tree grows Horizontally or vertically. The horizontal growth will be from left to right. This option is self explanatory. The other options are designed so that when we switch this direction of growth the tree still looks the same, except for orientation and overall size. This option is toggled, that is, when it is chosen the orientation changes, going back and forth between Vertical and Horizontal. The default orientation is Horizontal.

Style of the tree:
(In the character-mode menu version, selection S). There are six styles possible: Cladogram, Phenogram, Curvogram, Eurogram, Swoopogram, and Circular Tree. These are chosen by the letters C, P, V, E, S and O. These take a little explaining.

In spite of the words "cladogram" and "phenogram", there is no implication of the extent to which you consider these diagrams as being genealogies or phenetic clustering diagrams. The names refer to pictorial style, not your own intended final use for the diagram. The six styles can be described as follows (assuming a vertically growing tree):

Cladogram
nodes are connected to other nodes and to tips by straight lines going directly from one to the other. This gives a V-shaped appearance. The default settings if there are no branch lengths are designed to yield a V-shaped tree with a 90-degree angle at the base.

Phenogram
nodes are connected to other nodes and to other tips by a horizontal and then a vertical line. This gives a particularly precise idea of horizontal levels.

Curvogram
nodes are connected to other nodes and to tips by a curve which is one fourth of an ellipse, starting out horizontally and then curving upwards to become vertical. This pattern was suggested by Joan Rudd.

Eurogram
so-called because it is a version of cladogram diagram popular in Europe. Nodes are connected to other nodes and to tips by a diagonal line that goes outwards and goes at most one-third of the way up to the next node, then turns sharply straight upwards and is vertical. Unfortunately it is nearly impossible to guarantee, when branch lengths are used, that the angles of divergence of lines are the same.

Swoopogram
this option connects two nodes or a node and a tip using two curves that are actually each one-quarter of an ellipse. The first part starts out vertical and then bends over to become horizontal. The second part, which is at least two-thirds of the total, starts out horizontal and then bends up to become vertical. The effect is that two lineages split apart gradually, then more rapidly, then both turn upwards.

Circular Tree
This is a style introduced by David Swofford in PAUP*. The tree grows outward from a central point, being essentially a Phenogram style tree in polar coordinates. The tips form a 360-degree circle. The "vertical" lines run outward radially from the center, and the "horizontal" lines are arcs of circles centered on it.

You should experiment with these and decide which you want -- it depends very much on the effect you want.

Use branch lengths:
(In the character-mode menu version, selection B). Whether the tree has Branch lengths that are being used in the diagram. If the tree that was read in had a full set of branch lengths, it will be assumed as a default that you want to use them in the diagram, but you can specify that they are not to be used. If the tree does not have a full set of branch lengths then this will be indicated, and if you try to use branch lengths the program will refuse to allow you to do so. Note that when you change option B, the node position option A may change as well.

Angle of labels:
(In the character-mode menu version, selection L). The angle of the Labels. The angle is always calculated relative to a vertical tree, whether the tree is actually horizontal or vertical, if the labels are at an angle of 90 degrees they run parallel to direction of tree growth. The default value is 90 degrees. The option allows you to choose any angle from 0 to 90 degrees.

Scale of branch length:
(In the character-mode menu version, selection R). How the branch lengths will be recalculated into distances on the output device. Note that when branch lengths have not been provided, there are implicit branch lengths specified by the type of tree being drawn. This option will toggle back and forth between automatic adjustment of branch lengths so that the diagram will just fit into the margins, and you specifying how many centimeters there will be per unit branch length. This is included so that you can plot different trees to a common scale, showing which ones have longer or shorter branches than others. Note that if you choose too large a value for centimeters per unit branch length, the tree will be so big it will overrun the plotting area and may cause failure of the diagram to display properly. Too small a value will cause the tree to be a nearly invisible dot.

Depth/breadth of the tree:
(In the character-mode menu version, selection D). The ratio between the depth and the breadth of the tree. It is initially set near 0.5, to approximate a V-shaped tree, but you may want to try a larger value to get a longer and narrower tree. Depth and breadth are described as if the tree grew vertically, so that depth is always measured from the root to the tips (not including the length of the labels).

Stem length/tree depth:
(In the character-mode menu version, selection T). The length of the sTem of the tree as a fraction of the depth of the tree. You may want to either lengthen the stem or remove it entirely by giving a value of zero.

Character ht/tip space:
(In the character-mode menu version, selection C). The Character height, measured as a fraction of the tip spacing. If the labels are rotated to a shallow angle, the character height will be automatically adjusted in hopes of avoiding collision of labels at different tips. This option allows you to change the size of the labels yourself. On output devices where line thicknesses can be varied, the thickness of the tree lines will automatically be adjusted to be proportional to the character height, which is an additional reason you may want to change character height.

Ancestral nodes:
(In the character-mode menu version, selection A). Controls the positions of the ancestral (interior) nodes. This can greatly affect the appearance of the tree. The vertical positions (these descriptions assume a tree growing vertically) are not under your control except insofar as you specify the use or non-use of branch lengths. If you choose to change this option you will can choose a number methods in the selection box (in the Java-interface version of the program. In the character-mode menu version you will be asked:

 Should interior node positions:
 be Intermediate between their immediate descendants,
    Weighted average of tip positions
    Centered among their ultimate descendants
    iNnermost of immediate descendants
 or so that tree is V-shaped
 (type I, W, C, N or V):

The five methods (Intermediate, Weighted, Centered, Innermost, and V-shaped) are different horizontal positionings of the interior nodes. It will be helpful to you to try these out and see which you like best. Intermediate places the node halfway between its immediate descendants (horizontally), Weighted places it closer to that descendant who is closer vertically as well, and Centered centers the node below the horizontal positions of the tips that are descended from that node. You may want to choose that option that prevents lines from crossing each other.

V-shaped is another option, one designed, if there are no branch lengths being used, to yield a v-shaped tree of regular appearance. At the moment it can give somewhat wierd trees; we intend to make it better in the next release. With branch lengths it will not necessarily make the tree perfectly V-shaped. "Innermost" is the most unusual option: it chooses a center for the tree, and always places interior nodes below the innermost of their immediate descendants. This leads to a tree that has vertical lines in the center, like a tree with a trunk.

If the tree you are plotting has a full set of lengths, then when it is read in, the node position option is automatically set to "intermediate", which is the setting with the least likelihood of lines in the tree crossing. If it does not have lengths the option is set to "V-shaped". If you change the option which tells the program whether to try to use the branch lengths, then the node position option will automatically be reset to the appropriate one of these defaults. This may be confusing if you do not realise that it is happening.

Margins:
(In the character-mode menu version, selection M). The horizontal and vertical margins in centimeters. You can enter new margins (you enter new values for both horizontal and vertical margins, though these need not be different from the old values). For the moment I do not allow you to specify left and right margins separately, or top and bottom margins separately. In a future release I hope to do so.

Final plot file type
(in the character-mode menu version, menu selection P). This allows you to choose the Plotting device or file format. We have discussed the possible choices in the draw programs documentation web page. In the Java version they are Postscript, PICT, PCL, Windows BMP, FIG 2.0, Idraw, VRML, or PCX. In the character-mode menu version there is a longer list of plot file types.

#
(character-mode menu version only) The number of pages per tree. Defaults to one, but if you need a physically large tree you may want to choose a larger number. For example, to make a big tree for a poster, choose a larger number of pages horizontally and vertically (the program will ask you for these numbers), get out your scissors and paste or tape, and go to work.

O
(character-mode menu version only) This is an option that allows you to change the menu window to emulate an ANSI terminal or an IBM PC terminal. Generally you will not want to change this.

I recommend that you try all of these options (particularly if you can preview the trees). It is of particular use to try combinations of the style of tree (option S) with the different methods of placing interior nodes (option A). You will find that a wide variety of effects can be achieved.

Afterword

I would appreciate suggestions for improvements in Drawgram, but please be aware that the source code is already very large and I may not be able to implement all suggestions.

phylip-3.697/doc/drawtree.html0000644004732000473200000005055112406201172016016 0ustar joefelsenst_g drawtree
version 3.695

Drawtree

Written by Joseph Felsenstein and James McGill.
© Copyright 1986-2014 by Joseph Felsenstein. All rights reserved. License terms here.

Drawtree interactively plots an unrooted tree diagram, with many options including orientation of tree and branches, label sizes and angles, and margin sizes. Particularly if you can use your computer screen to preview the plot, you can very effectively adjust the details of the plotting to get just the kind of plot you want.

To understand the working of Drawtree you should first read the Tree Drawing Programs web page in this documentation.

Java Interface

All Phylip programs will get Java interfaces in the 4.0 release. But under some operating systems there are currently serious problems with Drawtree, so it has received its Java interface early as part of the 3.695 bug fix release. We do not anticipate changing this Java interface substantially in the 4.0 release, but don't be surprised if we do.

This new Java interface supersedes the old character-mode menu interface. PHYLIP also contains versions of Drawgram and Drawtree that have the character-mode menu interface. We have kept these available because PHYLIP is used in many places as part of pipelines driven by scripts. Since these scripts do not usually invoke the preview mode of Drawtree, we have disabled the previewing of tree plotting in Drawtree in this release. Previewing is available in the version of Drawtree that has the interactive Java interface.

The Java interface is different from the previous character-mode menu interface; it calls the C code of Drawgram, which is in a dynamic library. Thus, after the previewing is done, the code producing final plot file should make plots that are indistinguishable from those produced by previous versions of Drawgram.

Java Menu Interface

The Java Drawtree Interface is a modern GUI. It will run only on a machine that has a recent version of Oracle Java installed. This is not a serious limitation because Java is freeware that is universally available.

When you start the Drawtree Java interface it looks similar to the following, which has been edited to generate the plot which follows:

DrawTree Main Control Screen

It has all the usual GUI functionality: input and output file selectors, drop down menu options, data entry boxes and toggles. "Preview" brings up a nearly WYSIWYG preview window that displays the Postscript plot created by the current settings:

DrawTree Cat Tree

Each time you select "Preview" another preview window is generated, so that multiple previews can be visible. This allows you to compare various display options. When the plot has been fine tuned, clicking "Create Plot File" writes the Postscript file that generated the last Preview to the plot file specified. Note that if there are multiple preview windows open, the most recent one is the one that shows how the tree in the final plot file will look, since it will be plotted using the most recent settings.

All the functionality in the Java GUI is the same as in the equivalent menu item in the character-mode menu interface. To ease the transition, we have kept the text in the Java GUI as close as possible to the description in the character-mode menu interface. So, for example, "L" in the old interface, which has the helper message "Angle of labels", maps to "Angle of labels" in the new interface. All the detailed explanations of each label are found below.

Command Line Interface

The Command Line Interface gives the user access to a huge collection of both display systems and output formats (some of them are historical curiosities at this point, but they still work so there is no reason to remove them). It can also be driven by scripting because it is a command line interface. But, as most users have little experience with command line systems, it is a bit daunting.

As with Drawgram, to run Drawtree you need a compiled copy of the program, a font file, and a tree file. The tree file has a default name of intree. The font file has a default name of "fontfile". If there is no file of that name, the program will ask you for the name of a font file (we provide ones that have the names font1 through font5). Once you decide on a favorite one of these, you could make a copy of it and call it fontfile, and it will then be used by default.

Once these choices have been made you will see the central menu of the program, which looks like this:


Unrooted tree plotting program version 3.695

Here are the settings:

 0  Screen type (IBM PC, ANSI)?  ANSI
 P       Final plotting device:  Postscript printer
 B          Use branch lengths:  (no branch lengths available)
 L             Angle of labels:  branch points to Middle of label
 R            Rotation of tree:  90.0
 I     Iterate to improve tree:  Equal-Daylight algorithm
 D  Try to avoid label overlap?  No
 S      Scale of branch length:  Automatically rescaled
 C   Relative character height:  0.3333
 F                        Font:  Times-Roman
 M          Horizontal margins:  1.65 cm
 M            Vertical margins:  2.16 cm
 #           Page size submenu:  one page per tree

 Y to accept these or type the letter for one to change

These are the settings that control the appearance of the tree, which has already been read in. You can either accept these as is, in which case you would answer Y to the question and press the Return or Enter key, or you can answer N if you want to change one, or simply type the character corresponding to the one you want to change (if you answer N it will just immediately ask you for that number anyway).

For a first run in the Java interface version you might accept these default values and see what the result looks like.

You can resize the preview window, though you may have to ask the system to redraw the preview to see it at the new window size.

Once you are finished looking at the preview, you will want to specify whether the program should make the final plot or change some of the settings. The possible settings are listed below.

When you are ready to produce the final plot file, you should use the button "Create Plot File" (if you are using the Java interface) or you should type Y (if you are using the character-mode menu). In the Java-interface version, the name of the plot file has been set in the dialog box near the top of the Java window. It defaults to plotfile.ps. In the character-mode menu, the file name defaults to plotfile.

If there is already a file of that name, the program will ask you whether you want to Overwrite the file, Append to the file, or Quit (in the character-mode menu version it also gives the option of writing to a new file whose name you will be asked to supply.

THE OPTIONS

Below I will describe the options one by one; you may prefer to skip reading this unless you are puzzled about one of them.

Postscript Font
(In the character-mode menu version, selection F). Allows you to select the name of the font that you will use for the species names. For each of the plot file formats, this will either choose the Postscript font (if they allow Postscript fonts) or the built-in Hershey font that most closely matches it. Please understand that for plot file formats that lack Postscript font support, you will get one of our five Hershey fonts. The plot file types that allow Postscript fonts are (as far as we know): Postscript, FIG 2.0, and Idraw. In the preview of the tree in the Java-interface version, actual Postscript fonts are always used, but with any plot file type other then these three, the font is replaced by the closest Hershey font. The size of the characters in the species names is scaled according to the character heights you have selected in the menu, whether plotter fonts or the Hershey font are used. Note that for some plotter drivers (in particular FIG 2.0 and PICT) Postscript fonts can be used in the final plot file only if the species labels are horizontal or vertical (at angles of 0 degrees or 90 degrees). Otherwise Hershey fonts will be used.

Use branch lengths
(In the character-mode menu version, selection B). Whether the tree has Branch lengths that are being used in the diagram. If the tree that was read in had a full set of branch lengths, it will be assumed as a default that you want to use them in the diagram, but you can specify that they are not to be used. If the tree does not have a full set of branch lengths then this will be indicated, and if you try to use branch lengths the program will refuse to allow you to do so. Note that there is no way to use negative branch lengths, so Drawtree automatically takes their absolute values, and thus will plot a branch that has length -0.1 as if it has length 0.1.

Angle of labels
(In the character-mode menu version, selection L). The angle of the Labels. Initially the branches connected to the tips will point to the middles of the labels. If you want to change the way the labels are drawn, the program will offer you a choice between Middle, Fixed, Radial, and Along as the ways the angles of the labels are to be determined. If you choose Fixed, you will be asked if you want labels to be at some fixed angle. This can be between 90.0 and -90.0 degrees and you can specify that. You may have to try different angles to find one that keeps the labels from colliding: I have not guarded against this. However there are additional options. Middle has the branch connected to that tip point to the midpoint of the label. It puts the label at a fixed angle of 0. Radial indicates that the labels are all aligned so as to point toward the root node of the tree. Along aligns them to have the same angle as the branch connected to that tip. This is particularly likely to keep the labels from colliding, but it may give a misleading impression that the final branch is long.

Angle of tree
(In the character-mode menu version, selection R). The rotation of the tree. This is initially 90.0 degrees. The angle is read out counterclockwise from the right side of the tree, so that increasing this angle will rotate the tree counterclockwise, and decreasing it will rotate it clockwise. The meaning of this angle is explained further under option A. As you rotate the tree, the appearance (and size) may change, but the labels will not rotate if they are drawn at a Fixed angle.

Arc of tree
(In the character-mode menu version, selection A). The Angle through which the tree is plotted. This is by default 360.0 degrees. The tree is in the shape of an old-fashioned hand fan. The tree fans out from its root node, each of the subtrees being allocated part of this angle, a part proportional to how many tips the subtree contains. If the rotation of the tree is (say) 90.0 degrees (the default under option R), the fan starts at +270 degrees and runs clockwise around to -90 degrees (i.e., it starts at the bottom of the plot and runs clockwise around until it returns to the bottom). Thus the center of the fan runs from the root upwards (which is why we say it is rotated to 90.0 degrees). By changing option R we can change the direction of the fan, and by changing option A we can change the width of the fan without changing its center line. If you want the tree to fan out in a semicircle, a value of a bit greater than 180 degrees would be appropriate, as the tree will not completely fill the fan. Note that using either of the iterative improvement methods mentioned below is impossible if the angle is not 360 degrees.

Iterate to improve tree
(In the character-mode menu version, selection I). Whether the tree angles will be Iteratively improved. There are three methods available:

no (Equal Arc)
This method, invented by Christopher Meacham in PLOTREE, the predecessor to this program, starts from the root of the tree and allocates arcs of angle to each subtree proportional to the number of tips in it. This continues as one moves out to other nodes of the tree and subdivides the angle allocated to them into angles for each of that node's dependent subtrees. This method is fast, and never results in lines of the tree crossing. However, it may result in rather large empty areas between subtrees. It is the method used to make a starting tree all three methods, so that the selection "no" leaves us with this tree and does not improve the tree beyond this.

Equal-Daylight algorithm
This is the default method. It iteratively improves an initial tree by successively going to each interior node, looking at the subtrees (often there are 3 of them) visible from there, and swinging them so that the arcs of "daylight" visible between them are equal. This is not as fast as Equal Arc but should never result in lines crossing. It gives particularly good-looking trees, and it is the default method for this program. It will be described in a future paper by me. This method has also been adopted by David Swofford in his program PAUP*.

n-Body algorithm
This assumes that there are electrical charges located along all the branches, and that they repel each other with a force that varies (as electrical repulsion would) as the inverse square of the distance between them. The tree adjusts its shape until the forces balance. This can be computationally slow, and can result in lines crossing. I find the trees inferior to the Equal-Daylight algorithm, but it is often worth a try.
Maximum Iterations
(Not available in the character-mode menu version). This is for the Equal-Daylight algorithm or the n-Body algorithm. It sets how many passes through the tree will be made when trying to achieve a good placement. The more the greater the accuracy of the solution will be, but the slower the program will run.

Regularize the angles
(in the character-mode menu version, selection G). If iterative improvement is not turned on in option I (so that we are employing the Equal Arc method), this option appears in the menu. It controls whether the angles of lines will be "regularized". Regularization is off by default. It takes the angles of the branches coming out from each node, and changes them so that they are "rounded off". This process (which I will not fully describe) will make the lines vertical if they are close to vertical, horizontal if they are close to horizontal, 45 degrees if they are close to that, and so on. It will lead to a tree in which angles look very regular. The size of angle to which they will round off the angles varies with the number of tips on the tree. You may or may not want that. If you are unhappy with the appearance of the tree when using this option, you could try rotating the angle of the tree slightly, as that may cause some branches to change their angle by a large amount, by having the angles be "rounded off" to a different value.

Try to aboid label overlap
(In the character-mode menu version, selection D). Whether the program tries to avoiD overlap of the labels. We have left this off by default, because it is a rather feeble option that is frequently unsuccessful, and often make the trees look weird. Nevertheless it may be worth a try.

Branch lengths
(In the character-mode menu version, selection S). On what Scale the branch lengths will be translated into distances on the output device. Note that when branch lengths have not been provided, there are implicit branch lengths of 1.0 per branch. This option will toggle back and forth between automatic adjustment of branch lengths so that the diagram will just fit into the margins, and you specifying how many centimeters there will be per unit branch length. This is included so that you can plot different trees to a common scale, showing which ones have longer or shorter branches than others. Note that if you choose too large a value for centimeters per unit branch length, the tree will be so big it will overrun the plotting area and may cause failure of the diagram to display properly. Too small a value will cause the tree to be a nearly invisible dot.

Relative character height
(In the character-mode menu version, selection C). The Character height, measured as a fraction of a quantity which is the horizontal space available for the tree, divided by one less than the number of tips. You need not worry about exactly what this is: you can always change the value (which is initially 0.3333) to make the labels larger or smaller. On output devices where line thicknesses can be varied, the thickness of the tree lines will automatically be adjusted to be proportional to the character height, which is an additional reason you may want to change character height.

Scale of branch length
(In the character-mode menu version, selection R). How the branch lengths will be recalculated into distances on the output device. Note that when branch lengths have not been provided, there are implicit branch lengths specified by the type of tree being drawn. In the Java interface version, you can enter how many centimeters there will be per unit branch length. In the character-mode version the selection will toggle back and forth between letting you select the scale and automatically rescaling the tree so that the diagram will just fit into the margins. This is included so that you can plot different trees to a common scale, showing which ones have longer or shorter branches than others. Note that if you choose too large a value for centimeters per unit branch length, the tree will be so big it will overrun the plotting area and may cause failure of the diagram to display properly. Too small a value will cause the tree to be a nearly invisible dot.

Margins:
(In the character-mode menu version, selection M). The horizontal and vertical margins in centimeters. You can enter new margins (you enter new values for both horizontal and vertical margins, though these need not be different from the old values). For the moment I do not allow you to specify left and right margins separately, or top and bottom margins separately. In a future release I hope to do so.

Final plot file type
(in the character-mode menu version, menu selection P). This allows you to choose the Plotting device or file format. We have discussed the possible choices in the draw programs documentation web page. In the Java version they are Postscript, PICT, PCL, Windows BMP, FIG 2.0, Idraw, VRML, or PCX. In the character-mode menu version there is a longer list of plot file types.

#
(charater-mode menu version only) The number of pages per tree. Defaults to one, but if you need a physically large tree you may want to choose a larger number. For example, to make a big tree for a poster, choose a larger number of pages horizontally and vertically (the program will ask you for these numbers), get out your scissors and paste or tape, and go to work.

O
(character-mode menu version only) This is an option that allows you to change the menu window to emulate an ANSI terminal or an IBM PC terminal. Generally you will not want to change this.

I recommend that you try all of these options (particularly if you can preview the trees). It is of particular use to try trees with different iteration methods (option I) and with regularization (option G). You will find that a variety of effects can be achieved.

Afterword

I would appreciate suggestions for improvements in Drawtree, but please be aware that the source code is already very large and I may not be able to implement all suggestions.

phylip-3.697/doc/factor.html0000644004732000473200000002716112406201172015460 0ustar joefelsenst_g factor
version 3.696

Factor - Program to factor multistate characters.

© Copyright 1986-2014 by Joseph Felsenstein. All rights reserved. License terms here.

This program factors a data set that contains multistate characters, creating a data set consisting entirely of binary (0,1) characters that, in turn, can be used as input to any of the other discrete character programs in this package, except for PARS. Besides this primary function, Factor also provides an easy way of deleting characters from a data set. The input format for Factor is very similar to the input format for the other discrete character programs except for the addition of character-state tree descriptions.

Note that this program has no way of converting an unordered multistate character into binary characters. Fortunately, PARS has joined the package, and it enables unordered multistate characters, in which any state can change to any other in one step, to be analyzed with parsimony.

Factor is really for a different case, that in which there are multiple states related on a "character state tree", which specifies for each state which other states it can change to. That graph of states is assumed to be a tree, with no loops in it.

The first line of the input file should contain the number of species and the number of multistate characters. This first line is followed by the lines describing the character-state trees, one description per line. The species information constitutes the last part of the file. Any number of lines may be used for a single species.

FIRST LINE

The first line is free format with the number of species first, separated by at least one blank (space) from the number of multistate characters, which in turn is separated by at least one blank from the options, if present.

OPTIONS

The options are selected from a menu that looks like this:


Factor -- multistate to binary recoding program, version 3.69

Settings for this run:
  A      put ancestral states in output file?  No
  F   put factors information in output file?  No
  0       Terminal type (IBM PC, ANSI, none)?  (none)
  1      Print indications of progress of run  Yes

Are these settings correct? (type Y or the letter for one to change)

The options particular to this program are:

A
Choosing the A (Ancestors) options toggles on and off the setting that causes a line to be written in the ancestors output file that describes the states of the ancestor as indicated by the character-state tree descriptions (see below). If the ancestral state is not specified by a particular character-state tree, a "?" signifying an unknown character state will be written. The multistate characters are factored in such a way that the ancestral state in the factored data set will always be "0".

F
Choosing the F (Factors) option toggles on and off a setting that will cause a factors output file to be written (its default file name is "factors"). The line in this file will indicate to other programs which factors came from the same multistate character. Of the programs currently in the package only Seqboot, Move, and Dolmove use this information.

CHARACTER-STATE TREE DESCRIPTIONS

The character-state trees are described in free format. The character number of the multistate character is given first followed by the description of the tree itself. Each description must be completed on a single line. Each character that is to be factored must have a description, and the characters must be described in the order that they occur in the input, that is, in numerical order.

The tree is described by listing the pairs of character states that are adjacent to each other in the character-state tree. The two character states in each adjacent pair are separated by a colon (":"). If character fifteen has this character state tree for possible states "A", "B", "C", and "D":

                         A ---- B ---- C
                                |
                                |
                                |
                                D

then the character-state tree description would be

                        15  A:B B:C D:B

Note that either symbol may appear first. The ancestral state is identified, if desired, by putting it "adjacent" to a period. If we wanted to root character fifteen at state C:

                         A <--- B <--- C
                                |
                                |
                                V
                                D

we could write

                      15  B:D A:B C:B .:C

Both the order in which the pairs are listed and the order of the symbols in each pair are arbitrary. However, each pair may only appear once in the list. Any symbols may be used for a character state in the input except the character that signals the connection between two states (in the distribution copy this is set to ":"), ".", and, of course, a blank. Blanks are ignored completely in the tree description so that even B:DA:BC:B.:C or B : DA : BC : B. : C would be equivalent to the above example. However, at least one blank must separate the character number from the tree description.

DELETING CHARACTERS FROM A DATA SET

If no description line appears in the input for a particular character, then that character will be omitted from the output. If the character number is given on the line, but no character-state tree is provided, then the symbol for the character in the input will be copied directly to the output without change. This is useful for characters that are already coded "0" and "1". Characters can be deleted from a data set simply by listing only those that are to appear in the output.

TERMINATING THE LIST OF TREE DESCRIPTIONS

The last character-state tree description should be followed by a line containing the number "999". This terminates processing of the trees and indicates the beginning of the species information.

SPECIES INFORMATION

The format for the species information is basically identical to the other discrete character programs. The first ten character positions are allotted to the species name (this value may be changed by altering the value of the constant nmlngth at the beginning of the program). The character states follow and may be continued to as many lines as desired. There is no current method for indicating polymorphisms. It is possible to either put blanks between characters or not.

There is a method for indicating uncertainty about states. There is one character value that stands for "unknown". If this appears in the input data then "?" is written out in all the corresponding positions in the output file. The character value that designates "unknown" is given in the constant unkchar at the beginning of the program, and can be changed by changing that constant. It is set to "?" in the distribution copy.

OUTPUT

The first line of output will contain the number of species and the number of binary characters in the factored data set. The factored characters will be written for each species in the format required for input by the other discrete programs in the package. The maximum length of the output lines is 80 characters, but this maximum length can be changed prior to compilation.

If the A (Ancestors) option was chosen, an output file whose default name is ancestors will be written with the ancestors information. If F (Factors) was chosen in the menu, am output file whose default name is factors will be written containing the factors information.

ERRORS

The output should be checked for error messages. Errors will occur in the character-state tree descriptions if the format is incorrect (colons in the wrong place, etc.), if more than one root is specified, if the tree contains loops (and hence is not a tree), and if the tree is not connected, e.g.

                             A:B B:C D:E

describes

                  A ---- B ---- C          D ---- E

This "tree" is in two unconnected pieces. An error will also occur if a symbol appears in the data set that is not in the tree description for that character. Blanks at the end of lines when the species information is continued to a new line will cause this kind of error.

CONSTANTS AVAILABLE TO BE CHANGED

At the beginning of the program a number of constants are available to be changed to accomodate larger data sets. These are "maxstates", "maxoutput", "sizearray", "factchar" and "unkchar". The constant "maxstates" gives the maximum number of states per character (set at 20 in the distribution copy). The constant "maxoutput" gives the maximum width of a line in the output file (80 in the distribution copy). The constant "sizearray" must be less than the sum of squares of the numbers of states in the characters. It is initially set to set to 2000, so that although 20 states are allowed (at the initial setting of maxstates) per character, there cannot be 20 states in all of 100 characters.

Particularly important constants are "factchar" and "unkchar" which are not numerical values but a character. Initially set to the colon ":", "factchar" is the character that will be used to separate states in the input of character state trees. It can be changed by changing this constant. (We could have used a hyphen ("-") but didn't because that would make the minus-sign ("-") unavailable as a character state in +/- characters). The constant "unkchar" is the character value in the input data that indicates that the state is unknown. It is set to "?" in the distribution copy. If your computer is one that lacks the colon ":" in its character set or uses a nonstandard character code such as EBCDIC, you will want to change the constant "factchar".

INPUT AND OUTPUT FILES

The input file for the program has the default file name "infile" and the output file, the one that has the binary character state data, has the name "outfile".

----SAMPLE INPUT----- -----Comments (not part of input file) -----
 
   4   6
1 A:B B:C        
2 A:B B:.        
4                
5 0:1 1:2 .:0    
6 .:# #:$ #:%    
999              
Alpha     CAW00# 
Beta      BBX01%
Gamma     ABY12#
Epsilon   CAZ01$

     4 species; 6 characters
     A ---- B ---- C
     B ---> A
     Character 3 deleted; 4 unchanged
     0 ---> 1 ---> 2
     % <--- # ---> $
     Signals end of trees
     Species information begins

     
    
---SAMPLE OUTPUT----- -----Comments (not part of output file) -----
    4    8
Alpha     11100000
Beta      10001001
Gamma     00011100
Epsilon   11101010
 
     4 species; 8 factors
     Chars. 1 and 2 come from old number 1
     Char. 3 comes from old number 2
     Char. 4 is old number 4
     Chars. 5 and 6 come from old number 5
     Chars. 7 and 8 come from old number 6
phylip-3.697/doc/fitch.html0000644004732000473200000002235612406201172015300 0ustar joefelsenst_g fitch

version 3.696

Fitch -- Fitch-Margoliash and Least-Squares Distance Methods

© Copyright 1986-2014 by Joseph Felsenstein. All rights reserved. License terms here.

This program carries out Fitch-Margoliash, Least Squares, and a number of similar methods as described in the documentation file for distance methods.

The options for Fitch are selected through the menu, which looks like this:


Fitch-Margoliash method version 3.69

Settings for this run:
  D      Method (F-M, Minimum Evolution)?  Fitch-Margoliash
  U                 Search for best tree?  Yes
  P                                Power?  2.00000
  -      Negative branch lengths allowed?  No
  O                        Outgroup root?  No, use as outgroup species  1
  L         Lower-triangular data matrix?  No
  R         Upper-triangular data matrix?  No
  S                        Subreplicates?  No
  G                Global rearrangements?  No
  J     Randomize input order of species?  No. Use input order
  M           Analyze multiple data sets?  No
  0   Terminal type (IBM PC, ANSI, none)?  ANSI
  1    Print out the data at start of run  No
  2  Print indications of progress of run  Yes
  3                        Print out tree  Yes
  4       Write out trees onto tree file?  Yes

  Y to accept these or type the letter for one to change

Most of the input options (U, P, -, O, L, R, S, J, and M) are as given in the documentation page for distance matrix programs, and their input format is the same as given there. The U (User Tree) option has one additional feature when the N (Lengths) option is used. This menu option will appear only if the U (User Tree) option is selected. If N (Lengths) is set to "Yes" then if any branch in the user tree has a branch length, that branch will not have its length iterated. Thus you can prevent all branches from having their lengths changed by giving them all lengths in the user tree, or hold only one length unchanged by giving only that branch a length (such as, for example, 0.00). You may find program Retree useful for adding and removing branch lengths from a tree. This option can also be used to compute the Average Percent Standard Deviation for a tree obtained from Neighbor, for comparison with trees obtained by Fitch or Kitsch.

The D (methods) option allows choice between the Fitch-Margoliash criterion and the Minimum Evolution method (Kidd and Sgaramella-Zonta, 1971; Rzhetsky and Nei, 1993). Minimum Evolution (not to be confused with parsimony) uses the Fitch-Margoliash criterion to fit branch lengths to each topology, but then chooses topologies based on their total branch length (rather than the goodness of fit sum of squares). There is no constraint on negative branch lengths in the Minimum Evolution method; it sometimes gives rather strange results, as it can like solutions that have large negative branch lengths, as these reduce the total sum of branch lengths!

Another input option available in Fitch that is not available in Kitsch or Neighbor is the G (Global) option. G is the Global search option. This causes, after the last species is added to the tree, each possible group to be removed and re-added. This improves the result, since the position of every species is reconsidered. It approximately triples the run-time of the program. It is not an option in Kitsch because it is the default and is always in force there. The O (Outgroup) option is described in the main documentation file of this package. The O option has no effect if the tree is a user-defined tree (if the U option is in effect). The U (User Tree) option requires an unrooted tree; that is, it requires that the tree have a trifurcation at its base:

     ((A,B),C,(D,E));

The output consists of an unrooted tree and the lengths of the interior segments. The sum of squares is printed out, and if P = 2.0 Fitch and Margoliash's "average percent standard deviation" is also computed and printed out. This is the sum of squares, divided by N-2, and then square-rooted and then multiplied by 100:

     APSD = ( SSQ / (N-2) )1/2 x 100.

where N is the total number of off-diagonal distance measurements that are in the (square) distance matrix. If the S (subreplication) option is in force it is instead the sum of the numbers of replicates in all the non-diagonal cells of the distance matrix. But if the L or R option is also in effect, so that the distance matrix read in is lower- or upper-triangular, then the sum of replicates is only over those cells actually read in. If S is not in force, the number of replicates in each cell is assumed to be 1, so that N is n(n-1), where n is the number of species. The APSD gives an indication of the average percentage error. The number of trees examined is also printed out.

The constants available for modification at the beginning of the program are: "smoothings", which gives the number of passes through the algorithm which adjusts the lengths of the segments of the tree so as to minimize the sum of squares, "delta", which controls the size of improvement in sum of squares that is used to control the number of iterations improving branch lengths, and "epsilonf", which defines a small quantity needed in some of the calculations. There is no feature saving multiple trees tied for best, partly because we do not expect exact ties except in cases where the branch lengths make the nature of the tie obvious, as when a branch is of zero length.

The algorithm can be slow. As the number of species rises, so does the number of distances from each species to the others. The speed of this algorithm will thus rise as the fourth power of the number of species, rather than as the third power as do most of the others. Hence it is expected to get very slow as the number of species is made larger.


TEST DATA SET

    7
Bovine      0.0000  1.6866  1.7198  1.6606  1.5243  1.6043  1.5905
Mouse       1.6866  0.0000  1.5232  1.4841  1.4465  1.4389  1.4629
Gibbon      1.7198  1.5232  0.0000  0.7115  0.5958  0.6179  0.5583
Orang       1.6606  1.4841  0.7115  0.0000  0.4631  0.5061  0.4710
Gorilla     1.5243  1.4465  0.5958  0.4631  0.0000  0.3484  0.3083
Chimp       1.6043  1.4389  0.6179  0.5061  0.3484  0.0000  0.2692
Human       1.5905  1.4629  0.5583  0.4710  0.3083  0.2692  0.0000


OUTPUT FROM TEST DATA SET (with all numerical options on)


   7 Populations

Fitch-Margoliash method version 3.69

                  __ __             2
                  \  \   (Obs - Exp)
Sum of squares =  /_ /_  ------------
                                2
                   i  j      Obs

Negative branch lengths not allowed


Name                       Distances
----                       ---------

Bovine        0.00000   1.68660   1.71980   1.66060   1.52430   1.60430
              1.59050
Mouse         1.68660   0.00000   1.52320   1.48410   1.44650   1.43890
              1.46290
Gibbon        1.71980   1.52320   0.00000   0.71150   0.59580   0.61790
              0.55830
Orang         1.66060   1.48410   0.71150   0.00000   0.46310   0.50610
              0.47100
Gorilla       1.52430   1.44650   0.59580   0.46310   0.00000   0.34840
              0.30830
Chimp         1.60430   1.43890   0.61790   0.50610   0.34840   0.00000
              0.26920
Human         1.59050   1.46290   0.55830   0.47100   0.30830   0.26920
              0.00000


  +---------------------------------------------Mouse     
  ! 
  !                                +------Human     
  !                             +--5 
  !                           +-4  +--------Chimp     
  !                           ! ! 
  !                        +--3 +---------Gorilla   
  !                        !  ! 
  1------------------------2  +-----------------Orang     
  !                        ! 
  !                        +---------------------Gibbon    
  ! 
  +------------------------------------------------------Bovine    


remember: this is an unrooted tree!

Sum of squares =     0.01375

Average percent standard deviation =     1.85418

Between        And            Length
-------        ---            ------
   1          Mouse             0.76985
   1             2              0.41983
   2             3              0.04986
   3             4              0.02121
   4             5              0.03695
   5          Human             0.11449
   5          Chimp             0.15471
   4          Gorilla           0.15680
   3          Orang             0.29209
   2          Gibbon            0.35537
   1          Bovine            0.91675


phylip-3.697/doc/gendist.html0000644004732000473200000003160612406201172015636 0ustar joefelsenst_g gendist

version 3.696

Gendist - Compute genetic distances from gene frequencies

© Copyright 1986-2014 by Joseph Felsenstein. All rights reserved. License terms here.

This program computes any one of three measures of genetic distance from a set of gene frequencies in different populations (or species). The three are Nei's genetic distance (Nei, 1972), Cavalli-Sforza's chord measure (Cavalli- Sforza and Edwards, 1967) and Reynolds, Weir, and Cockerham's (1983) genetic distance. These are written to an output file in a format that can be read by the distance matrix phylogeny programs Fitch and Kitsch.

The three measures have somewhat different assumptions. All assume that all differences between populations arise from genetic drift. Nei's distance is formulated for an infinite isoalleles model of mutation, in which there is a rate of neutral mutation and each mutant is to a completely new allele. It is assumed that all loci have the same rate of neutral mutation, and that the genetic variability initially in the population is at equilibrium between mutation and genetic drift, with the effective population size of each population remaining constant.

Nei's distance is:

                                            
                                      \   \
                                      /_  /_  p1mi   p2mi
                                       m   i
           D  =  - ln  ( ------------------------------------- ).
                                                                   
                           \   \                \   \
                         [ /_  /_  p1mi2]1/2   [ /_  /_  p2mi2]1/2     
                            m   i                m   i

where m is summed over loci, i over alleles at the m-th locus, and where

     p1mi

is the frequency of the i-th allele at the m-th locus in population 1. Subject to the above assumptions, Nei's genetic distance is expected, for a sample of sufficiently many equivalent loci, to rise linearly with time.

The other two genetic distances assume that there is no mutation, and that all gene frequency changes are by genetic drift alone. However they do not assume that population sizes have remained constant and equal in all populations. They cope with changing population size by having expectations that rise linearly not with time, but with the sum over time of 1/N, where N is the effective population size. Thus if population size doubles, genetic drift will be taking place more slowly, and the genetic distance will be expected to be rising only half as fast with respect to time. Both genetic distances are different estimators of the same quantity under the same model.

Cavalli-Sforza's chord distance is given by

                                                              
                   \               \                        \
     D2    =    4  /_  [  1   -    /_   p1mi1/2 p 2mi1/2]  /  /_  (am  - 1)
                    m               i                        m

where m indexes the loci, where i is summed over the alleles at the m-th locus, and where a is the number of alleles at the m-th locus. It can be shown that this distance always satisfies the triangle inequality. Note that as given here it is divided by the number of degrees of freedom, the sum of the numbers of alleles minus one. The quantity which is expected to rise linearly with amount of genetic drift (sum of 1/N over time) is D squared, the quantity computed above, and that is what is written out into the distance matrix.

Reynolds, Weir, and Cockerham's (1983) genetic distance is


                              
                       \    \
                       /_   /_  [ p1mi     -  p2mi]2
                        m    i                  
       D2     =      --------------------------------------
                                           
                         \               \
                      2  /_   [  1   -   /_  p1mi    p2mi ]
                          m               i 

where the notation is as before and D2 is the quantity that is expected to rise linearly with cumulated genetic drift.

Having computed one of these genetic distances, one which you feel is appropriate to the biology of the situation, you can use it as the input to the programs Fitch, Kitsch or Neighbor. Keep in mind that the statistical model in those programs implicitly assumes that the distances in the input table have independent errors. For any measure of genetic distance this will not be true, as bursts of random genetic drift, or sampling events in drawing the sample of individuals from each population, cause fluctuations of gene frequency that affect many distances simultaneously. While this is not expected to bias the estimate of the phylogeny, it does mean that the weighing of evidence from all the different distances in the table will not be done with maximal efficiency. One issue is which value of the P (Power) parameter should be used. This depends on how the variance of a distance rises with its expectation. For Cavalli-Sforza's chord distance, and for the Reynolds et. al. distance it can be shown that the variance of the distance will be proportional to the square of its expectation; this suggests a value of 2 for P, which the default value for Fitch and Kitsch (there is no P option in Neighbor).

If you think that the pure genetic drift model is appropriate, and are thus tempted to use the Cavalli-Sforza or Reynolds et. al. distances, you might consider using the maximum likelihood program Contml instead. It will correctly weigh the evidence in that case. Like those genetic distances, it uses approximations that break down as loci start to drift all the way to fixation. Although Nei's distance will not break down in that case, it makes other assumptions about equality of substitution rates at all loci and constancy of population sizes.

The most important thing to remember is that genetic distance is not an abstract, idealized measure of "differentness". It is an estimate of a parameter (time or cumulated inverse effective population size) of the model which is thought to have generated the differences we see. As an estimate, it has statistical properties that can be assessed, and we should never have to choose between genetic distances based on their aesthetic properties, or on the personal prestige of their originators. Considering them as estimates focuses us on the questions which genetic distances are intended to answer, for if there are none there is no reason to compute them. For further perspective on genetic distances, I recommend my own paper evaluating different genetic distances (Felsenstein, 1985c), Reynolds, Weir, and Cockerham (1983), and the material in Nei's book (Nei, 1987).

INPUT FORMAT

The input to this program is standard and is as described in the Gene Frequencies and Continuous Characters Programs documentation file above. It consists of the number of populations (or species), the number of loci, and after that a line containing the numbers of alleles at each of the loci. Then the gene frequencies follow in standard format.

The options are selected using a menu:


Genetic Distance Matrix program, version 3.69

Settings for this run:
  A   Input file contains all alleles at each locus?  One omitted at each locus
  N                        Use Nei genetic distance?  Yes
  C                Use Cavalli-Sforza chord measure?  No
  R                   Use Reynolds genetic distance?  No
  L                         Form of distance matrix?  Square
  M                      Analyze multiple data sets?  No
  0              Terminal type (IBM PC, ANSI, none)?  ANSI
  1            Print indications of progress of run?  Yes

  Y to accept these or type the letter for one to change

The A (All alleles) option is described in the Gene Frequencies and Continuous Characters Programs documentation file. As with Contml, it is the signal that all alleles are represented in the gene frequency input, without one being left out per locus. C, N, and R are the signals to use the Cavalli-Sforza, Nei, or Reynolds et. al. genetic distances respectively. The Nei distance is the default, and it will be computed if none of these options is explicitly invoked. The L option is the signal that the distance matrix is to be written out in Lower triangular form. The M option is the usual Multiple Data Sets option, useful for doing bootstrap analyses with the distance matrix programs. It allows multiple data sets, but does not allow multiple sets of weights (since there is no provision for weighting in this program).

OUTPUT FORMAT

The output file simply contains on its first line the number of species (or populations). Each species (or population) starts a new line, with its name printed out first, and then and there are up to nine genetic distances printed on each line, in the standard format used as input by the distance matrix programs. The output, in its default form, is ready to be used in the distance matrix programs.

CONSTANTS

A constant "epsilong" is available to be changed by the user if the program is recompiled which defines a small quantity that is used when checking whether allele frequencies at a locus sum to more than one: if all alleles are input (option A) and the sum differs from 1 by more than epsilong, or if not all alleles are input and the sum is greater than 1 by more then epsilon, the program will see this as an error and stop. You may find this causes difficulties if you gene frequencies have been rounded. I have tried to keep epsilong from being too small to prevent such problems.

RUN TIMES

The program is quite fast and the user should effectively never be limited by the amount of time it takes. All that the program has to do is read in the gene frequency data and then, for each pair of species, compute a genetic distance formula for each pair of species. This should require an amount of effort proportional to the total number of alleles over loci, and to the square of the number of populations.

FUTURE OF THIS PROGRAM

The main change that will be made to this program in the future is to add provisions for taking into account the sample size for each population. The genetic distance formulas have been modified by their inventors to correct for the inaccuracy of the estimate of the genetic distances, which on the whole should artificially increase the distance between populations by a small amount dependent on the sample sizes. The main difficulty with doing this is that I have not yet settled on a format for putting the sample size in the input data along with the gene frequency data for a species or population.

I may also include other distance measures, but only if I think their use is justified. There are many very arbitrary genetic distances, and I am reluctant to include most of them.


TEST DATA SET

    5    10
2 2 2 2 2 2 2 2 2 2
European   0.2868 0.5684 0.4422 0.4286 0.3828 0.7285 0.6386 0.0205
0.8055 0.5043
African    0.1356 0.4840 0.0602 0.0397 0.5977 0.9675 0.9511 0.0600
0.7582 0.6207
Chinese    0.1628 0.5958 0.7298 1.0000 0.3811 0.7986 0.7782 0.0726
0.7482 0.7334
American   0.0144 0.6990 0.3280 0.7421 0.6606 0.8603 0.7924 0.0000
0.8086 0.8636
Australian 0.1211 0.2274 0.5821 1.0000 0.2018 0.9000 0.9837 0.0396
0.9097 0.2976


TEST SET OUTPUT

    5
European    0.000000  0.078002  0.080749  0.066805  0.103014
African     0.078002  0.000000  0.234698  0.104975  0.227281
Chinese     0.080749  0.234698  0.000000  0.053879  0.063275
American    0.066805  0.104975  0.053879  0.000000  0.134756
Australian  0.103014  0.227281  0.063275  0.134756  0.000000

phylip-3.697/doc/images/0000755004732000473200000000000013212363632014560 5ustar joefelsenst_gphylip-3.697/doc/images/DrawGramCat.png0000644004732000473200000014064312406201357017431 0ustar joefelsenst_g‰PNG  IHDRGû÷K!ù$iCCPICC Profile8…UßoÛT>‰oR¤? XG‡ŠÅ¯US[¹­ÆI“¥íJ¥éØ*$ä:7‰©Û鶪O{7ü@ÙH§kk?ì<Ê»øÎí¾kktüqóÝ‹mÇ6°nÆ¶ÂøØ¯±-ümR;`zŠ–¡Êðv x#=\Ó% ëoàYÐÚRÚ±£¥êùÐ#&Á?È>ÌÒ¹áЪþ¢þ©n¨_¨Ôß;j„;¦$}*}+ý(}'}/ýLŠtYº"ý$]•¾‘.9»ï½Ÿ%Ø{¯_aÝŠ]hÕkŸ5'SNÊ{äå”ü¼ü²<°¹_“§ä½ðì öÍ ý½t ³jMµ{-ñ4%ׯTÅ„«tYÛŸ“¦R6ÈÆØô#§v\œå–Šx:žŠ'H‰ï‹OÄÇâ3·ž¼ø^ø&°¦õþ“0::àm,L%È3â:qVEô t›ÐÍ]~ߢI«vÖ6ÊWÙ¯ª¯) |ʸ2]ÕG‡Í4Ïå(6w¸½Â‹£$¾ƒ"ŽèAÞû¾EvÝ mî[D‡ÿÂ;ëVh[¨}íõ¿Ú†ðN|æ3¢‹õº½âç£Hä‘S:°ßûéKâÝt·Ñx€÷UÏ'D;7ÿ®7;_"ÿÑeó?Yqxl+@IDATxì|TEׯOHB½„Ы"ˆHQĆ^åó±V챃¨AáU@DÀ‚€4ÒQzï½·PÈ7Ï$g¹ ›er³û ¿Í;åÌ™ÿ¹÷îafîlب ë“8 ʉãÇ!999åxÊ“O¥I³'Ž28×ògÄQ?ƒœ*7ƒl&“ dÓgHll¬äÉ“GŽ=*… ¶ñˆˆÁ'W®\öfY©fRR’ôêÕKºwï.ùóç÷»©çŸ^7n,×]w$$$øU711Q:$E‹MÓξ}û¤H‘"iÒpräÈûý—½öîÝ+ ·ÞzKž~úiÉ›7ïr™p&\kgž?qÈÑr¸–5¾cÇ{À&agý}ò¤”+TXÊ› ¹2 p‚THFe2“î”§q=fFÊúª‡Î#œ:uÊÞÈ8ú (¶2¾êk^ äxë×¹ÈÕ:8"x럷¶´/8ž-ßYÖŸ¸SžÆõèO}g_õœ}Vκéãþ”I_ÇÛy äxë×¹ÈÕ:8"ÐþÞ,–’¦¬2.á_N äÐþþ±öÆIk:¯ylâO•íë9Þúu.rµŽ¼ÿ3¶Üq3H´k×.‰ˆ2ŽQݘ))bâÍ’3À‚4<<ÜÆ5Íž¤þQàš‡²HSƒœ4òñ¿üO²5OeëÅ 2RE[9HÓöU®¶ 9§ú¦¯¯r=´]gº¯:(çÌ×öµ=äkšS'¤;ËàÜÒç9õÓ<=z“¯iN™ˆkãèÔùΠ¶QÝ5Ï)yÊß)KËj]Í£ýSDN†ÊJéóp®×§æéQm­œq kšÊÓ£ÖÁ9âjÄ5 qÚÿôs.…Èé¿ÊÊÉO¹;9kšÓ>â,sZjJ,}Îiÿ3m¦à¯Ïo¬ÁNy:mådîä›>ÝW”uækûÚò5 GýŽq–±Ž?éóœúiž½É×4‡HÕ:8AÁ©¿¦Ù ó‡÷ÿéklŠ+&a“,H®bFŒðšÆq®ÆÖ<þé¸ ´.ÎQ‘Ê@¾³-•¡upÄG/,M‡,œkÐ/_=ײÎ2Î4•©”³œS⪗ƵÎõPNÏ5îÿO;=z­€ xéu¥Ge†s|ôsæ{‹C–Þ»éíþm¨\ÍËÉÏÿ°E+W&ç14 sè@! “× ¤!Ðþ§ˆà6àKû§\¸FôúÁÑy­•2C¸iyqÎû?… ïÿÓד>pÍ ®çzÍàœÏÿ {þ/Z´(W£ãˆ ÆÇa÷¦òϯSdÍ? eûæM‚ÔÒeËI…:u¤ÞÕWK1×à|øh}ä9ÛØ³}ƒ,š5YÖ­\ »6m²U‹•-+•ª]"µ^%ÅbÊ{Ê#rô!‡£>Ô‡øö›åÏÓdéÚE²yëV¤JÙØ²R£BmiZ¿•Ė޳2PÞÙ78YøRq¦Cžr@ºöõTµÊë¡eêµÛ©iHwæim:©<ÕCu¹¶ ™Zùx˜©Cá,rzޏ³=MW¹ÚÊhêx È×úÞtJ__ÏqDpÖqÊÒ¶´¼–E_pD»T=G½ôòÒËÒsÚ?åZW8"¸Íþ'Nœ¨¨(«ÛùØröìÙ#¥K—¶}tÚy¹sçö\Ó¶1óGï10ÁGÏ‘s\z¿jä%ïÿ”ç¤òç5†t¤y»QGpG8û£¾ê€£Óþšž^µ7òñÑs•Eû{ÿN‚MÁKƒžçû‡?üðï¸øèÅÎà¦F@GVÍ™-“?$K§M•"‰'äââ%$.o>9h^{[:o¾l]½Jò+*ÅÊ”±åqÁ©<½  ¯I"¬^4S¦ kÿüUbÂHƒJE¥r‰®lÙ¶Zòæ/,EJ–µò ‹‚EÜ ñ–Α‘?•?ý&¹Š$I\íâR¤L>Ùy`‡,øwlزV å+$%‹—¶í£tÄQÿw¹ÐY §í!ÍôGÔqþ/ò®71ê!MeâùÎ7‰€<ÔÕ£3Me qmçÎò`®ç8ª QòU–Êp–QHs†ôuTÊŸÍþÐCË£,ú«öpêƒ8>ª?ôCG ˆ;uÑ>à™¨€2ZOópÌ)öÿã?äŸþ‘+VȶmÛ¬Þùòå³<´oÊJy¡ß™µÿúõëeÁ‚‡ê–Ÿr³ ©œÌÑ.ÎqÄ'ÐöÇ«Ï;v” HÁ‚=v„*ÐÍ© ΡŽéí?|øpùè£äÏ?ÿÈÄëÙ`…W´mà\e@¶3è9ލËû?…7xd•ýõz†MÇQâhÄñQÛ¥·¿ÖÃy8æ”û}BPÚgœkßQçZå3{ÿ;ÛA}e‰¸3hûÊç(‹O ïíÚBG ˆ;uQ}p<_ûGž6€q±hÀÿ¤öoÝ"¿5\ö¯\!Ï7m"Û\-Ñ­¯2O%‘£•5&Iß?fÊŸ#¾–f…wñ¸òV)•ñ¿1ñ9¸w›Ìþu¸$¯^$Ï<ÒLb¯¾AòikЇɉ½?Éæ)ãdðÀßdvX²ä/\B /cÿW¨ýpD@ÚŽ][dü´‘²vïj騽ÜX÷F)] ‘Íß|è»`¬Œ8EÆ™2E —RÅcmŒÐêÌàÿDžG{à€#ÒÁJyáˆ<çYj@Ä¡«þ¯qm ò׺вœò!KõÁ9⚆8ÒðAÀ+ˆHÃçØ±cžíÀΨ‹t‚*q„ôöG]ä9ÓmAóuUâÚ?µê î´¿³]mSuƒ\ÄQuÓëªé*_ë+Gå€zˆk¿qD@y-}QAó•‹ö 塇ì?vìXË#ëÖ­³_æ>ú¨4iÒÄÓ§@ØùòåGìŠ+®°ýå`a9þ('ñ]P^í(ûÃf¯×ƒÊÏÈþاdåÊ•-3íÃ/¿ü"={ö”:f´ûÞ<÷ÜsR¿~}ëB6Ú@€|ÈÖ~»ÁþÐÅÉA™#M¯é@Ø6Ó€~+;gºæC襺d•ýõ™ùzMA´¡ƒr€-G:ž'8"(CÄQå4çhKû…òhöÍç„^ ¸˜paàˆ‹ >K¦N“ƒË–IÇŠñR>¾œ$¯X)G–,³UXD¸ÄWˆ“Ž›¶Èð%KÍÈÒti|Ï=ž ²qqá¨7β¹“åÔšErW›x‰¹¸†È¾ÍrtçÇ)òÌôVLš&o£|3q‘¬\8M^ÝÑêÐG觺ÎYô»¬Ý³JßTCjW©*ëWËÒK¬¼èÈ(©[µ–l¾i³ü5v¹Ìœ?UÚ]u·ÍSö$õ> +8à\o2´‹à¼Yp޲¸™ d*;ÍÃ7¨ÊÒ›r!çÎþißPŽÊ"h{’úGÛÅòô¡–ZÄÃÞÙ6ôt¶ÙÚ®öÁ)[u@žä«L¤¡à€<ÈGŽjek ™?ÚÊipÆ!_û­2QGuMß&Ê8uE{(‹:H‡<•¡mâ¨íªŽ8GmWËèÃç( ™(™ø ®y8ÂþèCݺuåÁ´ÓK]»v•©S§ÚMîœmB—ó±ÿõ×_/íÛ··Î—ö G'SÛ9óéhöEHo‹@Ùýƒlm ç¾ìÿ×_ɘ1cäÓO?µöAY8”ø6ëƒã|ðÝàŽ’ê£í½`es§p~¾öGz=9eg§ýµßÐý?›ýºËýÛ*Ä58m„4Ú?í³ê|ìàz3è…ÈHÇgÏâER'w””1_'-•H3¬²QäÉ£Ç$Ñì‹Tæd’\d‘=‹yêA çT†Û³é_¹¸L)’?·œX¹\"Œ¼\QÑhRNž8&IFòPfÕÆŒn÷Xý=¡ä@.Ž+·,“âÕ J¾ÂydÑ–%R ºD‡ḑwðÈ!É[8ZJT/(«·­°2P2ô\û¬_’øRCÒ5@¤©.HGýE:>¨‡:è¿ä# M탶§Gä#®zá_tªƒÊ×úÈײªÊ ¨NöÄñé(£ùè+âΊ;urö eµ}-£ýD9MCÜ›ý5m@­ fˆ#?}@›ζÑoÔ×£æërT>Êè¹QNÙ¡òðq£ý¡#˜b½ tÄîÅóçÏ·}øä“O$>>Þ~écJìý÷ß·»î~ûí·võš6m*wÞy§ÝÇì7ÞÇÜ®»A_±Ë/Òž|òIY²d‰üý÷ßÒ£GÛγ{þàÁƒeÍš5vWß{Ì~j×®-¿þú«¬\¹R0z»Íž=[FŽ)Ï>û¬/^Ü^·¯½öštëÖͶٯ_?yï½÷<œQû`¤eï½÷^?~¼lܸQÊšõ‡pK”(á¹·Ðoôõ0’6}út«;FÓîºë.©R¥ŠÕó»ï¾³#CØQ¹zõêv‡äÞ½{Ûz/¼ð‚=ZEÌŸÖ­[ËÕfÍ$®-ÈÕkùß|ó‡FÒî¸ãO9½Qñ¾}ûZÀ>sæÌ±»IßvÛmrÙe—¡ˆ˜µòý÷ßËæÍ›íxÐßB… y®9ô G§¼ÿSø‚2W;9š¯GpD¾>ƒô\(§œQyø¸ñþW½¡«Æ¡³öÁÉé8ǽ’> 24ŸÏÿŒ¿ÿíÈ :!ž:‰[¶J¨H)”pLòîÙ'¹Ìÿ¾$*Å9Š8~L"Mz²q’jDEÈT3‡z¸(õÆVCà˜˜°CÊÇh³cz¤$H®¤0 KqRN&•°ä#6e–nÝaz‡à¼x¡÷„RºžY#Ï8aÇ$ÁüdIâ‰ç(ñä 9‘ë˜äÊ›,1ŠÊŽ™»<d¢Ï‹8äj;¸àpŽ…Ú8⣪î8jùÐ皇#d" ÝW¤£dBÔAéÚdàÜMö×¾@wŒ|`täòË/·ý…sg_ô÷ÝwŸÕ}Ò¤I2wî\y÷Ýwm¿ð3(_Ƭ „3ÇäA¶Ë‡Sƒ¾ÃYQÛõïß_bÌ&±ø2Ÿ0a‚u€jÔ¨a8!huÑ6]¸p¡´hÑBÌ[°–?~†o¹åm`§ý¡Öü`´~ZaÀ€ÖêÔ©“ÕEíýf̘aûÛ¥K¹ä’Kä믿8?¯¿þºÄ'N tÃú"Ø}yâ‰'ä•W^‘7ß|Óʇ}_|ñE{ÍáºCmÓSS¦L±}‚CýÀ¯Q£FÖiÓrz½ ?‡¶#y×^{­¼ýöÛ¶ýaÆYýÐ×Q£FÙ)P8—[Í #hS¯3µ-ä@&ïÿ”Qd°P&`Ž»'ì‚sÍÃLî+ÒQ²sÒý¯}Õ¾£oHCÐþë¹óR†à‚:è?ŸÿþÙ?𠎀‡#Œ€x´9‰Ê-ù¢óIT>ãEGKœ£\t¸¹`qÑšïö¢æ<Ú@åÁ`j4<4äÎ.QÅrKdÑü’»H Ïcä™Q'„ˆDóem>¹’M™ã’{ï nȇ>ø èƒ%*o„*˜O ä‹6£FÑ%QáQ¶Ü‰SážÛè‘lþ—vBöE'ØzÐ ŽzA9o$Û€ù½5ÒµÊA®ê]P2Pç(«u5 GÕº  ¬ê€sÄ![ûˆé?å…úÈÃéОê„r˜ÒP'yhm¢}mCë"íâHÓ~8ãhGë@äj9=¢ œ§Ú¤«>(¶•Æ‘*ùÊz ¯´®rr!Ûý‘Ž‘"8ØÚþÒK/µ#-Ú',V¾ùæ›í4Ú9s¦\tÑE²sçNÛvùòåí¨œ£–-[Ê—_~i§Ï  œ#|©;׬ =üÎFŒðÛTh·\¹r²{÷nÙ¿¿u° #"IÂBq8g«V­²Nt…3Ñæ¾lÖ¬™u 7½ýQi…‚ƒ…‘åæ´%T_|ñÅR³fM« Fh&Ožlûz¸¦pmÁþ#d€:ÕeÕfÈG:¬Y³,¿íÛ·Û´xãt-3Ë àXªÍ!S㨋>Þpà vúî>ã¤bí¸T«VÍN…‚ â°‚öÛû£,l…ö 'Òxÿ‡ÆýOûŸùý¯Ï܇¸pîŒ;Ÿ¸wôY‰rZ\õ~²ÇŸ<@pƒ¡"B”80ëŒöïÚ+Æ‘°¢EŠ˜ä3q#]ÂÙ»Ï8ûe¿Y¡×ì´ 9§_ ú°ÒHÏ_4Vö:,áfî?<¿qŽÌGr› #/!B“¼(³À»„ç‹r SÝ kñ±ròà ãÄEIA3=W(waÉž"ïÈÉC²/×~IJ4Î…]¢`¬mº¡o‡àÔ° A>Zm£d(|”CÈxx¡¼Ê@Yä£ 8àƒ<”×2ÈW}Pe‘‡ó åvÐ>Ž8GòñAÐvœòоöº¢^Fö‡ Ô…®Î8ê M´yè®#”Ež:JÈGBú¸ÊÃ2PeðlMSΚ]Ôþ(‹v(‡~è—ú]T&âβÊiNQu‡Lmmg—ý¡O… k‚*V¬hÃñŽÐkitÊ úÂÁÖÚàýÀšpÅTêa oÁá³<ôòÀéˆcªKC¥J•dË–-vô ›á0àÇI1  #8½xñbiÓ¦•ƒºz#®<•»ö#O¸v`Œ8á}Ä+ü° l 9xË Ñ1Â…€#ÚÇ ®ÈC(‹#>èd íÂŽÎöÑoä£>ÞøáXõúÒë²Uè¢6@ÀvÆ vÁ÷c=fGð N#NpC@}½!zAÕçøh¿ í#àˆsä!hG}µ§ÊÔA{¼ÿùüÇuÃçÿéï(ðÀ½‚`w-Ô›7n&} `fAóÁYr¤Dq‰,RXÂË•‘°â%ñ"É»ÍÃ9<·‰Ì-›$Hášµ¬`Ü|z“jCz£(YS¶þSŽE†Kt>ó€Š6¯ææIùµã0h“œ`ó¶M’‚¦¬>|p3ëŽ`з|±*²a÷‰:)ùÌÜZ8G¹S~Q9ò¾4OHBÒQ³MÀq‰+^Ã>$ÑqôQ¿ø A8€tÒþ)ßo¸²Úþö[ŒóâA£¸ˆÊ6¾\Ö™9ò¿Ì芒1’¯h ³ˆº°y‚š ÌÜËGN&ËÌ];$¡l)ßè2ûy¸ðôÆÆAZ™*Mdà2{å.iT¤ ä6e“S Iæ¡pøÄI™½b·˜ÿ J\ÅFV/Ô‡~¸ p¡hZõøKdï²-²~á.)Ѩ¸YÜmFžÌ[t¹’ÌÅ4ÌæåO,%5«7ô|¢§ }€á¨íXæ72A\ÛFY ˆã! eõEy è‡æ£`‚4½iµ>ÚL”AÐv! iš¯çª³–o8;(§í਺èºddä¡<ôƒ.8ª|¤#_å¨ÞZÆ›ýUž¶§ú¨î©í@Òµäj9­‡#ú‡¶P_Øè3Ê! í ž¾.Ê# mmã£eQçhåw2±‰æÊe¥ýÑèTâhSuW=Ô>-Z´°ë\° #MÐ#1p6Pëg0Ê6X£P_åÂAÀú"Lo¡>~Œ}DL•a Ž F 0-þ˜êšnÖ3µjÕÊÃ×öOªZµªÕ×›ýÑìã„)?ÈÄt]Æ ­®ÈSæàÇëŽ0åGiÞ¼y¶-ŒdA7è pPàü¡è3d¨ÍÑœ£ØP¤Ù‡Už‚riÕVòïºßdδ¥R!.AJ§ÊÛºu»¬Åÿ@“ IªM¤@žBö! #‚örñ€Á—«Õçh°ÂÃCâ8¢oø8/^”W†‡¶ƒ8ä à¨E}è¡ô€>*SûŒ#z¨=Õ1A­#òU'ÈG=”Q‡6T_öG]­2 [û£2‘ïËþÈGYµÿ¯z¢.ε-”Ñþ Móæí@\ë@ò5è9ÚE}ä¡-#ÒÝlp†þÞìt|ôzDÿàXàÞÆ¢`\«HÃV×\s- §Î¾¤á0aTIy¨,¤uîÜÙ¾­Ö§O;n·Þz«`z 6Ä^B`©/äÁY€³Gº¢ F°Æé¥—^²mC¾Óþˆƒ?FÄÞyç« ¦ÉÐdëu;¢ì•W^iËâU}8"HÛmˆƒêb½tFŸ1½§÷ú9jÄQGó‘Ž4ŒJáùòÕW_Y~HSYàŠõCp~Ôáƒ\0ÀBrÈ‚N<òˆG÷qãÆYǶÀš#¬O‚¹j?ÔÁÇ© òôšÇÑ›ýQG™â[ã¨}A=• =‘‡6P÷¿ûï؉öOa¸v³úùfþdÿk‚Æp£âa©7,øÆÛ(‘&¾wébÙ»j­Ú½úI⥤h•ŠR´fmI4õp³ëCùzSâ&E:äâFÄ"w.³ÿ–Yf=ÓRI8´Å%oX)\¢–‰m Æ-³¤ëÍ Ðr 7}a³Ö)ɼ©¶f뿲e÷jÙ·—)eÒ –r¥ªJù’µ$*W^û°C}è…à|`hß!miÿ‘‡ލƒö!çZç(¯}†Î(ë|ð(ÔÑ|<¤P2Qé£q-‹#Ê;óµ êkÐòòhiø¨Îš‡¶PyÞì|mGëãˆkCåj}m u”Ezû£.Êi×~Aê9ÚÐ~ eœö‡Î¥º)OÔAÚ‡®¨‡ òQÔÕ²8‡Nn²?¦È0j„7ÕÐe<Œ aÊ éè^ƒ‡ãƒtMÁôêÇ›‘ „õfTGí‡Q ÜOAÿq¯bÛ0E˜Âëè¸×Ámƒʃ!œ.”ÇZ'½?Áµ‚}Á:$ÈIo´ƒ·Â° F¹ÔÙÂÚ&”E8aX®÷'FC78T¥J•²£Hˆ# HÔ”Å(”ŨX"eñÆtÇ€ÐõÀ }-àLBôõñƘㅾˆ£òÔ¾ˆ#Móô>Ñ|-¯×¶³<Êè}ƒrø }ÍCÛjØJuÔëùÚŽÖÇ‘÷?Ÿÿz]éõƒ{A¯\c¸nqÝ#× ®7q®×4âz]Cê9äéuŒt”È… \³8굩÷êè3ý\žÿaf¿ûó¨(  \;ƒõxXà!ƒ†  Êë:(Š:ø 3 B;ƒ£BÃdBȃ üo Ä¢ÖÅ—ƒ‚@;¨‡ÿIAGÈÃCÿ{D´ÝP‹+Wè ýDÎ!OÏÑoÔÇ9tL´¥lPeÔ`¡ò‘†:8GyÍCyôq´«uôˆ|ä©aQFë!Ž># çÐ AÛP{! ²ÒŸ#mi;8GßQ2 ¯öÑi¤¡Žh[™i}ñѾA&ÒîtA9ĵŸ(£zk{Èóe+ØüÑöQýA@]|”1tG@¤9û‚sÕEÛÎéöGŸÐØ癵¿^g*ÃÉ Ìa/•?òýµ?Ê~øá‡v” ÷°rGºÆ½ÙÎlå Ú>꩎È?ûk}_÷?œ£xã|aÄ ÷(œ"p†zo*weý† GÍC¿Àe ;ÒçýŸòE;;Ÿÿ¢ùã´?ï>ÿq]àþ9—û?lôèÑÉú¥§Q\`¸øôK^À ‹çxX `˜騇"ê ¨…8nlÜàú°Ã9ê  ê õñÑ©>ĵ¼>4´d"Ž€úªâúPBxH! ýD9×QN´?Fôð=ü×ïÿ³?ÿõÞÆ½Ëû?å;÷Ÿÿîxþ‡™]m=#G0 .R5¾\Ç—*Òô q-‡#‚=¾ˆô\¿Ä´<¾Ì OÓåQùNçD ÔÁ—žS´‹|•<½ñðBäâ¹8ÇCçÚ&äCòµœCO¤!õ´Ž¦£-} # òPAõ@\uDò´>ÒqŽê)ÈQyhS蹨‡€2Úê çpZ`;œ£>ú„²Ú­#dãˆ<!qÚŸöÇu ×-âzàˆ€kù¸ö2sÿãÚÃt¦· ç¸^ ÷dC½/qk@¾ê‚öô~¸P÷?jCL7BWÞÿ|þ;¯CÄõE\ï#‘Ž#>|þ»èûÜØñÉ'Œ7¼ÔH0nð\¹ð`2#I'ÍÿÈ“OHDšW÷ñŠ<bxå 3åÃS<¸`h<Ìð²_ÈI&nÚ@ºÊE:d%Ùh[ë9õ€|Ï…“h’¦ z!áÜùEyÈ@€LûpMÕq”Á'¥¯)iúP³2L¿#s§ì!bûúpö¦gD„Ù›ÇpPN*×6dþ$ž0£<©ýOyÈãub8ÿñ¼æó?ô¾ÿÃÌ&Ö{À—~éã¨ÁéHh:Êz p§ùNÁ™™8×¹š¯Î–QéÛÆ9>(§õõ‹* zh;Ú´©uQ^ó‘ŽòÈÓ>ÀQBuòPFêAW8 ªŽø¨LÈBP=4OË#ÏéÔá\뢌3¤×å _yá\û¯õ´/zŽ2Ú7ä9õpö]õF=ÕCÛCÊz NùÈǹ³¾¶4äi@{ZWû£eT†ê¤mã”ÓúÚ•…²ÚŽê6µ.Êk>ÒQyª'íOûãIïáZÁuÃû$Nç}‡Tœë}çÌÓ{[kêý‹ûŽ÷ÿéQXåƒcúk<ÁMyá\ŸZOŸezî´òôÙ‰|ç³yœöÓt”õœò‘sg}}®æû£—És0¼uši$@$@$@$ .5`Úá‰Pè5ûH$@$@$@>Ð9ò‡Y$@$@$@¡G€ÎQèÙœ=&  ðA€Î‘8Ì"  =tŽBÏæì1 €tŽ|Àa @è sz6gI€H€H€| sä³H€H€H€B£Ð³9{L$@$@$àƒ#p˜E$@$@$zè…žÍÙc   èù€Ã,   Ð#@ç(ôlΓ ø @çÈf‘ „:G¡gsö˜H€H€HÀ:G>à0‹H€H€H ôÐ9 =›³Ç$@$@$@>Ð9ò‡Y$@$@$@¡G€ÎQèÙœ=&  ðA€Î‘8Ì"  =tŽBÏæì1 €tŽ|Àa @è sz6gI€H€H€| sä³H€H€H€B£Ð³9{L$@$@$àƒ#p˜E$@$@$zè…žÍÙc   èù€Ã,   Ð#@ç(ôlΓ ø @çÈf‘ „:G¡gsö˜H€H€HÀ:G>à0‹H€H€H ôÐ9 =›³Ç$@$@$@>Ð9ò‡Y$@$@$@¡G€ÎQèÙœ=&  ðA€Î‘8Ì"  =ÙÑå‰ÃGHµú—dGÓl“H€H€H ‡رi³\võULë îM9R^èpÏë "  Èùúþü³4nÓæ‚tä‚;Gù ´{ñ³AR·i“ ÒI6B$@$@$3 l]·N¿î:‰­Pá‚uà‚;GÚ³’eËH|õêzÊ# ¸‚d»Â T‚H€H€HÀ-è¹ÅÔƒH€H€HÀè¹Â T‚H€H€HÀ-è¹ÅÔƒH€H€HÀè¹Â T‚H€H€HÀ-è¹ÅÔƒH€H€HÀè¹Â T‚H€H€HÀ-è¹ÅÔƒH€H€HÀè¹Â T‚H€H€HÀ-è¹ÅÔƒH€H€HÀè¹Â T‚H€H€HÀ-è¹ÅÔƒH€H€HÀè¹Â T‚H€H€HÀ-è¹ÅÔƒH€H€HÀè¹Â T‚H€H€HÀ-è¹ÅÔƒH€H€HÀè¹Â T‚H€H€HÀ-è¹ÅÔƒH€H€HÀè¹Â T‚H€H€HÀ-è¹ÅÔƒH€H€HÀè¹Â T‚H€H€HÀ-è¹ÅÔƒH€H€HÀè¹Â T‚H€H€HÀ-è¹ÅÔƒH€H€HÀè¹Â T‚H€H€HÀ-è¹ÅÔƒH€H€HÀè¹Â T‚H€H€HÀ-è¹ÅÔƒH€H€HÀè¹Â T‚H€H€HÀ-è¹ÅÔƒH€H€HÀè¹Â T‚H€H€HÀ-è¹ÅÔƒH€H€HÀè¹Â T‚H€H€HÀ-è¹ÅÔƒH€H€HÀè¹Â T‚H€H€HÀ-è¹ÅÔƒH€H€HÀè¹Â T‚H€H€HÀ-è¹ÅÔƒH€H€HÀè¹Â T‚H€H€HÀ-è¹ÅÔƒH€H€HÀè¹Â T‚H€H€HÀ-è¹ÅÔƒH€H€HÀè¹Â T‚H€H€HÀ-è¹ÅÔƒH€H€HÀè¹Â T‚H€H€HÀ-è¹ÅÔƒH€H€HÀè¹Â T‚H€H€HÀ-è¹ÅÔƒH€H€HÀè¹Â T‚H€H€HÀ-è¹ÅÔƒH€H€HÀè¹Â T‚H€H€HÀ-è¹ÅÔƒH€H€HÀè¹Â T‚H€H€HÀ-è¹ÅÔƒH€H€HÀè¹Â T‚H€H€HÀ-è¹ÅÔƒH€H€HÀè¹Â T‚H€H€HÀ-è¹ÅÔƒH€H€HÀè¹Â T‚H€H€HÀ-è¹ÅÔƒH€H€HÀè¹Â T‚H€H€HÀ-è¹ÅÔƒH€H€HÀè¹Â T‚H€H€HÀ-è¹ÅÔƒH€H€HÀè¹Â T‚H€H€HÀ-è¹ÅÔƒH€H€HÀè¹Â T‚H€H€HÀ-è¹ÅÔƒH€H€HÀè¹Â T‚H€H€HÀ-è¹ÅÔƒH€H€HÀè¹Â T‚H€H€HÀ-è¹ÅÔƒH€H€HÀè¹Â T‚H€H€HÀ-è¹ÅÔƒH€H€HÀè¹Â T‚H€H€HÀ-è¹ÅÔƒH€H€HÀè¹Â T‚H€H€HÀ-è¹ÅÔƒH€H€HÀè¹Â T‚H€H€HÀ-è¹ÅÔƒH€H€HÀè¹Â T‚H€H€HÀ-è¹ÅÔƒH€H€HÀè¹Â T‚H€H€HÀ-è¹ÅÔƒH€H€HÀè¹Â T‚H€H€HÀ-è¹ÅÔƒH€H€HÀè¹Â T‚H€r½;vÈÉ“'s†²Ô’Α@Ä9Öc5  ˜ùË/2kò¯²fñ"Ù¸r¥$›å«V“JµjKƒV­äŠë¯ `k™µøïY2eôhù{ÒDY·|¹ü¸n½/›9!,M9ˆ£d,ªJ$|ölß.oÜ¿üñóÏR©fMiÕ¾½tzî9 Ë%s§O—_¿)Ãz([·–—?ÿâ‚;%pÚþž8I~ö•ìÛ½ÛàTò©à3{DtŽ0% I`Ú˜ïåÿÞ/÷í“kn¿]^ûò+‰ŒÊíQ¡^ófÒéùç¬óôÓW_ÉíµkÉ Ÿ ’V·µ÷”ÉêÈå×^+øÄW¯.=»<˜eÍØ»W -šeò)˜2C€kŽ2C‹eI€H @0bôæÿµŽQ‹ÛÉ[#G¦qŒ´™ˆÈHã4})WÝz«Ü¿_zv~@voÙªÙìX,¦T–µ5äÍž2qØð,“OÁ$YtŽ2KŒåI€H ^½÷^ÁhI¤q~éÙë¬~«—qž¢¬ƒôÒ:œµ|  „……yDæ2S~ ~û]÷|3Pâ(‡B pWx@Ô¡ ~¾&3'M²mtÍ5R¡Vͳv:®jiz]Ê¢ì9S§Éx³þ(§‡õË–IÛo“ãÇŽåô®Pÿ #À5GAfPv‡HÀýfŒëQ²a««<ñ³E.iÖ\¦~ÿ½-öÛã¤ÝýÿwF•ïÍš¤Ù“'ÉÆU«$áð!ûÆ[ǧŸ–ú-[žQöоý2òãeÉœY²Õ¼V¶RE¹ªýíÒ¦ã¹Lü¹Ìÿm†yën±DåÉ#Ì"ó¦mÛJó›nJÓþâY³¥»yoßž=6}üÐ!fú4¶_?)“¦ÞUº}øÝCé®îÝä#ãÐ ü=y²ôþ÷çORb¢<м©q®fI¯oGÉÝOt—â¥KKë»ï’OÍÞM¥Ê”‘ß~øQº]½G\ã6m䎮{Θ´{ž|Ò~ò,èIg„²ƒ£ì Î6I€B–À–5kÌt×aOÿËT¨à‰Ÿ-[á´#u4!A6™©3„f¤U‹I3:„·Ûœá?Ïöh3S¸X1Ém¦¹ú<õ”„…‡Ë­uq•êõëKLÙ²6mô€þiò| 4#U›Ö¬¬Ÿª}YÃ4E‹”,!¿ó®M[<{¶|Óç£4ùzâ\ð­i<’@vàš£ì"ÏvIà,Íœ)£ûõ—˜òqfÿ—bg)Íl7ˆˆˆ°£0e*UÊPÂ%J¤É;d¦µò(&-£“Cû÷¥É*V*e]ÎÐ^æM6ã5½¡mš|œÀY™`F”¢óå“p£¦¶~5»]6{ ½tÏ=g”?u*eƒÇÃÊÖµkÅŸ‘­1Z9jx_XÞÊ,º~÷±Gå€ÙÏiÌgåÎn]Ïh—ÎÑH˜èe#|6M€cÔÙL{$šé †œE ÿK/ÊP3½„M½L•0SN»¶m³ÙI*U®œ·¢g¤m1£3 £@‘²mý»¨çéG´lþB…4*ËçÍ“ÄãÇm›ÍoL»H…œiùõ<ÒEvlÚd$—«R%]nÊ)ôªsEcùýÇŸÓ˜ Ëuzk”JîUIà st@³È 1?£c”`.*‹—yÓ¦eèAÕ²fdÉã­['—´háW6¯;í•«\ÙÖY6w®=žðóuø5fú ádR’]dOÎãÏêSäAD®\¯Ô(WÞ¶’päˆ`ÍUl…x{îùãØGÉ“Æ dŒ¯älRˆÍ’ ˆÄÄù7’@Vî#€é­ºMšøT¬z½K<ùë—~sÍ“˜AD×!»rê¢ëS§NÚÒ˜²Ú·sW5O'‡G¤¬IÚîX~:7ó±’åRÖ(¡æÖõë2P,¦´ÍÃ(RVî¶¡Ì LàÈQ&`±( \( IÓTK³GL½¦ÍÒ¤ñÄ}°æ¨~Ëž·Å2Ò𿯾"¿|óµìÛµK& &™¢3šSxµþgó3"…Š‘ÿK}›¬ÊÅuµˆŒèÓ[yë-Ϲ3’xü„¬_±Ül8YË&c¡¿&L¼5æ- ½)£FÉ5wÞé-Û“VÑÈë÷ÇÌñ5K{ÒÓG¤îgTÜì_>[rqäè &LÈ>t޲=[&¿ À1ÂëÑ ÁA?°Úí½÷ä•ûî“f±ôèOûy]¤ììí˜þdûæÍ6é‰Þ½¥x™X3ë| .lVdÌ€rKç¥t|Ê–³þkî³U-³·Q>³üˆy­¿ï3ÏÈå­¯õºÞ?oRëÒN^ãáæ­·êõêÉÂ?ÿ”9S¦ÊÑÃG$:¾3Ênߘ²}AÃV­N¦Nåiæ fFàp}kvÜN0j5wútóÖeQéúþûžll-€4ç”ܳn‹²“£ì¤Ï¶I€Bš@oóöVg³Gгïу-ZÈC†x6vLmý0t¨ü׬a:tà€ÜÓ­»ô9ê fw?ñ¤\bv§Fؽc‡t1£3W-"íâãå:³~m÷¶­òˆqÀ4<`¦õâRt/0›F¶-_^î¾øb¹£vm¹>.N¦~÷ôýå—4S}s¦LÑê²yõjOü6\—W_µÛ zó ™o!gø¤GI2oÈ=ýQ_)dö[r†˜Ô7õÆñ…à§G>2»fs8PβŒ“À…"@çèB‘f;$@$ŽF}:¿öš|ñ×_Råâ:òA·nÒÒL‘ÝÛ°¡ÜwÙer¥Y[ôÞ£J•:udÈßKw³ÙcdTîtRR^ƒ8}†™6{^ò§î. gê Ijÿ`øÛïi8(_š·Ü˜ß[ÃZ'¬ZiF§Ö,Y"™úÚ¼cœ$„ÅÏ’—:t #FxÚ}ÅŒ\}е›`§m Ìï·}2q¢Ý?©{»vÒÝüžÚ ×^—ÇZ·ü–ÜÀ3äÚgî«Ôõýì¨Óž;å=³c6F¶|ýuË# d ®9Êìl”H€N¨jFmÿñ§Mظr•,1û$á-´šfÍO|^×®ƒ£õpÏžöƒ×õ““¤jݺÖ…ÒßL¿á'EšÑ£ðð©e6Œtî‰ɵÍ4>þü¸ífZný²eFæŸfÓÉp¹öžOE·ð&£áU­d²Y˜¾bÞ|©qi}¯‹µ½Õc d%:GYI—²I€H “âªV|Î'èo«ù#»sgôÆš?õ½•C‡¿?oR·©ïíü•År$œV EÊ   tŽ‚Æ”ì @ Ð9 EÊ   tŽ‚Æ”ì @ Ð9 EÊ   tŽ‚Æ”ì @ Ð9 EÊ   tŽ‚Æ”ì @ Ð9 EÊ   tŽ‚Æ”ì @ Ð9 EÊ   tŽ‚Æ”ì @ Ð9 EÊ   tŽ‚Æ”ì @ Ð9 EÊ   tŽ‚Æ”ì @ Ð9 EÊ   tŽ‚Æ”ì @ Ð9 EÊ   tŽ‚Æ”ì @ Ð9 EÊ   tŽ‚Æ”ì @ Ð9 EÊ   tŽ‚Æ”ì @ Ð9 EÊ   tŽ‚Æ”ì @ Ð9 EÊ   tŽ‚Æ”ì @ Ð9 EÊ   tŽ‚Æ”ì @ Ð9 EÊ   tŽ‚Æ”ì @ Ð9 EÊ   tŽ‚Æ”ì @ Ð9 EÊ   AÓv„H€r(M«WËüé3dÅ‚ù²zÑ"Ù½m›”ˆ•2•*I¥š5劶míùÓ·Ü,/ú\b+ÄçОRmÈèå ;QK $p2)Iú<ù”ŒîßOråÊ%ÕêÕ“š—6ï¿_6¯Y#ËæÎ•écÇJï§Ÿ–ˆˆI2åíÛ'R!>i°K$àtŽÜc jB$BÖ.^"ÏÝq»¬YºT -*=GŒF­[ŸA`ëºõÒ½íõ¶2:£Ì¹$Ø»×¶{.uY‡‚×»…Ù? ×س}»ü·YSëð”4Óg_ΞíÕ1‚â˜BûjÞ<©Y¿¾íGÂáÃöx>†¼ÙS&~>"X—‚š£ 6/;G$àF½º<(1=fÂÿ½ð‚][äKÏÜyòÈ£o¿c‹$>¿‘£¿ý.ƒ{¾é«9æ‘@È sò— \Hüð£L7Þ6_^n~ðA¿šoxU+©ué¥fZíÜGŽÖ/[&=n¿MŽ;æW›,D¡J€kŽBÕòì7 @¶úö[žvÛuº_ÂÃÃ=çg‹ôèßß”?ó±=vÐ`ù~Ðg7¯1k’¢¢óH‹o’f!·†Å³fK÷믓}{öؤñC‡ÈÜéÓlüÙ~ý¤XLŒå‘BžÀ™wYÈ#! È:W­ö¯X«–'îO¤†9JžhwƒüfF£úŒÿAšÜÐÖf÷ñEéóÌ3²cóyò£>6íàÞ=Òñ©§¥ïs=ìyùªU¥fƒ6žÇ8U $@§ pZí4 ÆH€H K 0£6{wíò´_­š'~.‘)£F[Ǩ`¡BÇr:˜íÆ  x+ ¡q›6rG×ÇmÈ­Ä@IDAT´ºJîyòIûÉW° ' :G¼ H€HàÀ¾E°oQ¹*UôôœŽ'R×(R$MýE Kdd¤?~\V-\˜&OOÂÂÂ4Ê# @:œVK„§$ Íœ)£ûõ—˜òqf/›bÁÐ%×÷ÎÎfMvµÎ(䎊òdaCǤ“I)¹=i™´éØA š=’J–+›¦*¶ аÏ1R¥i8Ò9rÒ`œÒ s”–ÏH Ç€cÔ¹ysILLÌñ}ÉiÈÿÒ‹2tÖ,‰¯^Ý«êÕ.¹$MúšE‹¥öe Ó¤eö™†?úY¾Øßž&§&žJ:©ÙiŽa¹8r”OHÀA€ÓjŒ’@03ð3:FÙdÈÃʼi)o€ySk{J8Þ [·t‰·b™N›u›4ñÙçjógÌðYÖŸÌ^]ºÈK:H»N¤ç7ßH¹Ê•ý©Æ2$@>pZÍf‘@N$P pÚŹ-oºIê5m–»’£tÆš£ú-[H¥‹.ò©w×÷Þ—N—7²‹¥'û­Ü÷ÜóR¾ZUŸu2ÊÄÏ€|7p ýi‘ŽæÕýÌ„\9Ê .– 1tŽBÌàìnè€ct÷ÝC¯ã.íqÕzuå3Ê3zÀ9vô¨¼ÿøcòñĉ~i»fÑ"YmÖ)µ¾û.[~ò¨oíÑÛþG§¼Ht.Â>q츗L"N«ñ:  L£GºÇÑÌI“¤Gûör4!Á§? "OšQ@ç(ÓöM›l…ü!'ÍÛo~1?*›h^ãG8|ð€& ~£­y» aëúužô#f­ Ài9:Í‚1 ¸ ¢ó瓯ÿùWÞ7›2Žûâ ùõ»ïd…Ù¨Ëk¯K½æÍ¤dÙ”WóŸãÆÊ¨O?•[¶H¿_§Hl…xŽÕ/©'³§L•5K–ÈÝ_l×;-™3Gª›·âÊV¬(›×®•a¼/W®’:K«ÛÚKL¹rvcÈñ¦Ý²•*ËúåË%OÞhyðõ×=r!P'À‘£P¿Ø l!•[ž3Sk}~øAª˜uJ{¶m“:Ü#×çåš’%¥u©RÒ´@~éûì3Ò¬];¹xqÇJ¿üÅ©\»¶ÕŽÐj3톟yqð`¹ó±Ç%_þü²o×n9–pD®¼õV[®ëûg(¯ìÙ¹SÞ3ÎY~³»6#‹†HÀC€#GŒ À…'ШukÁçäÉ“²~éRYjvÑÞ±i³”3›IV¨YSª˜¡Œö$Љ‹“oŒC´aÅJ)j*쌭áÎn]¥m§û$*O´ÀÓÐðªV2Ùl ¹bÞ|©qi}ó#µÑšÅ# @*:G¼H€HÀÂÃÃí›ng{ÛÍ›ªÎuHÎ|Œ y Ñfä¨nSß[x«Ç4œV K³Ÿ$@$@$@~ sä&"  tŽBÅÒì' €_èù…‰…H€H€H€B…£P±4ûI$@$@$à:G~ab!   P!@ç(T,Í~’ øE€Î‘_˜XˆH€H€H TÐ9 K³Ÿ$@$@$@~ sä&"  tŽBÅÒì' €_èù…‰…H€H€H€B…£P±4ûI$@$@$à:G~ab!   P!@ç(T,Í~’ øE€Î‘_˜XˆH€H€H TÐ9 K³Ÿ$@$@$@~ sä&"  tŽBÅÒì' €_èù…‰…H€H€H€B…£P±4ûI$@$@$à:G~ab!   P!@ç(T,Í~’ øE ¯R9¼Ð¢™3et¿þS>N -–Ã{CõCÀ‚ß …n²$@$àJAïÁ1êܼ¹$&&ºÒTŠü!°aÅ б @ý´Ú˜ŸÑ1 À…BÙK`ùüyÙ«[' "ôÎQL\¹2'»¬*ש¬]c¿H€HÀu‚~Z­@á"i ·¼é&©×´Yš4ž€Û`ÍÑ´±c=jUªUÛg„H€H k ½s”£»Ÿèž>™ç$à:NçÈuÊQ! bA?­Ķc×H€H€H€²€£,€J‘$@$@$@9—£œk;jN$@$@$èeTŠ$  ȹèå\ÛQs   , @ç(  R$ d–ÀÞ;ääÉ“™­²åO;²}gdzž@ȽʟõHÙ @N%°iõj™?}†¬X0_V/Z$»·m“±±R¦R%©T³¦\Ѷ­=ú–›åÅAŸKl…øóêêâ¿gɔѣåïIeÝòåòãºõR¼LìyÉ æÊ[Ö¬‘éfÿ¯ßüQvmÙ"ß­\ÌÝeß²‘£l„ϦI€ÜAàdR’ôyò)Ý¿ŸäÊ•KªÕ«'5/m 7Þ¿l6_ÈËæÎµ_ʽŸ~Z"""$É”?´oŸH…øsîÀÌ_~‘¿'N’Ÿ‡}%ûvï¶rN%Ÿ:gyÁ^ñß¿þ’ï °ŽÑAþlÅŠÁÞeö/ Ð9ÊFølšH û ¬]¼Dž»ãvY³t©*ZTzŽ!Z·>C±­fT§{Ûëm9d&:tF™Ì$\~íµ‚O|õêҳ˃™©’eë4n,ø¼òŸÿÈO_}%ááá!ɾ0¸æèÂpf+$@.$°gûvùo³¦Öá)i¦Ï¾œ=Û«cÕ1…öÕ¼yR³~}Û“„Çíñ|ÿ‹)u¾"Bª~ábÅmÃÌ d^]YE–rI€\O —±Á Âÿ½ð‚][äKéÜyòÈ£o¿c‹$>¿‘#m',,L£’+ŒdŒ³D0ýÉ@YE€WWV‘¥\ Wøã‡eú¸ñVÇØøòróƒþMm5¼ª•ÔºôR3­˜‘#WCr±rN§ÒÅjRµJ€kŽr¨á¨6 Àùúö[í:ÝŸ©5,=ú÷7å/Üãóоý2òãeÉœY‚µOe+U”«Úß.m:vðô!}dÜàÏeþo3dÍâÅeF¼*˜·íšš·íšßtSú¢v ™~‘Ÿ¾üŸÔoÑBnîÜY&3R¦}?F–˜©Æ˜òååúÿÜkÒ8£.Æ,ßúL¢óæ•#f-VTtiqãMÒÁ,`OVþó|óÑGR±F ›=Ç|6PŠ”(!½F~+ÑùóÙ*ØÖ`Ø»ïÉœiSeûÆv=X³ÚÉ©S\´žž)ÏOàÂÝÝ×I€Hàœ l\µÚS·b­Zž¸?‘fäèB8=Ú·8í»<,ÍëëCßy[fŒÿÁ,LþR>š0!c·{ËVyîÎ;¬CqW×nòØ;ïȼiÓå¦ÎØÏ?7rÚÊË_ ‘ÂÅSÖîü¯×Û2¼÷‡²w×.Û¥’eÊʃ͛ÉÊIÁÂ…eûæÍö³ðÏ?Í[{säùÏ>KÓõ'ÚÝ ¿™Q¸>FŸ&F6Bÿ_”>Ï<#;6o‘'?êcÓ†¿ÿ¾m[ t4o~òlúnÊ4%Ò~õ­ÜЩ“yp¿@îNÓvOã¤Õ¾¬¡,3G¾ü²ücÞZc ¬&Àiµ¬&Lù$@®#p`Ï3åâ«UsŽP Æ»4o.w>ÞUº}ø\qýurW÷nòÑO?Y}ÿžèÚ›ÃÆÉØºv­ÄšMÇ h‹U¨Q3}q{ÞêöÛäÝÇ•æ =¬ó¹³[WO¹pãÌ ä/TØ“¦lcм];™8r¤lpìJ g§ Ñ¯d¹GMËcÄKþÔé:=ÇÚ"è|…½ ËuúM=Íôúk6ÚܬYòŠ•Šñ–Ì4(:GÅIa$zàu6S?‰fJÇ-!ÿK/ÊP3µ„ ½…j—\’&yÍ¢ÅgŒ´¤) 'ËÍžJ‰fä¥T¹râÍQp¦å3ÓZ;Ìœ„ŒFÂ"ŒTçŠÆf—éŸìš¤äSÉ^oÝ­V·žuŽàØàwàŠ–JÙŸ ލ†?úY¾˜2’•œšx*)íïÅéæ‘‘g:£Xk´uý[³leïέŽéQÛæ‘I€ÎQ iR „ 1?s•cà |Þ´i:GùÌÔN‰˜Ù•:±né×9GkÌo»!à§M°fèl¿Ó¦Á×@¥ãÊÛb GŽÈ¶ üþ}¸ØŠ)ëŽP9½Ã3ÙŒ(}öê«Rõâ‹¥û‡½íTÙæ ¹$UÈÏãªR¦àP<*OÊ4^FUéeD†é À5G H$Âbâʹ®÷XïR·IŸz9ßP›?c†Ï²Ù‘‘2͵Ý80þçÔÖÖõë2¬R,¦´ÍÃ(RfvçÖ}°fÈù㸽ºt‘—:tvæ-³žß|c£ ?KÆ3"¥a“Y¨Í@ÙE€#GÙEží’@(P8íbÜ–f!m½¦Í²­wXsT¿e ©dÞ~òº¾÷¾tº¼‘]4<ùÛoå¾çž—òÕªúªrAó*¤n/°Ï¼Y÷—y]¿q›6^ÛÇ~@SF’Vf!u³Ïб„Y³d±×²HÄ›zÅÍÈYTê"k›p–?Û6¬·%ʤ¾¹†“!oö4ÓhíOªt4¯îŸo¨\û´Íþùóib~Ë.£À—Õ2"Ãô@ sŠ”A$à!ǯ»=T­W×î©3ÚüÒû±£GåýÇ“¯©ûÒS^«Í:%¦»|Éñ•W«aÉW €ÝT±¯q<.o}­×õA¯Þ{¯Ù±»Ýë¨z½z‚ýˆæ˜·ËŽ>âÙPÑÙÎö)#Q S_›wæùŠ/0¯Õ#ÜñØãžb“;DÞö}:—­ãªW³oÍÜ¿_~6?.ûpÏ·ÎèsrrÊj¦ÔƒGFH 8­Hš”E$£`ôH÷8š9i’Ýlñ¨yñðªù“ftÌÛ(ÓïãÇËÃæçE°ë³¿áø±cž¢'O^¼œß,²¾áÞûlÞj³Ëug³1#Þ`Ó€Øï>òˆ¬˜?_Ú?ò°M~Á¬ÿÂèv©þÖì¨>$?!s§O·»Mw5›2z »·m=#ÎàÂ?þøªU­C©¶‡µQ~6Ü.&Çùáƒ4Ù1Ò…p4áˆ=:ÿ`±ö]ÝRë[¶~û.}˜õëd›´{Û¶ôY<'€ s0”D$Ó`ï¯ÿùWn5¿«†58¿~÷Ümöõ™8|„ÝYû§âW3õö y+=¥ß¯S{ ¥o›ýf›Q›ÌFÇÍh”?aΔ)žb›W¯öÄyàÕW$Îì„°À8 mÍÏxÜm=ßQ»¶\oöšjôíûË/Vw”Á~E]ÌÂh¬¹ôæ2ß8BÎðI’dÞ€{ú£¾R¨˜÷-&›):,°Ö°eÍéj^»·?ïaòœ¯ßW¿¤ž-¶fÉ«W/ñƒypîôiRÖl-€0ìƒ÷¥w÷'솑Ø;n#8ûmRÿtzþ9Á6|䪫äëÞ}dÔ'ŸJ§F̈Øa›‡½ÛÅÇË3ºÄ@&@ç(ÐD)H GˆŒÊ-Ï™©µ>f÷eìҼnjH¼Ðá¹Î¼BMÉ’ÒÚ¼²Þ´@~éûì3ÒÌìõ3ÒŒâ`ßo!ÎŒ¬ ”Œ=ëz¼]†…ÌFŒðˆzÅL‘}`~òC÷ ‚ó¥Ù°²AË–ÖÂz¢•f³D8#-nlg6N\d7Nô0üžÙ'fzû#u7úv7¿§6èµ×å±Ö­eƸ±2Ð,>¿Öô/£p•ù©üŽÛuf7m8bwÕ­kwÙþ⯙vCHg=ü Ieã¨!l6{-­6#LŸzZ^ûN$@é$%&¦Ká) „:G¡gsö˜H ‡øêÝweØ{ïe™ö¿ .­Š•.-ZdYL9£œ`%êH$ò’O%ËÈ?–oúö•“'Of ãÇɑÇ埿ÿ–ÃdIJ9×å+QG '0eô(Ù¾y³å0iÄ×Ò¦c‡€3ù¿_”#Jƒ+¯”ü… \~(<°w¯2£o 9›GŽr¶ý¨= @ˆø¦ïGžž~ýQO<‘*fÛ‡¾¿ü"Ÿy&bCFÖ7{ÊD35Éó Ð9Êù6dH€‚œÀÚÅKdÙÜyÒ eKÛÓ¥óæÉ¢™3ƒ¼×9«{ ~û]÷|3g)Mm3$@ç(C4Ì  wøêý÷¤Î¥Ûûxúß;o{âŒd/õË–IÛo“ãÇŽe¯"l=`¸æ(`()ˆH€Ok€¦}ÿ½¼ö¿ÿIµKêI­úõe‰9š9q’ì4kJ–-ë³Ñ„C‡ä“=d‘Yd¿p!9™˜$54´ëb:½ð|šºxʨÑ2~ÈòÒçŸKL\\š|œŒ4X¾ô™DçÍ+GŒì¨è<ÒâÆ›¤ÃÓO§)‹Eã3'ü"¿!qU«É¯¼,KçÌ‘† ‘¹Ó¦IÙÊ•äžîOÈ¥f}Ó¹„‘õ•?Œ“][¶JÞ|ù¤ú¥—J³/X½¦M%¶bÅ E~ÿÙ ™=y’l\µJ’òF·ŽF÷ú©£rΊþôuñ¬ÙÒýúëdßž=¶êø¡¦Ó§Ùø³ýúI±Llàêl›ñì%@ç({ù³u ðI`Ô'ŸJÂ…¥ùM7Ùrí}T–têdG)¾|ç]yêã¾Ö‡³Ó©Q#)SJ‡$_Á‚²eÍy°E )Vª”¨s´~ùriÞ‚›8b„L}K-ñĉ3ä>Ñîùí‡¥Ïø¤É mm~³ˆ»Y£´cóy2u-ÔD³`¼÷Ýe÷Ž¶Ì­;Ë—o¿c§J–)#ëÍ!¯3m.üýjœ‹òÕªžÑVF '“’äÑÖ×ÈêÉ€©S%¾F ùeøpyóä»mµ<Æqƒsö¿Y³¤êÅÛ´CûöËswÜnßÂ{ið`©tÑE‚©°Ç®m-L˜ ]^{]þûòKžfýíëÁ½{¤ãSOKßçzغå«V5ÎgŒä(œVËQ梲$@¡F`ìàArƒq†4\סƒulp>aÄp9~ô¨fq3` ¬YºTîz¼«uŒP L¥Jòª…JtlöX®reÁ(Ç­=ä‘‘+,ÌG#JpŒ š·ØÔ1Bz‡'ŸÂAÆ  xS ¡õÝwÉÐÙ³¥P‘"ö|êwß=–È6Èhã ý{–)VLGlÐk¯Ú2þþøÊ+2gê4¹µKëà„›ß ¼þÞ{å†ûî³"ò( _‡k¤qž*Ö¬iÓ° BçæÍd×Ö­2ø?l=dÔkÖT_{­-3ôí^‚Q6„Ìôµq›6rG×Çm=üiÐê*¹çÉ'íÎ(CÎ$@ç(gÚZ“ „ßÇ—½fôåŽÇóôÎ@›{R^ã‡3ò]ÿž¼ô‘uË–Ú¤f„Æ0•圎ƒL„‹¯hâ)v*9ÙGäDêzš©f(RX"##åøñã²jáBM¶SrÚF†^ûòKÏ+îµ/k(5Ì4ÂÆ•+=uü‰Œ3Ó}•ÍÈ347S{˜êÛ¹e³™Ê«"F/„½?”U‹I—×ßð¤Ù óç?Ïö°S„…³–;O›œÙ¾ª,ÃÒ9•Î<ÆsN«å[QSzx ?¶S>Î|‘ ªþFä ³6#7þ†}úF& ™/ngèðÄ2ªß§vj ǻ͖·Ðäú¶2ª³qäG’+W.yôí·%,WʈÐ+C†žQ%2wnOZú‘#ì«TÐìßS²\Ú5N{¶o÷ÔÙ·k—'Žˆ:'Qyó¥IÇI•:Ë_'ÚÑœ323HÀÈΞԩºô#f—´l!‘QQ’hœ´+WI£Ö­=R†öêe¸¦©Sž £6ÁŒ(E›uKê$žK_U&#%‘³tŽr¶ý¨=  8F›7O3Ý4KíHþ—^4klfI|õêgí¾à˜) ®f]Öæ¤•kÕ² ³7™5D¿'Ín¼1}ëXÅ”+'Û7m’ÿ½÷®Lû½<Üó-iu[{)R²ÄåÏ–çNß?ýlÖøô·§:Æt*Éÿ»#r§Œêœ:uJEžõ˜×L™•‰—-ë×ËòyóítšVÂñb%KÚ¾V¬•2†¼më7ØÅÒ%J—ö8kZGÞ6¼<×¾ªó©²yÌ™èåL»Qk:c~ÔŽ vؼy6Ï,ŒöÇ9f^ß7£=‹„FW‡iÃ>øÀ«s„/ꦽ®×]'ëÍôÕó†Ö³æ•ó6wß-/1ÄŒ´œ)rÊöŸŽŸÆÀ«ò5Í>KÙùó"˜ZûpÜré×aã >fÖ•._^*˜7ÓF™7á0½æ U.®ë9ѧ·<òÖ[žsg$ñø ³ÅÀr³ªŽœO_Ó¯Õr¶ÁxÎ!@ç(çØŠš’@H€c”ÑBã`×ÎG™×ê›¶kg6XŒöÙÕ;»u“áfJ-ÁŒÍ0ëŽ0Šâ\¼…Ø7™ýê4nlå/+ý&O‘ÇÛ\+3'Múÿöî<Š¢ àø‹ ½Iï5ôÞé  RD¤Hé   è‚JQ© "Òi¢€4Aé=é¤Cèí›ÙäÖ»$’\ÉÝå?Ïs¹½ÝÙٙߨßûíÌΊ~Ä>ªÁÑš… Œ2,O™YW,곆¬ÏŠÙ¶>ë¯æW½Ù§ùø~d%eÏ›WR¨u¢nª@ê§É“¥q§Î’)gŽp§|Ô®­¼ýþ`ctÛj= ûÁ½ûáÊf‡÷ 0¬æ}}F@À‡~™õœWë½Úú­g¶R¿ý½„ZZ§;·oËŒOF†;Çòè»å€ž‡ôr³7ŒŸiC‡å,ÇÜûoͤûwm_…¡'uë´[M× 1ZÒ*õ¢Uý„˜NÁ7oXvßOT c/é;5:=Æ„lÿó^½$p×.©Ù¤‰þṳ̀ÛÛ~pHУ‡ÖÚ½P^¶«Å#uª“jØ´©¤ËœEr)lì‹n[õº/t:ò„ñ­ÿèÕÍIÞ)À#ïì7j>(tú´L:Ôh™Ÿ_Ôþ¿k«WeüüÍti®$´~íÇš T0ÔÌæÑöMË‘$É’¯ï°fvÄØuU=þ¯Ûm]ç°y­oüåùW=z_K½–C¯M¤ZL áO˜(¡z/ƒ)"oª;jÖwÐZôí'¨5£vnÜh¬ÚÝEMêN®³Ô+_RK.]J†Ïk^&ºmÕOÿé§õºS?û­d È#zÕñDIKç?6ËeÃ{¢öoŸ÷´‡š"€^)ðÃØ±ÒT=i¤Þ—¦ÓЖ-½ÆèÇûû7j(+­æéW4Wsš>ï²îQ<õ´›ž3øÍ7¥Zãç«Aƒämõþ1ý?ÜŸ.ZdÌGÒåïÙ¼Y>V“¬§ÿŸy¹1ït—±=zŠe#ýt[x褡cjÎ’~mÆõ*Žæ*_Rl]»tYîݹ-‰“%•ÖeJ‡Î¯Û¤çé;6úÑú®5kˆ^ À’zªUªWÿ0Çò3ÒïnjD}‡Gk‡vî”Ý[þ”uK–¨6 7õµ,ÉxjoýjØì}Iºjµ^û¦º“Ô´s™¢^#bY“IŸ¶ê P§^ê¥Àúµ%z¢ø êåŒ ¯üÃ#¯ì6*¾&вÑŸ¨¦•+©!µÈŸ|ë§‘Ô˜õ<3êQ~ýþ3=Ç(ì]šâ+ªÕ±+Ê‘\\Ÿ3OD§Hjµž^Û’š÷î%õÚµUwo›ËèÅ+í¥IêTLSVµt@±+H¯1ŸÉi5Z¿änp°zòî®ÜSC‹W.ÉVµ¸¤^ÞÀú¥¶:@ê6b„ñÑOö=R/àÍW¢„¹(¦u}¢ÛV}n¹—jÊu,P­¿TP†Ïš/f}=¶=O€àÈóú„!€NБNúÉ6ý‚Výq4Ù{IlD ):z­°çïݲEz«5›¾\³ÆxY­½ºèá­N¡s±Â–¡Gå‰AÏ^ùöÚªŸ”{VÀªË%y¾ÃjžßGÔˆóz-#½®QR5\eyë½=”ëjx¯¨>$!S‚£˜Êq €€ÛN©!4ý´Ù¥ læ+…­€¾»ôQÛ6Ò"tÞUØãüF * «EE‰< €±*P@->Y¡V-c}¦^j…ìl¹ÕzE9Õ*Ýy%A‚„r9è‚ݳGMO&Ÿ©5ŸÒgÍ«õåâÞ-@päÝýGí@8#0nÅ ™§^²xÚ4¹ ÿ?süõ¤ÝI•6­)_^O&%«V‰34ÔuG®³¥d@' è‰åÖOõéS¤ImL8wâe( !8â@¯H•>WÖ›J{¾²=¿¨! €¸Q€àÈØ\ @<_€àÈóûˆ"€ €€ŽÜˆÍ¥@@ÀóŽ<¿¨! €¸Q€àÈØ\ @<_€àÈóûˆ"€ €€ŽÜˆÍ¥@@ÀóŽ<¿¨! €¸Q€àÈØ\ @<_€àÈóûˆ"€ €€ŽÜˆÍ¥@@ÀóŽ<¿¨! €¸Q€àÈØ\ @<_€àÈóûˆ"€ €€ŽÜˆÍ¥@@ÀóŽ<¿¨! €¸Q€àÈØ\ @<_€àÈóûˆ"€ €€ŽÜˆÍ¥@@ÀóŽ<¿¨! €¸Q€àÈØ\ @<_€àÈóûˆ"€ €€ŽÜˆÍ¥@@ÀóŽ<¿¨! €¸Q€àÈØ\ @<_€àÈóûˆ"€ €€ŽÜˆÍ¥@@Àóü=¿ŠÔ@ÀZà×yódǺuÖ»l¶S¤I#™²g—L9sJ‰*U$q’$6Çù‘ EîÃQ@À㪿ÖX2dË&czôûvõK–"…T|õU¹÷ŽìܰA.ž>-wîÜ‘”©RÉ«­ZKû†ÊóiÓz\[¨ž(À°š'ö uB"ˆŸ0¯XQÚjæ*R¾¼Œ˜;WÆ.Y* ’u7oJ“N䯵k2wâé ò߸aægì Ù·á àѲe5ë?As[oøùùÉ )S¤E¯ÞÆþ“GŽÈЖ-mòð" 8ŠØ…½ €€Ç øûÇÿ¯ŽñþÛ´ÞªûÖ[æÏÝ›6ÉÓ'OÍßl €@ÄG»°ð ü¥JJ’dÉŒ¶ÜRÃjNžð‰vÑ\)À„lWêR6 ËÏœ‘;ÁÁF-’&O®ž`Ë% &L”Í+WÈ£GŒýo½ûž”ù%c{Îç_ÈöõkÕÄî{Æïb*Hç?6¶-._¸ ?Mž,?–®Ã‡Ë¿gÏʯ¿–}[¶û^nö†4ëñŽ‘]çYöí ùmá¹uõª(SFÞìÝ[r(`)Îæ{É´é²xÚTã »Û·nIÂĉ¤ZÃFÒjÀ›|ü@À•G®Ô¥l@ –þúí7³9óå“xÏÅ“f={H‚ĉex§ŽÆ±—^ofæiÑ·Ôzã i[îß¿oî×ÛÖü&Ó?þHöÿý·éÆ«W~DöBYÄG7Fvé"C[µ’ê)·êåº:0"!G±¡Î5p¡@ãÎDÑÄVÊ:á6º×OŸ%‹Í)i2e2gÌžÍÜöå =ñ¸D¥ÿ&Mûr[ömÆðjmŠqǨuèvaóðw ÄÞAÝÕB®ƒ@(ª훦îÂèù9:ÀH›1£Kôœ#ëa»€ÂEbt½dÏ?osžžPlIÉŸ·}‚©z£Fê±ò*–Ã>ñ­ÚÒÕ«OtEµAOž86G_S/yC=Ñæ¬Óד¬Q‹DêTP-6= »ƒß¸X€àÈÅÀ@lèIÜ•¬ƒ#w\SFz±Â¸žn^»nÜ ¾mnGu#uúô!‹6ª·­„<‚¯ÏÕ“¡ÿ92ì¢Z“(lzº²¶< ¶Ü¾õßShÁêu%aÓc˹O-Ó¬Cr©•¼uÒïÓyüBï~®šýƒ±à¤>|3|yz? g 0¬ælQÊCÜ$°ÛêIÄ“‡© âA´®\¦F #ÿæU«¤wݺòÕ AÒFݹ™õéhõä\]ãØÙþ‘^uêÈ® Ͳ÷m ™vîäIsŸeC/)`I»ÔšKaS`èZEgŽ3‚ Ëñêp:WAY 5{dçÎÒªT)õ*“u’5wnãØìÏÆÊ}úŠ^0’„€+Ž\©KÙ €€ VÌš%*¾(ß~ò‰Yúå‹¥aî\ÒÿµFrdÏsdƒ&M–ò5kY6­X!s¾øBr¨•±'®^- Õb™Ô+?^ïÚMÞV¯ðÐs¡þ ¢‚Ô»Ôtº~劼¯^5²î§ÅÆ;ÕÆ©Õ¬õ„jKZúÍtù¬g/ÑïwÓ‹SjÖLŽìÝk¾yýº¼­în.þñûõþµŸ}.7£ÑÞ òÀñâÅ3kõ\<þçÅÄ`Ãëø§×뺌 #à›E_x!Ê óÔ%9ž>y* ìÙ²EN9a›ò—*)Yrç–W¯Fxœ » «Å®?WGPŒÙsØXToÔHJV®bìÓsެŸ:õÔ%9ô°™žw¨æÝ»sGÚ¾P^>š9S*7h`Ó6ýcÖßÛà »Y2ݺv]æ«'Ùü½MΫ§¾²ª‰Ý/5m&uZ·²d1¿¨áÇyãÇKî‚¥Õ€²tú7ÆSr©Ò¥“úmÛɯ¿}GË’týO&ú–N–.•_}eÔ×ßß_Þý©š@^Î’]–L›.‹§M•ÄI’Èí[·$aâDR­a#ãZf&6ð1îùX‡Ò|E@F-úö1>– ÉÚ6`ÂD‰: [½Û´©|Ú½»ÜUÁ’uÒ”%@±Þ¯ƒ6åÊJðõëÒ´K7iÔ¾ƒìßö— }«µ¼S«–<~üØÈþÃØ±òº ˆZ”(!?Ϙ!W.þ+_¾7Pþ×±ƒ ªþ–M+VÈà[òöà!êüm²{óf¹~é’ ™>ÝæºU6”ÊuëÇÛ©¼ÖQßõex§ŽÒqè2yýzù~Ç)]­ºŒ{÷]ù¬Woëj³€O ùTwÒˆm•+Ɉ9s%yÊ”FUª»6 ¾þZšæË'kýiõ®¨'غT­*Í{ö’ÞŸ&ë¾*oöé-ãÕK·uÚºfLz°±Ý¸KWy;t[ïØ²r…\½ô¯,>zTê´l)y‹•²5kJ¹—jJí7ß4ι¤&‚ß ¾ml[ÿ ¾ySŠ”+'å_~ÉÜ­×iÚ°ìI¡ÚQ©~=s«~ýퟦLfXÐTaÃ׎|­Giĺ@¦MdÆŸJáҥͺ\Øù£»YzXlñÔ©F~ë?¿-\ ¯wëf½K„>=˜lý–dÌ‘]R¦vÎK©í‘éIÊzx+K@€½,‘îoÖ³‡ÔPwŒ>jÛVþüõW#ïYÜtRÁÊÂC‡EÏ=ÒI)Õݘ Ù²IU5á9l²Þ—4tÈNç±¼¿,~üržgùý¦š#¤ƒ£ãÊ>uW«h… Æ!=ñ»V³fâ§Ú6év[Òæå+äÇ)!w­Ô3yFzò(dþ“%ßøŠ@ø|¥e´|V@Ïá±~ËÕ M6tˆÌT“šõ“^1Ii3e’‰«WˬQ£ÍׇœU+€ÏS F¶V“›u:¾oŸñýøÑ#©Ý"dޱÃIô¼¡,9sʹ“'§Ûtp¤‡Î6,[¦†·Ú½Êõ”©Ã†I¾âÅ¥Z±\ÛUL”HÙ=ƒx¿s޼¿iqNÀ‘ÆÕÃV;Ö­{¦ó„請?ìæ{kà{ÆdiK†j‰KòóyÍHЩS–]NÿÖO¾é´IMðÖs”–7Kr, ÙóåðZ#»t‘¡­ZIƒvídļyF`aFv"àcG>Ö¡4œ/ Í/Q©Ò3 >´c»ÌødD¤ùµoo×k!YR®Â…ÍkW®¨'ÏVZv‡ûÖòÿª•˜¤×{¼#I“%“ÛÁÁ²xÊY¢æ5éÜ5¢f ¡†Ñ¦wŒ,w·"ÌÈN|P€a5ìTš„@\°^,ÒíÖsŽJW¯&êÑøg¥Ì¹rCVÍzôçÓ¦0{ÒЧËôÁü%J™y «µ’&On,´8A µU¨ýJ„¬‡µi#…Ë”5Ï‹ÎF25W©rýúÆ‹o窅#©!¼Zo6°ˆ5ê 6 –)îxÈÔðp»Ù€ÏùLWÒ⦀e±HOh}ö,‡ÔKf%Mªæ7o´×*=¼6iíZcm=¤åçç/…Õ‹`õ~ëÔ}äHÑŸ˜¤zjø®üË/GzjFu§jž ˆN‘ÔéÓ‹^Û’š÷î%õÚµ•„‰›ë4YÖM²äáo 8òæÞ£î àQzAFëÀHWN¯“õ‘’¨ÉÙ/Ö©ã’öéÞZ¯ÎÙEräÏáá°ÁZ„™Ø‰€— 0¬æ¥Gµ@@À5G®q¥T@ðR‚#/í8ª €® 8r+¥"€ €þöo="IDAT€— yiÇQm@pÁ‘k\)@¼T€àÈK;Žj#€ €€kŽ\ãJ© € à¥G^ÚqT@\#@päWJE@/ 8òÒŽ£Ú € à‚#׸R* €x©Á‘—vÕF@׹ƕR@@ÀK޼´ã¨6 €¸F€àÈ5®”Š €^*@pä¥Gµ@@À5G®q¥T@ðR‚#/í8ª €® 8r+¥"€ €€— yiÇQmâ’À­ë×ì6÷T` Ýc@b"@p5ÎA· >c÷z‡wî°{Œ €@LŽb¢Æ9 àVÆ;‰¿¿„×ÌS¬X„ûÙ‰ÄT€à(¦rœ‡n(Z¡‚LÛ°Aê¶j-åjÖ´¹n@á"6¿ù8*@pä¨ ç#€€[t€ôÑ÷³¤Ò«uÝr=.‚qW€à(îö=-G@Ž"@a €Ä]‚£¸Û÷´@" 8Š…] € wŽânßÓr@ˆ@€à(v!€ €@Ü 8Š»}OË@@ ‚£PØ… €qW€à(îö=-G@Ž"@a €Ä]‚£¸Û÷´\ ðøÑ#”êyEÆ•vzž<5r‡Á‘;”¹x…À™cÇdéôoäÓîÝ¥S•*Ò8o^é\µª|üöÛòÃØ±ròða¹}ó¦t{©¦œ?qÒh“¶¬\)Ÿtê$ r唟¿ámn%ãJ;£ëB~ßð÷ÍfÑ*@ êúøÇõë/‹&}-Ï=÷œä/YR •)+ Û·—³ÇË¡íÛeý’%òÅ€âïï/Tþ[×®‰äÊ)óÆ—ŸgÌãDý‚^˜3®´Ó »†*»@€àȨ‰@\صqCŒšÓóbt1«“þÙ@½ÑLŽ<()S§–sæÈ µk[åÙÔwŠúÔ«käÓ{îܺehÙ¯Ÿ4îÜEª§zÞšBr{ÞßW¯í‹iͼ¥1mç!`-@pd­Á6D[àÖuuÅ*­SwXôÇÑt*0ÐÑ"žyþ•  éP¥²ÜTwÒgÎ,Ó6l,ž—9WNù~ÇéP©’Tßw‚ƒÍ|‰“%•DI’H°róÄ4cøIš"…4ëÙáêyz;j'#`%Àœ#+ 6@ úA§ÏDÿ¤(œqxçŽ(är,ËÈ.ÀH—òöàÁv#ËU$J$ïŒmü¼rçÈrÌS¿wmØ(ÓG ÷ÔêQ/?~,?*Us~k—.I^è½Ñ£‡”®^=¢o]».ó'N”o3ž‚Ë[^jÚLê´nãüû·ý%}ê¾*×®\1Êøyæ Ù¾~±ýÞ×_K«~Z2mº,ž6U«¡ÁÛj.UÂĉ¤ZÃFÒJM@,E·‘•Å1)T¶¬±­çGYRßõeƒº‹6N]¯RýzÆîIC†È¸wß•‹gÏI¿ñã,Ym¾£ÛN›“ù€‡ 0¬æáDõ@Àù7ÔÝ”«êŽŽ%åÌŸß²éð÷·ŸŒ25jÊ XÌØºU©¡­ÔéÒAÒÌÑ!ó•,Ñ»¨u”š÷ì%½?ÿL*ª;=oöé-ã—/7²l]³F&½?Ø’]¢“ÿÅ:uä^=ÍsËÖ|Iôgú£'gëôûÂEF`”"eJ30Òû[©e túiÊdÑO¹E”¢ÓΈÎgž,@päɽCÝ@À%zÝ"KÒëeS‹=:+Õ}«tø`¨$Wöë”2M)Xº´±}J-"iÆõï/ñÔp^“®]¬wK•?c֬ƾE“'™Ç¢›ß[H f9Aß¡²$=g*¢ÕvFt.ûðt‚#Oï!ê‡6QYW)ÙÐ!2sÛ6ÉY €Í¹–ùK•²lßÇ÷í—"åËÙìsæDI’Å=~øÐ,ö°Z+顺3“![6©ª&?‡MÖû’ªa/}·+:ùÖï¹ðwŽtHZÒæå+äÇ)!wªž†î|òè±åð3¿#jç3O"(@päB•@À¾@TÖUÒw[v¬[g78ÒsnÒ©§µ.…Þ!9qð€Kƒ£ˆZs|ß>c·~uIíoF”Åf_tóÛœ¬D0¬fɳfþ|™:l˜ä+^\ú|þ…dË“G*ª5Y2ð@`ÎQëpš‹€· De]%=_¦„ZÉ:²dý„ÚÎ?þˆ,«KŽùùÇ7Ê :u*JåG7” U™Fvé"C[µ’íÚɈyóŒÀ(ªç’_àΑ¯ö,íBÀGžµ®’žsTºzµṗåè5f¬´«ð‚1éxÍ‚ÒvÐû’#¾°Ù\ö;Wèòz¢-êq}ýtYDI¯%ôûÂ…Ýüµš7·)î¹îé׊ü8eŠRÀ[«G÷I "@pÄ?  àuÎXW)_ÉR_Ý-Y4y²Ü»{Wƪ÷ŽM\½:Jzˆë˜š§•á0{.WV’&On,º8A&j¿"Í Ö¦Z‘»¬vLv©¥^Ž’]M†ÖCiÇ´2ÈÌ¿þ5‡*&ùõ9ÕÓpz!ÇŸUû²ä‘“j­¥DIKç?–¥Jª•»××j¡&cëyZzeïêi¾¬j)€³j ÙŸU«v•b/¾(E^(£v•ç^$À#/ê,ªŠÎˆŸ0 RCkã–-“¼êÕWTð2¸UKyUµÒ§—Ú*@©œ<™Lxï]©Ò ÌWwy¬£Y£FËÛV¯NÑk iÑBΨ€Gƒß|SÖýô“Yñ¶åÊÉÒéß¿õ‘³Ô#úeÕ;×ttOݵ:²w¯¬TkØ@æîÝ'³g7Ïn~}b¯±Ÿ©`(‰\ù÷_£‚@}JF:}ðí ÉS$äv::¦† õ+G†LŸ.Í{ô”¤É’©wÃ]Võº­¥c1n§q1þ àEÜ9ò¢Î¢ª à:j×ýÑwmN<(UÐrñÌYÉ ¹ ’¼êÎJDs‚Þøžè½4bî\{‡Œý:X™´v­ÜQ/|ÕÃ[ú…¶…ÕšKzD)ºù˽TSÖ¨…w씂eJ«—Ê&6‹Õ×< <"©U hYÕ[ghÞ»—Ôk×V&J,:€Ô©Í{!ïi3~„ùó¬v†ÉÎO½¹Íx¾Á‘ç÷5Dh ŒìÒYnêá1•Þ<ؘ[Y %’wF6²Ü vìΑå:~~ÿݘ÷\<Ën¾@À ޼ “¨"D]`Ó²_dýÒŸ2çÌ!¯u˽TS —)£†Õœsç(J%x¤Àÿׯ#«G¥@è Ìõ‰yBƒvíÅÏÏÏüý¬“&©ü±ÿŸÅ³fÉÖ5klª›z̼~îÂ…Íí¨lTwŽrùâEã¼&:É,5\8}ÄpIŸ%‹œ ”‡Ëî›dæ¶¿$Gþ|F¾ÜE K¿qã¤oýú²ý?Œ}Æ—’U«ÛÕ~#:w’–½ûHU–Hø²Ãj¾Ü»´ 8&pãʹzé’Ùêœùó›ÛÞ´¡ç(½3z”dSë0éôàÞ]›ÀÈÒ–ËçÏI޼yÕÐa'cWíoÊL5ùzüÈòÓ+¿ª»5z)fŒiÓ† Ó§åØþýÒ¤K›ýú‡¿ZtR§„I’ßÖò+nü¼tþ¼õns[ß}ª¬Ö}Òé‹~}eÚÇÉÿ~˜#ú‰>qE€à(®ô4íDÀË‚NŸ’NU«ÊòÙßË7#Fȧ½zJËR¥ä¤²—ò«ãÖéø¾ýÖ?½nÛOÝýjÒ¥«Qo=™úTàw{O*Å*¼ i2fŒV»ü„NOž<±{ÞßÌçÕðÛ•ÿ•|%Jwœìfæ>(@päƒJ“ð}[·ÊÇmš|ó¦ìPA‚½¤Ÿ Jg,œPóm¼=µèÓGR¨9DÚbÆ'#Ìæü®†Ôª§ñ\‘ôðZƒvoEïÚ¸Q¶ýjûäœ+®I™x’Á‘'õuAS yè¤bs‡ÚÐï(+Q©’õ®pÛÖO¨í }ò*\&/Ú‘8YRã‘z]åµ?ý$zÒùaµ¢·þ~I½ÅéªzÚmù¬ï$£zœžÞ±ƒÜ½sÇ—¢L30Ò§ä+YBê·kgœ­.Û³G˜’ìÿ<¾oŸèµ‚b#=}òTôkOò ü¥ÎAª5nlTké7ßÈK–HÓ®!s‘œ]×ÕB˜'ÔËzûŸ`<ÙÖ¼GOã˾û.Ò!Mg׃òˆM‚£ØÔçÚ à}÷ȲÆÑŸ¿þ*›6}æ°^AºŸºcYѺbþYº©×‹,þõîH·o«ùQ–¤ŸÈ’^ÕúƒÖ­ŒåVküZ„YÛj +êuœRªÉÒB_”Qæ'ª<{éáýÆ¡§LÈ>â¤|9h¼÷ÕW¢‡ótêª}Ì[©9O¿ý¶Zséž±Ÿ?ø²Á‘/÷.mC Ž èÿaŸ«Öj¢Þ«¦kÿM-†Ø¢X1Y­Iÿ÷ìYSE ¿©¡·Î꩸oÕq_ÿö{„AǨnÝä¯ß×Êg}zËýÐ×h˜…ØÙØ»e‹yäèž=Æ]|èÏѽ{Uªõ ñýH‹âÅeåœ9’3_>É-›yžõ†ÚÊ¿ô’±«NËVÖ‡l¶u ¦ç ét*‚'ûN yâMYz9KÒíðZC)ú Æe¿ök7h°ñóÜÉ“òÞëM-‡øFÀg\3`í³\4 ¼E ~Â2hòd©þÚk2~Àc¤Á­ZÕO­VŽ/žÜ¸vM½V#³¼ñNy½{w»kùdWAËÅsçŒ5‡&N)ÁlõÞ±5óçË5?Ê’ô¹­ÞefÙö»tõêawÙü®Ù¬™lW‹AZVĶ9¨~ü¹j•|õþûr)(È8¤ÁµkK›÷Þ“,¹äãö팠Ìr^ÏW^‘öƒ‡ïI›8ð=ãÎU~‰'[W¯–Ôy:]_¶Ìe×£`°àΑ­¿@øD-pçÖ-‰Ÿ ¡èe H à>‚#÷Ys%@ ZI’'V~2#€€sVsŽ#¥ € €€ùHGÒ @pŽÁ‘s)@|D€àÈG:’f € €€sŽœãH) € à#G>Ò‘4@œ#@päGJA@ 8ò‘ޤ € à‚#ç8R  €øˆÁ‘t$Í@@ç9Ç‘R@@ÀGŽ|¤#i €8G€àÈ9Ž”‚ €>"@pä#I3@@À9GÎq¤@ð‚#éHš €Î 8rŽ#¥ € €€ùHGÒ @pŽÁ‘s)@|D€àÈG:’f € €€sŽœãH) € à#G>Ò‘4@œ#@päGJA@ 8ò‘ޤ € à‚#ç8R  €øˆÁ‘t$Í@À×Ο8ákM¢= à%G^ÒQT_¸uýšM÷ì±ù}*0Ðæ7?@W ¹J–r@ ZA§ÏDšÿðΑ÷Ôƒ>ôÔªQ/°#àog?»@· 4îÜIVþ0[=záuó+á~gï\0a¢?°ßn±-ûõ—ìùòÚ=n}`Õìdd×.R°ti™¼~½õ!¶@ÀƒŽ<¸s¨qI h… 2mÃYôõ$I“)“œ9zDÖ-Yb.bn»r£~‡ör诿dT·nòÏ¡CÆ¥âÇ/½>#5š6‘ôY³Fùòü¼TnËž­[%øÆ I–2e”Ï%#ÄžÁQìÙse# $ýÑiÎç_ØGa²ºìgâ$I¤TµjÒ¢O_Þ©£q|Å‹KóÞ½¢}Í·‡ ‘Û7oJÙ5Œ¢­Ç ÄžÁQìÙseð`ôY³˜µK•>½¹¼j(pªUÑ9…¼ àLÈö€N   ày~~ÿýÇxÏÅó¼ R#p™Á‘Ëh)@¼Q€àÈ{:#€€WèÇøWÏ™+Ý_~Y‚NŸ¶[ç¥Ó¿‘ßzKZ•*%í_|Q†wè XMF·>ññãDzé—å2´eK™öÑÇÆ¡ƒÿ-£Õò× ”>õëÉöµk­Oa¢)@pM0²#€Ï8yø°¬ÔJ—N·l!Û~ûM>xî´ËçÎKÇÊ•eúÿ>–ü%JʸåË¥Ù;=D;ý^{Mú6¨/×/_6ÏÓVÝ,Y¤· €VΙ#—ÏŸ“Y£FK5áûoP×ݨ§Ë©À#æyl €@ôŽ¢çEn@à™Ùòä‘÷¾þZštíjæ}.ží¼%}W©cÕʲÛ6¹`¡´èÛGÒª% j·xS¾Zó›dPAІe¿HïºuÍ2ô±™j™”©RûÖþø£?x@–:%‹T`4së6I•&ÜRËLûh˜y =‚£èy‘x¦€ŸÈdîâ+™yŸ<}jnë)| gŽÿ#/Ôª%EÊ—³9–*}:é9úScß~ Í7Þ<ž1{vs­¥ùóÉG³fIÊԩ㺜‚eÊÛ§pçÈDch EŒì €@Tâ'H`f {çè§)SŒc¹ 2óXoÔlöºy‡è§©!y-ÇýÕ¢”:%L’Ô²ËüÎ[¬¸±}éüys =ÿžUÞyäF €€sVs®'¥!€€’?òH»å’Õ5’’•«X~Ú|ë9G¥«W“€¢Emö;뇞4²KgiÜ©ó3‹ôóó“%KÊîÍ›åïß×ÊÝàÛ’8YøÉÕA§Oe•«Yó™e’œ'@pä¸w×ܾ÷ž¹­7«!ÁÖåÊÊí[·dÁĉÒfÐ@›ãï?íë×é÷;ÖæØu={IŸ§ÓS&dÛ#b?Ï 8z&@ . ìݲÅlöÑ={Œ¹J™ræ2öݾuSn߸)7®\–Ý›6Ë–U+ÕðØÉ™/ŸdÈößPß™cÇÍ2NJî"…Íß¹ ’.ÆÉWƒË´áÿ“¢^RÕª™Ç¿8PÝ¿/ï« *¥ZØÑ’t võâEãç)µðcØtúXÈúFW/]2^Y¢×E"!€@ôŽ¢çEnðqÙcÆÈšùóåÀŽfK/ž;'Õ+:ž•JW¯ndÙ£†Ë–NŸ.ë—.5OóNwÙ©îµü¾š\ÑØßjÀcÑFý^µ> H©*U¤PÙr²wËf9sô¨LùãsQG}Ÿ«VÉWï¿/—‚‚ŒóƒÎž•µkK›÷Þ“,¹äãöíDÏ{²¤ž¯¼"í‘Úê&$ˆºÁQԭȉq@@,úãH*^±¢èÏQ(DT¿¨uN:dÜ…òó÷“WZ~%ú$aSìè½4IÍ_"!€€ãGŽR à°@΂EH û<Êû}@ @@ÀƒŽ<¨3¨  €ľÁQì÷5@@ 8ò Î * € ûG±ßÔ@‡@°GaEø!иs'ñ÷ÞÍí<ÅŠyDÝ©x·@ôþËãÝm¥ö àEE+Ti6È¢¯'IšL™$mÆŒáj¯çYµ..;@è EWŒü à6 éOdÉ:8Š,Ç@¨ 0¬U)ò!€ €@œ 8ŠÝL#@@ ªGQ•" €Ä ‚£8ÑÍ4@¢*@pU)ò!€ €@œ 8ŠÝL#@@ ªGQ•" €Ä ‚£8ÑÍ4@¢*@pU)ò!€ €@œ 8ŠÝL#@@ ªGQ•" GF!Y@À“x·š'÷uC— lúe¹l\ösŒ¯óJËVR²JeãüU³‘]»HÁÒ¥eòúõ1.“@ vŽbן«#€@, l[³F~œ:Õ¨EƬY%]–,’"U*‰÷Üs²í÷ßåáýû’4Y2U‘'Ë«Wåß³gåÒ… Æ9Yrå6ƒ£?~^*·ƒƒeÏÖ­|ã†$K™2–[Çå@ &G1Qãð÷ïIüøñeØŒ™R»e ›vUMžÜŽR¤N-ã–/·96üÓ»—Ü»w×Üÿö!rûæM)[£‘©ÂÞ'@pä}}F@À‰÷ïޕ꯽.0zÖ%ÞèÕSÖ,\ îÞ3³æ-VL&¬Zeþf¼S€ ÙÞÙoÔœ$ ƒ£šM_Qi57}> |K€;G¾ÕŸ´¢)ÐaèPÉž/4Ï ÉÞ°C{¹t1Fçrx®Á‘çö 5C7-ã«$M‘BôÇ’ôcü¿/\$?ÏøV†~ódÌžÝrÈæ[ÏWúcÙR¹tî¼$IšT ”)#Å*T’•+KæÜ¹mòZ~ܺv]æOœ(þÞ&çOœ”¬¹å¥¦Í¤NëV–,æ÷‘={dÞøñ’»`Ai5`€,þü4uФJ—NFÎ_ ‰“%5ó²áŽÂ›°ˆ–ÀÉÇeþ„ ²zι©žRÓéáƒáÊxü葼S»–Û»O&¯]+9Uð²ê‡dxÇŽòã”)FþDI’ÈcõTÜwÛ¶I¾âÅ}:ØØ´©T©ß@švé&§‘™£GÉ?/“åßÏ’ñ+WŠŸŸŸü0v¬,QAÙ UZ÷ë/_¾7Pf~:Úø­ÿü¦æIÕo×ÎüÍ„`ÎQxö €ÑÈ–'¼÷õ×Ò¤kWó¼çâÅ3·-S>üPþ^»Nšté"úŽ•Ÿ¿¿ÔmÓFê·mkdIªžŽ›µí/™¯‚§Ü… û®I—ªU¥yÏ^ÒûóϤbÝWåÍ>½e|èÓs[ÕR“ÞlämÜ¥«¼º­wlY¹B®^úW=*uZ¶”¼êšekÖ4òòì Ù·á %äèT¼b%3ÿ“§OÍmËÆRuWG§£þ €€!À°ÿ €NˆŸ YRØ;GwTàsåbÈäí°O¸•ª^Mâ'Lh¬©túÈQy¡vm£œë—/Ëo‹Éój¥¡êÎOØôäÉcW° ‚Îÿó9_IÏ-Òû*Ö­«³ +l9üF[‚#[~!€€ ìÚ¸Á¬½¿º›£‡ ²˜ûbs#‰2Ë’3§œ;yRïØi §Yê“XÍ3J“>½9#¹ ‡ §éc‡wì0¦ Ù²‰åî’åým½/©ÕjÜzþ‘Nñãÿ¬;øƒQ 8Š™@Àn]¿fS­uK–ˆþXR²¡Cd¦šØœ³@Ë®XýÖó}Ω¡µ_çÏ“j’t‚D‰Œúè Ø7¯\1ž.Ó¯)±¤ãûö…W¹k·xÓ²›op±sŽ\ Lñ à: Óg"-\-íX·.Ò<î<Ø÷‹/¤r½ºråߥO½zrCDwƒo˧ݻKŠ4idÔ‚…Æ$mKüüCæ:eÙÅ7¸A€àÈ È\\#иs'ÑÃgö’~gZ‰JÿM’¶—Ï]ûõÐÚçK—I>õš‘àë×¥‡š[4¬Í[ÆúC Õã÷¹‹¶©J®Â!¿¯© j‹z\ß^Òwž~7ÏÞaö#€@4ìÿW%š‘p·@Qµpâ´ dÑד$M¦L’6cF³ :h*­&:;²ÈãÓž83/ƒ ÄôoØP=ŠßÇ||?²b —++úñ~ýÛ„wß• µ_‰p‚õ0µ@á2e#+Šc  ‚£h`‘œ1Ã,Ka:ݽRó %‚£(1‘ âšÀvµX£%ÝR«^ëGìõúC‘¥3ÇŽ›‡O†&ÛøË/òïùóRKÝáz^Í1Ò¯I áO˜(¡¤JŸAŠ‘7{÷–”ê˜%uö¡lYµRN;&»6m’z9rHvµn‘€Ž8 i3d™ýe®‹tãêU :{Ö8ýïß·Ã7DC€9GÑÀ"+ø¾À’iÓ¥kÍÒÿµ…u‹õ;ÓÚ¨!.=GhŬYáölÞ,«WrLþ?óؘwºËØ=Õ‹iƒÌ}݆0‚"}'J?¶¯ƒ›Cê®Ïî-OÙM>\šªWŠlW¯±$(ÍÚ¾]ÊV¯n@úÜ#j1G}nµ† ÔÂŽûÌ…¿4H:©÷³=¼ß8ý¼šÈý† ¸Æõíg)Žoˆ‚wŽ¢€Dˆ;:vý‰N*^±¢Z»¢|ðŒ“²ª;>Å^¬ ½Æ|&§Ë%uénp°þº+÷ÔÞ•‹A²uõj™ýÙgR¦F ³4=¼6ILz!ÉÝêŸ¿._Nô~ëÔ}äHÑ8&@pä˜g#€QØ»e‹ô~õUùR½ -Gþ|Æ'¢õ°˜¾ûQÒO»½X§ND‡Ø‡N`X͉˜…D$ ×2ÒëéU¬ •ü©²ë—.»d‚yDõbD,@p± {@§ œRCh7®]“K.Èæå+ì–«ï.}Ô¶´èÝÇn €€ëVs½1W@8.P ti©P«–üùë¯ÒK­- ·dRïYË–'¯$HP.]£{öHâdÉä³¥K%}Ö¬q\Œæ#»G±ëÏÕ@ ŽŒ[±Bæ©×‡,ž6M.œ>-gŽÿ#{6o‘TjÍ¢"åËËà©Ó¤¤ZLj„±/@pû}@ @ øùùIËþýnîµ/©÷©¥VOžùÅÖÓD¼K€àÈ»ú‹Ú"€€¤JŸÎGZB3ð=&dû^ŸÒ"@p@€àÈ¥E € à€Á‘xœŠ €¾'@pä{}J‹@@À‚#ð8@|O€àÈ÷ú”!€ €€Gàq* €øžÁ‘ïõ)-B@ŽÀãT@ð=‚#ßëSZ„ €9€Ç© € à{G¾×§´@ 8rS@@À÷Ž|¯Oi €8 @pä§"€ €€ï ù^ŸÒ"@p@€àÈ¥E € à€Á‘xœŠ €¾'@pä{}J‹@@À‚#ð8@|O€àÈ÷ú”!€ €€Gàq* €øžÁ‘ïõ)-B@ŽÀãT@ð=‚#ßëSZ„ €9€Ç© € à{G¾×§´@ 8rS@@À÷ü}¯I´ð=_çÍ“ëÖÙmXË~ý%{¾¼ÆñsÇˬO?µ›·T•ªR»e »Ç9€@\ 8ŠëÿÐ~ð ê¯5–lyòÈç}úÈ®M›Ì:·8P^ëØQ2çÎmîË m’Ÿ¿ýF¦nî/ñbéýù8É_¢„¹ /À°Zxö €'?a)X¦Œtÿd¤Y·ô™3K÷‘#m#ËÁ̹rJ—ÿýOtKê:|„)_NtY$°/@pd߆# €€Ç dȖͬSÊ4iÌm{ϧMkJou®¹“ '@pŽ„ €€ç øùû™•{î¹gÿ'<žU??fR˜xl ‰À³ÿÍŠäd!€ €¾&Àÿðµ¥= €@VÌš%[׬±É™}*wƒoGúyúäI„µŠ÷\[HÀe9¦ï\YÒµK—,›|#à4ŸŽn]¿fƒµnÉÑ àiɆ‘™Û¶I΢Tµ„‰K‰Ê•"Í›PMbŽ,ù© ¬I—®2郡Æäh=ïGßÝÑiñÔ©R¬Â ’&cF³ˆÃjÎÏCuÇF¯·TUMŠ›¬÷%UÃaQIþ B'K`e}޲³¤ÍËWÈSBîH= ÝùäÑcËa¾pš€ÏGA§Ï8 ‹‚@W è»-úýiQ ŽœU—ê•$?|ñ¹Ü¼vMf|2B†}÷QôïjH­ãÚ\æø¾}ÆïÇê 6=ŸÈiÍüù2U=E—¯xqéóùÆkT*&J$Üqq®'|~ÎQc5‰Pß®&!€ž. çÑ”¨ù W´!q²¤R·õ[FÑkúIn\¹"ú‘þ~éf6—ôó¹Ë¤&T»#ìÒE†¶j% Úµ“êå»úýr$\-àóQCÑ dÚ† ²èëI’&S&Iku{ØÕ¸”DU@ÿŸ¸ÒÕ«I@Ñ¢Q=Å©ùÚ½?H–NŸ&w‚ƒeöرÆ]¤ª‰v³N¹ 6~^SÓ–•+åÅ:u¬›Ûzm#}ç©Vóææ¾ènÌPï‚ûqÊ)¤&z·Vî“p—€í?õ›¯£$ý!!€Þ. ßVzñ£üaËH!ƒTSk éu–~óèW“Œ }4ß:oaµ¶QÒäÉÅ'¨€¥BíWD/ 6 kÓF —)vw´~¯Y¸ÀÈoyšÍú䨵Êú ¶ˆº€Ï«E‚œ €€ç èyA–tïÎ˦Ýï;j$K ¾yòáwû!CG䯪'ÀôKmõ£ùaS25ɺ~›¶Æný$[§ªUD?ÁfIÏœ‘O»w—À;¥i÷n–ÝòDÝI²—Þ` »&S*K§Ý›6‰žãdI«fÿ`L ׿ŸÕ&Ë9|#‚£èh‘ˆe]l0k }¿ö¯ýGÙõœ¡+.˜ùw®[onG´¡ŸR+ÿÒKÆ¡:-[E”ÅØ×q؇’=tîÏ.¸ÔË‘CZ¨ÉÒo)"uÕšFz¡Ç «V‰åñý§OžÊÕ‹sO©…æÓÇBÖ7ÒA™^>À’ ”*il?pÀ(dçÎÒªT)Ù¾^½r$wnãØìÏÆÊêÕ#zÁHÎ 8r–$å €.ÐïBëP©¢|1 dÑF}©;·oK³B¥_ÃjÅ£æÕÏ?.}ÔŠÒMóç—;Vw—Æ|O:U©"ËCŸF3O°Ú¨Ù¬™$RKX¯ˆmuØØÔw•fmß.e«W7 }K/6©ƒ˜jª.s÷î½ð£Nª ©u™Òr)tm¢ ³g¥GíÚ¢ßÛváä)éZ³†èGô-©ç+¯Èêæ??øv†äQ—NgÕšIÇÔ“r­ûPk;M—æ=zJÒdÉäÚ¥ËrïÎm©Ñ¤‰‘?8C@?ÝÝ1l®¬'ðõ|õU™°b…݉|Ï© €è»0§Ë8õß訤;êE°zØËÏÏ_ «Õ®õ°›³“^{)uúôÆÊØÖe«WŽ$L”Xâ'L`½›m8©î66-XP:ä–e.ÊÄ‹'qbB¶ýsBs@—è'Ì~7W†Íú>Êå'Q“³í=±åBž‘Ѳ(eØl®ÄÂ^ƒßqS€aµ¸Ùï´'ðýèOÕÝ™TRþåyGá2°8"À£8ÒÑ4+0yèP ܵKJT¬$‰Õü™£GÉ`õÊq]€à(®ÿ@û@ N ü«&FO>ÜhûÆÐõŒÊ׬)/¿ñFœô ÑX 0¬f­Á6 GÒgÍ*µÔ“i &”Dêå´Õ5”Ï—-‹#­§™D.À£È}8Šø¬À'ê…®úi³ø òÄ—Ïö2 ‹‰ÁQLÔ8ðý´ lV³õà €Äq‚£8þÍG@[‚#[~!€ €@ 8ŠãÿÐ|@° 8²õà €Äq‚£8þÍG@[‚#[~!€ €@ 8ŠãÿÐ|@° 8²õà €Äq‚£8þÍG@[X{}Èá;mkÂ/@@ ŒÀ¿gÏ…ÙãúŸnŽÎŸ8a´êë¡C\ß:®€ €>!°{ã&ÉY €[Úâöà¨i·nòèþÉ^ ¿[ÈE@@À».IƒvíÜÖˆxêJO·?}ê¶ r!@@ÀSÊÄ‹'LÈöÔÞ¡^ € +G±ÂÎE@@ÀSŽ<µg¨ €ÄŠÁQ¬°sQ@ðT‚#Oíê… €±"@p+ì\@‰oR¤? XG‡ŠÅ¯US[¹­ÆI“¥íJ¥éØ*$ä:7‰©Û鶪O{7ü@ÙH§kk?ì<Ê»øÎí¾kktüqóÝ‹mÇ6°nÆ¶ÂøØ¯±-ümR;`zŠ–¡Êðv x#=\Ó% ëoàYÐÚRÚ±£¥êùÐ#&Á?È>ÌÒ¹áЪþ¢þ©n¨_¨Ôß;j„;¦$}*}+ý(}'}/ýLŠtYº"ý$]•¾‘.9»ï½Ÿ%Ø{¯_aÝŠ]hÕkŸ5'SNÊ{äå”ü¼ü²<°¹_“§ä½ðì öÍ ý½t ³jMµ{-ñ4%ׯTÅ„«tYÛŸ“¦R6ÈÆØô#§v\œå–Šx:žŠ'H‰ï‹OÄÇâ3·ž¼ø^ø&°¦õþ“0::àm,L%È3â:qVEô t›ÐÍ]~ߢI«vÖ6ÊWÙ¯ª¯) |ʸ2]ÕG‡Í4Ïå(6w¸½Â‹£$¾ƒ"ŽèAÞû¾EvÝ mî[D‡ÿÂ;ëVh[¨}íõ¿Ú†ðN|æ3¢‹õº½âç£Hä‘S:°ßûéKâÝt·Ñx€÷UÏ'D;7ÿ®7;_"ÿÑeó?Yqxl+@IDATxì|UEöÇ„Þ{ ^¤ ¸€€`EA±aÂ(ºº®º«ëZÖý[Y{×Õµ¢(¨èZ(""Òdéé½—¡ä?ßIÎóòxIIH^Â9ŸÏ}wîÌ9gÎüæÞ9Óî}D$ÙF†€!`†€!#ÂŽÅ‹KÁ‚åðáÃ’œœâ« ( kZ0Ž0üá )ŸÆ£3G8(«éÁ8ôOÞÁp‡°æÁYõjXå‚úU²ö¸øT&...t­éè…'˜/iÄCÞ .M'ŽÝðF"Í#(CXíE^Ó4¿`::áÁ†C‡ù3×|š¦q>ÁýÀ>W^Ò5-§y’N8Hʧñè ÆÖòãƒqèS[ƒá a̓³êÕ°ÊiþðS(öîGËÎÙêßž½ôþà¾".xß‘F<¤ñÜc*ËYÓ sØóoÏ?÷J$Ò{¬aÆâóüMCBÚ0é5g ë§7_Zñ8ôÁË5aåWy $N¯É9µáàÁƒ¡››8½†^=kAUžs0/ø¸F7i aHËÍYu¯ú‰#¬×*Ãu¡BF¢<‡ž ?×ä¯ys !«aá~¸ÖüÑkÅ“4â5ŸÂ… ‡°Vì¨SÍ ¾ ã sÀ£ñн^+vh¾ª“kHÓ4âÕ^x¹ò¨<üª“0yiàÑú&Íê?åy#Å L­þïêý£qßpé=ȽE¸A„!®¹¿8«â‘ãšCõeˆ³çßžî%î î£àýÂ5Ï,÷a(­öŸ4îÌä9sæx!"P¨7"×›W­”™£ÇÈÒ™ÿ“õ«Wù9ðê5kÉI-ZHë³Ï–Š.¬¤ FPž4½æ¼eý ™=ù{ùmÑ Ù´j•­X³¦ÔktŠœ|êYR±Zí?‰Øƒ^ˆ³ŠkÂë7¬–‰3ÆÉ¼e³eõÚµÄJÍøšÒ䤓¥s›3%¾z‚×°l4ú4jÁxôa£’–GAÖk•_Lå! 4j·ÆLSÕ‹MªOíP›¸†_IeH§QQ‡ä‡W¯ óÓxÕ«ùÁ£qÈD"ÒU>’MáòzÍ Êui^ʯ¼Š#gòURô¹p}áºôÚê?å^W<8CVÿöüë=¡ÏUø™t}ö‚ϲ¶IáòzÍ Êui>ʯ¼'âóòÉ'§8ç™3gú0Ô©Ðâ©SdÂÊšys¥fÙ2R§b%¬Èò-›eõŽR£ié|å•Ò Ý©¾Q¤Áƒ´,ñœ—Í,¿|ÿl™?GêT-)ukU‚E–­Ú"Ë7ì–ŠMN–ög_)õ›wð:Ôx¸Сº©à™ó¦Ê¨qÃeáêR©vY©–PQ’'˺•›eËÊÒ(¡‰\ØíRiѤ­—CèÐ G/ñ88ˆ4®ÃóS[ˆçÀ±S.•#Ž0²Êû‹Ôk Ãá#•Ó²‘qš'rª³æ¡úpèŠ=rè ^«¬–9ÊO¼ræ©8hºžƒ2ªCëEóÔr CñœÕnò"N1„8-›–Am!]Ë ºÑ¥ü§e@·êÐ<µ<Šö¨^xTVùÂóS[ˆçPÛUޏŸ~úI¶oßîõ’/ R¾|y/ãÜùqXý§Â!Ð@)RÄÇÞ¹uLý¾$/™-½­‹ÄŸ}+ßÓåV@’¶~%«Ç|.o¼ú£L),¥ÊU–²•j„†üÚØr†¸ñ7lZ#_Œ&˶.‘kþÜCzµê%ÕK·÷é«wý$#gŒ”a¯Ž‘ÏOù²•¤j¥xŸ¦=Ú@Ž4ts˜âB¼>P¤&xl"]è!0ñ: &¬y)¯Êbº‚„t©=\Ö8ÂÄq@û÷ï÷éÄïÛ·/TÙäÉô6„,6aƒê Q^%ìA–´`¼¦#«:kù8CÈÖ0_ÍSmC†02Ȇ۪ñª_å±QËäÀDËÍ‚_±‚>HÓ-üØq¬õ?bÄ/S¯^=ï¤7mÚ$›7o–K.¹D.ºè"Ÿ'ºÕn-Õþ¨ÿàýGÝÚóÿ{û"<ÿÙYÿ4´È8Ð7^44PÓF~.«Æ—«Oª-[6“[·JÒÄ_äÀ/SD¶m“²åÊH•»dê"·™¬dI©á†á 6n4ˆºöù¿ #dÇ´ÑÒ·{IèÒ^ $%Ê“äà–ÉîÉÜ-%+WÉ›eöÔEr°T©Y¯¥·ƒ«Î`ø‡É_Ë”¥¥cï&Ò©õ©²£ÀY´sªü¶g†$Ên©Z®¢ì+±]fýºPŠIQi覹±4¼”QIKÒÔvÒ‚5tPš.*[fÏÝp袑ᦠö–¶¬š%-k“ò¥ŠHÒ¢RÈé+X´xо¤}rÐé# žÅ+g:{® Ù¤7uðf^´f¾TjTFJ–+&³×Ì•ÒÅËJQ7R‡ö;};÷ì’åŠKåÆedɺ…^—‚‡m^ëâQ ñJØ¿ÚBÊ¿e˯—ŸsÎ9G¦N*ÿýï¥K—.²{÷nùûßÿ.×\s|þùçR¬X14h|ùå—2}útÙ°aƒŸg´Ý¸qcyòÉ'¥]»vÒ­[7o#-Sç÷ÜsÏcÖ¬YòÝwßùÁV×~óÍ7eÙ²eR®\9ßQ˜ížÍþóŸÃüQ~þùgiß¾½|ñžqúé§ËK/½$Ë—/÷¶5mÚTnºé&?ë±cÇyðÁ¥ÿþÞÖ+Vé·Ýv›`Çÿþ÷??mOçžüÀðD®{þ­þy´mãåy€ÂÛ"ms8Cœµ]òîÇo3Æ ÆEÓ5k¥IÑÂR6qŸ”زM ºéR)šâœ íß'…]|²sÒMŠ’±n 9=êXhT1ŽóÄ R;¡´/!RØm , qRœä¡{¥@òŸϼµ¼¡ÈiA1^OtîHÜ(Õ[——%#+°OÝZó¤ç|àP’$Ü'K$Kµ“*ȆI›¼ j zЫ¾æ£²¤à)ð«Œòkyá%N ‰ƒ‚ñ\¯éœ•à#_(h6#£eG†8H±Ñë Í]š%x‘ãÌ^x‰ƒOËK<ù"¯#k͇¼50|šz"Õ?|rAÈ’g°ŒªKíB.˜Ž\kù9£Sù"á ?rèÄd¯ù ƒëÌÖ¿ÚƒGÚ¬Y3aÓ%¸pàD?þøc¹öÚk¥Q£F>®ZµjrÇwHýúõ½3ýöÛo}Z:udƌޱca–.pœ¥K—–É“'Kƒ <†/¼ð‚·ý¹çž“¢E‹ÊСCÝ$×6¯{ÈK‡`ðàÁR¶lYÇhÿöÛo÷¸Ü|ó;3ѱcG)¶¾ûî»ÞÖêÕ«ûN :7Üpƒ 2D~øá9ÿüó}yÁàD­€ûÑçkî­ Qšé>…7xê½Î= !£Ï¡ÞozM^Ú†h¾èÒüìù?¾Ï¿¯ Ô:Òç_ëAÓ¢­ø½sÖEÚèw7Cù¢E¤dñ’R´¤sÊÅ‹KœsAçxâÜÈÐ=„l'«à®‹»Yn½iÔa“Q‘bqR´b)\¡”)_ZâŠ9}nÔ :àœEqwLv<û¥ÈÖ¤#nj ¤…Õ·h‰BR¶LI)]²¸5—â…ŠJѸ¢ž/épœÄqöJv P’l+žèoxìãFå¬7tðñƸuP\ƒ¼ä­øh˜øàCº:p /%•Uœ‚gtШ© ×ä~äCš—>hªŸx•=jØÁøõQYã¬6b ¯ÚÀ5atkYWV¼'"^ ÛÔ&øCN4ò!Oò×&ª¼ÂØFØ£vÏèT*ñ:tðv2rfÄüÕW_ùüpœãÆóyîܹ^‡‹³gDÍÈG»nÝ:ïxQë½ÑµkWYºt©Ï—8ì,îžø÷îÝëub#zö-`w×X¹r¥´mÛ6T·Ür‹°ŽN™+W®,Œ¶Ï;ï<Ù³g·›)eã€NÄúçþ¯†ûÑ{Gq ž©{þóþóŸ]õÏ=sÄÈÅ<¤œKש%Û7mç¥@…2"å+ø°{Â¥ÀÎ]"[·9éí²Ýí+ᦴ¸±hôÆÔFF‰øRâeÛáÝWª”;œsv‡q”äô%’¸ƒœ¾¢nƒYåЃmØÔ6þJåâåÐÎ$)îr7=^¶H9)—¢oÏ¡]²­àv9xÀ97§ºr™”Í`ئÙm¤qÕF…06“/ÄC„„øHƒ4z«4\Ä©xI'Ààƒ§<¤«=ÈÂKGp“|¤“ùCœ¹& "B»‚úÈ_Ë­ÈQFâU[ÈBÁ02äI>ÈQ.œ,¼¤qM¤…‡UgtÀº5=”CÓ±Eë^{ᣔ B[T'aHyoâ‚6‡,D83õ¯òؤzÈcãÆR©R%'v‡~F¥jÏ/¿ü"cÇŽõ<ÄÁÙÝÞðΛ7Ï;äSN9Åë[°`”rÏR·<Ä”òZ÷!28Oê¼'Ž0¸R®’nÕ°½ØúÑGy݌ę’¯Q£†çENå©#m8x6ºQßଶžèõ¯XøÊO½¨pÅ’³ÞZÿÈÂK‡=ÿyïùÏÎúçþðΙ›‚Q8ÂÒMšÉê“eOåJR¸|9‰«UC TªB‹&É›7Ê¡¸"²§pY½'QÊ5mæo(t ©7dé*Meý¯°{¸Kº YÅËHr±òþ&-€ɉ>mÝÞƒRÆñ¢ClÃFnZñÚÈŠÍ3¤è¡ÂRÒÍm—Ç9qG…“h´“$ñ lÜ/ •šø‡ƒ4m”Ð^HSôCÄó0ÑxAA^µË'¤þ(nÈ“ÎéCˆÍŠñè‡~ kô šG¥)&„Õ]ùxH åÕ~%Ê¡é”Lwå#|ФùÂCœ¦ëµÚ¬üàM£ ŸæÃYóÐ…¼°Ôµ3üÄc gÕO<骹VžHõ¯ú4?µGmG§æƒ>âU½Ê§rœ)yÁÇ‘2ÃGù@„Ãeá'ŽCó%oåE–kò€"ÄÄGºø´þI‡àåÀéâh±ƒ5Y0Ã6øT׌tÑè–o°Ñ ZNÖ¬‡.¬K#Çhøî¾ûnŸ'85iÒD¾þúëÐk\‹-òy“¦¶ÁL9)y“£mFÞ«W¯ö#bFâðñVL‘ÃV­ÕKy᧬AYø!zŸp­xF'iAûˆ'Nù‚öy¡;Vê[ Ê©¶ú÷C94]q"^Ê ‘N¹ÀˆkÒ9ˆÓt½b?xÛóŸ;Ï¿ÖovÔ?ué3®7 ‘(/æ^gªuö9²Á}XáÛ%‹%Á9åjqüñÙUÿäÃŽh°åõ)v7÷íÛ×O“?yri=³ ŒµåÇÜß_8ò%K–øÑñW\Ú4Ƈ ¡\ŒŽ™Zf*šû¼Ù`öÉ'Ÿø ]¬gwïÞÝï×2S^ø3lìÚµ«¼úê«òÈ#x'ݹsg5j”T­ê:µÎvl%?òPyŘ3ú{ò!ŽkÒ S^®Ñ©º9#ËBÕ¡iÄÅjýc;å‚(ƒ=ÿVÿܶ—„õþçÞÖg¾`ûè%óñÖE¸±`¦áÀ ðŽfaÞ:oŽl]¼LvmÞ€Œ”®TU*4¨+šž,œÜ®]»|æÁ‰Ìxà0 ½48¶"“dËšÉn={ž$îâs›"%JÇK¹Êͤ||;qÝ?* }¬ 1…BzÑOÿ Û©½tí,Y³y‰lÛ¾ÉI¹ø2•¥VÕ†R»J3)Z°„wæèÓYÛt­ŒkòÒò“Æ®WÎð“?:¸V^®áç ‚š‡b€Œ¦S.xЩr¤£GqT^ÎðÓ•y%åG‡âF‡Ú¬iä !OZ¤ú']óQyÎܪWå)£ò*áõ,|Z^ÂZ.ô $ôi9ˆ‡'XÿØŒ.µMñDFë[‘ƒT?2²ÊË56e¶þëÔ©ãŸô@846M±¶¬åÄF'£ZòÆ6la4 15ͺ/kÉŒŽqÀÈ0mÍÚ5ºÀ¥V­ZÞnœ¸|8UÊC–ת¨ƒË.»ÌçÅóÂÚ·Ž¨‘£¼èbƒ˜Ê¡‡¼9è°©L:v°!5hì ?ì_¿~ý _ÿà&œÁ³b¬÷¨¦s?À£õO<Ä=ªaåå¬÷¦+ÞÛÈ*?<ðS·ÄqhþšFÞòú p†WïÒ¹†Tž³=ÿÇ¿ý¿à‚ Rœ3SfT$•¥7AÏœW.Ø|B¥pÃQy<Ì4"<üÈpèM¬x*V+˜3èÃñ¡4&<ô„É[T–F‚†^òAކ|ÐG£C£y`2¼ÆB˜Ò ÝÄq>½¦ÜÈkÞØ‚Nò"^äàÑK3ú‰Cø5 ~}P “¯Ú¤gÒIÃò„GåSfˆt®á4ò&BWø5ñä¥ùpMyàCöjƒõO2œ!òPÌTž3‡– qñÁ3ú±>ÂZNxÔnÍ´ôêß+v?š?r”B–C1V|à!.X®ÕÍ;§ëŸ²s¯Q·”GmÄÎhê„ þ~gšš)ê… ú]áL—CèK«þõY$_å!_°°ú·ç_Ÿ‰à3ýÂ=Ë™ƒgtâ â âƒg{þ3nÿ/¼ðÂçüÙgŸ…œ, ò ª“ÑŽƒ âšbT  ‚ ¤Nœ0•þp#2¤iå¢O+[om(”„!xÕ>Â4>È‘§öô‰£±†›‚²VÂfd!xµ èáZo:òD^=“<è#0Ø0C@š’êÑ›’k%M#Âj'úÐCø(y(¤a‡6¼¤¡ŸuÊÔ:Ñ…׿–“4dá¥Á83õOÞª ÝZ°_ìPÊMžœÕ>¼¤¯e´ŽÀôD©:¢|€„Ñ5j¦¨Á…òƒ™Õ¿=ÿÎ:­p½ÖFTù}äG†ô s¤1!Ý  äKºê& ÒF†ž8ä9ÐË5ÎâZóD¿6\šר‰ éÈ©ŒÆ“—:âЇ¤vV £Oå‰çš9Å=ª<‰W;Ћæƒ ñ\ã4©;®‘§Lðj9Tž3º9“Æ=a«ÿ¼UÿÌ"1#Ž1õÌaõoÏ¿=ÿy«ý9ç០=Ð4Î ²èFR‡Üˆ$ù÷Ya÷ê¯(Ñ C8¦‚¿û(‰:” ñ7ÄA72r»¨‰W½Ä£+ÙéÆ pM:amXÐO˜ƒ÷•ÉCI×8#®É_uNˆøC©6†È‡ƒkS§êu¸r.’ò±/Cjç ’… ¹wsŠ“êõ¹ŸIn”›Z~ å[ä(?Ru’8Ó S¤tºÚ®ø£›Ñõ§2*œ2¡: ¸ºÈJ)oJݪÝèâä±úOéH¦Ô[JGï1Îâhõÿû”9xqOûgÇž{þ­ý÷¾éXÚÿÞõNÙ­ýÉðO¼°:4uZ4ܯ éAÒÆ]Óƒ*˜†N®•Ыé꜔Gu„çÍ5|*OC©.m@‰ÓrVYøƒvÀOg5<ÚÈ «„¶âÕΪ]Ú¡iÊOúµSÁµÊ¤pàC¿âŵ–_å´,z –´ Ä+?g%µCó#ÞHÔO:×AyÍ›8Ò”ÈOeµ<Ê£:Ô&Í›køT^˯ºàÕ|ÔòTYø5xøIS;­þnÀ l'û âN׊{0MëVŵþÀÝêÿ÷YHŇ³=ÿGÞ{ú ŸÏ?­mò„Ô‡/xSXØ0 CÀ0rήÃzä0-çm° CÀ0 C sÎa€Ø¥!`†€!Û˜sÎí°ü CÀ0 0Ì9‡b—†€!`†@n#`Î9·kÀò7 CÀ0Â0çˆ]†€!`¹€9çÜ®Ëß0 CÀCÀœs vi†€!`ä6æœs»,CÀ0 C sÎa€Ø¥!`†€!Û˜sÎí°ü CÀ0 0Ì9‡b—†€!`†@n#òß™´bÓÚÕ²jÑbÙ´j•$îÞI-&–]”pÿå[¹V-©Õ°Tޝ™)µ<ÍŸÊ”B2 C #À¿avï½9ZŠL;gó¢I“¥Y|ué}î¹’P¥jŽn™ÀÊdÖÒ¥2×Õ‹tL9èâÅ‹K÷îÝúÊ£s³CÀ0ò?ümç7ß|“ãÍ´s^>w¾4®REzvè˜ãF[†‘ ƒÄqhÂYæê'3£gþ'µ|ùòR¬X±È™X¬!`'ûöí ý?yN;ÓÎyýŠÒç rÒVË+JZ7j(?ñe”ÜG³á˜K—.}t‚ņ€!`ä™vΉ;wdûTvbb¢ >\*W®,=zôÈÈäàÁƒR¨ÐÑPŒ=ZÖ®]ÑŽ„„éÚµkÄ´ÜŽdôLýd† ¸?ù62 CÀøÜhöH¿Û“nèðáÃé¦g&qݺuÒ¯_?iÞ¼yŽ9ç§Ÿ~Z>ýôSùùçŸ2yÈ!2y²[¿@ݺu‹Y猹™­¦µ³‹Þÿ}_—-Z´8f•Y‘=æÌLÀ0 tÈÎv1lŽHÊ´sF åìbË.B„NF³9Aƒ ’ÚµkGÌï…^mÛ¶y3zõê%… ö#{"*UªQ&'lÎ(ÜèåE²éÖ[o•/¿ÌÜôzVd#Ùbq†€!`ä%²4L‰2BËÎCÁCç† 䤓N’?þñÒ·o_?Ý#}ï½÷|ž¿ýö›OÿÓŸþ$^x¡T¬XQš5k&ãÇ÷é+W®ôé×\sMÈÆ .¸ÀÇíÝ»WÎ:ë,9pà€¬^½:, #¾ÓO?Ý8‹-òr·Ýv›4iÒDj¹×˜Ö¬Y#ãÆ“–-[J¹rå|ü'Ÿ|Ê?½´`ÞY g¥Ã+Ž]ï;†€!ÛäF»˜%çŒa´Ë(7;Žð‘óþýû½ã|ë­·üöÒK/•7Ê]wÝåóÃÁâX_ýu¿6Ü¥KYá6ª]vÙe²uëVQyÖÕ>ÂÈà”uºµhÑ¢Ò¶mÛЈ]yƒgnÑküž={¼®×^{MÐQ¿~}GŸ>}dýúõrõÕWûk¦êgÍšåyÓJSY=ƒ!õ’YŠ‹‹;JtÚ´iÒ¾}{ßÙèØ±£|ýõ×!ž>ø@5jä7U­ZU àÓx‹=ÔSÔ`wà 7øàìïêÖëéû¬ï(ýðÃòñÇK‰%ä7Þ:UÆ óï âè™q“ÁƒG4-\–NlyñÅ}§ê¼óΓÎ;Ë’%KäÎ;ï”ÊÔ©S#ê³HCÀ0òYrÎŒ†p$ê²ã €èU]\WqïSk\Ù²e‰òÎAPƒ BémÚ´ñé8lƒ‚622 Æù ÷£ù¥uŽÄ‡MSﲌܡÇ{LÊ”)#_|±¿^å¾¢–^ZZùk1Ÿ ŸæóÑxvšC·ß~»ŸÖfj—rU¯^=ô•™Hi*ŸNÖQ%–;* ˜fÏž-t~(££g¥/¾øB®¼òJÙ¾}»ŸÞ¦óC‡"œz÷î-wß}·w쌘›6m*ÿú׿äŒ3ÎgxÍ«nõêÕóiKÝWÐÀK;b*À †‘!`ù,œcë‘O<ñ„ß VÊ}cúì³Ïö#= ,ðë”8F°AÂñ³#û¹çžóN4˜m8Ø«bºG´:¯a=òÈ#þs˜L{§—m^Ç›/|ã£aFüA‡Î8HÖúYogêš)gÊ ‘œ3ËÔº&Mš$8R6êEKØ¥¶1‚ç#)¼zÇÚ>y³t`d†@v# mOvëMO_¾pΌ–/_îGb–)eœ Ä(•‘ñŸÿügaØ¡ƒûèt€ØD†ãxê©§|#HÊTõkÖ²wîÜ)wÜq‡LŸ>Ý m¦—–©ÌŽƒP°£zv·3òå•W|nlÃ9ƒ/"œ$ï|Ó!áÕ3â˜QøN7ðpì#FŒ6“±€ÍwÈ7Y|þùç2sæL/.ŠL °©ŽꌺeiíàFµp»6 C ³„·‹™Õs,rYšÖ>–Œ¢áå5)FCJŒ‚×Ä5J“½#à§Ç†#xqÁFyóÍ7 ;«Y §7ß|ÓˑƎ봈‘b85nÜø(ûà¹öÚkýcb:;ØëJ/-\,\³úùçŸ÷;äx࿆ŒSÄ)ƒ+;§)#رûÎ.-têÔɯÏ›7O~øaÿÎsÆ ½#gtýöÛo‡ŠÈ|Þ%:t¨ Ê꺽23ÍNG€iòÇÝ¿hQ߬;³;ÞÈ0 ü€@–œ3N'­iÌœ'Ø›Á–´ÖùVtzä–\VËP£F4U¤—–¦P `ì D!rKSM`Göõ×_ï_ã]n%6n13À”6£dv͉O±2Âe×6Äf0FºtZ‚zH ïüeq¾:‡ÂFÍèÊJ™S4Ú¯!`‘ˆÔ.FæÌ¾Ø,;çÜlÙ¹}É%—„ÞWÎ>Xò®&ê#+u’ÖMˆÎp‡ª(1s‘…ÿŽ6-=á:ÂeÃÓ¹æçF†€!`OÒjgž™vÎÅÜ(ií–-RÑm¸ÂðH›€Ž§áèf*šéM£ô¢^¨#CÀ0 ¼‰@¦s•Z 2˭Þã6÷ðú Ž!¸£7o‘w­ÖÑ2³¨ê'3¤>3²&c†@~D 7ÚÅL;çZÊÂ)î‹Lî‹O'×=Iª—?z³U~¬¤X.Óºm[eβßd¡{µ¨î©í2eª:ùL ›!`ùÜh3휫»¯bv›{–ºO1þòÍ7²Ïí†6Ê]˜Ê®ì6¡ÕisŠP?F†€!`yL;çÂ… K ÷GU܆œCîÓŠÉÎQå.Üf«8÷Q•Âîk^ÔOfé»ï¾Ë¬¨É†€!`dœŽä ©ßˆÎ}¦Â0 CÀ0²€@g^S΂¼‰†€!`†Àq@ÀœóqÕT†€!`YAÀœsVÐ3YCÀ0 Cà8 `Îù8€j* CÀ0 ¬ `Î9+虬!`†€!p0ç|@5•†€!`†@VÈô{ÎdºiíjYµh±lZµJwïΊ&› ”pß9¯ìþ¡©VÃR9¾f¦4NtÁ˜_ÃÉ”±&d†@ À§©O»÷ÞÈé÷,2íœqÌ‹&M–fñÕ¥÷¹çJB•ª¿kµP® °rãÿ]í¹®^¤ƒdÊAó×Ý»w?â?±s¥0–©!`1€sûû fNS¦óò¹ó¥q•*Ò³CÇœ¶ÙòK:H‡&Le®~23zæïåË—O÷ÿ¯ÓÈÞ¢ CÀÈwìÛ·Ïÿ±SN,ÓÎyýŠÒç rÚ^Ë/ Z»?%ùù‹/£àŒÌR¬X1‰æ¿”#K[¬!`†@VÈ´sNܹ#G¦²¸ïvøá‡¡rvߎÆqtìØÑÿŸ3 IIIòÑGI||¼œuÖY!ÞŒüÕ%±NÄðÁáÑþºK—.>Ÿ`~™Í?bÙÉè™úÉ åäzó®]»„ƒz32 C VÈÉvQ18Ú3iJçÇgÀ‘=É{Ü¿]õë×ï(e•*U’·ÞzK.¸àÙ±c‡çiß¾}ÔÎùé§Ÿ–O?ýT~þùç£tïÝ»7bž0¾úê«rÑE‘_fò?*ÓlŽÈlýËÿ–N:U:wî,LûdDï¿ÿ¾4oÞ\Z´háYzè!yôÑGýÚ6zHÃæiî/H£Õ™Qž–n†@v p,íbv䇎L;g„Y(gÛñ$F±£«·ß~[å¿ÿý¯w’÷ÝwŸôèÑC”[4œ‘Mƒ ’ÚµkGä§\PµjÕä?ÿùϪš5k&¥Ü®hœ éä§yKþG(Íæ‹ÜèåeT„[o½U¾üò÷©vf&®»î:ù׿þåGÏ×^{­íÏDK7 ,9gœQfGhÑ"¬Ž²ˆûÄÓO?Ý‹áiÜ/^,Œrƒ6hxëÖ­2`À™2eŠ05Ψú¥—^’êÕ«Ë9çœããV¯^-'¹ÿ=ž7ož-Z4dR¤ûl?«S«8ä¶mÛúѸ퀴mÛ6¹ä’KBÇ<àËÊ3Ž}ݺuþ:<ÒpÄëׯ—«¯¾ÚcÄÔü¬Y³Ž+V”[´ƒ¬èhÃqî?¡ƒD¨eË–rÇwHåÊ•=–/¿ür%þé§Ÿäâ‹/ö»½ÁeˆW³¨“Ë.»ÌÏ8Щٲe‹Ü~ûí¾³dÉIpÿ Ný…ÓäÉ“¥U«V~ŸAݺuå½÷Þ gñ×éÙ‰Þn¸ÁÛÅNô®]»ÊÊ•+#ê±HCÀ0ÂoÃÓÇu–FÎ8=އqèÔ{Íš5~-RóÁ±>òÈ#G8#xqNãÆ“¹sç ›·FåEÎ;ï<Áy|öÙg~­ó•W^Ö­uÚZ,ÌêÜØ¬Dg@iÆ 2xðàPºæ§üzSB–i÷ûï¿_Îuïã´¹?û쳪5gg/oÿþý¾SQ¡Bùõ×_åÇ”[n¹EN>ùä#^·ÂÙõìÙS®¹æ=›øn»í6¿tðñÇK-÷q”7ÞxÃÏZœþùÒ¸qcyÜ}ðäÒK/•åË—Ë*÷!›pÚ¾}»PoW^y¥|ûí·2lØ08p 4jÔHÚµkw{zvrïðž"ëÙeÊ”‘Þ½{ûz|÷ÝwÐa†€!`Ä Y9«3R}<Îê4+V¬(÷Üs 2įSÒв1‹<•P¹& bG·ÚDZ°`Aȹr­éágÒp(›7oß}÷çÏ/üš©lè±ÇóÎÇ á€ÂóÉîkl¡^2KiM‰ÿýï÷x\uÕUÞ9†ïfçšÞåsÏ=ç÷Ü}÷ÝÂú<£l"Ä™å‰råÊù÷Ùu_¢D‰4Me”LGƒNebäݺuk¿ß -¡HvòjÓèldփΛ9æ´´xCÀG ­v1œ/;¯³4r¦¡c§nVœAF…ÑÀl¢ÁëÍòàèˆkРg¤§<„!6–Ø82M÷‰©?ªr¥—®ù)¿^³óbÚ–im¦tÙ±öI_j¶Ùr™ew}à@u½#™–f*:HË–-óÓßÁQ;¢éÓ§ÙŽ)¼téRù)­W¯Ò²“÷΋/¾èGÌM›6õ¼3Î8#¨Ö†€!`Ä Y9ÇL) 9í´Óü”õ÷¥,Ö§9&Nœ(¬Q3¥ ñ¾4kÊŒôpœÙILÅòþôÈ‘#…5S¦ßYwņX§ sU[é\,\¸P/eæÌ™R¿~ýÐ5°eV"H3fÌ8Š/˜žQŒ®YÛgýžë¶Abí˜]쯽öš¹;VpÎ쌇øf÷ÚµkiTϦ:ÖîÉ›YÖµ™ÖfsôùçŸûŽ‚¿Hý‰dçˆ#üö±°[·nöíð h6 tH«]LW(‹‰YšÖÎbÞQ‰³NÉÈ)=b=:œ‡©mÖiÜq*lh Ò›o¾éeˆ¾F<\_P6<¿ðkxyo—‡Ätvnô¼‚6g5ÌnhF²L™ë&7>¢Äø¡C‡Ê_ÿúWŸÎ?3¬QC¼rÖ·o_ÿÚ¯•ECÔá /¼à§¤Ù<Æš6ëÎtt ô÷êÕËç«ú"ÙI;÷ٵό¶ñμ‘!`±Š@–œ3‡]¬’nFŠd_Zk—‘x3W£FÌŠfJŽúÈJG ­"ïm3­Ì¬€~ò”ÓºÞŽ±|h„ÝÔŒpy5*h3t”t%Ö€u<\çõ×_/èdö#¨“5épŠd'ZJËNFÞæ˜£EÑø C@H¯]Tžì>gz͹˜›ú]ë>$å†áÙ D~Чõ@½P?ÙALû_~ùåÙ¡ê¸êÈ+vWL¹!`ä2=r®R+Af¹iÅsܾL…cЩÉ|ƒN*ˆŽ–™r¦^¨ŸÌ:x•eêŸàŠuÊ+vÆ:ŽfŸ!`@x»x4GöÇdÚ9×rÿ¼pŠÛä>øqrÝ“¤zù#7\e¿©¦1#ÖmÛ*s–ý& ÝÚpÝSü‚VF²š®N^¯íl†À‰Ž@n´‹™vÎÕÝ·“» 6K-’_ܧ÷¹¿v4Ê]˜Ê®ì6 ÕisŠP?F†€!`yL;çÂ… K ÷!Š*n£Î!÷ªR²sÔF¹‹@·á)Î}\¥°ûD&õ“Yâ3¥F†€!`¹‡@—uò÷!`†€!`ä>yM9÷Í0 CÀ0 C ˆ€9ç 6 CÀ0bsÎ1P f‚!`†€!DÀœs †€!`1€€9ç¨3Á0 CÀ"`Î9ˆ†… CÀ0 @Àœs T‚™`†€!`0çDÆ€!`†@ `Î9*ÁL0 CÀ0‚˜s¢aaCÀ0 C ðßÖ.:Õý»”‘!`†€!`ÄÞ9ïo—¹¿Œ‰˜†€!`†@>CÀ¦µóY…Zq CÀ0ò>æœó~Z CÀ0òæœóY…Zq CÀ0ò>æœó~Z CÀ0òæœóY…Zq CÀ0ò>æœó~Z CÀ0òæœóY…Zq CÀ0ò>þ=ç¼_ +!ÿØ´vµ¬Z´X6­Z%‰»wç¿Z‰ \@ D©RR¹V-©Õ°Tޝ™ D—¥9çèp2.C GÀ1/š4YšÅW—Þçž+ Uªæhþ–™!_X¹qƒÌZºTæºçK:HÌ:hsÎùõ´råi–Ï/«T‘ž:æér˜ñ†@¬!@G—ãЄ ²Ì=g±:z¶5çX»sÌCÀ!°~Å iݨ¡aaÇ ž/ž³X%sαZ3f× @âÎNewéÒE (ñèÖ­›Ì›7ϧ%''ç–ÅŠ;¾Rn½¯^½zòþûïçšM–±!ŒžyÎb•lZ;Vk&í:xð€ŒýðÃÆ¸Â…¥x©ÒÒü´ŽRº|…P|V‡”¸¸ßo©åóçÉÌ”í7ùÍ]úô‘B… g*›’dÜGI…øxi{æYQë·I‰óÁzyĹ¹szÕk×9".³i埑¾Ã‡gÄ"ÿ÷ÿ'ëÖ­ó|?þ¸võz×]wùëx‡S… äÚk¯õÎ1CeÇ‘á‰'žÎ;û–,Y"/½ô’ôïß_N;í4©S§ÎqÌÙTŸ¨:tH,>ú¨k“âÒ„!šç,Máãœð{Kzœ32õ¹‡À¾={ä‘kûe@ùJ•äÞ·Þ”Ó.¸ð¨´c6ôiÿé§òÒÄŸ½èÛ=(ï<úO9xà@HU5·Cò…‰?IÕZ ¡¸h»wìðe8¹}û¨s¸MÁ¼öïÝxþòê«rá€AöL…ÓË?…40iz *ä›êyûí·¥hÑ¢rÉ%—h”?¿óÎ;G\çÆEãÆ¥½«7ˆóyç'5jÔ1cÆÈ7Þ˜&Yžy ƒ“ÅÜ÷ãÆ“·ÞzK®¿þúˆ²Ì:Å2Ù´v,×N6ÛVŦž3ZÿòKé}Ë-²mófy}ÐýÙ’Ë«÷ ’í›6{]»wl—÷{\ª¸˜üÞ7W:]ÐSÖ»W‚^¾ç/™Ê¯t¹ròÐGÊ€Çþ/jù Mi UªVÍÛˆztîÝ+-öcŠ&ÿôâ˜éÙG:Ò“Ó4F© ÞÁ/^¼XÚ´i#÷ß¿TqÍj¹ŽÒ°aÃäð×Íš5“#F¨¨Lž›’çú•+<2÷ºò`Ó5n„ö?7…Û¤eô‰ŸÂ©˜€‹åS_[Jm›7ùü‡Þz«›é Ði ó±aÃyöÙgåõ×_—~øÁë™={¶·ŸÑÔk¯½&¯¼òŠüòË/i™lñ1†@øýŸÞõ“O>) ô@Î=öØQÏÏÏW,“9çX®l¶ís  ¦O“)ß~#Ï»wÏ®]R±ZU·VYÄ;è;Nï*?Žü\âëÕ•çP'ŽúJþäâX7?uм6dˆT­ ýyDÐõº»^³t‰4p#,¨ˆstMN=Uª¹uÄSÏ9G¶nÜ(=¿§\X±² ½e TpN¿•[Ï…öïM”{Îí.KfÍ’nš³låÊ2üùçåãýË;tFÙ#]ZÔm(ªéw7OÜ–Ô5ÖƒÎñ(Ï–µkå”3ºÉšåËeˆíÙµó(›xH#Ñ®mÛœÌE¡ãÍ¿ ñl8óôð8ì&ùùƲË9³3ûö•­8…Ûo÷òá˜?kïÛ6m’d×0DCÚÛ§! ?¢‘ÄÃú/£iÖ|]çéÁô´p”sæÌñ"Œ’™ò»ï¾ûü¨û²Ë.“Ö®3öª›îgƒ×n÷A”OÝÎg÷î»ïFÊ*‡¾AƒÉ-n¶†5qœþR÷ž)ºí²&øœëè±N~÷Ýw £ø—]çQ)›;VúõKY¾av€5wÕÓ˜³ÖݤIéãö>0²ž?¾façG üþOïç[±\ìÈ-V,[l¶e «WËMmÛÉÝÝ{ÈÇndÁå­îF†f¸†m‰]´q£Ö9ZŸ~ºYw£ž[¶x¾• Éέ[äž×^•¯ÜH»F½úr›»ùÙŒTÎ5ˆ;^6…=ñõWr³i×;ùdIt€q.GÍ:,ôãg#d³s´§ž}–üÝmôzú»o¥ÿÃKãvm}:?Lÿ1s†ÊTnuzÊÈ– ¯p#Œ?8‡Þ¥×…²Ü…ßzèaxZ'¹¡EK·n½É˘¶]$Üë:oÏš)·»‘rc7ªZæÞ£ý,uÝÙµ„âZBÏÿ«…ßÔîTù¿Ôâå+U–Kþ|§sÀ¥Üºöv9,ÉR©f Ï»ÔMkC¼îõ¤ÛÀóÕ¿ßò×üqÓžѪ… ½>ø–¥NËVtÃSÀ¦”ˆè3ÂC5)VTƒR0ü•,äRš ¦}w¹Ùާžzʦ™fZû믿ö›ÆÙnݺUÚ¶më×õU•Ú®£ÃHšƒÑv$bDͺóW_}å× /½ôR¯ ÇÊH†©iœó9nYäXˆur:ºžMÄÅz|,e4Þè`ßûô`y&/’9ç¼XkÇÁæÖn:»¼[÷eZ÷q÷z ǯn=±²st§»Æzø¿ž‘«ËòŠó@IDAT7‘·ÜîÞ­[¹£µ·"¡QcfT´ÝXÞûç?¥ýy=¤ŽÑþêFOtÓ˜lšºÙM§óç ÍÝ»­Œ4Ù”ÆHsÚè1òŸÿ,^v¹|áF?ëû-T:môCL#ßÁòæskéßú@G·‘ Ú””´/‚tÚQá‘¶äï)áù?ìF‹W9\¢Ýö»¦œ 5p387Þ£fd‹³eÝùꫯ–ë®»N:¸M| ÝF=v>ãÀÙ\u,„óçu*6š±;›5Á¿þõ¯RÙݼ ƺôUW]u,*åæ›o–Jn_BõêÕýNôÿþ÷¿ÞN]G?&eÆœ§Ð{‰Ž ÷T^uÎÌ&Op=V£ü‹¯6õ(W^â]C8Ì­ñ¥E+Ì—¿_r©_gæý䓚6•¿½ÿžÛ\•âˆuÓ™SÜN]6zñ R;÷‡ ÷½ýoïlÙT5Þ­û²öüžÓ³Á¶žì“¬J¾.^¢„´púäc·K¼Œ7á§/>—§Ü,[ÜFªÒÎQ·ìÒYv ÷kÞB:ºüq7‚˜>¿°rá=ç—Ýt*;ºû6h(õ›7÷ïR/wS£¥ÜÌÀŸÝZè9Ή@á6Å×­çãùa]ø\gÓÚ»d‘(=<ØÅÞ»z¼´8­£¼øÓD/ÎÎó9nð—ÎÖr+•ÿÃ}¯”¹nÚ÷«-›¥L…Š‘² Å}øÔ“òÏ›È>·Û=ÒÚŽ1'ˆQ3kµáS¬٭uñ‚ë¸Y±‡2’›ÕÂó:½Li³ž¬kêÇ"k¼±‹K+Ñ÷³<¼!À+zŒ ÃŸ!xØxÿë¯IßL¾Þ­=™áëìì3çœäò¹ #;œ3&mvkÈ•œc §M«Wù]ÕE‹%±yl§›wkˆº6JL WÁx‚_ ç ¿VçÜî¬3eè÷£%-‘l וÑuFx¤'ŸÙüqÎÿç:.lj ®›j^Œh ^;b¿ÁóîíÛÝÛ‘žÒÙ;ÈmlUçywȱ `¼ùReR66¥U°HŽÞÊ5ß «²Œ3%F’SùhÏiéH+>Z½ðe„Gzº²’?½û¬Œ"Ó³ËÒ üŒ™å:·‘(/<[æœ#ÕœÅÅ<¥Ü{­ÝÝôu÷Q”üHÅÜ«NkÝkaÝ&ºð]Çù±¼V&C »HË1ó«¹ÔH¨àþ¥êHÞq"kVn• £gËÞƒ‡äÌëÛ8]öH¦Ô«C‡ʘ>eËJ§ {…x–Κ)KgΔ¦:HÍú BñÑH’q}$œÃi{æYÑŠe‰²ÄÅ¥?nWøu– ˆRÇ<ýûÑRµ`A9Û=ê”ÃÅqÒß;¾áðaisöYi:è1}èÿ­¬ëå—I‘"ÅBjV/Y,ó&M’x7?¹CÇPü±ÓìÂ)\Oøõ±Øv¢ðšs>Qj:‡Ë¹dæÿdÇ–”éº{{^ …Ýß³ýcÄgÞŠr•+K½æ-rØ¢¼Ý¨—&ÊmIrÖ%§I‰¢E$ÙyæääþgMÉ’ÝQ @AwÄIâþ$=|¢._DzÞzZÄ‚ïÞ¹Cz¸QY|:2ì·ßBÚÀ¶Í›äÂÊUädç|^vâxÓ°¡OËøO?•—&þœnVáv…_§+œM‰“¿ù¯”IÜ+Wwž”(ö»3¤>qß>yÿë¯eg‰âò‡î="±È@×šãœøßþóŽœs͵!ž¿õé#?|ö™üÉÕã%úS(>Ú@ÓìÂ)\Oøu´¶H|8gÛv"Õx•µ~ËVÒæŒ3ýQ›ÌôÇLïþR7½ýôÀreÆr‘imY¿Nfü0Nú5k&ç–*åã>ä·iõª4óÞµm« ¾¨·ô®VMzV¬(÷]ÐÓÛ„Ê;ºv•ë[üÞ¡¸±U+¹Ì9¶=»vúï>÷\ŸßA÷ÿËï>ú¨ô?åçüÊÊ­[˧Ï=—j•È‹/òåÆ9FK ¦¬5ó×HÓŽ-äprœìÞ—${ö’½IÉ’˜z&Ž4xàEÙœ¤ÒåÊÉCnD7à±ÿË‘l_½olß´9GòÊJ&¬1ïvëËŒ˜3rÌä¼È ‰ÎºêJýÝû„’¹ÿ~;VŠ—(!ݯ¿., ¦9]ŸÇbç‰ÀkÎùD¨å+c’¬_µJF¾òŠu QÍ)S¨÷÷ê-[6l×]'‡’‡¯¼R~›7×;É´Ò‹†“} Ï%²qõjéÜ»·ìqŽ|ÈïÀ)ï2ÎßáÖÁù¹Ä׫+ ÉÄQ_ÉŸ\S| ÎÑ/™=[L›*ËçÏ“EnêwÝŠ2í»ï…üfŒç¦ãeñŒòÚ!RÕ­'öä?íøº»^“ÚÀn߸I¶mÚä¦¤Ãæ¤Ã ¸^øË©R'^JV*'{’JâdÙëŽm;÷ÊgäÍÛwû8Òàd³BéuX"u°Øäô²qúlJ‡äµûùΣà±ß^oVzú]!3ôÖ[åÁË/—óË——>5kÊ÷ï¿ïeï<ã _Ÿ]½ÂÇTéWÿ~Kú·i#ç–.-—¸uÛ'nºI…ÏÿGäÞž=½Žµ¿-ó©ÔùծδkûØ£V-\(=µªVõ6sftÀ‹ ²‘¨{¿~RÂ9áÙ&Hâî]žåç/¿”Û·KóN¤T™”%´:µ‘êï6'‡]ŠéN7û¬O2Y:{–µÓñ½Ò=·ÏÞqGȼXÅ?d` ˜sÎc–ŸÌ­âÚ3gÈóãÇ˨×ßÝ;wJŸ?Þ&w¾ð‚ÜõÒ‹¾¡ ¡O/-GŒð Ô©çœ-ƒÿóyaâD)[¡B [2À)Ì{†mà|Ûtëæ§H™&m}úé²bÑ"ïFïg\q¹úiäH™4j”s ã´ï¿ò£AërÑEBc­\¸Hvº]¸÷¼öª|åœwÔÝ·/üô“ŒÞ»WJ—OµÇs§ÿ³fáJ)W#^÷ôxï‘­Û÷ÈÝ×_$o>÷¨?î¹¾# Ç /2ȦGÛ7n”›O=5tŒûøã;Î.½KÄNN¥J¾´eÝ:¯§²s¦ MšH7R¨HaŸ–´¿Ä*ä0K_ÿáƒ=ÿ—o¼!»œÃ9³o_Ùê:nÏÞ~»×ÝÀÍ^@EŠ•&® ]gé¹Ûï½»wË%wÜ.E‹—d§}ûçKï§yÇŽ>¯oßydzýì꘺¯ŸšGz²Ñ¤­w¶5L¨í:z‡ä +W4¼È ‰J–.#-ºt‘ÄÄDýÁ‡žetê(ºÇuýü5³Qiuj#Õ_S‡#¤˜ºµ“#êsÿÞD¹çÜî²dÖ,éà¦ç˺%ªáÏ?/ÿë_BÇ&Vñ÷…ʃ?æœó`¥å“k¸žw;™­_¾ÜŸßzèaa½åîÔµ¶ +W¦›æ…?›×¤ì×ÆÛ­ÄJ-7 §`Þó§LõÉ­NïbÓðŠùó¥•sÚåã™êFÊÓÝf¶ø:µ¥±›ºží?Κ5õ3®ìë×»ôºÐ®ç åxZ'¹¡EKa˜YÚ²vƒwྃnÄ|Ð9_wŒùz„lذ&¤rÓ¦µ>NÓáEÙôè€sŒ’ôHtŽM)£‹ò;9§ç‹n½Mžþö[¹îï8ǺQJ”,)ÿü|¤*TX¢Õ_¡JyòÛo\gí%‰¯[WvlÛæGŠ· êq/çÒv¨nۿݨî¹ñ?Hç‹/–:M›x3V.ˆ<òT9÷¼y€ŸÁùqÄH=öÃüù‚7ùsVv¸©w5Gã˜á¡Ã‡ ²i‘:á1ºÍan´?Ýu2˸†®—^êE¢éÔë/Ó®¤?!›]ÇëT·QíïnãßÓß}+ý~X·k+ñ'ÕYüƒeÈKá#ÑÏK–›­y"1<à£ÞzK®¾ï^¹`ÀÙ·g0ʪâF_Üè4­´pZtî䣿OæÏ4ZËçÍ g“`Þu[4÷é §Oñ-œþ«W;é$·1£€œâ¦Qúâ 7"+.íÝ:vÍõåÝÇ—Ínwmý–-ó®ì§P<þ¸Ü঴§ã§`8Ÿ¹uç~$¤ûXÉÉå·-ÉRl¿HÁlüI:œÒ¡ ê!néV·9ÌMvǾÝÉ.ì"ìgýÊ2Èí§.‡¸ÍKÍÚwðéoéf.  nøC¥Ü:>„®¥Jû°þ0}ýáOú b‰n¦´sRÐ$\T®b%ié¦t§Œ-k—-•iî\ËívnÒ.e$™x†ÉlÞKJJòËŒˆ£¡¸¸8¡ó„lZÔÅmþÂÏuoF|ûÓu\ξâ ßùA&Øá¥³¨D‡W)£úS>Îk—.õ—õÜýQýþö7Žeü½yðÇFÎy°Òò‹É4@J]Üh§ë©u=rÖÑ^ùë½rSÛv~ƒKzi*¯çnô[ÆmLšêØÁ½{¹\-e÷Ž£7`ónÍÈØ4§¹Qñã7Þè_Ý:re7í~ºk¡Ó/é#ûÝZ9 `[7eÞÁ­SpNbÇÖ­Ò©W/Ï3ü_ÏÈÕ›È[< Z·rGkŸÐ¨±??ì¦fÙ¨v,ÂÊW--IÛÖI!U‘¤há‚Òþœ^R½Z ¯“ŸªUã}iðÀ‹ ²™¥Œ:,ª7ØÉÑ8=³aîž³Ïñëì×¹FüÌ+új’D¯¿hH¦`à~ñ‘n£¡¸WŽ ‘Ï¿è÷œzÎ92rÃz¹â/÷¤²D×ÄßÿFÏÿô-e»[žèvYÊèÓGfñ‡eŒUnJ*@ï*RdÒ[aâÝ»Ë~·TòJêîúóRËAtx!:¼Ã\§ã7³ðºÛ71èíûx~Žª¿¦!¦Ô@¥š)÷ÜR7­ ±íIבf­9–ñO5?Ï¢»só\±Ì༆@ù*UÝZóó²{ûùG¿ëd®{çB·¡ç¬¾WJziáådšü‘Ï>•jn“Яã~pÓ›M¥•AEܨ7±yæùÇK 7JþÆ@¥W¯SGÿjTh„vš›®.áv‘ÓÀ17v‡²n#tö5WûóåwÝ-ݯ½Væüôðí®”]±?Ê5ˆá48uSÑæuk}^ÁôWÒø0K'<\¿]}Yñîx9°£¹ßäÅà±€›ßÞQ ”œ~Ñ5ž}»{•ªT 7¥íæ³™5Ý´fƒìY½@Ú_“25®3škí°üý’K}‡LNr¿½ÿ^¨Ã’‘žµË~ó,ìPÿÞ­‹*uw–ê4iê;Di鯩¶=ëLïÖ‰ßzðAycú4ùá“á2Þ½ã;Ñí8Óíð^¾`,q;ë£!>d ³†nÆ£zí:шEÅSÓM‘Ïûa¼´t{,ªºièÃn´ŸáþG£fó:7zŸç:s »¦_‡¼¢X¥F Ù¸ftv›ƒ¤Ú×îä;¼Ì(i‡wÙœÙAÖP8ˆ)3DABß`·4ñÔ€›ýûð¥£îä^9¼ÖÍŠ0‹4Æ­ÕÇ"þÁ2ä¥0s,É‘ʼT³ÕP\³'}ëÖs»GóÃ`íšØí\C>Ô}¡)¯Ñ7"ùâ™ÜôÒÖ5¾¥œ.çF`®× Òd åûƒ‡Ëî݉2Í­Ñ—)—,Þy¥vSŸY%¦á#uX²ªWå³¢Ÿw×˸ÍzE‹¥ÌаùŽÑ]áÂET}Ôç‡ÜzíèaÃäÎgŸ•>W„¢V#u8ÑíW¨à:8½Oï*EÜÈ猓NçœäfVFºm[ÝFÃÓ.¼0[ê0R‡7˜w0Ži0MÃðÐÙ ÿ:[,â¯6祳}!,/Õ–Ù5Œ ØĆ^ iض\3xpÔ#¾¨3Ê!ƵËVÊ÷ÿþB Æ•’¶çvsSî5]¸ Û –âžÓÐ;Ǽnùj÷êÐ8Þ-g_¡ÛÝœCæílX3ýèñ'd¹{§˜)øOÜì‹:ûì*3)SÝ”y w?vi}ŠTwSèá£g3#ægü*kÜH´ÝÙg5û’]öÄ’žœÀ?–Ê-朣AÉx \F€°ë¯’I#G»÷y7¸éå&R¿E#IhXÇ[¶rÑr÷îéB÷Á–ùR¥vUéÐû,©Þ Vè5µ\6?泟ê^ zÛ½TÞ½–ÕÍ|­R6òe§áÔ!¯ùñÝë}ë7H#÷¡Þc®áò„Ö¸wιNÁB÷ÇŪUõß7¯äÞU×W ³Ó–XÓ•øÇZ™3²ÇœsFYº!#и'íÛ/ó&N—¥¿Î“µ‹WÊŽ )kÞe«V’ø RÒô´6nnÑ¢Q‘ª‰Ú ­ÃîÕ¾un=yóÚu²Û}¤*åvtWН.ÕݻܵÝÚ¾ÕaÔ°æKFsÎù²Z­PùÖ/Yÿ=ä>q ñŠQœ[Ÿ,äŽìXcÎÏøÅBÙ¬c¡bÛœ³íÖŽí:2ë #Àùzù­°#xí"6°:ŒÍz‰5«¢ñ.Ö,7{ CÀ0 |Š€9ç|Z±V,CÀ0 ¼‹€9ç¼[wf¹!`†@>EÀœs>­X+–!`†@ÞEÀœsÞ­;³Ü0 C Ÿ"`Î9ŸV¬Ë0 C ï"`Î9ïÖYn†€!O°÷œóiÅZ±ò>›Ö®–U‹˦U«$1Ê¿>Ìû¥¶äGø»Uþi®VÃîÒÓÿÌüXþ̔ɜsfP3Cà8#€c^4i²4sŸtì}î¹’àþ®ÏÈÈ«¬Ü¸Af-]êþ§}²H1EEšsŽ$c1råsçKc÷§=;tÌé¬-?C Û sÉqhÂYæîm=g ±­9gŒ‘q9ŽÀú+¤u£†9ž¯ehO¸§¹·2FÀœsƇ!ã$îÜÕT6ÿ<Ê–-+]ºt‘¥n š3gŽOÏñX††@=soeŒ€MkgŒ‘q¦-[×®¨¹rB‚´>½kÄ´œŠúH*ÄÇKÛ3ÏÊVÓ>µ¾ÁƒK«V­äû—ªÕ«WË£>*7Ýt“Œ;6jÆhdî?îEî¿8÷OiiѱÜÛié8âÓo}N¬Œ¹‚À›ƒ‡ÈœÉnsHjsF7i=¦k„”œ‰6ôiÿé§òÒÄŸÓÍp÷Žòȵýääöí³Ý9“1]rrrD úýÑíÔ©“tïÞ=ÄGÚý÷ßïåC‘0Žƒ“Ä;ï¼#ãÆ“·ÞzK®¿þúˆ²ÌòE‡€MkG‡“qe3w¿úŠ<ãFÏE‹—ReÊ„®oæ™lÎíØÔ½zß Ù¾ió± n3£ŒHGzÙÕr¯¬ »oß¾#š*nƒÇ_þò—Püd×AbÔ]ºti©[·®¼÷Þ{>mñâÅrÊ)§ÈC=$ÕªU“Š+ÊwÜ’ûé§Ÿäâ‹/–òåËËI'ädMüꫯ¼N¦ØÏ?ÿ|éÙ³§×;þ|iÛ¶­ôë×O*T¨ ¹Y‡>ø@5jäó¯Zµª 0 ”›6m|'›)Ó°aÃäðehÖ¬™Œ1B³´óq@ Ò}—VÜš5käý÷ß—’%Kʇ~(\GâM«³yÌÏó*Í9çù*Ì›¨ß²•´9ãLt½é‚ †®ë5o!«—,–KÝôöÓÊ• ÊEnêxËúu2ã‡qÒÏ5Ìçº÷&‰ÿaøð饅˜Rﺩ·þÎùôpäÆÖ­åÓçžó)wžq†8p@6ºw‹Éÿ¯ççÏk[æÓ™î¾Ú9“íÚ…«ô×éÙ0ä⋼®ÝQ®¹Ñ¸1zf~DÊœ†ï—_~‘'Ÿ|R:wîìJå›9s¦L›6Mžxâ yê©§ä×_•íÛ·Ëy®|ð.Y²Dî¼óNèðž:uªìß¿_fÏžíÃŒ†^{í5yå•W¼þ•+Wz‡[£F ™;w®üñ”Ûn»MÆŒ#Ë—/—«®ºJzõê%óæÍ.ÎzÛ¶m^ç¬Y³dóæÍòâ‹/úÎÀ 7Üàð† äÙgŸ•×_]~øáÏ‹èÀî=z¼ÄMš4ÉÏ0}jtü¿çÒ»æžã.î:Úœ{ì±£îYîeîi£è0çNÆ•Ã$¹Qßzç G:‡P´X1©Ù ·àþ^½e‹kÈ{\wvûÃW^)¿Í›ëwZiá¦ÏŸ:E^2DªÖNþ<"‡œ3~Ý]¯YºD¸Q$T¤hQirê©Ò¼cGoÇ·nÊúyÔ(Y±h‘ÔOåó‘©?tÒ³aûÆM²mÓ&IvvGC:ò Q ?‚ò}úôñŽ˜QK‡¼óûÇ?þd‘gÜlD‚ël\çp‹wœ$£d¦ï»ï>?Ò¾ì²Ë¤µë¨¼úê«^–Æø¥—^’&Mšyàhý2ÚeMñ9סA×Ýwß-Œd_~ùeùÔ-`#nœ÷ /¼ %J”ÙBÇ'Ü·o_¯—uqFÒ#dFÔ8%:ØÝ¿ILL”|PêÕ«'W\q…ßì¦|vÎ~Âï¹ô®qΣܳAýs:tèQ÷¬ÞÏÙoiþÔhÎ9Ök¾)U×ø¿1s†jåÂE²së¹çµWå«m[¥F½úr›kT .,圣xØÊ/x‹ïü8b¤—ûáGþ|Á€›ÂÕfhà n*xôÞ½Rº|…£d#E0¦AÔF-xò?þøã2Á½?úóÏ?û*SÒíÂFö84%¦°÷:;ØÑÍ™éc¦¶9¹ŽÇîÀ×ȘNVbT„Ã^¶l™´lÙòˆ]à]'fݺuBÞM›6UöVb=\mÁŽ/¾øBêÔ©ãü Aƒ¼~Ê©ÄT;„ǰ°1:~ï·ŒÂIII~¶ƒ®Ãe¨/›Ö޾¾~ßU½Œq9†@ 7bŽ+²ós½›2…Þzèaø ÷³Á´9§EJó Ÿ?tï!]z](?~þEˆ¿®s(ÏŒÿAÊWªà)W±’´t®¦Œ-k—-•iî\«~}iÒîTÙ¶yÓ¼éÙwc”4p¬GjÐXÏUªïìa}8=Š´‡‘0§ªé+Ü;¨Ä­MÝI¯ñAÝÈ1: ÒŒ3;ºuë&Ÿþy( ûûí·Ð5úT'SØL—üñÇÒµkWÁq³†tÎÁo!%.ä Æ[8û ã–D½Gº—³3ü¤ËFÎù©6óaYЏ)m¥ÆíÚúàÕ÷Ý+Ü£|gö,y}ÚTôö¿%½4•×ó¡äC2À6ßž5Snw#åÆn4¶Ì­m~–ºîìÍhý•]Îï£?}Ë@ÙîFÝÝ.»4” ‹ A¹Ü 3U½k×.¿ͨ†éd¦µ¿þúëtMºôÒKeëÖ­Þ±ÒØ25s>çœsüZ3#o¦Õׯ_ÿÿì]|TÅÖÿC ´ÐRÒ»T)R”^TP)Oä=Tx(  ÒôCŸH¤ˆ””^BB ” I R¿s&¹Ëfٖʆœóûíî½wfΜùßr改«\ÞìŽ6Fóæ‰+tVÂìçc<)Ê]ðêNüÓ>*Ê¢œ3‡Ÿ´Îfô×K¶£ì`~ˆï¥,ßóäÆ]òñD¼Û´N‘r0Wf(âúïæâ-ÚXA™¿55¤O#UŽ–‡úµ··G8%-ýüå—¤(¡ý€dA;áø®]ªÿ©ņ|-É0â¬oR2™µ a†ü³z¿y%X!~õÕW*žÌ®pŽ;¿õÖ[f»ª_¿¾Š)~üñÇpqqÁÂgܸq*¬0M¦xI /§á,lVø—æã†ôïÿÎÎÎ(W®œÊÆþûï¿UÌœ_œ"”»Ю…R¥J©kB”sæÏŸ¸µ3¡pÈ!JÓÛ…Æ.\€¥´ÔiæÐa(A‚^ô²Îƒ+ Ì•é‹øú¸ñð;{Ç·o‡—çf8ÓR¡î””ÔéÍ>M;wŠ1¯ ä£Žo¼·ªÕІ²·’©Iм\¥ÊúìtÛ–ä ¤XmÅt“R]𺆙ذä&¬[·îS®ÄË—/ëzäõ¨üa%ÊñeÍål¬Ç£5zï½÷Tf7·ã„-­gk³L—fâ0vƒsrÇ´õ­â´|ÎÛÛ[%°q Y‹+k}èÎô÷y‚À¼…lGðÄn=åihKìôÏ—mH™»¤àáÉéfrÁ·Ô_Ðå_ÂiÉ\YÚš@èÝ@8—s3<ŒÛ·P‚¬ºB…‹¨²i”¼›ÖÙŽ¥Lãþzë}Ÿj˜z =2ãñÛÿ¾ÁW#ÿ­’kŒÅWYáÙ"qœ’•<[ÅõêÕÚ5kÔ2-^ÊeÌz¶Å1ˆL)ðò¶ô/¡Z°`Fm4'€Ë Ñ*ˆÉ”€9h“µöéé#¯ÔmKñyqk畳ýœÓµ¢;]¼O+f¦¹2CŒ)f®ãR¡¢RÌÛV®ÀÛJ1—&n#YÚ†SŽy^€=Í£;ôy%œˆÈ˜xD¥~x›q×áºÜÆó:ˆC†bæ[oéªD=¸º¿‚ÿ{g„úi]Am\¢‰+æ7iÂhJ1³(\Æu¸.·1E «1N0{Öþ¦«vhÓ&uüÄŽºc9¹ñÇwß!ðÆ ô}ÿ=/U ÓH¶‘ÿ÷•NŸ'P¨Hl ½—k3Fÿ¾iÜ¡#\è¼}?~‚nœ²aQΖ1²©kf͈ÆÑ½dI¼Ó¨6ÌŸ¯“ïôþ}ú èêà€Á5kbÿúõº2cQáðaçNèJ³ô·=<Àí™nû]Å@ww|;j”âÓ—n¬° »Ø¶rF4i¢ê ×â×ï¾ ¶PØšåúsÞ{_¼þ:^-]ý+TÀ®_~Ñu{íü9ŒzñEt+Q‚,‡˜7fŒ®Œ7è‹§„îôÀb^–d×oljÜÖÈå½wÞ$K‘åú¨{w5fî?äö-ŒíØ‘\Ùñäb¼¥db-ÉjîüLé×Wñ‰ŽL‰ëÁÔöåãþ¸ãsuZÕGR²¢ÅááãDÄÆ%#&õÃÛ|ŒË¸×å6ÜÖ½7ç(W©¼÷î[ÑLóÇ~ˆˆ°0t6 u_le¬Y¶ãs4Å—Ùb.Z¸°Å~¸×å6ÜÖ-7±1-UË‘òí«BEºþ=š6ClTô16ÌK¹‡ùþ¸wçŽúÿî7«×ÄÉ=»ÓÈ௮¾w&R苟|ß²›œÉÔ}kêþà6æ®WSí,ÝW†÷ »ê; ˆÓàÆ¥‹Ü­ˆr¶$[©ÂnÇ¥S¦À•âq#fÌP7ñ2Ú¿C(VžŸôà`õ€MJLÄôÁƒÍÞ çE"¹@[Ó~×ßSû+ޏGD ÉsÉ¢az˜<ŽÅüÑccFÓì¾(þ$—ÛÉ;‘” êó~Tx8:‘’½OrÌ£˜ÓãØL ·–ß¹sx‘¬ž’..X¿`~'+B#Ÿ“Þªm·!Ch,AJÑkeæ~ÍÛ’\l-~Fc¾wû6Úö郇4v3•r† U×ö… ¡vóæÈOÏdJVsç‡Û…ß ÁƒrIø¤¹ÐùóC™Ên(æ\ 㟌Xú<ˆŒÅÆßתOhx´:Æe\‡ërnkŒìí cÂKTÑ¢ÇÑCs?öPœÓ•&TcæÍÕ51õpæ æê:VnÜòõ…Mö*ºº*Ü{K®Ëm¸­9*P BïÞÓñ5ðiß>èS¶,z89aRÏê^2V7³Ç’Œ»7o¢’G-Å*®¾ÖÂH>&¾ÆìííaG2×mÕ ¥é>Ñ'žj÷e¹÷wì€;ÄoJ¿~à<c÷-·7õ\0w½fæ¾zê¾!/GÆÔP.ýóþdÛ ¢œÍ€““Evt+JN6Úm>ŠAD’eÃà{‘”Å:aéØF—ò”½ºuÙrR¬‘èÿÁû»p!Æ}¿H=à´Y¹1¦ŽeÊ`þþýøŒÌ (á,âþ}Ùò§®j²˜—Ÿ=4ãåd´•dýÎ?°méaP¹NmU/àò“‡#óûfÇvêû{¸U­Šˆ¯›Ô²9Å?§¤·owîÀˆéÓáÑìIü‰ûšý÷_ø/yx2ÀJŒ•º%²fÜ&å"g$M&šwyŸþô>Œ’ŽŽ)]ÒyxÎ,X¥h\ÓÉ ¡)gS²š;?Ìtá¡CØM“œâ”ôg-Ýñ @©ònˆy” pl±E7¦ýKðòÜ ·jUáNJóðÖmø/ãØpVO$bhrëR¡¢QÖcæÍC1òà¤É ß'Õê70Z¯(yÆ–ÒXfmòD³N”ÇÃkãF]]ýûÖÜýaîz5×NëÈÔ}exßp2e¹*UT³ëgÏiÍå×¢œ-”SÅÅJ—Ïòù¡£Ol±2ñ Û¢[w´ëÝ 7}|°bÚtŒjÝÿ¢˜ÝLA4ƒfâãœL0žê2˜~@Wª¢`¹^­TEÉî\Ê“’´ãL#"v_ÿöõ7Ný}@ý^:–2Ž{¬U‡#Yù)ÞÉä@.7¦8J< ¼vMmW£%SQ‡â:ujšLq'RÎZ[~ø0ÅÆXVÎ֌۔\¡wU?ÚLŸû¯H.nKdJVsçÇOSåaÁ(BÔ£²˜I_ðgÏ_›|G×$$$PÓʹ.·á¶æhü’ÅÊ:c…Ѿo_´îÙKWÝÜÃÙÜC]Ç Ó׬fk3×aËšÛp[sT°=>ønâé:üšâéÉz“ßÓ{÷Âïüy4éÐA%üqBb#JÔò¿r,„„ÌõiªìÆù”<gºÖ3Cîtj÷J)‰{÷(‹]#ýûÖÜýaîz5×NëÇÔ}¥•ëÿºR…)Èÿ¦ú•/ˈr¶ŒQŽÔ`%ÈJ)üÞ=ßzr£ÝöKqM–¢‡-+È‘³gcÕ¹³MVů_º„dmjVè[“&bÝõkXMVî2Z®1yÕJ“òR=ü‰“þƒÃ^/þç¹`‘rù6ïÒžÁAx㣔äŽüšÅOmí R<øK³2yÛ¹ByþÁ5rk3%$Ä㛑#U [ ¯öµMäÓã©;hbÚq›’«~Û6Š«Ï‰“ê—­¥›©8èºcY(3XŸLÉjîüè·OÏvrrn„%ãÚÂï~Ê'.)e¤χiå\—Ûp[sÄ|ý6­U•.CÞNSÕÜÃÙÜC= +w’““G.ÛšŒ=|øÐª×å6ÜÖuø-»ëˆ+gÎ`÷/¿êªû?¡¶¶²\IÛö§ pVSÙÊ) *†\Й!¶ÀÙEÎtýBŠÂç £Fú÷­¹ûÃÜõj®Ý“~ŒßïªÜྉK52œÝRžù5€(gÓØäxI=zPòC‡-ÓÙ#1/½„ ”‘ZŠbaM_~ë¿›‹·íÓ[y¢#Ò&kq 0<4?ù¥òhmýš;?\ŸÞ8ù,= a¥]‹#îÁ] }l_0 Ì–]z£\Ù'9WW7uŒË¸×å6Ü6£dîál‘þx’p‹r˜8|c‰´:ÜÆÚÁG´±h±bÊRÖøW­_Omúz?qwûzŸRÇøYM܇I‚Ì{4,õËá¢OzöÄŸMÅqÊ:çI}+Š•k¤ßš»?Ì]¯æÚiý˜ûMsßÄ=Rê\Ÿ­~!ëålN9RkÒªUhNJ8œbË[W¬PÙnôø|ÝZJÌ*‚×)딦.P"×ZžptëVt:ÞŒÒe\)Ö¼Ñá˜9t.’RïEÙÔ 6);»ÈCoßÁª™³”µªÜœÎi“P´Æ=ÉÒ­Ý´)PlëJžz2K™üΞժ˜üeÙ>ýi5ŠÐÃqÝܹjÂІ&CȵYÊȸµ>Ù[1c㔥ììSûöS½RìÉž–³05¥lö(RØ+¾ø÷)QÍ™;?Ü.ðúu˔լ¥ò5]ñ8ÈŠ磊ͻb%0÷O¼7áSõYðëfuŒËTªËm¸mFÉÜÃÙÜC=#ý¹T¬?JÊÓ÷ÂXâÃu¹ ·µ†ÜªVÀÿŽISµ¹³9éêäž½j2ÌâSûö©e?úk£Ó4ÊÄ'â¹PüûÚÙ3™àT¯WÞ v «fÌT|Æ/^ GzW€12w˜»^͵3Öá1ýû†Ÿ1—ÿ9®ªÔMõÔÖ—ý§àijòA½8ÌÓUäHN#À.« zsVs±â%Œvz70Í $ô+±[Ü¥bE]\J¿ÌØvHàmussâ†%âø6[Á Ú[ªj´œcÚŽåÊ©7n­‰ƒé7/Mñ\´UxùÙ¡ªçág·áÎG±:üXæÎÎj‚d­xæÎµ<¸Þ9¯“سæêöHv7ÂÞ”?å%íS¬ÌZJ•H^Îä$ú¥Š¥_øótz»=ê·kj¶;^Þu€‹fmÚˆv}ú¦©»™2º—Ò:oNšcÃK´†-P¦Y4)<¾};îSÆ™r šuíŠIBÑb¡iYØáevWöÀ âáJËð’(Œ ÖoÎV3+æ`²#«±æKíQ^¨cHœLØ™^@¢ÿröþ¼Y³–²â&,þ½ÿ3 þ—}ðù€*Μ@^¡*4A›úËÏ”©Ÿâ•2ä›ÙýÕ´Ê‚3Ý·Ð}Ĺé!^*5ˆ^`ÒŒ&Œsè…-é½—ÌÝæ®WsíÌɯߌ¥°ÂãØGX|䈹&R–Š€¼!L.…<O‚‘EÅ/…à·4=¢x''i?['žÊ–¹¿’‚̇¦”´åàP”ÜÖäð"%•¢šI)³ 4ÁNHLBtt NRz‰RÉè5v0 xËÏèxÌ=œÍ=Ô­íÇxxË8’rìÓþ%ØSx†•3+i}b¥ÌÊ9Ž<žöã>¹ˆ[÷ê•%cäP+çRNÎú]fù6OTtÇ)S0pìØtñ7TÎéjü +ó›õ^«RÓèÍgú¯&~†"Ù|׬œ9³ä‹‘ËNHÈk°zkB3úG7¡B°G Ãkñ!YÓíŸ$»— Ê9—œ(SÐàø,ÇFéïheâ5åv©îÒ¬ˆ1k}=«ß¼0Æg…­ô›;`ål9=7wŒE¤ò¬|•NYéõ\Ž9/Œñ¹}úè7O³žºiêíʧW$›‚€ E¤} dSa“;¾€‹G¢ ýpăؾjuô9q…H!m ½‡‰Ë—ç¹3"è¦M›‚ï¿ÿ^YÏ?ÿü3îܹƒ?þøضm,X€ªU«b9aA#¶´ )=u Ûʾ ä,¢œso›êmÓÂ…d-ÇcÀ˜1J®?—þ “ïAhº»cÎ{ïá‹×_Ç«”=Ü¿Bìúå]ï½{ð&YsÝJ”ÀGÝ»ãÛQ£T›Û·tuô7Nï߇¡/¼€®L.Úýë×ë§ÙŽzpŸöíƒ>eË¢% MêÙaAwUQ/¾ˆ{¤œIö7«×„5®óm+W`D“&èJIVÜ+âëwßEbrŠåÉL£ÂàÃÎTùÛ`Y5JÜìfѸ1º—,‰w5†ùó56˜Ò¯¯Â':2BwÌØ†¡Ûøúõë¨Q£*V¬¨ªó« àâÅ‹8{ö,8A‹Ë™x¢U·n]ìÞýt8!=u3_†ò™¨&‡A ˆrÎx¹½éî_C1RVï̘Žjô@÷9éëΫa%%$ èÖ-üI–XTx8: „ûÁÁ˜7z´*gåùYÿ¸wû6Ú’õ!)Ï%KTVø†ÄŠõ“Þ}F<º†$rÉN<7.]4¬ªÜëcÚ¿/ÏÍp«Vîµpxë6ü—Ž%&& vóæ°··‡]¨ÛªJ»¸<ÅCÿ@àë˜?z b££i"2š,î¢j\'wìÔU;ä(ѺGÜ%W÷T+ÑôÈísâ8–N™×Jî1c†š<,£ý;×üT?á÷Bð€,àäTw´®sƒ ÃõÍ;wVrTT”ªÉ/¹|ù2®]»¦>¼ìJŸXYéRÛ\ßÚºO5Ö;`(Ÿ^‘l ‚@! Ê9‹€ÌmlŽlÝŠ`R¬M:uTÊêå7«!ü1w^š¡8–)ƒovlÇ8r©º‘Û”Ýß1ÑQð"Wk$)íæ]^Ƨó]xø0J::¦´%åaH[—-'e‰þ¼±d±û~‘²Ú7Ì{bYjmNïÝ ?ŠŸ6éÐß>¢>(áÉÿÊ k{̼y(FÖzÁB…ðùÚµ¨V¿ÖÔè¯[•ªXyþæSÒ[Û~ýP¹NmU/ಯ®>sþþýøì·ßЀâ¶÷ïãÈ–?‘¹#S³¬|¯ ò~&'bMbÊW«®úYxèvÇÆ¢xéTœt½›ßhÑ¢\]]ѵkW,[¶ ¯¾ú*’’’ðøñcåîæx³>-ZÑ41$v[[×°­ì ‚@Î" Ê9gñ¶™Þ6/^¬dñ;s#)ûw皟ÕþÁÓ$†9’[9?R²‡ÈUËGJ!ôN Ú®Ñ°¡úå:ÉÅmŠ‚nÞTE+¦MG[ÊFß­»Úxª‰ÏñêXÃöíteÚ¶¿î˜µì¾þíëo0œ”ø­ÛàÒ±TÓø¸Ç:•j§(l>P«YSuœÝóé‘»©]ï^¸I2ò8GQ_ÿ¢>9D2–pÅqeGšüpL¹y ø bc.KçG³¨µ>8欹Àµcü›žºúí ·ÉgXGöA sˆrÎ~¹²uxX(N‘•X˜,¬„¸x„QRQ4YÁd²eü÷ÊUºqÙyMàö@IDAT.¤Ûί÷:ÉúmÛ¨ã>'R–ô°»ùæ¥Kºº†© ï­I±îú5¬&KvÙɘ¼j¥aUT­_OóõöÖ•ùz§,*[¥Šî˜µž )—{ó.]à„7>š šê+™@’I#ÿÔq8»¹!=ró$`$e¯:w£çÌŸ¯¯zqg­ôü²²å°­äíøçŸ0…\å?æ¥Uk¦PA…!4ò'·n<º ‚ ”\6¦CG%K·;­Gî”Ϥ¬ëŠ›s²9ҷ蹿ÆsÚ´iDʽeÌóöÂ…•2+Jvu¯Zµ .4¡™>}ºZVåì쬺7nV¯NY"g©®9™ôË åÓ/“mA@È8˜˜|ÐHOÖ°.Ï+Aþð\´Uxi½Ä„i8-ï¹åë‹buqjcãçõÕ.´,H‹e«£c+“-ÞRN)ÊF;ž‘_Žýrܼ`A{“ÍCoÃѵ,ý#ÔÓ¯OÜ¡wá\ÎÍd?æ üHÑö'±wks6ÇŽ ‰3¸oQv½;M–,Qzêòâøö† P2î…A {༜§Ÿ@ÙÓ—p}Î(CJ`ßëñ %Zy²›üáC•aÝŒÖ [Rº®-+ .¶¢³ŠØ¶D.nLVIÜUÌܹ9ËÔ˜bæ6l[£˜Ó[—ë’9ù ëʾ d qkg ·<ߊðW[6£ÛÛo£@Á‚dmº‚“½fÒ+¡Ì!`ëÊÏÖåËúÒZ° Är¶ó+¥¨Z·žZãœ+…¡A@°aÄr¶á“#¢åMlÝ2µuùòæU#£~Þåü¼QO®GÀÖ—*Ùº|¹þ„€(g¹ A@C@bÎ6vBDA€عs§!yY眇O¾ ]AÀöàuÎâÖ¶½ó" ‚€ äqD9çñ @†/‚€ `{ˆr¶½s" ‚€ äqD9çñ @†/‚€ `{ˆr¶½s" ‚€ äqd)U¿dø‚€ hä?rDÛÌ•¿I­ZåJ¹ -ÊÙ*rL<ˆ@Ô†ç¶W´&%%áöíÛˆ{ŽÎ™(ççèdÊPA@È üjVúÏó‚™a“ãmãé?ߟ·×ÊŠrÎñËH:AÀv`Å\¤HÛ0H&Ê9œhæóÀåã¸îˆßPÜŒPƒrt+ ÷ZΨÚÄ ÍÝžÊ(ž –¬ÏÄÄDØÙÙ=Ù,ujIvKím­\”s‘ÄÄìùõ×'ÜòåG‘âôЬ…JµŸÏ¡­øø8ì[»NåË£IÇN{Õê;º¹¡i§ÎëgEÆÌή´¾­•5+ú6ä¡É ßpß°~Nï‡FáØæËñ½‡šuÝðÚëõQ¹†³ãæÕPø\ ÆÑ_OÃÏûZöö€³[ñœQúKEàAh‚oÞDlt4J:;£Ò u`—/s -197ÎGÑâÅáVµZ¶am.Ö|éÒ%lذS§N5ÙXX~ÿýwøùùÁÉÉ Ã‡G¹råLÖÏÊs²ge?9ÅKÞ­EHGGF {ÉRF¹u~ýu|NŠ2'éÁ½`ôr-‹z­^Ä÷‡-g`ò¥—KÔmÙ‹ÍvQ×ÍùèFgÙÒ+kvg8~ÃýìèÓZž¡d!ïYé"ìжs=”ww=«Ó?ûïÜÇÁÝ盈NÛ‚.™¦Ž±›>—påäI¸¸»£Qû—ŒU1{L›`™­”M…Zß¶4‘:µo/ynFrr²nÔ¥HA÷û_8”°|>t 6ÇÆ`ÉÄI¨X«&ú½ÿAiÖí=u µk×~Ê­„  páÂf•óÏ?ÿŒÇ£ÿþêÏ[._¾ŒÏ>û,ë4Á)66>>>ˆiÜØDÜuXÞ­ çË…,ÏïÄ¢ƒ^˜µqÊV¬ˆÝëÖá²÷Élè-÷²üaÒd„‡„Ú슗*…ikÃÈÿûê™ËxÌóì‘ú¼ˆNŽˆŒNDdL<¢R?¼ÍǸŒëp]nc }ûŸQ˜1d(¦öí§<Ö´Ñêðkt»vÚnŽþê÷¡ÆðãÓ]N÷G¶ü‰R..øáX ùl*j6iŒðÐPÛ¶-'DÈ–>Ø þî»ïP¨P!³ü£¢¢àíí>}ú Ý?¬ £É{àïïo¶G@ÜÚÆqÉðÑ‚öö¨×º®½ç÷ß#èÖ-¡ ÈÛ~Wñ!¹˜[¾ú*¼÷ìQn¯å§¼qì￱iá"ܺrÅK—Bó®Ý0~éD†ÝÇÈÆMðb´†ô7‚…‹ÃfÏÆËo¾©ú¸vþþ7ò߸qñ"]]Ñ¢{wüwþ|]ÿ qq˜>hŽR%Jàý9sðÒ€ºrs§÷ïÃ\š¥ÑÍåD“Ž‘_~©Ú²UiI.ï½{0gÔ{»{—ðh²•+«Ô÷Gc)ή¼G¸ $‹1¥GÖ5³f)Ëûεkäæ«ŠWÈ}ÖÌÅÇ&ÛV®0еj¨÷KšÅ}ŒZM›*‹2³cv©PSúõ…ïIo¬¼pÞj+êòqÜñ¹ƒ—úuBR²¢¥,a——>%ƒMiþØ¡N«úØ¿q¸­GóJúÕÒlß ÀEò’pPăؾj5z¾ûnš:ævx‚ÅçõY~ßÚDªtÙ²ÏB]Ÿqƒc²ñ!!!A¹´;¾öj4l‡Ò¥uõØS´ï?Èõí¯&TUêÖC+ºÇ™|éypÎë BhYP1º_«Ô«‹6}ûêÚêoÜõ¿‰Äç~P0õå„V½z¡Jô«¤{ÛXÜ–-Ò—_~.4騲e‹Ižl]— ™ÙÍıéJ•*)7ÿf7“ÝÖ¡w±mÙrÔiÙü ¹{ýÊTrG‡áµqcÊ>V¯ŒxEŠËnÑò—7„…%ãy–<º}{eQ¼Mî¡»÷ ÷È‘¨TËq)Eí¹d ‘{¨BxLî˜ù£Ç(E=`Ìh*R._Ž“;v"‰npV켎N¤dïcÞèÑJ@vuM Eîwî^|唤›g=¹ž~§Y®F>¤ ¸m·!CF7Ïœ÷ÞÓŠÌþ†ÝÅ'½û Œúë>l’è¡3}ð`ܸtÑ¢\Qîã³þp0miý\þ'Žcé”)p¥›iÄŒH$žËhÿÎ5?ÂÓ4&7®›ÄÚŒ/ËË“ &KçÂÒ˜™Gø½< !—4+QëÈ÷˜ÊTvC1çRx—€˜ødÄÒçAd,6þ¾V}BãÕ1.ã:\—Ûp[s´iáBu>¤Njþ\úƒ®zHàm5qš‘: 䂉4©äÉ»‘Çvì˜f‚ÅǃOûöAR’=è=©gºæRð»|ò„jûÃäIݶ-ºÑCü?-ZÀïì|ЦÚÙ¬™ qhBðDjD“&èJqÖîñ5M8öjØ7O\y"µaÞ“I)OÐF½ø¢â;˜î³y©cÔxgÇ/»­«7l€èˆHlZ°Ë&O¦ý&51/랢œâ±iÑ"ܽv]YÕEhl<é¾@ÔèˆpÊ[ù q¤Ü›u튅ìqzß~’åjH1ÑQØLÿ(šT5hߎîÏ$l]º <‰ÌjêÙ³':w¶œ‡rÿþ}µ K¿^–©(ǶÍa@Ï ~VÝöJ—qE…š5qûÊUüüåWp «ß­z52¦üðÏ_瘼†‰ålˆH&÷ã)Þ@qŽ9±âeº}õ*)Õ ç2d….?{Z—$²’$…héBÈ;¸yéÈ‚¸ìK7zŠs,SßìØNÿï™OYÜ\Î7çáÍ[JÊ£]ï^*¦ÍÇþøn.<š5MÓ×ì¿ÿÒµ½AüYñ$Àm¥Ye4ÝTÿúü3 ÿbŽSÿã»uWÀáÓ¾PMMÉåµi"iBЮOo|úÓOHB2z99#‚n^FYïH1”¢qM_¿^÷@f\¬‘•ÆL¾WP½AL ¥R›ôœ\¶ë—_LbâV¥*La]“,dk(£cfÞ ²¦‹4uîø vÛöˆyDÉsù“ÕZÎGâ³ÿ @pðUwËo+1séåUáë.1)¥Ê»Áçà4¼ wv“"(FÊáÓéüîOŽ®“U_µn=$ħL ˧NN¸m(]Ÿ²{¯;‹a ânê½cn"eØ·¡µÈÖª¥I«¥ñe¤œcΊªEÖ>'×ñ$áâ±c8¸ÉÞäAcå‘:±deÌT° =½ÔAmó$öôþýJ±ÇF?DÑ)÷ìµ2$“Ïø£[ÒÏŠŠÓD³ô‰÷Ýèô,ÈÖAþJ¤â¥J«ß‚©ëºµð‡˜8Lñ¬H”s6#Ï3d&vkdO.m<,R.ß®ä>»h!¶ýø#ŽŸ@bz§’=Yi.`Þw®b)\#·6»q¾{ï}Ôy±%Z‘ ’©€}AõË_ùôxêšØ`ë{+=Dßš4=É-ÏÖOÊÐCEËD5%Wý¶mWŸ)IpœUË4IJÐ+÷ôÉZYYÉŽ¤¸û¿È¥Í=¶–/S"ÊFеk1Pc˜<ŠŠ¶ˆµ¾<ƶ35fc -KNNÀ°d¦¹Z~R̬œã’ž^–ÃÇ®¥8&(6 <ŠN¦ó”`’ûæÅ‹U™ß™3`wòcš°1¤xÛ¸Åß«mþâ×"j¤¿­Ó~}ŽŸP› ÉŪo³5ìO1KM9kÖ­öÔöy »™23‘27AS̳é‹ã–žß/FµúõÑcÄ5é©G.ûcÛþ[ÂLÅR'×aä…`âq²Ì•œsqjÏ^ÔkÓZYÄWÈ“±gí:£o¾*Cn~¦V=^U^#öر"Ñø«Â |‹ÛZˆcÍhrÀÏO(¹ŽëÕ«g-‹LÕÓúÔ˜hXÚ—¸1ÙH¹—òóÅûä}³z¢lE¢\.G$]œœ€5í70¹WOÌó_5¢–”¨¥‘þ"~V¨LüP⻓™x¤%â$¶¤O’‚ZHîº/^{[–-Cžån‰‡©òvýúß³»—–€?xKÈÝÈ.ÈS©V‘©v|¼>=”KPÜæÄîÝø”,³áõ9‘¦ »»8æó3%™±ÒO­'×ý[´v|-ѨѨ!}©æî×7‡If°¶$Ÿ5cæëâMò ð²;k©´kqÄ=¸ ~†Ø̇Bó£e—Þ(W6ebÆ|\]ÝÔ1.ã:\—Ûp[cŠSd¡.Z”B<ÂMaNäİ¿W®'62ñC_£˜(ƒØ¡Þ«jý”°/M’4òõ>¥6Ë’;Y#¯6QÔöµríW›´6ïÒžz㣠ªH7iÕë[k£ý›´~CLŽag'ñû \hÂÌîQOšàÚ²¿“‡!ž”®{mÕµ»‡á^„<çqô¯mØóËo8½wŸŠ'§N„ØR½}çRà ÆîÊ/¼  ?3ØËudÛ6üòÕÿé 2‡õ³'½}ŠrN/bê³ÚE —Oع …É]2\ÛoLøÈhK¶JkS¼óY,ŸPòÔ ”ÄÂäwö¬Ñúú9‘áÓŸV£ÝëæÎUг %á 1ó’ýöæ¶™÷Ø… è¡™C‡©¬Þ^”ÃñKÄ.û¼ŒŒ¬Sûö«xXC² ˜ìSÝGM;wBaµâ‹/(ÛôI<Þo.}Üx•àváÈQŒéÐGÉEÛ}èPt"׬9L2ƒµ%¹¬sàõë*Ÿ€ã³ÖRùš®xä‡â…óÑÇÅ‹æ‡]±˜û‹'Þ›ð©ú,øu³:ÆeªÕå6ÜÖmY¼„bÀ1x™ü6§}&’׆ië²¥p&74‡A®_¸ ”ÚÜ>@àMÿ4ìô'Xuéº-Mî÷“dùÍ~çõ9E1i^ZÈ1½ô’¥‰”~ß ”h©Oæ&húõ²c»½Ó€WMøû\V^Nä$±.ggâÌßn´²À¾Haß¾7(ÆÏË­uì€ZäÁ¨P½:.ûG­((_­ºj£Y~j'õ‹ùtyû-Ä>Œ&oÛ Jfº¢xT«W_¿Z¶oo¤çÖ)Z­Ñ`º¦Ѥb2%Ãýõ×_xƒŒ”¢4 |dëg!Ozûd[>ù ª„ž)7c+˜cP!Nhp¤7ñd‡[†—ܸв-vmI>ŽçxRFjš1w2TUNqñ[¾¾Øù(VLJe.Aî¥B…‹Xbi²œ—D°"1F¦0É,ÖÆú²vÌÆÚš;vÎë$ö¬9€º=RRß|žÈG9 @I{¾}ˆ8J£[8™üÙ‰´B1ÿ þNo·GývO'º ¦ÌÔ[”¤¸ôø?¨Ý¬¹®{v±öu-§2ÊWS’"¯Ù]5s&ÅNcQ¿u+rTnê=cao_X- ;@ñT^Šõóee!~>` üIQ°'¨ åLýågÊ=h„3^^jƒöBž€«WðfÍZhNKt¾¥le>'ê%8¤ä9¢ÇÒÄë*¹Ý ߉”Þ¶•+UŒ|õÉËÒ´¾y)Þ»4ýè°ÕÊK yµAqº¯´k‹éôÒ›ŒÞ_:¬ÜxL×9‡ J8¦,+2Öì!y"8[Ûð¾âXu eiy)ÆÚêcOLf^p¢Ï«85éúÈ컵#hâ]’pÏ)âØöºî¢Róo û5…µa=[Ùç—ˆr¶•³ñÉÁ‰-ƒªV£ŒÛêÉñjNjFÖòœ]»Ÿ£‘>Jv9žÂ[æþJÙïùД²šŠ’Ûš^©7/K ¦Ö4ÁN å4ÑÑ18IÙò%JQ†üØÁ(H 53ÄJ&–Î_)ʶ7FÆ&X¬,X9›jcŒ©cæ&RÆú6äcj‚fXOöSÈ*åœÓxZRÎ9-OfûåœY¥½IxIg¡s<Œ×3פĸ·?ýTe”›l”Ë ²kÌ×°kåZb〦]; \å ´ŸÄR,ç$R̼ÎõîÍÛ´>~mGãåá½èå,î¹Q?§åœÓˆïO”³q\ä¨ `S°»ùîÕ[8ê¹÷üƒÉ]\Õëׂ{ÍÊJ΀+7éE4¾´"À‡ÞräŠûtF¹­v‹ÚÔ`E˜gŠ@ ZùQƒ–zfÖ­Óƒ`Ëù*…j")Sþy Vζ“7þ< *c²Ž=²²í1z0.öƵS—ð×OÞˆNy7yIWg¸ÕpG˾íP§uðr/kã•Ù ®°,@@”s€(,ìF€•mzYHý—Z*̯,Õ^NÁkßí()«}2cÎîqA@°QÎÖá$µ›@€•¯RÀOp·‰qˆ¶‹À,xO‚íŽ.÷H&Ê9÷œ+‘TlEày‰Ùf+H9Ä\^B’C@K7‚€ ‚€µˆr¶)©'‚€ 䢜shéFA@°QÎÖ"%õA@B@”s-Ý‚€ Ö" ÊÙZ¤¤ž ‚€ CÈRªZº¬EàðìÙº?«·¶­ÔK¦÷|·ž8ÑVÄ9\‹€(ç\{êDðç~¯q·nÝèï!írÕ±}ûö\%³+Ø*¢œmṏ\yüùó£téÒ(\¸p®ÂàÑ£G`Ù…A óˆrÎ<†ÂAÈRœœœ”b.^¼x–òÍ f,» d™æfCá d)aaaYÊ/'™åfÙs'éK°„€(gKe²}`TªcÇŽ¡R¥JèÚµ«®¼:g¦ñãÇ£jÕª8~ü8˜g¯^½píÚ5]½Ìn“=³<¥½ åœCg½Lùò˜¼zb££á{âþ˜;_ ŽZÍš¢¬{¥LIñäÉ([¹²Õ<|ŽýCsšuébu›œª˜Þ±¤W.¾)¹-¡÷`oÿtÂUv÷Ÿ^y¹~`` ¶mÛ¦3+ÖåË—ÃÙÙ^dù·3˜\p}___ :cÇŽå]={>>>رc\\\ðÍ7ß víÚŠ¿›››®žl‚À³G@ÜÚ9tì BÓNѶwŒ˜9 =FŒP r)i¦Óû÷aè / «ƒ׬‰ýë׫ãAþèîŽycÆ`bè^²$ÞöðйdÇvìˆøøxÜ»uKÕKˆ‹Síøwú Aè^ª”:®ñãBï]»Q’wªÕo êòWTø|عº’EÉüY¦Û~WUûoGRrõ¥‡xXÐ]“òr›m+W`D“&Š×÷ŠøúÝwÁÖ:Ó‰]»îg¶.[NŠ0ý?xc.ĸï)kxüù:銒E½ô”7fmòD³NA½^7âý9sÀ–T©2e0¬mM9—! wößá¿óç+…ÊJAKB»rú ´Mku¤öó÷ïÇg¿ý†mÚ âþ}Ùò§®æ·üìi,8pÀ¬¼nUªbåùs˜O‰^miQ¹NmÅ#à²/zz"*"-ºvÁ”ŸÆÂÇá@ž22­-ÿžÞ»~“mÒ¡ƒŠ¿s ¾%Qù_¹‚„͘yóP¬D ¤‰Ìçkצñp{sý[3vKçqá¡C؋⥹;«)„ÎÇ›õ‰­Ýh ‘ÒÇ•RÞºu+üi’åííCÔï×_­«£âÎå)Ô²–p`….$¶…€(çgt>‚n©žËT¬ˆ ²„™VL›Ž¶”©;¾[wµ¬—ðã^«ò#%‹·N‹«ûž€ªgìˉ”©VŸ;S,=”¯_8¯{Ó./§iV‰bqœ)ävŠŒ¼]ž,f»|v¼iV^v_ÿöõ7N.óZ·Á%Šo3ÅÇ=ƽÔñx4m¦Ž±|îä·D¦ÆbØÎçø u¨aû'mÛŸb­%kÇnÍy´F䪲䢢}Š INEºv ‰]ß›7oV3—±{›ãÒÔUåøòlzEè™3g”õÍ1ì¬"CÙ³Š¯ðò¢œŸÑ×bÀõÚ¶Gª2|kÒD¬»~ «Éò\vò&¯Z©“î%ù$!Yí_¿pAý²ÒRÄoeJJJÙNý.`_P·ŸOï­M'vî„Yµ[¶Ô•óF õ«‘?Å"™œ5þ´m¯÷¶*sòz.X¤\ðÍ)ÙÌ38o|4Añ⇶“[yµ@–¬FMCéKšv´Sµ~=uÈ—¬E|½S–•­RE;dþ×HÿÖŽÝ.æ;5_Ê 7˜Bškšk³UÌKª ‰³°'LHÁ\+‹$¯ »·ÿùç¼ÿþûÚaõþî*„ ' ‚€m! Ê9‡ÎGDh˜JÐâXñ0Jæ9NJ² =0{S,¸+@qнäbÿüsÔ§ YIŠïûí·˜:u*\i='ŒqÛ7ÞxCë:Ó¿†²gš¡0ò(ÄL>Hó&dß € ŵx1/eT£&š‘žCK 8ìH·vvÒÌÇKÐÚ×B…Ó&¥©D;WÏœF9Z+ËŠÌ…ÞVJß¿±º|ÌP^­ÞƒÐ•ìU° ½vH÷Ë™Ó÷ÉåíâVAwLÃÚ±è·1Üfë”­÷RNΆE÷­íßÔØ¹se–ð#%ÜŸ¼ †ïÖæ¿c¼E™ûκCËÆ8^­oùkíØ-αgN&Ì*â¸øzNõaÃ²Š¥ðò$œ{”ö Ÿ'a°­A»V4ÿàu©ðtÀÔqÃÑÕhØÈðPš}S 3M%½Sòš²º¹)+~sýX;=1žÚ45ùxª¢‘ÖöojìÌÒ\™‘.Ó2e}²m­bf†œmŠø bÙA¦dÏŽ¾„§ ð<# ÊÙÆÏ®-¹êöÖ[¨AY·Byܬàr³ìyãê’QæD9Ûø™b·ì§?A@È;HBXÞ9×2Ò\‚@n¶>s³ì¹äò1ó¢œóȉ–aærór¤Ü,{î¹BDÒ¼€€(ç¼p–eŒ‚€ ¹ ‰9çªÓ%ÂævÒKj„A ï" ëœóî¹—‘ ‚€ Ø ¼ÎYÜÚ6xbD$A@¼€(ç¼}þeô‚€ 6ˆ€(g<)"’ ‚@ÞF@”sÞ>ÿ2zA@D@”³ žIA o# K©òöù—ÑÛ ‡gÏFn}™ÿsVë‰mUIÈ]ˆrÎ]çK¤Í)Rݺu3úW¶<üÄÄDl߾ݖEÙ\ƒ€(ç\sªDм‚¿Ÿº4ýYáÂ…sÕ=zy·v®:e"¬ # ÊÙ†OŽˆ–7prrRйxñâ¹–]H2€$„eCá d)aaaYÊ/'™åfÙs'éK°„€(gK™)¿és ›X‚Õ3f`ﺵHHˆªvbbÂSDzú@||v®ù '÷ìÎjÖ&ùiãÒúöÞ»Çd].ظh!¼<7éêhíurhãôý «Ø˜‡YÒ£6kq°¦ScÉ`W¯^EûöíQ¬X1T«V §N2ÉÊ\ݹsç¢\¹ri>ÿý·I^é-0&{zyH}A@€¼¾3£Áªi_`xƒ†øßFaùgŸáó7aPÕj¾ c¹nηÝ®n?»6¢#"0cÈPü8ejvu‘†¯þ¸¢￯8ŽíØñññ¸w뺻ÓvNï߇¡/¼€®\³&ö¯_¯ê^>yBÕùaò$ŒnÛÝJ”ÀZ´€ßÙ3ø€¬$ÞÙ¬Ü ¶ZZS}= Q}Íyï=|ñúëx•²‚ûW¨€]¿ü¢ãÍ.ê7kÕRý~Ô½;¾5Jµ ¹} †ãJˆ‹Síøwú Aè^ª”ª« ½wíFIJªV¿ÁSíoPh€ñá>“¾nn ºk+ægjl\fH'víR|»RrÕø®]•¦Š)^AþJ®ycÆ`bè^²$ÞöðÀRˆLÁaJ¿¾ŠgtdD w ]ÃgÏž'ZÕ¨QCU-X° êÖ­‹Ý»ŸcXª{óæM4nÜ\ÏÇÇǰëLïÊži†Â@È£ˆrÎÀ‰w­èŽædÞ¿w¿Ú½œ\0‡ÜÛŽeË¢aª%S£aCÅÙ¾P!ÔnÞáÁÁø¤w„Ño÷aÃDkB§Œ—.’’E)ñµÿûåé\§Es\<~ï6m'RV5éaêsò$lÝËX¹™ê+)!Aõõ'Y_QááèD õ>É4oôh%oÔƒûø¬ÿÜ»}mÉuú‰ç’%ª O6 Ç•ßÎNµó9é­øu2„”kXùktö j6JÁð}")u;÷Qˆ–UHU@¦ä776­?í7ò~>80e,½{#6:J᪕›ãÅ“ M®0²\wì€;¤Ø¦ôëG^€È á~/BBLçÞÙ¥bªÕ¹víšZZ¥íó/+ë ÂÙÌÕe«:œÎù0ºþzõê…:uê +MXèšÈ*2”=«ø A ¯! Ê9ƒgüë¿¶áß³f¡Y01ä²ÝGV0+jŽC2½?gØÂ)U¦ ¦SÙ_+W!š,îþ¼± bÜ÷‹”e½aÞ|µH OZ±ïP‚S9r‰Oûýwüwþ<µ÷Æ õkékë²åûr$¹¾Ù±äønä* Øq )/¯M›Iðæ]^Ƨ?ý„…‡£¤£cJ—ôö'Ãqiʹ M"fÿýÉ:_)XVBccT»+§Ï A۔ػ¹öËϞƂ`N~se†¸ôôDÅã[tí‚)?ÿ¬Æâ@°FÖð*J^Ž¥§¼1k“'šuê„ʤöÚ¸1C8,Å'£ eÏ(i'äuD9gà 8µw/ÞmÖ_Q†4Sig øp,X)E=G’S¸Ò›ž”¤¶=š5U¿oMšˆuׯaõùsXF±æÉ«V¦Ô¥ï‚ööj;·ÓÛW;éø²¦/ûÂ…t5ë—ÔoÛF÷9qRýòR¡›—.©mÝ—Þ¸´cì j›Ðäç'vî„ÅÌk·l©+§×HépÑÚë½ ËœüæÊ4^Ú¯“[Êd'àÊí¯_×m[Ãë%Piçóú… ª-+`EFÆa ]§Ø`¥L¡}÷3[¾¼¤ÊÌÕ=N¡’ &¤iIÞvo ‚€m! Ê9ç£]Ÿ¾¨LÕ©ýûUÂÇWÿMÖY ¹ëµn­³´ìIÙ†‡†âç/¿DËW_E°wíZœ'Ke Å9¦ÌŠ>«©ÅE3ÚWýöíP‚’ºNP²Ñ§}zc8%qñR-}Ò[Ûæè,¹©«×¯¯Ã„ëê·OM(ÓUš“ß\™¡­{õT&/ +,üðCLíßáä–ÖÈ^ìîÿ¤gOüøÙTß±CñkÕ³‡b¡?K8pN˜ãD»ô&„u"wz šàLžœ”Œ!™“ß\™!§²å0žbûùiR´ŽÂ þ—/£Õ+¯èªYëz½z¸ãw «fÌTíÆÓ¹vt-«¶õÇa nÀV;[ñœ”gŽ ]ìðxùÔªU«àââ‚éÓ§cÁ‚j9ó7nV¯^­Xš«ëH¹_þ‚®Ç2”s0’å>ÿüsÔ§ÉSV‘¡ìYÅWøy :&¤D¡Œ!ÀÁ‘÷ïì-6©Ï‰—• 5©… ?IèᕸP¼ÐX}ý¶Y±Þ¾x ‘ç¢E¨ÂK¾RÝöÃ4»ww>ŠÕÉll\Æä½zæ4ÊQ™C‰'‰X\ÏÚöæä7W¦/ »æïÁÅ­‚þá4Û†¼nû]Å 5ÑŒ&Xsh)ËëHëÙíì ¤igí8Ò4²°ãGJ¸?Yù†ïÖæ¿c¼E™íî4q²D–êÞ¹sŸÖ÷XXâi©œãâ6l@õaÃ,U•rA@0ƒç%¥}Ò˜©,EÆ(áèþ˜"— O'íðR¬œ¢ôöU†üûþX_¾þž‹—àeøúQÒ+)ýÉ„±qS†Œ†µíÍÉo®L¿SV¨æ3×µÄË”¼¦Žë÷ŸÞmSÖ'[ÅÖ(fîÏRÝò©É‡é•ÍR}S²[j'å‚€ qk§Å#Ïï±þjËft{ûm ¥`Ž®®à$¶™´Ä*/gÌw{ë-´|åÕvnVp¹Yö?ÑÒ¡ `±œÍ€“W‹ªÖ­§Ö8çÕñó¸K99ãÓ5kò22vA@x†ˆåü Á—®cäfë37Ënì\È1AàY! ÊùY!/ý &àxqn¥Ü,{nÅ\ä~>åü|žW• ‚@.F@bιøä‰èÏ/;éÍjB‚€ wuÎy÷ÜËÈA@l^ç,nm<1"’ ‚@ÞF@”sÞ>ÿ2zA@D@”³ žIA o# Ê9oŸ½ ‚€ " ÊÙOŠˆ$‚€ ·¥TyûüËèmóg«?®°AѲU$þ'­Ö'fkÂ\È-ˆrÎ-gJäÌ3¡ÿÍîÖ­[–þ£­ƒ—˜˜ˆíÛ·Ûº˜"Ÿ cˆrÎ1¨¥#AÀ:øýÔ¥é_± .l]ƒç Ö£G ïå~N¤ !ËåœeP #A kprrRйxñâYÃ0—páq ‚@ ’&W‚ `c„……Ù˜D9#N^wΠ+½ä6ò”r¾pä0ÖÍù¿÷.Ÿ<¡;Wññqعæ'xïÝ£;ö,7²¤ûÌŒKk{rÏn«dÑê[ÂÐpl-„—ç&]Z¹ÆÏÚþu žƒ cÿìtõêU´oßÅŠCµjÕpêÔ)“#Ý·oÚ¶m«êÖ©S/^ÔÕÍh™ŽA6nw6v'¬›F O(çÈûaø°s'ŒjÝ ÇOÀ‚qãðn³æ˜>x°:9Ñ`Æ¡øqêÔg~²xò0º]»,‘#3㊎ˆHÁdŠu˜XÓ—áØ“±tò'ôóSãÕ/OoÿY˜01{4i8QìüùóhÓ¦ úôécRÚñãÇÃÕÕÇWмW¯^ºº-Ó1ÈÆ cãÎÆî„µ `Ó䉘ó×#ÞÅÉ={Ñ”Þkã>D\l,–Lœ„]¿ý†ݺ¢9eÆÚ ý0i2ÊV®üÌÅ)^ª¦­ý ¥Ë–Í2Y Çæsì<ŒŠB³.]TúåÙÑ– $‡bÛ¶mJ1W­ZË—/‡³³3¼¼¼ÐÎ`"wöìYøøø`ÇŽpqqÁ7ß|ƒÚµkƒy„„„d¨ÌÍÍ-‡G,Ý ‚Àso9ßõ¿‰›6¡lÅŠ˜»/ÚöîƒNo Â'«Waè§Ÿ jýúº« !.Ó BwRLÝݱýz]Ù¶•+0¢It¥$îñõ»ï‚-¿Û~WUÝoGÂàš5Ñ—daAwaª>3¼vþF½ø"º•(Á5j`Þ˜1ªŸ±;">>÷nÝR<Ùµ{zÿ> }átupPü5™Lõ«XoÃܸLñ%¥¹ø£±aÞ|'vY¿Y«–’û£îÝÁcfœBnßÒÕ1Õ—±±yïÚ’”T­~–GRÜU¿ÿ Õc5±Gt/Yo{xà )(S´fÖ,ŒhÜXÕ}§Q#l˜Ÿ2kxeäü±¦ðä²)ýúª1DGFð®I2tï²Âåd©t­0,XuëÖÅîÝO‡4h€ëׯ+ÅÌu/]º„B… ¡hÑ¢ÈhóÉ 2wNô)}¶ŠÀs¯œýΜQØWkPvùìtç¡ÑK0bæ,ÔhØHwÌç¤7¢ÂÃÑmÈR°A˜óÞ{ª,ðÆuÌ=±ÑÑ0f4 )Š?Éz9¹c'âh H)SÏ%KPˆ–¾T èc²ÌMÕƒ ]»ÁïÜ9¼øÊ+(IÖÍú T¼Fƪ?{z˜ÖnÞáÁÁø„&aôÛ}Ø0$ÑZPvÅ߸tÑh¿NeË鯢¿aj\<‰0Å?úâq…ݽ«XE=¸ÏúÀ½Û·Ñ–\ªIÁð˜¹O(42Õ—áØØ…yö j6J³a9è…úý³Ò×p#+°qǸsó&)¼~d}GjÝë~}NÇÒ)SàZÉ#fÌ@"ɸŒöï\óƒ%^æÎ·¹ógO,ü^õšLØš#;»'×)×»víšZZ¥ß†•u]£Æ¨\¹r§ëxàÀ9r$fΜ‰R4ádÊh™±~²ú˜á¸³š¿ðrϽr¾Ÿª\Šs°x^ÊÕ;ûï¿ð_²°XÉòƒ”ÆnUªb%Y»óìG[R•ëÔV¼.ûêxrÛågOcÁfë{mÜ„P’©ùËñùÚµøv猘>Íšâý9s”UTªLL'«ý¯•«‰þ¼± bÜ÷‹”"Ô·fõûÕ c°aj\[—-·È_cåEÞ‡Hzà7ïò2>ýé',<|%SŠI‘jdª/ñÙÙÀ•ÓgРmJ|ݰ<_ã—¢äAXzʳ6y¢Y§Nˆ ÛkãF­{Ý/[ÞL¾WÀ9–þ€m4Á(_­º®Ž)^æÎ·¹óg Ï…‡a7MÜŠ—NÅM'‰ù vGs¼YŸØަɢ)Љ‰»ÀË—/µt±[[£Œ–iíåW²ç^9W$7,S¨ÞÉ÷ÙµÉ.bvkäD 6?ò©]~p3ÅÒCŽÝ׿}ý †“ûõJ*»D±R¦ø¸Çê—¿Ê“2×,ssõÉ bªFîG¦¢Å1”Ñê·i«öõ¿‚È2dZ1m:øÏ·Çwë®öƒÔ/é÷«;h°aj\Öð×X…ÞIy¸k.ã¤a«Õá_S}é×áíëÎ+ÅÚ””}zÈΧvŽê´h¡šÞ󂇯«aÕ®w/ܤø+ãÇÉ€ÿ¢ó÷ 4D«S¼2zþÒƒ§N#†‰Qe)îEa}Š „½Šª1E'žM¯=Cž#¶°9N­QFË´öÙõk8îìêGø ¹ç^9פ8±Å'}Nž„ÿeÝ9Y<á#L%·ßœÿŒÒ+`_P·Ü®y.X¤\¸Í)qÉ38o|4Aé?LìõÞæd®¾s…òªí5rk3%$Äãr=rŒS÷›”¤6ÙšfzkÒD¬»~ «Éz_FKÀ&¯Z©Žó—~¿ºƒ¦Æe Uý¶mԦω“ê——<ݤx¦!™êKÕÓÛ‰;á@1÷Ú-[>a¡Wþä`Ú­[¾¾HBŠ¥~ýÂUÈCb;’”Óªsg1š<t\'y7¦Æ¹¾)^=éÁÓP^sûk¦ÐFBB‚®š¿¿¿ÊÄÖHÝøçŸðþûïës·J•*ð%Ü2Z¦c&‚€ c<Ñ@9ÖeÎväP¢$ô‘Š¿Cñåɽzâ½Ö­ÀnZ¶Ž_?΢@¬@™(nésüØ}ÉÄ1hôãeæê·¢d&ž,œÜ½ ?ü_¼ö:¶,[† 7+{{{„‡†âç/¿DËW_Eÿoï:À£,šð@HBK€¡DjhÒ¤(HQPA¥Wi"~QŠ‚DéUEQªÒKhB:¡¦þó.l¸\®år¹ûÂÍ<ÏÝ׶̾;·³;3û›wײYr׆ 4ý7é…:uiûÚµºÚL½¹1›è핯+ªÞ¤1²ßr ! oÛFY°Ý)#dÚ¶¼·<ãéU0Ê1}žÈ¾|Kt‰·½ kÕŠ>ñ6mæˆdôaƒV-Ó%]úégÔ£Reš3bU`¿v…U¬”šÖZYÎöŸ=<lˆ€ºŒ„5có} Od†ªôÌ™3)66–ÚsÿV¬XAUâ ¹… Ò×_­®×¬YC[¶l¡VŒ™³ÏTAnø’€07€,Udîz匞è9|8õçäåHë?XI‡wî¢Z5¡1K¿¡2UªÚí¬V¼²­\§­gßæ0†ªÊ‘Ö ˆÛ¢y¶Ò*R”}¶ó(¿LbÉgŸ)EÛ•ËÓ·÷X×áýØq¬ô挥‚˜LžDW.^¢w{=C{6m¢Ö%þH×næU:u ^-&û1ß.£b½}Ýïìw¯B5y¿-ÈÏÌjÓ¶íçxµÛ«qÞô9ò,QùjÕèTÄaš;æ]õxдiT4ýv¯.©À¾Ým¢þ7¥M+WÒã½zQ³îw°³V–³ýgÏ(ޢޔ/À—üý|È—¯…l# ÊÙ6>òTp;PÌW®&Ðùó7éü¹xºÊçI¢œÝÞR¡ëðaeœ/Ÿ/ö£  ÊÏ碠mã+ÊÙ6>òTp+Z1ÇÆÞ Ãc(úÔ%VÎñü?βrvkGHe.E W.(g?*V¢• /JT„DAÛAX”³€ä± àN`Ê>wîæ-Å|,†óæ¡ð²!T(þ4l¼Ø44͵\®F`ÆÚÓ.+ò•+}þ2E³Lƒ||Šñ_™æ$ß<¢‚¬,ÈXCFî @>æ çã)êÄ*'7Õ /Ie‹ä!?ñÑy 7¼»ÊÚeÒN3ƒF|R>:H[œP²R4ˆòå¢ü¢œ­Â*ÊÙ*4ò@p?þ‚ŸJ¡E”b®S:€üØ,h®_¿N)))”7o^{Iå¹ `:eì¦q4Aüm·Ìɳ(*"VÉ7d]È:²•Ê:6òDp;ˆÊFðWRb²2ecÅlO1ûí·T½zu´ ¢jذ!ýøã™æ}áÂ…´sçÎL—c¯€°°0Z´h‘½dò<#†,Ã=Ù†ŒËÛ*ÊÙ6>wÅÓcûöÒ/ó¿¤ë7L{’’â%!!^µeëšßœÊïL&àü®_»êLv§ò8þuøðazöÙg©|ùò´zõjúùçŸ)W®\ôâ‹/RBB‚SuëL/¿ü2]¼xQ_ÊQ°‰@RR 2„¯í±£²m³2/x(ÊÙ :ù“—úÒ˜§{ÑÛíÚó€ïñ/ÿ õkÜØ)>®\º¤Úòù[o;•ß™Lóß}WÕyþ´ëdâƒMÔ‰IÉ©˜î£ówïÞ*T¨@X=7fLñÁùÀ™0a‚N&GA ÃX’;[÷æÎKëÖ­£9s椓W•e™ý.æÃ[3ˆr¾Ë{>æD$íÙ´‰##}éÒ… ´jî<·xÆ¡tñÌYóatx(£”ä$JNNN÷ï7nÜ ÿý—>øàƒ4My{ñâÅÔ¬Y3uÿŸþ¡š5kR@@•-[–,X î:tˆî»ï>zçw¨X±bT¸paêß¿¿zÖ¢E ºvíuîÜ™`ÞY+gß¾}T§NêÕ«—2­£nkiQLî5jÔ ‚ R—.]ì®´GÈýX’;k÷N:¥ä$_¾|ÊEkó´J–Ýߌl[£(çlÛuŽ1þÝäÉʼÙñö ûÃÌ©/œ=CØß7žÍ—£x|²P!êP²$ýz{0FÂûvRßúõ©E` uãÚ„ÛåàÙŽß×Q¯ªU©9û‘º…‡ÓïK—â¶¢ùï½GÏóÀÿxô\­Z´lâDu@Ó¦ŠŸØ'TÝGÙä>éÛW•Ñ®xq:}š~üb=_»65g…Ò1¬}ü ””bÛ\† iSÜ…ó4¼][jË ©%+¤!­Zª:ƒüµå×_/¨{Póætírœ~¤ŽÎ´ßjßNµõÊåKiʳv‘’ÂJ«g++ç¿ÿþ[)¶úÜ?æåZ‹q‡Yú‰'ž FQDD 0€ú2Ö[¶l¡›7oÒ®]»Ô9V<3gΤéÓ§Êýúë¯U`ÙìÙ³©S§NvËoúìÙ³4eÊåÿ¶Vgdd$õèуڴi£üÙ9s椨¨(söåÚX’;k÷ÆŽKèËgñõ©ÃTWp ?ª\¯%ÅÇ+–³RðÏ›Jòà&GOìן®³i¶cÿ~äŸ'¯âqëjûÿql¯M0ë÷oòý±|/W–Â*U¤+¤ÿñ=øÁ/Ÿ?G#YÅžu„¬µi÷ÆÁ«ÅÚ?LŸ±õè!–[O뙟ëW¯RûµÙ[¬Ø’y0y²`!‚¯äl»‘wòŸâà0Á=—ŒÕ³ÔÒg{àÛ§öï߯”«i ˜¬sódcHS›­¦TœñÖTªT)}ªV>PØæd¯(ØråÊ©l¶Ò‚תlmÑ„èrÓúõ}9zkrg‰³xždÛ#È2dZÈ1D9;†S¶LµbÚ4Åwû%ûÔ­K7Ù‡ÚÀC§MUçø bó.3(?›¡AñlòŒâTŽ,(oþêõöÛêüç/æªãœwF>šbØly?+i(ô?X¹ëçe«T¡Ï8ê¹PpˆNšæX‚WÌZ1Ã|½èã±´žgá×xÀævPBüÍ4yl]XkÓ¾Í[T¶šM§fÇ9”óq^1jªT§®:.al²ßËf`Pô±cê¨Û¥.øËÙvëü–ŽãoijrM?¢à~*]º´Ú:…Õñ{ìF0¥Ö­[+?3|»ð5ŸæIV¾ ãÇ«{Úœ¬ï›æ7?ÇŠÚZ90U£ ]Ž­´_~ù¥Z]éò1¨ÇÆÆêK9Lê\Iñ7¬N4]YÏÝR–˜µï–ž4kÇÅsgiûï¿Sn~!Eb|c¿Þ6]çg3%ôrE6¿Üþ©¹súÜZ¹âFpÉêþa6kƒhlŸ>ÊŒÕ3¨Ç7iÉ‘Ã4Wº³¶n¡¡s¿P¾á>}DswþGýx¥\‰WmGxõûím¿3;¥xI˜Ö÷äÇ«e7Ãd¶F\æ‰ V­ØF“7LÆ0UÇÅÅѸqã”™ŠfíŸ~úÉ.ðBÞ‘rl¥…/|"²åÂüy•­BÙaÆQË–-S?ØR%”9í2Wäv3ßO›N7x¥ü(û—±iXÞüüsÅÉÊY3írÔ€lXIoým M~í5Õ¹ }?kE=JÛ·W¾ÂµlîÞµaMãMzW›ÛÙT¼”Mß=*U¦9#FP…Z5ùSKÕV±’:‡y‘ƒ‡¼ÿ>¯†o™Ã|L&˜€Ù‰•.LÉ ø 3KµØœ]ˆM©[׬¥ž{N}¶s0T›y›°"{°u+¶ä§?ØÄ6¿Í÷.ž;—ZmfÚ=šý±Ý+Vdw€ca©•Ú8"|Ÿq„’ UÁX0iÏŸ?_­ª±Í ŠÝXÑÖe üÎʲGx™ |È#GŽTÛµ-ÇVál…¯ƒVÑáˆ/Ó"¡ìÀo¼¡ä øpUˆrÎ|ŠY;ó²„Ul>µêóBþuh§”ÃŽ³ÿÏ*R”†9Æõy‘–|ö°¢nÈ‘ÍO³iÛ××LžD3‡ ¥w{=Cü£lÍÕtí¦ŠŒøo'm^µJ^³ÙüqÞfÓŒ}Þ :4£õß-§9£FQ…ûn)nõàöW+^¯Y´˜Ö³ù}#ûÍ›q$ù1æ5â¿ÿL“9už?°Múc=ìØ‰V±Ã  ›Üß^¸@™íaº4u }új?Õfs1ŽN^ÊÇ<$o.žGó>QMbÕÖHÈ1«Ø¹?NIå+GNJHLfYNV2 Ù†ŒCÖ…¬# ÊÙ:6òDp;ù|©P+Q€bOÄÒ¿G¢étÁ Ì{g«˜jzkë¹Ûù“ ½ÝQ.RÎ Ùåk7”bŽã·+UDÉ8d]È:¢œ­c#O·#àïçÃ[Œü©|¥bíšL±QqtúØ9Þ£œ6ªzÝ®“nçM*œE¦ì|¼b.R2XÉ6d².dQÎÖ±‘'‚€ÛðåA,?›üBCóS.ŸRì:Å]NP/q;3R¡ à"ðƒ€@_ ÎC!Eò)‡¬ YG@”³ulä‰ à´‚ö-@sÓ›Iü¿ÎiWÎaL*œDÁ_*ž‚MÙX1‹b¶¤(gûI AÀí`ðòÍ“‹òóGH¼ÙJå}}.-A@08¢œ ÞAž ‚€÷! ÊÙûú\Z,‚€ `pD9¼ƒ„=A@ïC@”³÷õ¹´XAÀàH(¨Á;HØó^ÎD¤Ñ™'èš þËÚ{‘”–{üG:þ¹®Tx»(ãi^R¿(g£ô„ð!˜ Å|pÓ?Tµx(µmÞœÂø¤…ìŠ@dl í<|˜ö°LS}þW:Åš]Ûèj¾E9»Q)OpÇöì£JEŠPËú \Pš!xL.ñIÚ°Ž°l‹r¶ßâs¶‘¤ÜŽ@ôñãT«b¸Ûë• ¬D2 Ù²€(gûI AÀí\»|É!S6þ¼ÞôS @jܸ1fbfiáÂ…´sçÎÌ“&ÿÞ½{¿))ö_Gšõ§aF.ÜŽVÏm!ûˆYÛ>F†H‘””Hk¾úÊ"/Õx0.^œÖ-^LA|¬Óì‹é½™Ÿ©²œÍ6úøXI”¹vÑ"þÅ“T¥þtîÔ©Ô¶:[Ÿ)¶ê6Mç®óää;ÿÝl¯ÎáÇSÍš5ÕŸcœüÀ0¼f„‘C†ÒÅ3g­fÙ·e ùçÉCߟ¥!s¿pi[íÕm•©,~ÅŒU†¥­ªKñ–ä½qã†JöçŸRûöí©P¡BT¦L5xâÒ<ûì³ê>ž=ôÐC©ò@Ù_»v:wîL0/ÛJ‹ [·n¥xbXå°AƒôÓO?©röíÛGuêÔ¡^½zVÍ&L °°0UÞ¡C‡¨FÔ¿ ¡ððpš6mšÅúÕMùr+–äÎÚ½SlÉ‚œäË—±… ×–ÒBŽ„C@”³c8&•¯¿?ÕnÚ,ͧûq®ÇÅÑ´Áoв ¯Ξ¡N<ŽgÓä¨.]èI|;”,I¿òHÓ_Ì¡çkצæ¼íVŠ>fShRJ’~lõy\•=Õ7[¶¤ÇÙÏÙ³R%ú÷?¬æ‰»pž†·kKmyrѲpaÒª%‹>­ÒhÚ”Ød}B• 3µ)õ­_ŸbùÇžÄiº—§¿V®LÓVÓ´ú|Çïë¨WÕªÔœ÷WvãAÿ÷¥Kõ£4Góºßxâ ÅCÔÑ#*ÌÝ=*V¤>uë’#í¶Uï[lÝ@Ÿ\qÐç†Á «g¬`Ì?iqûßßÿMcÇŽ¥F©ʶ%÷Q‰%hÏž=ôꫯÒ+¯¼BkÖ¬¡%l‰Xµj•R¬T}³$è믿¦¼yóÒìÙ³©S§N6Óž={VÕQ«V-:pàµk׎žþyºÎÖž›7o*¿5ÒL™2…ªW¯N'¸ŸAúÙ®]»hûöí4räHzã7hGôš×¯2È—[0—9[×¹œ9sRž@ãøá‡¦“YÈ2dZÈ1D9;†“aRÅ]¸@äõçó·ßR¼%²àGó wîô-…—Ì:®àÁ5îâEjÖµ+‰¡ ýú©ôP>ûõ§ëür‹Žýûñª4¯J»uõ/vÛš¯Ê^>}:‹Š¢ûš>L§ŽcžÚÓÕ¸ËéòCÙöoòý±|/W–Â*U¤+¤ÿñ=(¿ ì/ùñÄ£r½zêÇmZîùùù‘›kïåUYO4LÛjšçPúÃÚ´¥sÜÞÇŸy†’›ÑݺÑѽ{Ì“¦«»—²Wß6½b"pœWyæÑ^»íÕ{1ö ]8s†R˜GH¯<0¨™LówèÐA)b¬ZêóDŠðÝwßUI¾â8˜¶'NœHÅ9aРAT•'-X¡æÎ›®pÿ/cwB<÷)L’óçÏWùSÀÞVÚß~ûM•ƒUqÑ¢EiðàÁJcÂÂϺ² â¹9A)cµß½{wªÈ!ðl^¿y¹ÎzÌeÎÖ5”óJþ­@–p?~|:™Õòœõœß5ˆrÎfýxåòeÚøÃÊÔÏÖ5¶ƒ~‚x¯ìØÕ«hàÔ©T¼lYºÄÊýÚ•8*^¦,}±k'M\ÿ;û«ÛSé*•‘û8ŒÞú3sû6zï»åT·Y3ºtîýñí·éòïàÀ¤^Õ~øaåS†_¹V“&Jé­çí+üCöõõ¥‚Ìëh¾6 ëÏ{>V°Œä ·Ò¬\lÑÊY³yuz™:¼ú ˜<™Û>E)mU0Ík^w«¾/‘?+­?¸M µ‹«c«>/¨#¾¬µÛ^½“Ù¼ü¯& ¥–eë+a ˆzP3=šæûè£Ôjó¯¿þ¢ Ü¿0×å•>èÈ‘#ÊtlêëƒÙù4OâÚ¶m«”5V´¥K—Vy¬‘ÙJ‹•o… ÔJó…Õ³V°ð—+WN?JsÄê¼ Ë‚¦ÚlɉˆˆÐ—rô ¦òfï“;XBô׿y ËbÖv¼CE9;Ž•!R†Þs­ã•ˆþLãÙ±9'Ý ÂÈÏægP<ÿˆ`¾^ôñXê]½½ú`CÚû7¿¹‡)!þ¦::òÆ«]v•ûïWYbG¦Ëºoóu¯f“Æ©ÏôùqöIºš¢yšóÎhjÄ(ƒZ<®®cnûSÕ…•¯‚…ƒ©XEð¢¨#‡i+¯ K•/O•ëÖKÍa­Ý™©7µp“ p0 [ú˜$£òÌß}÷ݧ"¶áó5%¬T÷ïßoz‹vìØ¡ò\âÀ?ø‚ó¾ÓM›6©•uÏž=Ó¤Õ¶ÒáIÊ0x?þøãÔí\z«—.Ëô¿6Lášþûï?Å›¾–£ç°$w™½™r QÎŽá”mSùåöOå=§Éö†å“¦ÌÒõ{Œ–ÇDÓSƒ_Wéà/r”Nð šL·<ŽìÞ­²fÓ©9•­^MÝ:°m[ê£Û¶«ób ¤õºÈU©nUd!oÒV°óØB0këÊdɬî'ŸN%û䥾t‘­wî”&›µvg¸Þ4¥fÍüÅçÏŸ§™3g*剕1”ócÜïß}÷ ÞÂsm=Ì– Óènø£Øm¥k+m«V­ÔŠi:ËÁ`PÎPڎиqã”e¾føÅ±J™ÖïH9’Æs`wbôgÈ!žcæ.©ùÎ~Œ»¤AÒ ÇHL¼åLäU8V¶0É‚àƒv”`"ÆsÚ¼zµ2÷6à@/sªÅƒ~!ŽÆ… þ£çn)¾íìß aEÞ„ý¥ ø5/²¯tÁûïSç×òunób¾nÌfúÏØŸ¾&ðÊ•iÍâ%´éçŸiäW é‘®ÝÒ•c^w“Ž©`áWió¯¿L²-ûôI“ÇZ»±B´Uïhö¹àÀ§[6SþÀ[VŒ4gÁ°àÿC P˜¨ü»¨û˜%vâÙܹsS¹À-ø‰ñâÑ£G[M[–Ý%“&MRåŽ1B™³¡p8ÐЂ+|LÀcs~—8ȼ~GÊ’4žAòõÚk¯©I\T¢œ3ߢœ3a¶,¡+œ5ìO]Ï>âÀÑŒ#º±ù3‚ÍŠŽRùjÕèTÄaÜ•Ÿ}ƒ8È(¨h1B¤¸)AMúc=ìØ‰VqÀ&eªT¡·.`…~k¯óH3ZÏ~Þ9£FQÓ§º°ܲҴ\kçˆ^0yÍ2”Þíõ ²©·5G¢[RÌ(ÃRÝ Û´¡•¼_3œ'¡÷”NS•µv#‘­z£ØÿÉÁeÖs™š’­•‰—yôíÛWm“Â6&Sÿ3xà Ä AY¦´”ýÿq¼ ~a­´Ø’…½­x Ši9x9Š©)³ ÷»9Ïëׯ§èèh V“̓yýú¾‡¶âaÛúLoÛ3ïgãqmlŽàŒLÙÀ3V!ïDоh__?‡8qˆºV§º¬PÇÿú9y‚‚BCÓrY*Ûˆ œáÛ5'”È´î<朾Ž9©þªNûÆ­d^÷;O=¥^z2€ƒÑ:ð–1PFÚíh½ÖøY4n,}ÐçEe.ÆŠÖœ,E=›§1úõnv…Tã ž âÆì©Þí‚K ”~¼#Ä’Ìâ¹?u9ƒº¾>8#E{]ZÄÊÈÊÙëº=mƒ ‡¤½áÄUHÉ´+.[EØ2çf¤[u˜>+Z*ÌôÒê¹®{¿ô1c:Lñ-M¢´Í3ë<æ÷qíh½–òê{¶©tšì|ÄKIº°ÅFèî@@»L±m‰îvy¶ÔæÌÜåœô¼4o~6aµèу*p„ðÝFEøeÁ…©I•¶Ô‹ý§¦«xw¶;7ïYŽâ`´Â¼] +K+‘ìŽ=ö]/渡»kŠYšB¦!ÛBö³¶}Œ$… àvP–£¥ã(j¼™ djþ…VHÈJð2WŽq@Ðá/üš×#¼=ðAŽì²Ž€˜µ­c#O"Pªb8ÀþpÌî-[†B|q‰G™–Êï* L]E§ùõ½»¥øW¶Þ­个컵ס·"$í<€@(ïÿÆkGst÷ßüþëW¯z€ ©Rp 0e‡°Ë¨tíû²-d1kÛÇHRA ÷¢'ð•ð‡޾Û#ŒJ¥‚€rð |xÿ³/¿ÏÀ7—¯ÔòXÌÚ"‚€À &™;HX²ÇßÕ˜…LHÑ‚€ ‚€ pQÎw°3A@AÀˆr6D7‚€ ‚ÀD9ßÁBÎA@C ÊÙÝ L‚€ wå| 9A@ €(gCtƒ0!‚€ ÜA@”ó,äLA@0¢œ Ñ Â„ ‚€ pQÎw°3A@AÀˆr6D7‚€ ‚ÀD9ßÁBÎA@C ౿Œ<u’N>ž/^LÅ‹§GyÄUź¥Ì&Ñ?F!wËÊÆéï¿ÿ¦œ9sRÆ ©nݺYEbb"åÊ•±Ÿ #²ì¬üYã÷¿úê+‹84nÜXɹ©¼;[¿Å œ¼éírì$lùß¶mݸqƒjÔ¨¡Æ/‡Š³ÖïÖäÊ¡BÝÈh²¢›œ±‘AçÊä1999“%d,ûÕ«W©W¯^é2Óœ9s¨U«VéžeæÆ¥K—T}<ð@¶SÎh·»ûÇÖîâåܹsÔ¹sgZ»vmvºvíjU1¥I˜Á‹O>ù„–-[FýõW†r:"ËÎÈŸ-~®_¿nñ÷Æg̘AíÚµK#ïÎÔŸ!Lì.Ùq„#ñb‰ßèèhzê©§hýúõiW©R…–,YB÷Þ{ošû–.,õ»-¹²T†§î±<¢œÑIII”’’â–¾ÀÌ „•ìܹséÚµkôóÏ?«eÈ!ôøã»”üù¼páB*V¬éº]ZA–#bEî•çŸ^)æFÑ€ iذaÊâòè£RÏž=] ÊСCéž{îɰ|hy²%Ë: ~_úÜó¶øþ Èó—_~™¦¨ªU«’¹¼ë:3RšB]pá­rì,tÝ»wWйfÍšÔ·o_eQœ:u*­[·NM¾víÚe×Êc.àÅ–\9Ë««óQVÐFøœQ1~¸˜­¸ã£???jÒ¤‰RÆ'NT¦íC‡©øàÁƒT¦Lzå•W¨råÊTŠ·:uJ 'Ì; T÷¿ùæÅsË–-©lÙ²tòäIu©V­ZT»vmºxñ"½ù曄:tû ä–ÊiÚ´©Ê§Ó!9Ž`G¸‡‰f¯7oÞL-K§ÍŠ£»&LG)«eåèÑ£´|ùr*Y²$­^½šÐ·:u¢Ù³g«~„ʨ| o¾øâ ªW¯*THÉÖ‹/¾H0KÃÕ#d2‡‰€5ù0ïcGdy4éügÏž¥öíÛ«6-Z”Ú´i£äÏ-ñ£óáh^'~CúëVLæòn^¿£í3­73çÞ(ÇÎâµcÇ51-Q¢ýùçŸôÌ3ÏPëÖ­iÕªUtÿý÷SDD„š¤FFF*yÅDU׫£–as9°'Wº O(+øýxlåŒÁÞÀèÁâæÍ› &Ì_ý•âââÔ`Ÿ Ì…,gΜIŒ±J@¾:(u=è—_~Qæ» *($r¤ë­·håÊ•´{÷nµÂÒ/V7PÚ0Y+eÁ´þÏ?ÿP¾|ùhçÎj\C]af‚y³;½"ÑŸ«FAfµ¬lß¾]AY­Z5%ç|ðíÙ³'Cò‘7o^zíµ×(44T­D¾ûî»TJõêÕU¿úûûS:ulÊ&f¦äˆ,ë4øm¡-ðb D K Ÿ~ú‰`غu+™ó£óéz5.\ Ž;êÛjÒ8bÄU>~;ZÞÍë·%ÿæíK-<'Þ*ÇÎBå B|…ù8ƒ1cVÎôsTTTêX„sÜØ93•{rå,¿®ÌgTYA=ªœõŒÉ•`[*KO°†ÙRÇ1cÆ(% ^@`€ >úH)p˜¾aâlÞ¼¹Z}À܃|~ø¡òÂt3oÞ<•ÿ¥—^R³J\ ^ TP¾˜X*JÏ¿ÿþ{*P €*ûš5kƒ!„³S=à©Yô…6QXµœè>ruóOŸ>Š»5œuÝŽÊÇ„ T`Yîܹ åïÝ»W­@8@ï½÷MŸ>°ê„™Ø–œ¡Sʈ,kùêŠÁ[˜D‚žxâ µJúöÛoÓñƒç¦8è¶C†áÒCÇO'ï:½®ß–ü›·O—™£·Ê±³˜ÅÆÆª¬iú7q„îW\›Ê®ñL?×ýn.ç–òáž'ɨ²L<¦œuâ˜Õ¤©páÂÔ»wo€‰s¬&`¾†Pi>`¢áÞñÛ[Š „ñÑt‚ßjÿ LÒ?üðƒ êÁ +n|`BÔd¯¬ÌÀ P˜ÎÃÂÂæ%D cðCd9Ì‘Zðu¹YqV£QVËJùòåU“± 0ÅýŒÈU(²ŒÊ&UŸ~ú©’Ë—/«¾E%Ú=¡1¶'¦ü #²¬Ó =òcu Âî]ÎaÂÜ¿ê=¤ÑÏq®IßÃo+(SÂ3óú̯mýŽtÙ¦eföÜ[åØYÜàFÁl­ûÊãLÚ ¤ÑýŠ£N§­*¸§Ÿ#½~Žsùõ­»žÿ6ª¬)g˜@®¯½¬ì&Ô‚B4hPšªàïé4ðKë{0s‚úõëG0k# ƒ+L•HEåÜ¿eïÒ¥‹º¯Ë‚@"­r &rLš5k¦|8“'OV+.({˜»5OŠ¡,úªÙý‘Qö³ZV*V¬¨V0ïÁ­WÖô/"¶ûôé£î9*Ó¦MSXFÞÿ}å³5j”2¢/1 Â`fO>Ìû]Ë–#²¬åO·æ{]ž6åÃ`Îj¨É—®²¡ó›*àãÁŠ×0kÔ.”µ`ÏûË/¿¬øa±í®˜p11‚«B“£ò¡WXACé#z„ ô/‚Ñß¶äLeÈÄ&0¡oà·¿ 8P}àºA`Ø“O>©J6å²çJ²'ÿ®¬KÊÊ8X!c·”+ä{Ö wdî»B^0ÎÁÚ‚4piÀºd‹²R®lÕ{7<ål£1 Áo³$VÇ0qb}š°ZÁÄÜÙ+ç±ÇSB5¶3 „•—PÖ#ð¿ÿýF­úÁx€`ú5k…‡[~›­~…œ x ¢ðAð=ƒ0àa26nÜ8W`OÎT&'¿0ù€o[·–.]ªVñ0Q/X° UfMùA—+ÉN®¬GÊrDÛCa½›?¾Ú€~C öâÃzÂ;ÂÅŠ-ÊJ¹²UïÝð ›ZS6¸Áïk Ö¢qcéýú¨™š6•˜>7â9ü‘0gg6`ÊUådFh~œÃfͤ®¯Ί*2\¦'dÊ oŽƒ ÛQ²Ö¯ðÝA9Z* b¨‰š¬•£Ÿgö3$Vó–ÞŒg‰ŸÌÖgž?«Û‡úDŽÍQÏØ5¶qÂz‚•2‚ÅàÞA\ZAx†Ý-–dÈRMî+Kõ:rψ²¾=öǨ ÀD¿Tv i¹‚\UŽ+x1/ý~1¹[V0 Ë(YëW¬>¬ü½æd­ótÎ^ëè[Kù-ñc)]fîeuûÀ›ÈqfzˆR­v( úóÏ?OS &ðø8Jî+Gy1OgTYŸ3kcÀ5¢"0ïf.BžG@÷úýcY1JOd>D޳G?K£ÊŠÆÆ#[©Š” £‡Ócüv$Dµ¤ìâ{ÖÀÝMGÌAHC¿ ŒB"+Fé ãó!rlü>2 ‡F–‘G”s©Šát`óâ·#нeËPh!×ÿm£n Càô…ó´ûÈQ:ÀÁPeëeÝß$:ÆÍT"+w°3ûˆÛÇHRÜBÀ¨²¢ûÇ#ÑÚ ‰ tŠß‰oR¤? XG‡ŠÅ¯US[¹­ÆI“¥íJ¥éØ*$ä:7‰©Û鶪O{7ü@ÙH§kk?ì<Ê»øÎí¾kktüqóÝ‹mÇ6°nÆ¶ÂøØ¯±-ümR;`zŠ–¡Êðv x#=\Ó% ëoàYÐÚRÚ±£¥êùÐ#&Á?È>ÌÒ¹áЪþ¢þ©n¨_¨Ôß;j„;¦$}*}+ý(}'}/ýLŠtYº"ý$]•¾‘.9»ï½Ÿ%Ø{¯_aÝŠ]hÕkŸ5'SNÊ{äå”ü¼ü²<°¹_“§ä½ðì öÍ ý½t ³jMµ{-ñ4%ׯTÅ„«tYÛŸ“¦R6ÈÆØô#§v\œå–Šx:žŠ'H‰ï‹OÄÇâ3·ž¼ø^ø&°¦õþ“0::àm,L%È3â:qVEô t›ÐÍ]~ߢI«vÖ6ÊWÙ¯ª¯) |ʸ2]ÕG‡Í4Ïå(6w¸½Â‹£$¾ƒ"ŽèAÞû¾EvÝ mî[D‡ÿÂ;ëVh[¨}íõ¿Ú†ðN|æ3¢‹õº½âç£Hä‘S:°ßûéKâÝt·Ñx€÷UÏ'D;7ÿ®7;_"ÿÑeó?Yqxl+@IDATxìxÅׯOBï]¤ˆ‚H/Š MDAŨ¨(þEQÑOPQª X°Ð¤#EŠôÞ©Ò¤÷Þë7ï„–pSnHnöÞûÎólvwæÌ™3¿ÙÝ{²;%dõÖ-W3&„Ë—/Û½þ ‘«W¯ÚÓ+W®Hhh¨=Ö8•ÃéšYÄ%K–ÌÆCwòäÉåÒ¥Köº5Mu#¯³L+hþ iZ¾êÅzOóª 6Ññé°CËÕ$ͧçÎ=Òœ:µ|g>sÚ„|Nœ;CÔ4œ«}š¦{Oú5ΩÇšG±wÚtgжQÛ5Í©iÊß©Ke5¯¦±ý#®{'Ce¥û¨i8gû_æ('\[¸®ôúÃ9îa½æxÿ+©ˆ½óºÂ1‚Þ—8Ö8#ðþ¿ùš‹ s•“Ÿ^NÎç¼&¡Ã)£:u5 ç¼ÿon 0õÕý6€ŸròäI Ù¶mÛU4 G‚ã\[ÓœéVøš<ŽñÒ¼8G^¤:î,Kujì±AAãíɵs=Ö_=WYì58ãT'lQ;UN÷šWíB¼2Ðü*£iz9;χ|(›S/òCVã4ŸÚ†ú•¦CAõh<ò©,Ò¯zuxœcÓ2œéžŽ¡ å!hy8vÊê9öªWe‘—íÏö×k×ïÜ)ÁyáAï!½§õ^ŠÈÁû_9ðùý·Z¯°Áõ¢×•î•αé5æL÷t ]zï:¯C§,tãAõª¬??ÿÁ4äßÿô&P9TH+« µâHsÆáØ<È«ö T÷Qój|T}ˆ‡]ªCmĹêÐ2°wÆi”¥Ç¶órØžÛHó:§,œñ8ÖxèÕ¼ØGµ%êÍ­z°×2œyp¬ö:õiyÎtè€ lÀ‘ÓÕ팃<â OíVYgš3NËD:ŽAå4^mÖ}Ô¼òœù`dÔFœ«•ÅÞ§ñÈçÔ¯ú”'ti€ölÿˆk<”;˜*#etec•Óxå¯û¨y5ží=;½^•8#,•§^ÛÁ1ïÿˆ{Y™)Ce§¼pŽ9Ýóþ¿~¯ãý²víÚ«zs9/ ½0ô¦S½¸ «’^`šf¯.óG"²z®òšòˆÓs”…›Zm@<â‡Os8×xÝ[óGócï, rZÒ`ŽpýØkâU?âp¬çšçxCâ (Gë­ò8G¼êÆ9‚óíŠêPWu9óáXëÙ°°0Ë ñÊNu zœqHWÛTF㕽žc¯ÇZ®êÄ9ôé1ÎqŒ8-GÏUF󫬞£­â°!íýÁ|Ùþ×s½~°w^k`¥Ìnzmáœ÷Þÿׯ'}þà:Á±žãÚÑgŸÿ÷ütŽœ ®ÇØ#Ú¹CVMŸ![Vý-ûvíÄæÊ{›¸ç)U«–d1ÇœÍ4½°°?¼ï?Y³dšlÛ´RîÜi³fÉ›W ÝUZŠ—¯)YrÞ)DèчöúPCŽ÷íß% VÎ’õ[×È®={+ysç•¢ŠK•25$w®|Väuƒ“…g<ôéÃñZäS;°×<×Ceú_›¾‡xgšæÑr`“êS;Ô&œkÙЩyއ™:NyÈé9Žåi¼êÕò £qÈã) ]ó{²)j~=ÇÁ™Ç©KËRy•E]°G¹Ô=G¾¨ú¢êÒs¶ÿõ·Úæà¨ÇÎ6R¾ºw2vÊé5©ŒòªqÎ}d;ÂØæ´ç°û¨í?dÈéÝ»·,X°@ óèÑ£–_¦L™ì¹–sÕÝÎ çØƒ3ïÿÞà‘Xí¯×3Ú4Xï°EPà­× özÝ#]e ïíýï,ùõ>À±3hùZ.αý“ëà •×fã?©c{v˼AC䨦òUî—‚×’ð‡jš§’ÈÙ?§Ë–ÉSåëù‹dÁoC%]–,’5ßíö¡¤º ÿaíÄ‘½²tú¹ºy¼ûú’»V=I•éQ#"ŽL”]3ÆIÿïçÊÒ«’6c6É5Oä''ä×RèGƒì?¸[ÆÏ.[l–fm–Çî}Lr¥«ˆdÙur¾Œ]9V†?CÆ™L²Jެ¹m‚:38Ç€HÓÆGyà€½^(¸ Ž=ÒÛ]Ѓ4#^ÿ«Ä±–¥²šv@—3@‡^èZä5Îiò?>ò=wYíPÇy±Á&èT8FˆÚþ°iÎx+hþ ¯êÀ±Ö{äÁ±³ýåj™jòày7ª­¯ú5?lÔz@ò9ëídŽxÈCAÓ•‹ÖKëí†ö;v¬eƒ7f…ý1oÕª•Üÿý¶®¨GB´ÿ† ŽØ}÷Ýg¯G7´?ÚLÛ m[ûc” È;î¸Ã2Ó:L™2Eºté"÷˜·ÝgΜ‘÷ß_Ê”)cAèEжwSûÃ'ØŠ qzM'Dû;ïs\÷ÊÎo!™?(v©-àymÞÿ|þãZðççr½ñðÃ… ç¸È±­›9KNüó4+˜_nÏ›\ݸIN¯ûÇÞ!ÉC%|Òlçn²n½y³4[*7ibÓý‘ÓçŸåÓäÊ–5òÌÃù%gÉ¢"GwÉÙßDè3Ÿ·rÞS̤ía®‘MÏ’òµšY› KuÂ>µuÙšy²õð¿Rùñ¢R¼ð²íâfY¿Õ–Rî½ónÙõø.Y8vƒ,Z1Sê×|Ö¦©{rí>`+8à{-bˆÓKœCÈ@§²Ó4ìñƒ¯º ‹º@¯óì¬ŸÖ ùpaAAóØ“k´\ì _j×D¬Hw– ;åC·–«upêV¦Á©q¨#x ú‡½¶¿²µBæ–9 ÎcجõVÈ£¶F-2N[Qd‘ñЇ :´Lìµ\µçÈ£åªLR´?êpï½÷ŠyËk?/½ùæ›2sæL©\¹r$CØw«íÿÈ#ÈO)©3†K¶"éeóÞVò!@‚žkñÀE€s‚4Äk€ ˆS[cÔñØyÀTÒP¦ÖAËÓ=Òq¬vá?Äjƒê×üHWYµ2j“=qüAø ²½aÃC=$µLŸI\[Ы×Ò† ÉoÒžzê©H9½!‡ã¯¿þÚ¶ÚgÙ²e’*U*yòÉ'¥B… ‘5æÙøûï¿Ë®]»ì?<¨o† "¯9Ô {§ ¼ÿ#ø‚2×vrî5]÷àˆt}é¹î!§œ!‡€4ln¼ÿÕnتǰYëàdƒxœã^‰šÎçÌÏûYÍ ðÔ1¸¸{M&Μ“Ô‡J2óß—¤ŒpŽ’Ÿ?'a&þªq’ЦL.3Í'8äCÀE©7¶6öÏì—Ûó¥“ðÔ"arF’] ‘Ј©ËÏJÈÕÓ6 2ë÷ì·|Їà¼xÑøÇÏ\¥L4Æ‘9'g®\•‹"œ£‹—/È…dç$Yê«’³@fÙ¿è`äE¨3ôâzµ\p8GGmì±)Èk•‡]ˆS8GÐ ÇuÈj:ö §Äv ®È£7.òèŲôÜ©CË….-Î d‘{lѵ?âQ.ò#޵”©é8Fš–ÝžÚrÈçÔ‰¼`Œ=ÒTöZGg:ìÀ¹¦aˆ÷ÄñÈ(ypŒx-:pî¦ö׺Àv|2ÂÛ‘J•*Ùú¹³‚ú^xÁÚ>uêTY¾|¹|ùå—¶^ï¼óŽ•Ïcú™€c‡ú  ÎœÔΊ¶]¿~ý$gΜÖy™Îö‡ è„·=pàà„}÷ÝwÖjÞ¼¹µEÛ öÍ™3ÇÖ·eË–Rºti:t¨u~>ýôSÉoœD81° ý‹Ð¦¨Ë[o½%;v”Î;[ýhß?üÐ^s¸î £eàóÔŒ3làÐÁ>ð«X±¢uÚTN¯ÔçÔ©SöM^:uä‹/¾°å<ØÚ‡ºŽ9Ò~…s¹Ç A™ziÛBtòþçýkA¯\¸qÍê³× âp="è¹óÂ1t!tðù÷ç¿ý¬€ xØ£pnö™R¦4ái$eã…‡Kœ£dt¨i4ŒùmÏlÎÃM /64˜6R¤ •”YRHXæ´’"S: Meô™·NÉ/šëp³%»jdÎKŠ#¬.Í {°!è…“2urÉ>¤KnÞ…Kxò”’24¥•»p%TBSû’_5ÿ¥]£ágl>؇ {½ œ(-vkzŒxȪ>¤#?ìÕ‡©êÕ¼Êɹ‡çƒç(z‘ÇZ–^èСñšç([mƒ-‡È㲚Wã°Wa dÕœãºµŽøü§¼iدå©MCßuB†rP&Ê×24/Ò°!hpŒ8­‡óåhèƒ^•Ó=Ê@ÀyÔ õA¼Úy”­|ôñСú®l`êªAó*'ç:<µ?âñ¦ÎÇÁƒ¥lÙ²öM‹Ö •4h`?¡ÜE‹I‰%äÀ¶ìÛo¿Ý¾‚sôàƒÊÀíç3Øç?êÎ>+(o¿p7Fxó‚ro»í69tè;vÌ: èŽ7"x“„ŽâpÎÌœiÖ €­pÞÀ#ÜÜ—<ð€u  7jûCqx  <  oZ”›³-Ñ¡ºdÉ’R¬X1k ÞÐL›6ÍÖùpMáÚBûC?öÐ^pèÔÈj›!ñK–,±üöíÛgãò§ëӽޥ¶9tê1ò¢ŽõêÕ³Ÿï^0N*únË]wÝe?…‚ ŽÑZ︴?dÑV(v"Ž÷pÜÿlÿˆvÆ=‡€û@Ÿ ˆÃý€sç±ó™{GŸ•SypEÐûÞž\ûƒç¨ý¬†ŒPExPbŸÎô3:vðˆDB2§É”Ùmrâ¤È‘£Æù8&ÇLíÔ3ZpÃêƒ £Ÿ6sn9z唄šoÿ¡isd6Ia6„ Fß™äzé¢Ñ—ÒtðÎùà =Њ©m°5kÆÜrùÄãÄ¥”ôæó\†%Mh„¾Ó—OÊÑdÇäÒEã\ÕÙÒç¶ÅÀ6Ô úœ6ô"èÃGåP6ò!@‡Â‡Òð@F£A^u@é(8# ò*ƒtµy!‹4lÎ,È!å |ìqŽ4¤cCÐrœúP¾Ö¶"ÚÅSûCòÂVç1ò L”ƒ|¨~ˆ ‹4u”Ž8„¨Çª{è€d°A·Æ)gM‡-ÚþE;È¡úãºÃÕ‰c§¬òFœÓFÈ!/Ž¡SËAÙIÕþ°§@‚>A ´ŒáøÀFؾ4úÉ öÂÁ'2ôµÁ9ê>7àŠOMȇOpý‡ÎÒPGècÄ㟺4*THvïÞmß¾ c3|BÂg08hø|Ýfî4yøá‡­äÕëÇÊS¹kðæ ×ÚoœpŽ:b?Úm =e†ŽèxÃ…€=ÊÇ'@\Ї2 ‹=6Ô :ŽrÑŽÎòQo¤#6OüÐ?I¯/½.¡[í‡-Ú¨Øþ÷ß¶Ã÷o¼a߆¡C8œF8œà†€üz=Bì‚-jαi½ å#`s¤!h}qŒüÚžªåñþçó× Ÿÿ×£À÷ ‚e£7 n2ÜLú@bº¢¦Có‰%r:[V Ë”QBoË#!Y³ãE®2çÐr:,…ì:}F2»Û*ÆÍ§7©¤7jºìÅdß©r.,TÂÓ˜T¸š›*“5&s(^=cÓöž½$é¬>|p3ëŽ`°÷ö,…å¿C+%åå0Ic¾­e‚s”Â8p&„]Àæ9s鬙&à¼äËZÔ>$‘†:êô"èú‡ ä`‹Êª]6áÚå†üZ$)ج]ƒ¼k<öÐçÔYpP=š6êÃòºiCCÇØôXõj9H‹Úþ° zZ°R†°Gõ!=ºöGÙЩ:_6£,œ# y ç©ý§åj½°W{õç(›êVÛ!ƒx7·?ꊷ*øÄçvcSn8Fô²x[„·1Z?ü¸ë7úÂàSœ*üXëèFpwèÂv”‹ÏT¥J•²ñHÛ+Èß}÷ÝöS® ŒúÊž=»-Ÿ¤ði x«Ne­mŽ2p ûPÒ! Nô¢_œ•ƒ.8N;ÍÜhpÆP_\ÿxËO™@^ËFÄ#eÂÃ^¯”MË€,øa:ƒ×^{ÍÆC>ÁCP[í‰ù;4àrø‡L ÛG}d߬¡OØV­Z5Ò^”8F€Ð…xÝ`3ïÿˆç¿¶Ÿ²A{+Cç1Xj#LÁ×ä•·›ï­#Û?â÷ ÷€/Úß¾9r^<(QÞÊ•d›ùF¾Ð<€îËžSÒdÎf:Qg4OPs™{ùô嫲èà~9“7Ü^±‚ýÁˆ OolìQÄå)|¿ü÷÷²tÓA©˜)½¤ 7²W# —ÌCáԅ˲tã!1ÿ J¾‚í…‹ü°4.+’¿´ùg·lÿû d«˜Õtî6ožÌ(:„d—ÌÅ6Ħ¥½˜CŠ)ùˆ:BŸ>€ô†½–cu˜sÜ4ÊÇZ6d5ài•ÕòPMG=ÀqzÓj~”AÐr¡ qš®çj³Êƒ7œÈi9Ø«-ú€…-(ÛSû# òHƒ-Ø«~Ä#]õ¨Ý*ã©ýUŸ–§ö¨íЩå@â5ôªœæÃõCYÃ6ê 9Ä¡GÍ yÄaÓrQ66•E^œ£ È#àØÉÄFš?KÌöG9` ŸŠpŒ2ÕvµCÛ§Zµj¶Ÿ ú"áMìÇ›8È‹þ3xË6è£_õÂA@ÿ"|ÞBþ,fªÔ2øT†7Pp\ð ŸåÀŸºf›þL5jÔˆäˆkó'ÝyçÖ^Oíz`'|òƒN|®+_¾¼µiÊüáØ¡ß>yÁQú믿lYx“Û`'tÀAó‡:¢ÎСmŽúàòЩõ×kv C;ÊÁ;ÈÂ9Ò¹¥þøãû‰ |ÀlëÖ­¶L°Á§?”Ïh`†)òçÏoß&ÁaÒ·«j˜  Ø› ö!èuç´YÛ 2zͪ<ïÿÀºÿÙþIóü·²õ¦ÂCDT°O•6½ÜV«¶ì7ßÏÿ4“=æ3NQÎP8>b:†î“ÿn’óæa÷þ$E†LöA€†„èĦ7=âÂRf\E•}[þ”s6ËmIÎÜ9ì`Ÿé€½s«qŒÎå”ÜÅjIŠðÌVŸ>Ø „âìÃ*Uz){g Y½m®,›µ^ ä;#¹®éÛ³gŸlÅ —2È=wÞ/éRe°Ø´ÞЋ ~\5¨ý8GyØð0ÃïıGݰ9/^È+CèÀÃNËÁ1ô `¯E}èAvÀÕ©uÆ=äƒê˜ æÇéjô#dÔ¤¡ µuÒ4µ öi^Í8•ƒ¤k}T'Òcj¤C{- úµ,䏖lȃ iˆs¾íÀ1ôBÒ5è9t ?ÒPäïæögØï©ýM¯GÔŽúÖ S0®UÄa*€Úµk[Y8p~ð# ‡ o•´}  q¯¼òŠ­Ö«W/û&Ü5j$ø¼YÌ%öøô…48 p–à¨ÀVÈà-ú8uèÐÁ– [›Ð68¼ëÚµ«-ŸÉPtëu®²Õ«W·²ªGù1² ǰyÑß 6£Îø¼§÷ê=Úþ8FMG<âðV Ï—AƒY~ˆS]àŠþCp~Ôáƒ^0@Grè‚M¯¿þz¤íãÆ³Ž-Ú}ŽÐ? ôjû!6§-HÓk{Oí<Ê{pÅ^ë‚|ªv" e@÷¿ûï´Û?‚XàÚõÅó?ÄütUTxXê ‹‡~„1%ÌY¿VŽü»UNÚû$]Ö’¹pAÉ\¬¸\4ùp³ëCézSâ&E<ôâFÄ"E23ÿî%¦?Óz9srÄ%uºÜ’1ÛÝ’)w91n™}˜ ^ofè@€­Ð½¸é3š¾N—ÌHµ-{VËîC›å豃FÊħÏ&·å¸SnÏ~·¤L–Ú>ìv!8°Mÿ›CÊÒú# >ì‘åCÎUç×:ÃfÈ:<Êy4)È@'ä=z¬²ØCÞ™®2ȯAå¡ò(qØÔfMCÙÈ4Oít-GóckCõj~- y”EÔöG^Èi}q¬õ‚äs”¡õ@‘á­Fª! Êix+„OfˆG½0 Žâõm >?! þüùí1Þêhûá­î'¼Aýq¯bÚ0Eø„áè¸×ÁeƒäÁNäÑ×IïOp-`Þ¾ ôDm”ƒQa˜o¹ÔÙBß&È"œ0t‡NtŒ†mp¨räÈaß"áò°²ØðŽ ê‹Ò`;Þ.a¤˜Á±Ãô°u;8È.E(Èâ­X" ²1Û1v 8¡.h 8“°õA~ŒXs\£°ÇC‚¶/ާizŸhºÊëµí”‡ŒÞ7Æò4 e#hû£­ÔF½þ‘®åh~ìyÿóù¯×•^?¸ôZÁ5†ë×=âqÍàzÃçzMãX¯kèA>g€>½Žè^èÀ5‹½^›z?!>Óãûü1Ch¯B¡>`, €r­ úàa‡ ‚1×~ 0y°¡2 B+ƒ½BÃtBèƒü·†‡ Ž¢æÅƒ‚@9ȇÿ¤`#ôᡎÿ!ƒ2`ò s%ŽºBD=‡sèÓsÔùqŽ[ e)䃌6X¨~Ä!Î!¯iG=pŒr54mXÈh>£ÎHÇ9ìBÐ2´½]Qϲ´œ£îƒØ«ut¶?â{”­Ì4?öØ´nЉ€8Ä;÷Ð[ ‡c­'dÔn-i1µ¿Ulþhùȇú /6e Û ƒ8g]p®¶hÙþÞþ¨ê€öCÀ¹·í¯×™êp2s´öÊéqmÈöèÑþ¥Â=¬Ü¯ÇžÚÎÚÊ´|äS‘~+í¯ùcºÿáå7ÎÞ8á…SΰCïMå®ìaât¯i¨˜C¶#Ǽÿ#~ÑÎÎç¿…hþ8ÛŸ÷?Ÿÿ¸.pÿÄ÷þùí·ß®â&†Óƒ‡(.0\|ú#¯à†…,Îñ°@ÀkfÄ#ˆÈƒ NŽqcãG^=GÈ ÒôéÃVåõ¡¡rÐcȪ}8Ö‡ÊÄC q¸Y ‡ê†c °y «u@ù8‡]G™ÈYÝ£Èh:ŽÁo¤iP=ȇcl4 åàXíDyЃxcÓ|°QíƒÔr¨Ê@š–=âÕ~èÇäÃ>¶ö×z¢,äEùhC0ŽOû£LÕÝ8GÝÀ|Q•Ý({Ä#à²HC¼n¨ ‚¶ô²ýÝÙþh3ôÉAÿ\GþØþx£‡ôð¤^üÿ#þ‰Æ}ˆû÷«ÞÏz¿# ÷®ž+;Þÿ¿S`Æçij>)žÿ!æ¿ûæ)\¤Øã"EÀ1~T âXå°G@¤ãÀ‘žë˜Êã‡>wÊ#ÒΉ:ȃ=§ (éªizâA‹8èŽ8ÇCçZ&ôCÒµœÃNÄ!ù4Æ£,} #úAíÀ±ÚˆcèÓüˆÇ96äS>УúP&âÕèE>Èh9ȃxœÃiAÛáùQ'Èj=4?öÐ=Ò°‡³ýÙþ¸.ôºÅ±^'Ø#àšA:®=oî\{ø\‡Ï[Ð…s\¯Ð…{ºaƒÞ—¸Ž5 ]mAùz?øêþGGm؀ϰ•÷?ŸÿÎëÇzâXï#ì=6>ÿÝõû2røÈ«᥄ÆÃ ž,LæMÒeóùÕëo$ÂÌÐ} ‘ÇC  d!F>4ÂÁƒ ‡Röù’96e ^õ"º®Ý([ó9í€þÈ ç¢yHš24è…„sçCòÐöázÍ#@[D]#âô¡fu˜z‡¥¸>;©>œ=Ù™<¹™›ÇpPNª×dþ\¼`Þò\«„ '†Ã‡åúÜ=ê¤àÇÁé”  tžÚ®ü¡ÿ]\4ú.›:&‹x§:BLÛ  <gÛªÝleáÔ²ýyÿãžãýÏç?ž×xæóùñE"˜~ÿ“3Ê: ú£¯…þb{LJ§¤iºÓAp¦áÇç WÓÕ9PÕµlœcƒœæ×ÿU—ý»Vä4h^È;í€<Ò´p” ƒ›Bu«䃭p@Ô챩NèBP;4M命(qšWÏm¤ùÕÈA¿òÂyTµ.ª2Z7¤9ípÖ]íF>µCËCd=§~¤ãÜ™_ËFÒ4 <Í«õQÕ¡6iÙ8Ç9ͯõW]ÕrÔ”©y!¯éˆ‡<ÒÔN¶?Û×HÔ{× ®Þÿ q=8ï;Äâ\ï;gšÞÛšSï_Üw¼ÿ¯¿…U>ØG½ÁÜ”Îõù§ùôY¦çÎ6@š>;‘î|ö!Mƒ³ý4²ž‚S?Òqî̯ÏUjÔôêrOg ²Æ ¼þJ%jÌ:’ @,èňÉ$@$@$@ÁE€ÎQpµ7kK$@$@$ :G±b2 @p s\íÍÚ’ ÄB€ÎQ,€˜L$@$@$\èW{³¶$@$@$@± s &“ :GÁÕÞ¬- @,èňÉ$@$@$@ÁE€ÎQpµ7kK$@$@$ :G±b2 @p s\íÍÚ’ ÄB€ÎQ,€˜L$@$@$\èW{³¶$@$@$@± s &“ :GÁÕÞ¬- @,èňÉ$@$@$@ÁE€ÎQpµ7kK$@$@$ :G±b2 @p s\íÍÚ’ ÄB€ÎQ,€˜L$@$@$\èW{³¶$@$@$@± s &“ :GÁÕÞ¬- @,èňÉ$@$@$@ÁE€ÎQpµ7kK$@$@$ :G±b2 @p s\íÍÚ’ ÄB€ÎQ,€˜L$@$@$\èW{³¶$@$@$@± s &“ :GÁÕÞ¬- @,èňÉ$@$@$@ÁE€ÎQpµ7kK$@$@$ :G±b2 @p s\íÍÚ’ ÄB y,é žü|¹rR¢bÅ×K…$@$@$@Gààž½Òuô(ŸVÌçÎÑåK—äío¾ñi%Y €èöÆ>7œŸÕ|Žœ’ ¸™#7·m#  ð9:G>GÎI€H€H€ÜL€Î‘›[‡¶‘ øœ#Ÿ#g$@$@$@n&@çÈÍ­CÛH€H€H€|N€Î‘Ï‘³@   7 säæÖ¡m$@$@$@>'@çÈçÈY €› Ð9rsëÐ6   Ÿ säsä,H€H€HÀÍè¹¹uh €Ï Ð9ò9rH$@$@$àftŽÜÜ:´H€H€HÀçèù9 $  p3:GnnÚF$@$@$àstŽ|Žœ’ ¸™#7·m#  ð9:G>GÎI€H€H€ÜL€Î‘›[‡¶‘ øœ#Ÿ#g$@$@$@n&@çÈÍ­CÛH€H€H€|N€Î‘Ï‘³@   7 säæÖ¡m$@$@$@>'@çÈçÈY €› Ð9rsëÐ6   Ÿ säsä,H€H€HÀÍè¹¹uh €Ï Ð9ò9rH$@$@$àftŽÜÜ:´H€H€HÀçèù9 $  p3:GnnÚF$@$@$àstŽ|Žœ’ ¸™#7·m#  ð9:G>GÎI€H€H€ÜL€Î‘›[‡¶‘ øœ#Ÿ#g$@$@$@n&@çÈÍ­CÛH€H€H€|N€Î‘Ï‘³@   7 säæÖ¡m$@$@$@>'@çÈçÈY €› Ð9rsëÐ6   Ÿ säsä,H€H€HÀÍè¹¹uh €Ï Ð9ò9rH$@$@$àftŽÜÜ:´H€H€HÀçèù9 $  p3:GnnÚF$@$@$àstŽ|Žœ’ ¸™#7·m#  ð9:G>GÎI€H€H€ÜL€Î‘›[‡¶‘ øœ#Ÿ#g$@$@$@n&@çÈÍ­CÛH€H€H€|N€Î‘Ï‘³@   7 säæÖ¡m$@$@$@>'@çÈçÈY €› Ð9rsëÐ6   Ÿ säsä,H€H€HÀÍè¹¹uh €Ï Ð9ò9rH$@$@$àftŽÜÜ:´H€H€HÀçèù9 $  p3:GnnÚF$@$@$àstŽ|Žœ’ ¸™#7·m#  ð9:G>GÎI€H€H€ÜL€Î‘›[‡¶‘ øœ#Ÿ#g$@$@$@n&@çÈÍ­CÛH€H€H€|N€Î‘Ï‘³@   7 säæÖ¡m$@$@$@>'@çÈçÈY €› Ð9rsëÐ6   Ÿ säsä,H€H€HÀÍè¹¹uh €Ï Ð9ò9rH$@$@$àftŽÜÜ:´H€H€HÀçèù9 $  p3:GnnÚF$@$@$àstŽ|Žœ’ ¸™#7·m#  ð9:G>GÎI€H€H€ÜL€Î‘›[‡¶‘ øœ#Ÿ#g$@$@$@n&@çÈÍ­CÛH€H€H€|N€Î‘Ï‘³@   7 säæÖ¡m$@$@$@>'@çÈçÈY €› $w³q´H p ,™6]N=b+˜»@)V®\àV–5#ð+|säWÍEcI p)SZ>nÞ\Žì? {¶m“*T)ƒ‡NY ¿%À7G~Ût4œü›@†Ì™%MºtRðîbR¶zu9uü„|ÿqG©Ó´É-WìòåËzËz¨€H 8 ðÍQp¶;kM®#0kÌhI6­üÒ¹‹´ª][Ο=+?ÿ¼|úâ‹Ö֩ÆÉóeËÊÀ/ºJ“{KJß>°²/U®,ÓGŒ°2‡vZµ’#GJ¯vo»®Ž4ˆHÀ?Ð9òv¢•$°& d|…ƌ‘»+V”sgÎHÊðp)aŽá$!T¨UKöíÚ%ÍÞ}W:þ2@¦*Í?ü@Úöì%#ûô±2ûvî4y*I•úõeê°¡réâEÏ?$@$à ~Vó†eI€œ@ÝfÍìg5U¼kë6=¼aš<¹$7[H²Éœ3‡$O‘¦§MŸNN8a‹W¬ ³ÿ]RÇ*$$Ä:VÉÃÂnÐà ˆßÅFˆé$@‰FàÊ•+råÊÕô‡§Gßø\¾tYäêéV8JU0öÇþÆQ:.Õ54z¯h4÷$@$à:G^á¢0 @B˜1r”ÊT& ÇŽT{Ï}•müû˦UË~ó)íø‘#²`â$û†hëÚu2sôh9²oŸI_%‹¦üiwnÞlßÍŸ8ÑöEÊ%‹ éÑ#R/H€H ®BŒàÕåžþ3‹«/åš–*%ƒW®ô2ÅI€‚‰ÀUófo’%õzÔòáSG¬Óú2no¼!oóϪXÖ|’gŸ#ŸáfA$@q%€~Ea)#úÅ5Êi#åW"Ü“ xK€ŸÕ¼%Fy   €&@ç( ›—•óg{¶n5#± øü4¬Wïeü!qáäÉ‘°ýÁ^ÚH$Øèvû²v~L໎å·X:_¾|Ivnþ7ÉkyhïÞxÛǨuݺ²ÈìH€HÀ 蹡h D!€7FÇ”±ýûÛŽÅQ’oŒпæo¿½!‘;8Ëøoã&éòJ‹xy™5Æ›3’ $:G‰Ã•ZIà–üþý÷ò^¿ïìrSŽÔµwûòK—Ïä§Îíç´U Hýüù­åiéŒQ}ûJ3Bô‡?–'îºKVÌž©ËyðƒyK5¤{wyºD »üÆ‘ýûåâù 2®ÿOviŽÿÕ¬!Ð社ñ?ý$ÛÿùG& èTÉc ð[tŽü¶éhx Øo–ÁÈ] ¿4hÑB†ö¾Þ§hñÔ©²lÖ yáý÷¤\êRò¾û"g‡ö´tFùšµäø¡CòŠqŽZ|ÔQFõëw¶mëÖ ô6i×NJ?ð€dÊžC2çÈ!S†þfç ªiæjôêkf®¡Ã⹌š’5Wn©ûÜs7éf ø#:GþØj´9  Ì;VN;&ý>üPï? ;þýWð†+Ö§JZ.,çϳqX&Kgl^³ÆôÝ™¹t†ÄcÙ „42Ü0Ù¢4²åÍ#vï–í6ØY¥lð¸Mš>|¸uÐpRãÉ'¤y«ä© ¤« 8f  '@çÈß[ö¹&ȧf1Ö×̧³Ö_})Õ\týÂÖóŸ¥K¥Çø òZ§Îò•™͆k“¸Æw錴ÆijfV°ß¸b¥¼Ö¥‹u‚ ·H™22Ú¼iº|é’œ5 Ánøë/ñX†qή\Iü~N•å_ H|tŽŸ1K 8À°|8!:ú ¯ÓgÌ( ÍS‡ 3oÊ—¯¿.iÌb«µ?%Íš%§O’™£Fy\:cé´iv)tš^ü矲gûvA"g@Y¾ì*#ûö‘Ï>+#¾‰èàýäkÿ³o”ž(RD¾ÿ°ƒ.YÒc…ŠßmìÝ'¿èÙQÜ©ŸÇ$@$ào¸|ˆ¿µí jp–.^8o¤H•ê&ñY:cÉ´érÚŒŽ+hœ¬u6þçŸMgð~‘ŸãÐ1Û9[µ§2.˜O|žì¹É@óÆ—¶=&=Ç“*õë{` @0àò!ÁÜú¬; ÄúÅä„Äg錙£GÉé'Œ3*ûwì”Â÷”ŒtŒ`’Ó1¹§2b² yH€HÀŸDôÔô'‹i+ @‚x×Ì“ô×ì9rhÏn©ýì3’1kÖ8ëÇ[$LÞX¶fM 7ÅH€H °ÏQ ´"ëàZví’)ƒ‡È˜ï¾“µK–ÊÚÅK¢µuçæÍv£h)£ÙÊ›yŒ /!+çÎ󪔥ӧÛObsÆüîU> “ €› Ð9rsëÐ6¿&0´g/yßÌ”»`©øPã$ ’Ï[¾m0*ìçϺD›žØ XÆcÜOý½*æžÊ•%Ux¸,šú§Wù(L$@n&@çÈÍ­CÛü–Àê… ¥wûw¥ƒ™=&t|û›oìqt•ÊoF……¥H]r¼ãK}D§½ Þ]LЧɛ€i`÷ªùó½ÉFY p5ö9ruóÐ8%0Ú|FËž;·ä/Zô†*´1KtÄ% ƒôÌÑ£å¿ eÿ®ÒiÈÁˆ°ÆÁÊ‘/Ÿü³ü/¹ÿ‘ºRºZ5³8mO™8h ”©ZMæŒ+ÍPÿ&o¿m—úøåóϤäý÷ËúeË¥M÷nv‰YcFK®­ýô³|Ùêu)P´˜ü5gV¼&s¼×ÌÒ=Ìô[”YsåŠKõ(C$@®&À7G®n篰üG¡Åo2?¥ù—0ºo?Ùkæ$§sæ†ÛO8ÈΖ]ë©§¤öÓOI×V­¬ª•*Êù³gå­^=å»Y³å×/¾°k­yZê£tÕª²ÏŒHëdÖkûxÀ™1r¤ä¸-Ÿ<ûV[©Ö a¼œ£êY;fŽ—ªQ†H€\O€Î‘뛈ú#Üf1ØõK—Å{RÄù“&ÊfΟ†-[ÊDÓ©»B­š2ïñr‡é4€Ù«š%?ïÛg>……JÊksåÊ»„§M+Gð¸ÔGªðÔ’.S&«#44T¦.·ßu§=Çd“ñ ÷˜7GiÓ§—å³gÅ';ó €ëÐ9r]“Р@ ÐÔ,ÇqúäIYô甪³iÕ*{Ž!ð1…;Ì:fãL%|N[9g®Tª]GV-ŒXc ù³çÍ+Yræ¼AÍÙS§%C–,öó–Ç¥>nÉ[ðÁˆ3„Ë—.ËÕkK‘D‹ñó)UJÖ™¥MH€H °ÏQ ´"ëà:˜mú‹#¥—Yé~ÉÔiæÓÕmÆù¸(EË–µ¶¶|ðAil>‹Õ~úéHÛñéìÌéÓ2×ÌýTë7å£fMåY³d:t·3ËŠ)WVV/Z(³Ì°ùcÊ+™ý}0]À³\H‹>²ñX vþĉòKç.ÖaÒ£‡„…¥0}˜vÉšE‹¤D¥J¦œÖҡɳҺNIgÞصÛ|vÛ!9M¿&oB™«Ëróù‹×¢ƒ6 ø3.âÏ­GÛý‚À¡Ý{$yÊ7L®ˆ7<áiÓÄj¿'¹ã‡[gG3cþ¤Ï_}E~˜;WR™‰1o‘OK}hšs>Kȧ³_;Óâr ^¨XAÞêÞÃö_ŠK•áò!J‚{ O’bù~VóÔŒ#$5Oî#¨Ž‹c>›9ùӧäÜ™3f1Úô78FQgý‹b è(®²1ÉE—V¼ByÉœ-›ñ ãI€HÀ_Ð9ò—–¢$ ÌÂýøË/ æVJÊP¬\9Yµ`¡ÄÖŸ*)mdÙ$@$×ß¿ÇEš2$@®#P÷¹çĦU ÈÝåËÇû Rù5eþ¤I²aÅJÁ›$ ðW|sä¯-G»I  ¬˜=[^2“EÎ?!ÞZ+×}Øæ]4åÆzñVÈŒ$@$Dè%xKn"€™¶säÉ#㎘> >¶a”Z.3ÊmÉ´©ñÉž y0íÁ„_•é#F¾ÄIµ@pbÔ…:I€< sä™ cI èÔkÞ\–Ïše‡òÇ·ò%Í„›×¬I²~GÍÍôEJ—6‹á¦–÷ÍLâXÌ7¡CR/œÐõ¡> › Ð9º™ cH ( <Þ¢…­÷ØŒwýËšùŽN™uáV›þK¾ÿ®^-‡öì‘|… Ëýõ•—>øPšó„ÎÅ{kà„°“:H€†£„áH-$à÷0ñc©*Udê°añ®Kµ›‰&ÃdöØqñÖߌXZÓ45oŽÖ/[&[¿a'³Ä"¾~ùE¾mÿž™ð²‰U?Ä,üò}•¥CÓ¦vûýû¤vöìv9–“GÙI1±ÇU_™É:±]/3ë9 @p síÌZ’@œÔñEÙ±y³]è6N¢eÌšUò,h†ôÏ’’ø§!ÉBä—Å‹¥ÐÝwK‹Ÿa õ´ˆoÔx˜I4±ËÎ7Ëy3gÔƒfÞt™2ЧÅ{¿&,H © Ð9Jê`ù$à"µL?tªûãñ¶ª\õê²eÝ:9kœ _†ãGŽØ™Ã¿5J:š7Eý>ê`œ´fz›ñº/ì|âµ×dXï^òç°¡R÷¹fÖôâfÖoô¡Z4yŠ`9Ì$Î@$øè~³†$àšO66 æþo禜™ïèœq"–_[ÐÖ«ÂoAxTŸ¾vm7¨Àšu¹nÏ'çNŸO‹øz*¦þ‹/ÉóæéÜ™³‚Ãâ²x¯']Œ#ðotŽü»ýh= $8§L_¼!õí·ñÒ]éá:f´X¸]7^ â™ ÎPûF¤G›¶v±ÝbeËI…Úµì"¾ë—/·‹øöhÛVŠW¬húD\€W‹ 3ëßásZ•úõ4ʾ-Šºx¯sàHA .<PÍÉÊ@ÂhY­š=xP†›Ïcñ èõÞFmØkö„Zx]Š”©äôÉ’,Y2I.Ý e{ZÄ÷s‚aúÎ…{‘×Å{£êâ9 @Âà³ ÑZH€n‘@ã7ZË–õëe£Y $>¡\õ²{ëV9vèP|²Ç+>…¡SvÚ nrŒ 0.‹ýFuŒOämñ^È2 ~V Œvd-H A ToÔP²™Ñ[¿õê/½U}D.^¼(‹§ü¯üñÉ´pòäødc ¸‰£›0‚Hj<ñ„Ì›0!^³]ßcfÊNkæÂH1_„)ƒ‡Hëºueæè1¾(Že 8:GÞÀ¬ Ä—À3mÚÈÙÓ§å_x­Ÿ¢Š•++kÍè¯Ä˜2à›÷ÚKþ»î’©»\ê'\tŽ·mY3¸%y ’’•+˨~}㥧T•ª²kÛ69´wo¼òÇ5ÓÏ:ÉþÝ»åÿ¾û.²P\óRŽH€< sä‰ ãH€,FfbÄ +WÚɽER½a›eæÈQÞf³ü¾;dø7ßÈýæ“ZéjÕ✂$@$:G1Ña 96´}‡Æ|ÿ½×$ •(!™³e“å³gy7®¾0ÎÂ;_×,”# X Ð9ŠH x  ïPÕdz³ã³H±rådÝÒ¥‰“1.™6Mžzã Á'@ H(tŽŠ$õ@€xòõV¶cö¤ÞwÌ.o–A íq˜ Ò[|_½ÑJ2gÏ.Ï¿÷ž·Y)O$@1 s#&’ ¯P^ /.ôFÕÇêK˜yû4+‡ØúòKÙ¾q£´ìÜÙNúèµaÌ@$@1 s&‘ D¨Û´™¬6Ãò½1Ÿ»ò,( &OJ0”˜u{P·nr¯™K©Þ /$˜^*" %@çHIpO$--_µ³G×/Z™è ßsý¬†5Ê"|ýî»ròØ1iÓ½GB¨£ ¸‰£›0‚H *ðÔ©¥ÒCÙÕì½ur*›aöÇ–¿fωªÖëóµK–ÊÔa伉]ó@IDATöÓO >÷1 @b s”T©“À“ÿûŸ}c3}ø¯jW³qcIe…]0ñÖ—éÕî-IeµwÌÜF $@$Xè%Yê%#€Ió.,cûÿèUÍðÖ)O²vÉ­-%‚õÓþ^°@ž7ŸÕÒfÈà• & oÐ9ò†eI È <úüó²jáB¯—)i:Oo^³Fâ3W;×O{¦mÛ oVŸH ± Ð9JlÂÔOD Á«¯JJó‰l`×/½ªUµ äŒYÄvΘ߽ʧÂ\?MIpO$à tŽ|A™e@€Àç¬ò5jÈŒQ#Å›ŽÙåkÖ”ÔiÒÈÒ™3¼&Óúiví|ncEgíµ‹—ÈÖµëdF"®ç–Øú½Ä $@ N€ÎQ‚#¥BlMÚµ³³^/š<9ÎÅ2$yÍœGþú+ÎyT0ºõÓ†öì%ï›ÎÞ¹ ŠÕ1NÒ ùÜL9°bîÚ»—f×þÐÞ½ÑæKýÑ*g €+Ð9rE3Ððè?”ïŽ;dDŸ>^]¡VmÙ´zµWý•¢[?mµé÷Ô»ý»Òá§ŸäžÊ•%wüò¶Á†ãÂ%KzeWTáÿ6n’.¯´¸!úòåˑ緪?RH€\K€Î‘k›††‘€{ Ô{¡¹¬œ7O0[u\C'Ÿ´¢S‡‹k‰ný´Ñæ3Zöܹ%Ñ¢7èjÓ½»=?yô¨ô~ûyÉ8K ¯½á:}â„Løåù¶ý{Ò¡I+ç)n¼q¸¶ÿóL2Ë¥Ú½G¾jÕÊ|¦)½Ú½Y–'ý‘‰< ð{tŽü¾ Y`&€þ8«ÌðöOžAf›ÎÎXÞ‹¼bÛ½e‹Wý‚¼áøp³¦V|À]㜭HéR¶ßÑÆ¿WÆ)OLë§íß¹S •(~“tGMž\Þìö•4nõ†Œøö[7ºo?Ù»}»}Ë´bÎÁ[)Oqè•5Wn©ûÜs²Ï”S¢b%©R¿¾™|rh$OOúm!üC$’D-X ðCplÎ9#ûwî’ gÏȳzýù³çÌvVïÛ+Ο—Ãû÷É…sçåèÁƒrÑœŸ8rÄÖôäñãrÎÈ9Äœ§7£Ctšté"ãÓ˜ÎÕ˜PQC¦lÙ$Eª”ö444¹dÉ™SBCCíyxÚ´’)[v•ÌÙ³KñòdüÏ?IËO>‘”i®ë‰Šr€~GÅÊ–8&±…ØÖOË?¿,˜4I®^¹*!ÉBnR–2¢é2e´Œ!0ÒDy«gO)R¦Œ4lÙÒæù©Ó§7Å-™:MBB"t¯XAfÿþ»¤0NâÐ6žôÛþ!tŽ¢Y _ˆÎ±9¸gqbÎyíØx²³J§3Lò)$C–,’&}zÉ{G!+š%GNIf—¥Ó§ËŽÍ›åžJ•¤IÛ·Ìg®ƒÖÁ‚ÐåKåàž½råò%›Cé¾þìøá#‘Žð¶éÒ… V}l°~ÙÅ8¬‡Ö¨È]òëÒ¥æmK.›7¦?÷™¥Dz·ooßpÝeÞ$E¢®Ÿ†yŽ–›º–ªZÕNÙÔ|âš:|¸,úsŠT~øáH5›V­Š<Ž<¸zÕÞQ¢„Œ3ŸÌà]8wNÖ™ÑmžâŒ$W®Dô1ûc9uâ¸ToÔÐ~â‹Ôé<¸¦ßÅc ÿ&@çÈ¿ÛÏkë1ôy…YãêÌ©“rg©Ò"æÁŽÿŽƒ)Ä×±£€5ÂâbslR˜7Yræ²seË“Û.¯‘=ïm’2u¸}›“3_¾‹Á'³ö¦Þ6}øÃòx‹—c”¿ÕÄSæM•sב}ûýutýBÖ-[.Ï—+'Ì¢S(_³–InoÖY›-Ñ9Gº~Z'žmÿ¬—ï?ê`?¹ëúÜÕ,&ªÁùðÇpæ^o¿%='ü¡É·¼w¾}Øcþ›?wæ¬ dúÙœ;kœ‘Ÿ¢PÐÑíùÑä¢ùÔ×O<Íç§0óCõS”[›[…8sôùøùçlŸŸ^¦ÏMþ"EnUå-çÇ[«æíckãØt6CàWΛk‡ØÃ¡ðî½ÿ~Yï¿o&7ì-…î¾[ú˜þ7nZ|µéÏ3|Ý:y×ôÍ™8hìü÷_ùÌô Òöròªüp]™i:9¯3ý”0gÒåKý{º·ic;W—­ZMº ýM²Ä¡“S/I€H ¡Ð9J(’.×ÓÐçw®Ž®Œ>­¾ø\0œyõ¢…v8sÔª¡0>ƒD iM‡a|RÒQPÚXßÚ8G<å¼ývÓ·&\räÍkûØä1ŸR<€3Ú饗¬ÓQ¿ysyÇ MOìOwž-‰9ÎZßé3ìÛȱæsßs¦ÿOóæ±x…ò7dÄ[&„¥FÎє߆Øót3È÷¦/¯‹ƒH€’£$„ïË¢cúÝpeÎÕN8Nè’"<µé›×¾ÁpSߤ¨öúó9>E¶yäû–åå?”–:¹¾:íûö•’•ï“ÏÌR-¬&­MÇæÆ­ßˆ´o™ò( óÿ˜ -:~dÞ™éÖ̼D?™ùšðH€’š'LêðQùúŒÏ_úì úì®|åÊ•ˆdÇpf§<ŽÑ“âU¨UÓöy¡c•PœchýSæÚ&Óa¾ãÏ¿ø…c¤5Lj²çΓ¬f4Ø—o¶–O_|Q0_l\{ˆÃȵ;‹£c¤p¸'HrtŽ’¼ |c€sèsO3Îo=z f >a†¦ã QÔáÊ…ÌPiÎŒ‰ö|K#¸^6}À0lü—E‹ìðußZpë¥a¨>æ?º×|:o–íhaê£ÓÜeF­¢/–è8b&¸,û`õ[/H€H p(ô'5ž†>{®ìÎìOõóg[Ñí5’ÅÓ¦I³nX·±c=vjö·:öhÓVFöí#é2f4ó0sl•’ZfFî;M‡ûuË—ËOóçÛþGþV/ÚK$ø8”?ñ³CÀÓÐgOÕÙ Î7^53@¯2oŠòäÏ/?‡Á¯ãCâ­^=íÒ!_¾ÑJÚ<ú¨´úìsëüm3Ÿ3'én3QLÁ›ÉKM™b'¨ If`  ð†?«yC‹²$ˆÐ÷k½y‹‚P¼B…€qŒú!}mf¬ÆR(è‡tÄÌauæÔ)ë©s®²Î=&/}¿qcÉ]°€T|¨ŽL7½£ ød×ÑL?¥TèEG‰ñ$@1àhµ˜è0|D+Äw3:·ãÓÓ3]:+Ê›#ňaýCW¯–Ì›¢í›6Ùè+׿9Rç>.“—:åqœÙÌbžÊ,²{»™P’H€âC€oŽâCyH  `­·Mžµ£?7ë…=ܤ‰]ÃmeV¯O@3|¦ ó!Ú¸Qî.SÖ– g&ºÓä¥Ñå‰)~ˆY:³p£c8Ö†Cÿ.ìG™éš˜7N?|ü±M_aFÐ!¬œ3W5ë¶-Ÿ9Ó´Q»· üC$Ðètó²rn'€·CoÕ«gGl}ðÃvÂDL“€°øÏ©n7ÿ–ì°|™<ݪ•L8@^2K…Ìó»`”:¥kˆiòR•ñfÿtÛ¶f¹šsrԌò4÷Tª$˜¬ â7ë¹½bœ£u”QýúYµ¥ª> Z¾&ýÌS÷™™½ËVç¨:oxS–ü•#m9ÚÞòIÙd>3½nV}Ç:bX/-›™hÅÜ9Qǘ*õýî­\ÉvBÛ,=‚™×±4L^º~é2I¨é$°>Zã×_—Ÿ;w–?‡ü&¼ð‚-*$Yˆ„š‰(0Ãûq3Å…†fï¾#;Ìr(©Ó¥Õ(îI€œ£o`VϽ0¼}¾Y<¶jýzÒüƒÿ»ÁÐ2>(˜«Ózè¿`¡ä4ËÈd1Ÿ×º™…u+™l5Ä4y)dœo™4Ý_½qn®Ý[¶ØOgHklVø^efãÞ¿k—dÌšõ†lžN~ëÞÃNÂÙíÍ7í'8O2Œ#,tŽ«=Y?!€ؘ÷§ˆéçÒyè°›¬®×üE»°ï¤ƒnJ Ĉ·zõ–Ãû÷›mŸí{¥uŒiòRÈ´4NäÔa7òáãGZgè»ä+óéîU#wç½÷Zµ)ÃÃ¥Vã§ìˆ@-g©™Wê¤Éó߯Mæs柲gûv9bì®»wÉýõ•ÜfÉ“öO$жn]ÙöÏ?¢#Ób*ºRí‡ìþ +VÆ$0iè~ç=÷ÈÐÞ½nÉ!Ä ÙÓÍ%3FŽºéX…‡’œ·å“¯´n¬ @ s”ðL©‘<@ìåfî¢?ø0rdšGÁk‘µž~ZR™þ1#ûx~“S^M{­S'ûöhL¿ïâ]…˜fÈÆ‚Ê $@$:G±b: $ ¿þj;`Û‘iÿ÷~œ4æÌ—Oò.,«ÌkÁ0ÇÞ êÞ-ÞUŽi†l|®|çÛàq6ã ‘I È Ð9 ò €ÕO|6ŽSpt02-¦uÄ¢ZSúd×¶mvFç¨izþ²™„Ñö=úú›¯"Ú¢¾™;éòåËrÁL9ø«¯dÚðáòõ;ïŠÎŠÝlÙ n ’ ¸–@Ĭg®5†‘€ÀÈ´vfbC¬—ÖǬ—æíZiUê?&ÃÌ›Ž™¦ÿLãÖoø7Œ8Z_ÝLY¨X1ùîãŽv­¹°”)b͉¹‘âât–¼ï>9e– A˜h¦IÀäŽMßyGn»ãéøüó2|íZ;[ö¯Ÿng˾½ðv¶ìÒÕªÙ<üC$èG;³–I@#ÓÞ4?ÚX»«‹yc„Ee½ eªU•ÌÙ²ÉÌ1£ƒÆ9#ÌI´eýzy·ñ“qBÖsÜ8ÑeWbË‚LDæý1^*Ö¬m‹”)#wï–ÃûöIL³e[aþ!xt޾‰YÁ¤"ðá3OÛøŽ¿ü"jÕŒ—xR¾fM™kÖƒ³åí›§xê‚L† ‘¥æM–ôx× íMó|DÎYµ­ùfÈ^d&w|âÿ¹–V©vù{þ<ëtb^©ì˜¥Û,ÛrÆÌ³Ä@$Üèwû³ö‰Dà—.ŸÉœñäÉ×^“z×ÖïŠoQ÷×}D¦ *˳×·#ñ-ËMùÚše;>lÖÔNgðx‹—ãlšs†ìå³fÙ™¯ççò33“ö_æüô©S2sÔ(yôÅæ²zÑB™e¼=f¢}åãOlÑÍ–ŽÞ $@ÁA€3dG;³–>$0eðùÄüð–5KVôüã8õ…‰É<¬¯öpîÜÒèÕW µSxÂ,Â{Åtža>±Å¥OQ|Ø`‘Ù Y²Ä'+ó ø€gÈödA‰I`í’¥òYËWmÿ¢ÏÍD„ ñƒŽY¡ -*›‘VÁ^þ°ƒì0Ÿ¹æš·p‰è%Yê%ÿ%À¡üþÛv´ÜeN?.ï›ÄÉÌz]Ý~+i3dH0 0óÿlX¹R6ÉlÙ ®NÓ&òå¨ÑRéá:Å= $:ö9JtÄ,  Coë:uä¨é»Ò}Üx¹«t©­v&M¤çÎ2w„× †&‚²4éÓËóy2j¸óÞRvþ𯿖§Û¼5™ç$@$o|sotÌH× tzé%Y½x±´íÞ=Þ#Ó®k»ù(¿é{“ÍŒ¤Z9oÎ͉óã'›éNš¡ö{¤«™LŸ×.^"~ùÙLæxIvnþ7À °z$@¾&À7G¾&ÎòŽÀw:ÈÄAƒ¤a‹ÒÈŒNK¬po•*²dÚ4»(kBôeJ,;Zï ï½/÷?úˆ,6CñÃÌÔe«W·ÛÂÉ“mŸ..’ÐÄ©H€oŽx À-˜g†ˆèÚÕÎèܶW¯[Ð{ÖêɉcǵsrìVø^Ž‘§PÙL°é\2?tì(CÌÛ»§K”°K¶Ù¿ßNÂ9ÁÌ5õmû÷¤ƒù<‰™—ï«,š6µ›ä ¸F€oŽx)@< lY³Æ.9‘Ñ ÏÒ Þû`Æ’Þ,C2sô(Á "Îå@¶­[/‹§N•Ÿ-’Ý[·J¦ì9s ü¢«œ;{FrÈ/“‡ 6oߦKéªUe¨qfû/Xh×Y#K pà›#' “@ `î¡wŒ³råÊé5qR¼–‰cQ‘bø”V¤LiY9wndDt9lyóȳÈö l»<Øàq‹gþ¤‰‚Ñ~ [¶”‰»vÙ>a©ÂSKºL™lz¨]È@$@NtŽœ4xLq €‘ioÕ«'{ÿûO:™¾F =2-&JU©*û°{Ë–˜Ä2 ŽèU³Ý®-‚©šµ{ÛNwðZ—.RÈ|ZC¸ÃìÇýô“=¾pœCçÒÂà h Ð9Š HÀ3ÏÌLÕ™ö|ûö>_ΣzöSò3]@0¼©›dÑ£f6k,¢Á¹È¡½{eÀ—]edß>ÒáÙge„Y“ á©ÖoÊúåËåÙ’%¥GÛ¶R¼bE™=v¬ì7o‘Ö˜Op $@$•—‰J„ç$ÑýúI·7ßL°¥Ab(*Ú$üÈc¢ÉÁ+VD+Œ èKtÚLÄY°øÝv=µñ?ÿ,ï™öÂâµgO–ð´i‚ ëL~M )–a‡l¿¾dh¼/ `dZÏví¤Pñâ’PKƒÄÇþÒ< cÍg¢³gÎHxêÔñQyÐQýô‰Æ •ý;vJá{JF:F¨0£€lvVŠ…?«% V* 4èãóÉ‹/Jê´i¥›ù$“Kƒx˪B­ZfôÕYY>}º·YZþÝo¿•ú/¾d¤ÚÏ>#ßhÐõeåH€ß%[jxCó–í„μߘ‰sæË—¤5+[³¦¤5KjL<Øç}ž’´â±ŽÏgåkÖˆEŠÉ$@$;:G±3¢Dx·AÙ²~½¼ß·¯W'©qàSZ™jUåïyó’Ú–O$@I€ŸÕ²YY©„"УM[Yd&¬óÌ3‰º4ˆ·ö–©ú Ü·ÏÎéãm^Ê“ ÄL€ÎQÌ|˜Ä&üú«~ݺòñ€®"Qõ±úvHÿ¸þó÷¸Ê8C$@~N€Î‘Ÿ7 ÍOk—,•îfÈ~¢Eåó‘#í§‰SRü´æ)TÈŽš[6sFü0 @´èE‹† ÁJ“ ¾ßøI;Lþ«1c\;\þÞûÕÐaœH€H áÐ9J8–Ô°4Hë:uR—¡CohܪÔÌéŸä²O~nåE»H€H ®èŕ傂ÀÇÏ?/›V¯–ÿuî,¥«Usu+Ôª)Ùræ”…S&»ÚNG$@þF€Î‘¿µíM4¿tùL¦˜·Eõ›7—fï¾›hå$¤âbåËÉú¥ËR%u‘ @Рsô—€ÀÂÉ“å§.å³(éÿ}ÿ½ß@Ñ!ýXWŒH€H aÐ9JŽÔâǰ4H'³4H†Ì™¥Ç„ ®™ÚºÏ5“Táá2oü¸˜Ä˜F$@$àÎí,ŠSf÷×ÍZeñÕ{Ò$ɘ5«_Uöæ/RDþ^°À¯ì¦±$@$àf|säæÖ¡m‰NàÝF eÿ®]ÒiÐ W, Ÿ cHÿ–µkoÀH€H€n£[gH ~J ëÿþ'KgÌ´#Óª˜…eý5Ôkþ¢\4SÌ7Þ_«@»I€HÀU蹪9hŒ¯Œî×OÆöï/ [´ð›‘iѱ¹«t);¤ÑŸS¢a< €èy‹¢A`•éŸÓûw¤dåÊònŸ>Q) é_9oží;b%H€H Ð9JBø,Ú÷öíØ!ï4h ©R§–ÎfN£äaa¾7"J¬\ça;[öòéÒŸx©’H ÈÐ9 ²æêbi·\Ι‘i_ýþ»dÍ•+`pÔ53{Û!ýü0ubEH€H ©p(R‘g¹>'Ðþ‰F²aåJùð‡ývdZtÐÂÍ›0 éŸgæib  ¸5|stkü˜ÛOôyÿ}™3~‚¼Ùµ«<Þâe?±Ú;3ËU¯!÷í“í6x—‘Ò$@$@7 stž"™£ÇÈàîÝbdZLíóØË/ÙäY¦¾ $@$@ñ'@ç(þì˜Ól\±R>}±¹(ZTÚöêåÇßD|VË–3§Ì=*þJ˜“H€H@èñ"XÇ’·<.èóÕ˜1v°•½V±{«T±ŸÕ°  @üÐ9Š7ær9ŒLû_rüðaùjì8ÉS¨Ë-Nóê6mj‡ôÏgÇì„J-$@AI€ÎQP6{àWºC“&²iõjy«g/)^¡|àWøZ ËÖ¬i‡ôOùmHÐÔ™% „&@ç(¡‰R_’ôå—2mäHyöÍ6;2-:È:¤ýÒeщ0žH€H tŽbÄdÿ"€‘ißü±”¯Q]Zõ¥Ÿ@ÖÖhô„ÒÎè $@$@Þ sä=3æp)™–ÕŒØúlØð€YÄ[Ü6jh³Løågo³RžH€HÀ sÄË œ:~ÜŽLCeºŽ-³f ˆzŧ:¤ÿo³À. €÷èyÏŒ9\F#ÓZש#‡öî•NƒÉ]¥K¹ÌBß›S¥^=;¤Ó0 xG€Î‘w¼(íB™¶zñbyñƒ¥Jýú.´Ð÷&UyôQ;¤ÒÀA¾/œ%’ €Ÿ säç ìæÿøÉ§vdZõ ¤EÇ‚GdýuHÿ_sfEÆñ€H€H nèÅ¥\H`Úðá2 ëR¨X1ùdð`Z˜t&aH)3[öšE‹ŸH€H€âN€ÎQÜYQÒEöíØ!Ÿ·l)é2d>Ó§ÅÒ Þâ¯ôP9rð ,š<ÙÛ¬”'  &@ç(¨›ß?+‘i­j×¶oDzMœ$YsåòÏŠ$²Õ•ë>lKX2mZ"—Dõ$@$X’VuX›` Ðæ‘GdûÆòá?rdZ Ž!ýøä¸læÌ¤˜D$@$•ßE%ÂsWøì•Wó÷ãÒ ñi˜rÕ«Ë–õëí°þøäg FtŽ‚±Õý´Îcì/~ýUî¯[7h—ñ¶é*Ôªe³,œÄ~GÞ²£< @ð s¼mïW5_1{¶ôhÛFò.,Ÿ›Ee“‡…ù•ýIel¥‡–Ìٲɢ?§$• ,—H€üŽ#¿k²à3x÷–-ò^ãÆ’"U*é1~b;`'Rc†§N-w–,)ó&LH¤¨–H€ƒ£ÀhG¿­ÅÆ+å£ff’BóÃÝwÆ vÀNä–¬ùdc9¸o‡ô'2gª'ðotŽü»ýüÚúC{÷ÊÛ —Ó'NȧƒIÎ|ùüº>þ`|åº[3gãæÒF HtŽ’; ÅÈ´ÖuêDŒLëÓW*›ÕãŸfÊÎeœÐ£G%~a,H€ü”#?m87»ÓK/ɦի¥a‹òx‹—ý½:~eÕúõígµ³gÎø•Ý4–H€|E€Î‘¯H³œH¿õè)S‡ ,iñnŸ>‘ñ<ð µjÙ!ýóÙ1Û7ÀY €ß säwMæßÏ3‹ŸöýðÉ‘7¯t;–#Ó’ 91¤?múô2å·!IP:‹$ ÷ säþ6  ±èi32-yX˜t7k¦aµxßÀþ2ÕªÊú¥Ë|_8K$ ? @çÈ)LDÿ–öÉù³g¥ë¨Ñ\$‰µLÕ9¤?‰Û€Å“ ¸—#÷¶M@YÖ®~=Ù²~½¼øÁ‡R¡VÍ€ª›?V¦êcõ%̼Á×ÿ'4Ÿ6“ @¢ s”¨x©0öÒ3¥~óæÒ¢ãG„ây ’Bŋ˲™3\` M  w sä®öHk0‡Ñ—æSÚÿ}ÿ}@ÖÏ_+…Ñ‚èÆ!ýþÚ‚´›H ±Ð9J,².Ö‹ÄQ}ûÊØûËþ;½¶tçæÍòK—ϼÊW½QCŽLóŠXâ W©ÿ˜Ò?iÀ€Ä/Œ% € säG•¦þðñÇò}‡Rã‰'¥¼Òý©ùÔ5þ§Ÿ½R}ùÒ%ùù³.±æÁò î%€¾_Ùræ”…S&»×HZF$@I@€ÎQ@Oª"WΙ+£ûõ“”LÙ³Iîù¥ÃÏ?KïwÞ–=[·ÆÙ,,A–âÿÛ»¨ªî3ßãOnB+Æ"µÁ±uÙÐ$ŒÅø‚oˆQ0@Rß’›—ö¶3íôvÚ{;÷®ÛvfÚÞ™®vµ]·³ÚÕ¹‰&1š6‰&‚ (DA´ZŠih ‰…bˆÁïÞÛJ¬òr眽ÿ/_ÖšÏY«¿Ë9{Ÿ {ük'^–ï|ö3ÃÓÁ ¤egsIðc PL€p¤Ø@bYήŸ?%7¥~B>?°ûe¯ 7Ü ÕÏ='ýø£wõRÕÓOËßæ,“îS¿ó¾¶tÓ&ù÷¯ý£|ýÞ{λôËÏ“'¿÷=©üÅ/äÿþ÷ÿ!‡÷îõž*yôQ9ùë_K™ÄøQW`éªÕÞ%ý+«Ô-’Ê@ŸG>ƒ¹ÝïO½! ‰uU ãÆ—Îöץ⩭rÍ5×HÎÚµ²ê¿~^Þ9ý–lûÉO¥ãäIï¯L‡kjäÊÿ}á‰ÍÒþ›ßHîÝwËò{î–ï~á Þúî[v'ÿµÜÿUûñ€:·;·Xp˙=? €G½¦~üyëÍ7¯ê¸§»[nš>]ªœ¿þ¸oµ¹?ËÖ¬önÔ¸¯ì¹Ýù¢Ò•Ÿûœ¼ðÆWÝ£¨vg‰|b½Õ=ÿíKòÚË/ËïséÇýÅ .Èë>%n¸q?“ä~àÚ½¼»µ©ÉûÎÎ[dîûšû¹¥ËÒ—çÉÑýuÞCçûûåÃÎw¦Mp>äëü JÞ{ïO—ÊïŠ ÌX˜.'Oœ>@¯è€( | ùN܆îç‹þõç?—ï|æ3òÜÏþCžúáäÑo}K~à|;ûØ}HÖ|þoå÷Î_~V;¸þÙÿþºÜÛÔÝÑ)OüÛwåÂ{‚kšGÈ[·Î;¦|óæå@À÷} Î_üúY?{¶<ÙÜì×vì3„€ûÖ—û–Êø ®:Âý`vÜÿòj´¾wÏJü¸ë¯:öÒï¼õÖUk¹iúÀ˜1—á¿ ÜéüÅoâäÉòø!¾ŒVá1QV |ÿ‹_”¯þøÇ¾õ>×y烿ùÆ­ÖFî[_ƒ#·Ê+ƒ‘ûØpÁÈ}~°µF®Œ?òªsu!wËÖc^T‰± ÅÖ—ÕÐB`Ùšµâ¾UêÞÆ@Àv‘í¯úGÀpï–è¼ÅzðÅñ@¬ Yÿ‹So¾YŽÖ]¼ò@Àf‘ÍÓ§w.È\Q(íírü`Ãeò+ `ŸáȾ™Ó1ƒ d¯Z)qÎŒõƒ>σ €€-„#[&MŸŒ à~¡ðÎ%ý͵5#ÉÓ €€Ù„#³çKw„%ž—ç|•H½¼ûÎ;aÇÁ €€I„#“¦I/D(pç†O˹¾>Ùµ•/¢’Ó@@c‘ÆÃ£t¢-2g¶ŒKHÆêêh/Íz €€6„#mFE¡Ä^ÀýJ™´%YÒ´w¯¸_$Ì `£áÈÆ©Ó3Ã䮹[Nwu9‰fÃÄS `°áÈàáÒ£˜u{¦wIÿîgø*‘Ñøqè/@8Ò†t€@T’¦N•äÔTiilŒêº,†è"@8ÒeRÔ‰€YÅwIÛñãÒÝÑáã®l…¨!@8RcT€RKW~Rúd—oÞ¬T]ƒø!@8òC™=ÐL yÆ ™””$GêöiV9å"€‘ Ž"7dŒHÏÏ—æ—j¹¤ßÈéÒ '@8N‡ç°X gÍ9ÓÓ#/•”Z¬@ë `£áÈÆ©Ó3!ÌÎÊ’1ññ²gÛ³!Í! €€9„#sfI'DU ~ìX¹mQº´:ÕuY P]€p¤ú„¨–Ý%í¯¼"ííVÁÖ €€¿„#½Ù ­æ/Ïõî–]ºé1­ê¦X@ ÂQ$zœ‹€áÓRRdÊM7É/÷×Þ)í!€ï ŽÞ·à7D sE¡­«“¾ÞÞAžå!@À<‘y3¥#¢*0Çù"ÚÞ³g¥±ª*ªë²  ªáHÕÉPŠÌÍÉ‘q R»s§"Q [ÂQl}YíÜKúÓ–dIm)7ƒÔ~˜4€! ŽBbâ ìHËÊ–®ÎN9ÙÚj7Ý#€€„#+ÆL“D&U\ä]Ò_½m{d q6  áHƒ!Q"A LIN–äÔTÙÍW‰= öGG> ³&ÌÊÈðÞVã’~¦I 0œáh8žCÌ¢b9××'ûø`ö€ ¿ €€™„#3çJWD]`AnŽLJJ’Š­[¢¾6 "€* ŽTšµ  ¸Àôùó¤¥áâUR ™á(2?ÎFÀ*.é·jÜ4‹€µ„#kGOã„/Ppÿ}2&>^v<òhø's  ‰áH“AQ&*$Nœ(ÓRRäОÝ*”C  €@LG1aeQÌà’~sgKg pQ€pÄ+Â(|àAï’þ²Çë<Ft é2)êD@[çÌö.éß_Q®HE”DW€p]OVCÀ .é·bÌ4‰€µ„#kGOãŒ^`Q^¾tuvʉÃÍ£_„3@EGІ²PY `Ãï’þÒMU.“Ú@Q ŽFÅÆIØ-?v¬wIÿ‘º:»!èŒ 9VšB öó–.““­­ÒÓÝûÍØðQ€pä#6[!`’@ñÃ]¼¤ÿ‰Í&µE/ €€Žx €À¨Ü;eOJJ’¦šêQÏI €€ª„#U'C]h 0+3SŽÕóýýTK‰ €@h„£Ðœ8 (X¿^NwuI}97„„‡‡@@S‘¦ƒ£lT˜›“ã]Ò°²R…r¨ˆŠá(*Œ,‚€\ÒoçÜéÓG¦O˜þˆ±À²U«¥íøq9ÕÖãXðG€pä3» `¬@öª•Òï| »fG‰±=ÒØ%@8²kÞt‹@Ô¸¤?ê¤,ˆ ŽÛ#`‚@fa¡4í­‘¾Þ^Ú¡°\€pdù €öˆ†@æŠòî™3ÒXUåXT€p(?›#`†@z~¾$$&JõóÏ›Ñ] €€Õ„#«ÇOóDGຸ8q?{ôËýû£³ « € ŽÄgkLÈ\Q('Oœ¶cÇLj‹^@ÀB‘…C§eb!ßz‰sþ‚T±uk,–gM@À7‘oÔl„€ÙIS§ÊÄÉ“¥¹¶ÖìFéŒ ?bDÀ?ô;îÖÇ¥§»Û¿MÙ ˆ²á(Ê ,‡€Íî%ýçúúäÐîÝ63Ð;h.@8Ò|€”€J—.éß³m›JeQ  –á(,.FáÜKúçfgË¡={ä¼ó}kü €: Žtœ5# °À¬Å™ÒóÖ[Îgš®’Ò@¡GCÛð ŒB`ù§îñ.éáñÇFq6§ €Á Ž‚Ÿ `”€{9ÿ-3gJ}E…Q}Ñ Ø#@8²gÖtŠ€o3.”7^}Uº;:|Û“@h Ž¢%É: 0 {Ï=Þï¥7 <Æ/ €€.„#]&Eh$03#CåðK5UM© €ÀE¯ˆ‰À'æÏ—£uuÒ×Û“õYˆ•á(V²¬‹€åË?õ)é={Vj¶?g¹í#€€n„#Ý&F½h"à~•HœsSȆ=|•ˆ&#£Lø³áˆ—ÄD qâDINM•úòrî–aEX Žb%˺ éwÜ!]òF[ €€6„#mFE¡è'à†#÷§bËýЧb°V€pdíèiØ ÌY²Dnœ2Eª·oýfì€DI€p%H–AÁf-^,¯ýæ7ÒùúëƒÀ£ €€b„#ÅB9˜&ðW“>ì} {ã¿|Ë´Öè  :XÚB@þ×ÿ”ëœKúßîîR¥$ê@†  ËÓ ©À„É“eÖ¢Er°²’»eGŠÉù à‹áÈf6AÀnŒ‚ïnÙÍ5|ךݯºG@‘s¢J´È,*òê?°ëE­û x°C€pdÇœé@¦¥¤È¤¤$ç­5ÂQ ƒ`sI€p!€@¤¹wß#m--r²µ5Ò¥8ˆ©á(¦¼,Ž—–ÜUìýº¿¬üÒCüPR€p¤äX( ónËÈ„ÄDÙW¶Ó¼æèŒ 5NšA@]÷^G)is¤µé°¼ûÎ;êJe `½áÈú—ø'ðÉÏ|VÎôôȾRþzäŸ:;!€@¸„£pÅ8F-ž—'câã·Ö^õœˆÄZ€pkaÖGqãÇËÔ›o–ã <Æ/ €€j„#Õ&B=.° w¹¼ñê«Òvì˜áÒè*@8ÒurÔ€¦«?ÿ9‰s>œ½ãÑšv@Ù `ºáÈô ÓŠ LIN–?ò9Z·O±Ê(¸(@8â•€¾ ¤.\(¯þú×Ò×ÛëûÞlˆŒ$@8Iˆç@ ê÷Ý'½gÏÊ®-[£¾6 "€‘ Ž"ä|[`Q~¾w·ìÚ%aŸË  €@¬G±f}Tà¶E‹¤¥áРÏñ  ¤á(H}öFÀb´¬,éêì”ã,V uPQ€p¤âT¨  òϹwË.yô º¥EÐI€p¤Ó´¨ƒ&Nž,ÓRRäðK/Ô­ €€ „#¦Hh*0oé29õÛßJwG‡¦P6˜(@82qªô„€&™+î”þþ~Ù·“/¢Õdd”‰€„#+ÆL“¨)0gÉùÈÇ>&[žT³@ªB+GVަPGÀ H-Ü-[‘P Ö ެ €@°óϹwËn®© ¶vGþ,@8⥀ ,.\!c¯¿^ö—•Z›#€—G—$ø/"0nüx™ãÜr÷³Ï²?›"€W Ž®áß à»Àüe9ÞݲO¶¶ú¾7"€W Ž®áß à»À¢‚|oÏýeå¾ï͆ €À•„£+Eø7ø.àÞ);yút©zæiß÷fC@àJÂÑ•"ü˜·t©¼|ô(—ô¢Ï¦ p¹áèr ~GÀäæÊ¹¾>ÙWZX lŒ¸„#^  „@z~¾Ü0i’TlÝ¢D=ö Žì=# ”Àuqq2#}¡´4Rª.ŠAûGöÍœŽPV -+Û»¤ÿÄáfek¤00_€pdþŒém î¿OÆÄÇKé¦ÚÔL¡ `žáȼ™ÒÚ $Nœ(îeýGêê´íÂ@@‘þ3¤Œ˜•‘!mÇË©¶6£ú¢ÐG€p¤Ï¬¨+ xPúûû¥fG‰ýÒ$¨'@8Ro&T„€Õ·Î™-“’’¤~W…Õ4Á Ž‚³ggB`úüyÒ\[Ëݲ‡ðáaˆ­á(¶¾¬Ž£È[w¯w·ìƪªQœÍ) €@d„£Èü8b °¸°Ð»¤¿vçάΒ €Àð„£á}xˆ;Vn™9SjùžµôÙG¼@@Iœ5k½»eŸlmU²>ŠBsGæÎ–ÎÐZ`QA¾WÿŽGÕºŠGýGúÍŒŠ°BÀ½Söä©SåОÝVôK“  ŽáHYP \!UT$îÛj=ÝÝW<Ã?@Ø ŽbgËÊ ¡À‚Ü\ï’þ²'6G¸§#€¡ ŽB·âHðY`nNŽŒKH¦šjŸwf;°Y€pdóôéÅÜKúÓ–dɱúrÞù¾5~@?G~(³ŒZ -+[NwuI}yù¨×àD@ ÂQ8Z‹¾ dI\\œ¬¬ô}o6D;GvήÐF`Jr²$§¦JMI‰65S(è-@8Ò{~T€³22¤£½Ý»¬ßІi ÊÏæ Š@fQ±wXõ¶í¡Î1 €@D„£ˆø8üX›#“’’d÷¶gýØŽ=@Àr‘å/ÚG@éóçyo«õõöêR2u"€€¦„#MGÙØ&°(/ß»[ö¾ÒRÛZ§_ðY€pä38Û!€Àè 6l1ññR±uËèà,@ DÂQˆP†Á ¸wËž–’"- ‡‚-„Ý@Àx‘ñ#¦AÌX¶jµtuv:7„¬2§):AåGÊ„‚@`(ìU+½§jKv u#€ Ž"&dðKÀ}[mòÔ©r¤®Î¯-Ù, Y8tZF@g¬¢"i;~\Nµµéܵ#€€Â„#…‡Ci pµÀ‚Ü\éïï—š|×ÚÕ:<‚Ñ EC‘5@À7¹992.!Ašjª}Û“@À.‘]ó¦[´p/éO[’%M{k„»ek?N@@I‘’c¡(N -+[Þ=sF«¸¤8'žCÑ ŽFçÆY  @Áý÷ywË>XY`l¦ ŽL,}!`°@âĉÞݲkJøP¶Ác¦5 FÏÆ ‰À¼¥Ë¤£½]N¶¶F² ç"€W Ž®"áÐA øá‡¼2÷—•ëP.5"€€F„#†E© ð¾€{·ìIIIRõÌÓï?Èo €@GQ@d F`Vf¦¼|ô(—ôÃÏ®+@82v´4†€ùë×˹¾>ÙWZj~³tˆ¾ Ž|£f#ˆ¶€{·ì1ññR±uK´—f=°X€pdñðiÝÜ»eÏvÞZki8¤{+Ô Ž¥ €@øéwäIWg§œ8ÜþÉœ "@8…‡@@¬â"‰‹‹“ÒMõ)šJ@@i‘Òã¡8I`Jr²$§¦Ê‘ºº‘åy@ $ÂQHL„* ÌÊÈðî”ÝÓÝ­r™Ô†šŽ4e"€ÀЙEÅÞ%ýeOlú žAB …Åa  ®À‚ÜïnÙM5ÕêIe  áH›QQ( '0}þ<9V@Î÷÷wÏ!€# ŽF$âÐA oݽrº«KêËù"ZæE¨,@8Ry:Ô†! ,.,ôî–}°²2äs8L€p4˜ !€€vîݲo™9SíÙ£]íŒj ŽÔšÕ €@9kÖJ[K‹wYËp*X.@8²ü@û˜$°¨ ßkgŸ;2i®ô‚€ß„#¿ÅÙb&0-%E&O*õ»*b¶ #€€ù„#ógL‡X%UT$͵µÒ×ÛkUß4‹Ñ EÏ’•@@¥«VywË®ÙþœÕPè(@8ÒqjÔŒC Üæ|ÏÚ¸„Ù³}ÛÇð 0œáh8žCí®‹‹“ùË–Icu5wËÖnzŒ€„#5æ@ EY‹3åLO´nŽâª,…¶Žl™4}"`‘ÀòOÝãÝ-»¾‚«Ö,;­"5ÂQÔ(YT˜8y²¤Ì™#ûv–ªRu €€F„#†E© ºÀŒ…éòòÑ£ÒÝÑúI‰8„#^ `¤ÀŠ ÷{}íyæY#û£)ˆá(v¶¬Œ $Ϙ!‰&Èޒ笂­@@G‘ŽS£fI`~n®?p»e‡¤ÅA pI€ptI‚ÿ"€€q‹ï¼SzÏž•}¥|0Û¸áÒ1 Å—¥@ XÅ……ÞݲÕ¶vG­GZ‹b@ ø±ceÆÂ…²¿¢<œÓ8, Yþ }LXr×]ròÄ 9ÕÖfz«ô‡Q E ’e@@M…Ë—Kœó}k¥=¦fT…Ê Ž” !€@4¦$'Ë”›n¾J$šª¬…€Ù„#³çKw à,t.éçnÙ¼@ TÂQ¨R‡Ú ä­¿Oúûû…»ek;B GÀW‘¯Ül†A¤.˜/“’’¸[vøì‰€†„# ‡FÉ ¾Àôùó¤µépø'rX'@8²nä4Œ€™+ŠäLO?Ø`']#€@È„£©8tÈY»Æ»¤ÿŧžÒ¹ jGG> ³/0nüxù›´4î–ü(¨åGʈ@ Zósr¼»ew¶·GkIÖAG•–@`pÅ+ ½'J7=6ø<Š8„#^ `À¥Kú÷í,µ¦gEðGá›qh,žŸïÝ-»§»[ã.(b)@8Š¥.k#€€r‹ï\áÝ-û@Å.åj£ PC€p¤Æ¨|HÏÏ“„ÄDiسۧÙt é61êEˆâÇŽ•›>ñ 9øâ‹­ÃÉ `®áÈÜÙÒ !‘_ ož:Åݲ‡ðáal Ùþ  ,È¿o½×õîgž±°{ZF‘G# ñ<'4uªLJJ’C|îȸÙÒÑ EC‘5@@;ÌÂBi;~\NµµiW;#€@lG±õeuPT sÅÅKúkv”(Z!e!€@P„£ äÙ˜ë|ÏÚ¸„iª©´6GõGêÍ„Š@À÷’þ´%YÒ´·Fúz{}Ø‘-@@‘.“¢Nˆº@ZV¶¼{æŒ4VUE}mD}GúÎŽÊ@ B¬â"‰‹‹“ƒ••®Äé `’áȤiÒ „%0%9Y’SS¥¦„e‡ÇÁ.@82|À´‡à ÌÊÈŽöv&ƒþ‡eIDAT9ÙÚ:ü<‹Öެ5"€À`…<è=¼¿¬|°§y , Y8tZF÷n3Û»[vÕ3O¿ÿ ¿!€€Õ„#«ÇOó à ÌÊÌ”—å’~^ à Žx! €€õë×˹¾>.é·þ•G¼@Àz÷nÙcâã¥ìÉ'­·DG¼ @Àz÷nÙ·Ìœ)Gjk­·G¼@O gÍZéê씇›AËøË‘å/ÚG‹‹ ò½_J7m„, Yþ }¸(0-%E’§O—#uu €€å„#Ë_´ï Ì[ºÔ»SvOw÷ûòX'@8²nä4ŒC d{—ô—=±y¨Cx, Y0dZDÐäæxwËnª©íŽB#GFŽ•¦@`´ÓçÏ“cõä|ÿh—à<Ð\€p¤ù)¢+°(/_NwuI}9_D]YVC@‘>³¢RðA `ÃïnÙ++}Ø-@@E‘ŠS¡&LÀ½[¶{Yÿ¡={«@ XÂQ°þìŽ ,[µZÚZZ¼Ëú,’@ Æ„£³<è'½j¥Wôþ2>w¤ßô¨ÈG‘²&ྭ6yêT©ßUaXg´ƒ¡ŽBQâ°N «¨Hškk¥¯·×ºÞiÛG¶¿èX›ëÝ-»±ªjÐçyÌ ™;[:Cææäȸ„©Ý¹3‚U8t é85jF˜ ¸—ô§-É’ÚÒÒ˜ïÅ  –áH­yP ($–•-]\Ò¯ÐL(?G~(³h)Ppÿ}Þݲw<ò¨–õS4ŒN€p4:7ÎB 'Nüóݲw[Ð--"€À%ÂÑ% þ‹ "0+#Ã{[­§»{gyL ™8UzB¨ >ð wIÙ›£¶& !€€Ú„#µçCu °À­sfˤ¤$iª©¸¶G¿G~I³h+0}þ<9V@Î÷÷kÛ…#€@脣Э8,È[w¯œîê’úr¾ˆÖÒ—m[&@8²là´‹á ,.,ô.é?XYþÉœÚ Ž´#€€ßîݲ§¥¤HMI‰ß[³ @8 -@@?e«VKG{;wËÖotTŒ@Ø„£°É8lÈ^µÒk»zÛvÛ§g¬ Y5nšEÑ ¸o«¹—ôïÞöìh—à<ÐD€p¤É (‚Èt>˜}²µUúz{ƒ/† @ f„£˜Ñ²0˜&¹b…w·ì}¥¥¦µF? p™áè2 ~E†˜›“#ã¤bë–áã9Ð\€p¤ù)üp/éO[’%- ‡üÛ”@Àw‘ïälˆ: ¤eeKWg§¬¬Ò¹ jGaGÃàð p¥@Vq‘ÄÅÅImÉŽ+Ÿâß `ˆáÈAÒø#0%9Y’SSåО=þlÈ. à»áÈwr6DÝfedH[K wËÖ}ÔÀ„£!`xJ ³¨Ø{jYùP‡ð8h,@8Òxx”ŽÁ,ÈÍñî–]¿«"˜Øb*@8Š)/‹#€€©ÓçÏ“æÚZî–mê€éËj‘Õã§y­À¢¼|ïnÙU\Ò?ZCÎC@U‘ª“¡.PZ `Ã/eO>©t‡á ŽÂ7ã @@Ü»eOKI‘#Î[kü €€Y„#³æI7 à£À¼¥Ë¼»eŸ8Üìã®l…± ÅZ˜õ@ÀXâ‡òz+Ý´ÑØi G6Nž@ *îÛj“’’äH]]TÖcPC€p¤Æ¨4˜•™éÝ)»§»[Ó(® ])¿@0 Ö¯÷.é/{bsgq(¨,@8Ry:Ô†Ê ¤ççKBb¢4ìæ~GÊ‹Q€p"‡!€ƒ \'³oÏ”Ã55r¾¿°Cx 4 i60ÊEõRç/”Þ³g¥zûvõŠ£"[€p6' €)ßz‰sþ‚th÷î¿|‚!€€–„#-ÇFÑ  ’@ÒÔ©’œš*õ»v©Tµ €À(G£„ã4@àrésçJG{»wYÿåó;è'@8ÒofTŒ ä­[çUUù‹§¬Ž’@ ÂQ8Z‹ !0gɹqÊ©¯(âF]GºLŠ:@@yÙ·ß.¯;&Ü-[ùQQ à ކåáI@ t¹ÙK½Kú×¼úI‰Ê Ž” !€€®wÜ»NÆ^½Ôìx^רÂ/@ JñcÇÊÌŒ i¨¬änÙQ2e‚ ¡Îž `¬ÀœÛ³¤«³SZ7Û#!`ºáÈô Óø*°üž»½»e×<ÿœ¯û²DO€p=KVBdJr²L½ùfiÚ» ÐT€p¤éà(Ôp?wôë¦&.éWwDT†À°„£ayx_`É'?)ýýýòâÖ§Â?™3@ pÂQà# 0M`Q~¾$N˜ ÍûjMk~°B€pdŘiüX¸|¹ä’~¿ÙÙ¨Ž¢ÂÈ" €À_ ÌÎÌ”3==ò«††¿|‚!€€ò„#åGD  £ÀëÖɘøx©Ø²EÇò©«GVŸæ@ VãÆ—”9s¼·Öbµë"€@lG±qeU@@f,L—öW^‘Smmh €€F„#†E©  —@žóÖšûSö$o­é59ªµ]€pdû+€þ@ f·Î™-“’’dßÎÒ˜íÁ }ÂQôMYÈ,,”¶_ýJúz{ãP[€p¤ö|¨4È\±BÎõõÉ/ëökÞ å#`áÈžYÓ) 07'GÆ%$ÈÞç¶°;["€ÀhG£Qã@ Dø±c%mI—ô‡èÅa¨ @8Ra Ô€F ¤ee{—ô·;ftŸ4‡€)„#S&I  ¬@Vq‘ÄÅÅIåÓO+[#…!€Àû„£÷-ø ˆ‰À”ädINM•úŠŠ˜¬Ï¢ ]ÂQt=Y T`VF—ô*è'@8Ro&T„ >ð wIÙãØ-!`–áȬyÒ (*pénÙû+Ê­²@à’áè’ÿEb,0}þ^jKvXÙ?M# ‹áH—IQ'h/àÞ-{ZJŠ©«Ó¾@Àd‘ÉÓ¥7PN`ÙªÕÒvü¸œjkS®6 B‹„#^  €€Ù«VJ¿Ôì(ñqW¶BpGáhq, ¡€û¶Ú¤¤$iª©Žp%NGX Žb%˺ €À™Î³›öÖH_oïGð0)@8 RŸ½@ÀJÌ+äÝ3g¤±ŠKú­|дò„#åGD `šÀÜœï’þƒ••¦µF?!@82bŒ4: ¸—ôÏÎÌ”š>”­ÓܨÕ‘=³¦SPH ýŽ<ého—“­­ UE) à Žx €dI\\œToÛÀîl‰à ކÓá9@ FS’“%95Uvo{6F;°,ŒV€p4Z9ÎC"˜•‘á½­Æ%ýBr:Q E”å@P2‹Šå\_Ÿì+- õŽCG> ³ 0˜À‚ÜïnÙ[· ö4!€@@„£€àÙp¦ÏŸ'- ‡À@…G ƒR@À>EyùÒÕÙ)'7Û×<# ¨áHÑÁPØ!P°aƒw·ìÒMíh˜.Ð@€p¤Á(Ìpï–=-%EŽÔÕ™Û$! ™áH³Q.˜'0oé2ï’þžînóš£#4 i84JF³Š~È»¤¿ì‰Íf5F7h*@8Òtp”æ¸o«MJJ’¦šjsš¢4 i<wdÑØiUA‘‚C¡$°SÀ½¤?yút©zæi;èEGŠ ‚2@W`ÞÒ¥òòÑ£Ò×Û $@8 žm@Á2‹Š½»eï+-ìiCG> ³ ªÀ‚ÜïnÙ[·„z Ç!€@”GQe9@ RéóçIKáH—á|¥áh”pœ†ÄJ`Q^¾tuvʉÃͱڂu@`ÂÑ08<…!P°aƒw·ìÒMƒØž=°^€pdýKPMÀ½[¶{Yÿ‘º:ÕJ£¬ Y1fšDÝ–­Z-mÇË©¶6ÝJ§^´ i?B@²W­”þþ~©ÙQbb{ô„€Ò„#¥ÇCq `«€û¶Úä©S¥~W…­ô@`„£ÀèÙ^ «¨Hškk¹[öðL<‹@ÔGQ'eA@ : rs½»e7VUEgAVAG!1q à¿ÀÜœ— µ;wú¿9;"`±áÈâáÓ:¨-à^ÒŸ¶$Kjùž5µEuÆ ŽŒ) !€€IiYÙÞݲO¶¶šÔ½  ´áHéñPØ.Ppÿ}Þݲw<ò¨íô€o„#ߨÙ_ qâDïnÙ‡öìÿdÎ@Q ŽFÅÆI €€ó–.÷mµžînÿ6e', Y<|ZG=Š~È»¤¿ì‰ÍzL•h.@8Ò|€”æ ¸wËž””$M5Õæ7K‡( @8R`”€Œ$0+3SŽÕóÎ÷­ñƒ± ÅÖ—Õ@¨¬_/§»º¤¾¼<*ë± -@8Ú†g@eÜ»e‰—ƒ••ÊÔD!˜*@82u²ô…F ¸wËží¼µVSRbT_4ƒ€Š„#§BM €À éwäIG{»wYÿ OóDI€p%H–Ab-U\$qqqR½m{¬·b}¬ Y=~šG¦$'KrjªìÞö¬NeS+Ú Ž´#€€Í³22¼·Õúz{mf wb*@8Š)/‹#€ÑÈ,*öî–½¯´4º ³ Ž(øP_`AnŽw·ìŠ­[Ô/– ÐT€p¤éà(ìHËΖ£uû¹[¶½/:±á(ÆÀ,D[`éªÕÒóÖ[ÒPUí¥YÂ/@@3Û‹ e\B‚ìzê)Í*§\ô é1'ªD®sîuôñ3œ/¢­xŒ_@ z„£èY² à›ÀŒ…éÒñÚkÒÝÑáÛžl„€-„#[&MŸ `”@ÞºuÒßß/å›7ÕÍ  ‚áH…)P ¦À­sfËS¦H]YY˜gr8Œ$@8Iˆç@E2 äxCƒôtw+Z!e! §áHϹQ5 ËÖ¬õî–½û™gÐ@( Ž¢ˆÉR €€Ÿîݲ'Lýå~nË^/@82~Ä4ˆ& L½ùfyùÈQ“[¤7| ùNΆ €@ô2WJG{»œ8ܽEY ËG–¿hôÈ^µR✛BVlݪw#T€B„#…†A) €@¸ÓRRÄ}k­vgi¸§r< !@8†‡@]ä.—“'NH§óö? ¹á(rCV@¸óþû½ýK7=hlŽ€)„#S&I `­€{·ì„ÄDùUÃAk hh Ž¢©ÉZ €@@·Ìœ)¿Ü¿_Î;ß·ÆD&@8ŠÌ³@%–®\%gzz¤¡ªJ‰z(G:OÚ@? ,]³Ú»¤¿Š¯á5@Ä„£ˆ Y^`âäÉâ¾µ¶oçNÞZ ~T ¹áHóR> pI =/ONwuIÓÞšKñ_…áhhœ‚¨(wï½^Y{v«X5! áH›QQ( 0¼€{·l÷’þ¦½{‡?g@`XÂѰ<<‰è%pÛ¢EòòÑ£òî;ïèU8Õ" áH¡aP  ©À’»>)çúúd_éÎH—â|¬¸ÎÚÎi0P`ŧ7Èõ ÷ÃÙü €ÀèG£sã,@@9ƒ•UòŸoŸ–k®¹Fþô§?É›¯¿.Çêë½:oüèGeFzºr5S* 𶚊S¡&@`Ÿ˜?Oþ塇äTÛoeü 7ˆˆ¼ø¢üøk_“[gÏÅŠœ‚€„#;çN× ` À¸ñãåCÎÿMû›”î>rS²Üpãò1c‹Æ/î_¦øAÀT‘©“¥/@à -ßÿ¾¬¾õVéîè³gÎÈ×V­òþëþ^ºi“üû×þQ¾þç{%=û“ŸÈ½Î_›þãŸþÉ;çðŸoÐ}êwò½/|Av;_Sò£¯|õŠø'fŽÌ˜#] €¥?&?ú‡¯xÿW¿«bàñ{¾üeùùsò¶síþ?þQns>ƒt}B‚lûÉO¥ãäIùëM“Ã55â~vi~N®¼ÓÝ-ŸuÂÑg¾ñMyö§?õÖét>Ç4caºdÉ‹?Н*Ð哸@¶IÓ¤@À(Üðiɺë.Ïâ±ÿó¯RS²ÃûýÚk¯•µ÷w²ñÛß–Y‹åÎOÚ{|_Ù ò?ü¡¤¤¥ÉÊÏ}Î{ìõW^‘k¯»ø?×;oÕ½óÖ[Þã© ÈÞçž“ÄÇ{üþƒsÛ€ëââ¼çø0E€¿™2Iú@ .È{ï]°¸ðÞ{Þc—XûÅ/ÊѺ:yó7$qâDïáϘ!;}ÔûýÎ_–šk^ºtøUÿ}þÿ="ïžyG–®ZéìóÞUÏó&ŽL˜"= €Ž@õöçämç/<îÛ]ÿùvw)ý‹»œ«×ÚÏ¿øä®½[R,0»ûï¿$-²næLùóÖ[êÂ…ÒPYé¬ñ¶¼vâe9°k—üÎyÛíô›oz-Ú÷ ²éÛß‘ñ&È–ü``~AÀkœF.4:ÿ? ¿~Ö;ð{²¹Ù¯íØ@à Ÿ}ãòð7¿)îÛl—ÿô½{VâÇ]ùCƒþ~¾¿ß{+ͽbíÊ5=ˆ@àûÎ_;¿úãG°Bx§ÎuîÆ_ŽÂ3ãh@@['þí»ò•â"ùðG>:h¨ %¹Í_úŒÁHÛ—… À²Gâi@À…ùy’2w®s%Ú2SZ¢b"@8Š +‹"€ê Üâ|¦ˆY€·ÕF6â@°H€pdѰi@F lÄ € `‘áÈ¢aÓ* €Œ,@8Ùˆ#@@À"‘EæU@Y€p4²G € €€E„#‹†M« € 0²áhd#Ž@@‹G ›V@@`dÂÑÈF € Ž,6­"€ €ÀȾñì»gÎÈúÙ³G®Œ#@@¹ÆwwÇ .ø¾1"€ €ª ̽æám5Õ¦B= € ¨á(P~6G@ÕGªM„z@@ PÂQ ülŽ €ª ŽT›õ € €@ „£@ùÙ@T ©6êA@@Gò³9 €¨&@8Rm"Ôƒ € Žågs@PM€p¤ÚD¨@ð¾[-Ð Ø@PHàÿ ês¼¨qIEND®B`‚phylip-3.697/doc/images/DrawTreeControls.png0000644004732000473200000022263112406201357020534 0ustar joefelsenst_g‰PNG  IHDRéN“?ÿ\$iCCPICC Profile8…UßoÛT>‰oR¤? XG‡ŠÅ¯US[¹­ÆI“¥íJ¥éØ*$ä:7‰©Û鶪O{7ü@ÙH§kk?ì<Ê»øÎí¾kktüqóÝ‹mÇ6°nÆ¶ÂøØ¯±-ümR;`zŠ–¡Êðv x#=\Ó% ëoàYÐÚRÚ±£¥êùÐ#&Á?È>ÌÒ¹áЪþ¢þ©n¨_¨Ôß;j„;¦$}*}+ý(}'}/ýLŠtYº"ý$]•¾‘.9»ï½Ÿ%Ø{¯_aÝŠ]hÕkŸ5'SNÊ{äå”ü¼ü²<°¹_“§ä½ðì öÍ ý½t ³jMµ{-ñ4%ׯTÅ„«tYÛŸ“¦R6ÈÆØô#§v\œå–Šx:žŠ'H‰ï‹OÄÇâ3·ž¼ø^ø&°¦õþ“0::àm,L%È3â:qVEô t›ÐÍ]~ߢI«vÖ6ÊWÙ¯ª¯) |ʸ2]ÕG‡Í4Ïå(6w¸½Â‹£$¾ƒ"ŽèAÞû¾EvÝ mî[D‡ÿÂ;ëVh[¨}íõ¿Ú†ðN|æ3¢‹õº½âç£Hä‘S:°ßûéKâÝt·Ñx€÷UÏ'D;7ÿ®7;_"ÿÑeó?Yqxl+@IDATxì]|TÕÒŸFH„H t½û舀‚¢€€Š }úPñ‰½‚>ÅÏ÷ôY±‹ý)*ŠŠŠ¥ˆHWzï „HHùæ’Y.Ën²I6a“Ìü~wï¹çÌ™3çÊœvïz-Z´¨ &&†j×®M üü|s—///*((0§N"oooã?áÃá /üjÕªeü!ÛÇLJòòòÌ3dK˜ÈF\kš†‘Ž0I_äâ‚Ä“¸¢ƒ ´ü zHº$ñäÙzGÈ*SÒ·Æ?«NˆgåÁ³•ìÃð,úI˜ÜÉ?«L¸%ޏq·êp+IÙˆîf•ƒ0Áß*Kx%®„iùÖ{+†‚•ÜíÃð¬åºÏœP·P¯¤þámXꜶAªðn­Wpƒ¤]Â-~pƒ´ýŸ]ç ‘9•?©VœÅÏZ'!ÃÊ#2ån†g”깯¯/¥¤¤×š5k ÂÂÂL”HÜx–D%Ìn˜‹øáFc‘¸xF\f‘p(!<"SžqÇ>ø›‡¢gq‹gáÅ]Èê'2¡‹è)|r—¸¢ü‰/<&ÏàƒÛú ?ÄCš¸¬r¼â'ñD7äXI8øA"GüOx‘+wøƒðŒKÒ°†;rCÒIzp[yåw‘+¼ˆ«å¯å/uõCÛ?ZJ!YÛÜ iCÒ¦¥-ÆÐö/8Ô„þyõÚ¶m[*…ÜhPRaPAÄ-aV?¸¥ÚW&á©r·+þöòà/:ŠÛÊ·¤»È7ò†xVù"qÅ-€Oâ "ȳ„C.ü¬é" þ ñ‡\‰‹»„Ã˾’™ÈE?’†5Ü¢¯Už¤g ‡ð@tŠV]D¶ÕOÒ†ä!.Hxá–0«Ÿ¤)¼¸ î ÑYîöqÅ郬ñ6<¢žE†ðânõijÊy‚'d „»–¿¶©R?P§à'íÎê·ø£ŽI\Üí뢶ÿ3m‡`¤íÿt¿‹:ãµyóæ©TRùp‡ŸtPòl"…I”Jè¬Óƒ?dá¯< ¿ÄG冟<#-ÑþðÁKæx¹þ‘ø¸[ÓŸ¤ƒ0è‚$ùÆ]üà/òá·°Ÿþš¿€výõ'%<@ðmب15ëÔ‰ºJõÙ-dmaÖ4Ž&î£ +~¡=Û×Ñ‘LÔúQ‹6ݨÃyC¨~ƒ&6~Bä‚$óæÑĤƒ´lÝ"Ú¼{LH€/5ŠnDm›u þÝStÃX#q¬yƒ±Gçfõ·o ’Ä=p—8à—*<C~ahDoñƒ¿5LâH:ÐIä$:áYÒ†¿ÄA8:1lV~k|¸­éYùà/é‰ÛšâZ ¼ßÊ':Ùë*ÏÖ4DžU–ÕOäâ.8âŽt…Dy†,{yx¶Ê’g-ÿBc$xàVâ'¸Úß.Ø ¶àÑò/쀶ÿB,P/¤>á²Ök]2ü#üÂ[SÛ¿1ÒÖ=H*€Ú±j%-ýäSŠß¼‰Õ ¡¦õÃ`¢½G“éàñÓ®-õ¿újjÕó<,:>Üþ¸ïÞ´‚þøå:ºe#5 ¤æÃÁB»¥½IéT¿mê5ôjjÙ±·‘!º€ºB†ÈFAÿµyÍ]4›¶ÜJáMêRƒØúTpª€íO¦£ûOP›Ø¶tÙ qÔ©mq 2¤à!þ’„áÙ>=Ñþ¸`à‘/‰?¸ת¯y(Ò_ÜàÁ%3 ‰'yCZð“4­òޤ!ò`Ø{ăë3â ¿’wÄÃ3ü…$i .wk‘»5MÉâ LÊ_ôFZð Á?É›äAtA¸äAdC–ð‹Ÿä²E†¤ ¸è%rá/q…Ï>=Ñþ¸Dw‰?¸%=<ÿöÛo”ššjÒÆ )66–êÖ­«å_Ê_ê˜Ô3Søu—¶ÿÂ>¸ í ÍH[D;”6‡p„I?ƒgð ¿§¶ÿò–¿Onn®-£È´t6pçääPjB<-ýh&¥nßFôïG͇¥€‹†0:D™óæÓ®¦—[NËØˆׯOá±MÎ0 è(ýüüŒÜ'RÑÊù3©`çzàö=ôRª:‚S󢜔ïéà‚oè·–ÐJ¯ ªAuÃclKÁÒéâBá$‰§oÍ¢Ý);麻‡ÓÈ.#©ap/~0í7š³nÍzk}Ã<¡uÃ)*<Ú„IáãA@De€¡CdãÀwøKÃ’Š0øC'øAä€nøË¬nIKx%.ô€,+Ad‰>x†[üà}/;;Û„Ã?++Ëè,z œAˆ‹ :A‘7Ⱦü¡/¬þ†‘WdÀ-ùÄ8p[Ëßš®¤)º!܈ƒ¸öºŠ¿È—øÐQòˆgÍ·søƒÀ>„ .’/ÉwyËΜ9¦lš7oNǧ#GŽPrr2;–.¿ür“_èQQåôèQƒ+ö˜ n‚¥µœ¡–¿¶i¨£V’v‚;.´KÔiŸÚþÝÓÿ{ß|óÍKGnŠ¿zÎ7t`ñbº¶YŠëÜž¼øHxβ?(÷•DÇŽQÝz!yJ„ ϳVLøáx$/ˆ^È ÈDw‘4Á‡8r! wð" „!k§ ?IOÒ‚|¸%MÄEçŠg¸A’¾È‡«[ôpTþ-r ˪;žEŽð!-âÀúK^À ?ë…|‹>¸K\ã(ú‘4D¦`$ò!nx… ÏŠ«ðÀ2*«üýõWêÚµ+ÝsÏ=tÉ%—ãD}ôÕ«Wš5kv†žÈ'ttWù¿øâ‹”˜˜Hݺu³ÕÁéÈ¥å¯íuíõÜÚþO÷[À¢²û)$ŽŽËÚ‘!ìèÆ ÔÉÏŸb¸#ÌÙ°™|CBÈ; èêÌ,Ê=ÁËÝùyÔÑןŽnØ`kðÒ‘¢“´ÎžŽXOcjShålßJ>,¯–€éWós²(å! <;öÿÅú\cÓ ºA'Té|·Ço¡ð6!X¯6mˆßDÁuÉŸgî l–wâdÕ©@q!´óÐ6# ñA’gÈÆ%†Q ~BREøÃüB.ÄCkåF8i" |¸[/„ãYô³\‰#ñp ¯è#þ¢“a²üÀ<޼Âí¬üÕš7ðŠ.¢»ä|â7ü¡Ÿµü%r¡‡Äfp#Üž&Èš6ä"¾Ü%\î#òÁ#ÏrüAà! —»Ë_0Æ;’Î!ChåÊ•ôÓO?ÑÀé×ûiÓ¦Ñ5×\Cß~û­ùnÁý÷ßO_ý5­]»–’’’ß37nµjÕŠ^xáêÞ½»‰ ™3fÌ0Kê÷ÝwŸIƒÏšÐÏ?ÿL7¦Ý»wÓþýû ~ˆÿ·¿ý¦OŸn–Ü!ò!¯N:ôÅ_ÐÒ¥KM™÷ë×®¼òJ[ùoÚ´‰>þøc“N“&MhÒ¤I‚öËåÒò×ö/mHÛ¿ûú€‰K:48tZb r㨭¿/ÕÍÈ¢:GQ-^F%ÿB#í“E¾ì_Àƺ­¿-ä¥qÄAŽt®H÷ÜŒ$jLuˆ|y®[+Ï‹¼½ ;ËüÜLò*8iÂÀ³9!Ét¤ˆ'µ…Ì㇩a×Pò dƒæ•E¼›Sh¤sós(§VÕªS@ š…QÒò#6ã™È3ä ¹’ŽtêØ+’J'x€_â¿äWdø8Â^+Á_ÂÅHÒYõ@^G âÀ$ØÈ3Ò’.dIz0šàE<Üq9+ø#]Ä—™–¤ƒ´%nðIzí¨üÁB<«LÄÆÖ<Š,Ü%Öpèg Ã2Aðw„+ü2¡âÀ I2ðìÎò‡{ölš0aµnÝÚà× AºãŽ;¨E‹ôôÓOÓ¼yóŒ{ÚëÖ­£¾}û½ÿúë/³Å¹0¶@Æ¥—^j 4äÜpà F&ò™––FóçϧáÇ›ô€Å?þH«V­¢gŸ}ÖèúàƒRŸ>}ÌàKô¯¼ò ÝyçÔ³gOú÷¿ÿmt3fŒÁJ0‡lpÕò×ö6ˆ:!mÎTþ©ií_Ú…äýôÇŽú)ðI?äƒÈ"wé¬"Ü|õ÷£À€@òdã@^0Òµ¸zóš#o'S?pç&òÐHq¬“_moò¯ïG¾aAäLÞµYÏÂA>¹l4øªUÀ<Ùä—’cÓ áÐH2à_LJê†Rp`Ï¢(ÀÇŸü½ý _Î)oòöc| øN È0ñ £µò@ž(<ƒ ·øT4\H[ð7üÁ‹;áˆ}Ř€¢0øC “;d s³># È…L¸A’–\È„ äˆnHü?žÁ+qÅwÉtWtÀ3Ü-yľ³à…øÃ‚¿ÒÀ—‘‘a3†C:HéKa¸@’¸á'ù°º‘ŽÄ<«Ñ~¤³=I~à/ú€i >â†?dˆ<„ 6ÐÃ> zÉ@JÒßM7ÝdŒ°ä¯W¯^Fæ¡C‡(..޾ÿþ{“>–Ïùk& 3ä–-[ÿeË3Ã^¿~=M™2Ŧ+ôÖ¸C6Ò‚±…‘…Á†ÞË—/§Ž;R||¼©'`i¿ai>â×6̓É£–ñåº.$uGÚ‰õŽz¢í¿úµÿò”¿.:néŒÑxQ‰p ÷à¦)õ/Óñ©n¯°¢Ð0ãæ^‡¼N¤¥c#˜J©|’¬7bÈ„¡ÓE?‘^„# tdÖ(<}¼Ãe= >„#¤ÂÏ!ñ¡—UÒ—üCWÄsVþ¸au#ÒD:‡|¡aÒa#~ {·ÈÃ2À\-~‚³„C)ð‚€øä „¼C‘ 7HxoøYuâ‚àvGù#-Iù’rÁ᱈ˆƒ/ôFZàÃÒ7xpÁãB>&zÀghÍ=zÐáÇiûöíf©<44Ôœ?yò¤‘‹ü \RþXª†ñGºÈ+Œ5Œü¾}ûLzð‡¾à‡Üßyçˆ2zH}Öò×ö¯í¿âúóõ 4\4RtŽèñ ‚_pÛötðÄ :N¾¡õÈ»q y…G¢g£‚äÔïíG'}ýèàÉ ª×®½éXИ¥s”O:¦àÈv”˜¾Œ²|½) ;¯€*¨jÒó‚6&ìPf…0/ä@tNbpÐY¡£†¾Mê·¢}ÉëÈ?ß—yÍ;FÚL¾9è¼s(#/“²gSlx[Óé y”XŒôI‡ tRàAá½L@Ñà†ø’ ÐYp€?düâÜ!Ï* ¼ÀAäH\èáG\À én\â¹’ÂìË:A.áÀAò¬Cè#òî¬ü‘6dŠ Ä·Ug¤…g„!ø•?ü$]É¸ñŒtq‰lÑ<ðUTùKÞ¤ÜDþ$¯9P†´á>Ñ~ ü¾ÿ¬Y³èî»ï6³ã_~ù…¾ùæ[ùc¹|çÎÆ°Ž=š½iw¿³!èÙ€Ã"ø°W=®y¼¤ÊýáÉüZ~$‰2ÅP“^3ä¡C— †;*übZõ£}ÛP¯Ð `Þ‚ÂYNƒ˜ž“O+·%S5¢Øæ½Œ^ˆý,H*'üâšv£”-ñ´÷Ï#Ñ+œ¡ñLܧаÔÊãYp¦— Ê¢vqçÙ:bäò¤!H¹$¤?tÞ‚ Ü¢€/|h, ᕊ~!è,áÈ0ŸT‰t _*¤ Yð“py…xÃè‚OÒÁ]t‘†]œ•?ÂÀý  î"þ9¢·ð8*‘'é‰>¢;dJ:‰¹Â'ñpGþø`PgðÁé€à¶ ~øá’t‘6.áE\<# ðƒà¶bb<ù|%•?ä/Ì q¨ 2Gef©ƒ äò‘ŽYì1ãpןþiü%ßøM ôÂ~3â`v/øîºë.[=‰ŠŠ¢üÁ ‘ Á $å1hÐ #o1¿Íås¤“žžN8‰ŽÃf$|øá‡tþùç“¿¿¿9@À[LЄühùkûG½Ñö_Ø®€…´3´´éK'ø¡ ÂÑþÑ® ,‡áA„Hç†{m~ ªñÐ )‰?È0oçŠeãÜÀ˜xHûwl§ln¨ú ¿º¡¶å1i¬V%áçë_—¶A‰»æÑ‚Å;©qóTjeÒNäƒbv³Îj@Ñ톒_@˜‘‡x’éà‡$°võh=˜ÖïYB«m¦f±Ô°H^BB"íæ¥;Ÿ¼ºÔ©u? ®]×tÀÐ €$ߋ޼Â$éáBåCgŒ Ãzá’Î[ô  ‹éÈäIá ž¤'¶È”<ãŽNñ §HÄ“ø¸#wć|ÄÙHaH| äIôà/q%>ü„2.ù™Gº#i" $~xF˜¤ù’ââ$<à?y¼pƒ².$ÏøC: ðƒà_‘å¿lÙ2¼0cÅþ2NNãÕ(ÑAÊAʵiÓ¦f¦ýÜsÏñQºøâ‹ÍImÌ–¯ºê*Â{×È[—.]lù‡\ò ¶•ï޽͒8äà0œ—r³¶̾ñüé§Ÿ# < {(QºaFÿî»ïÒŠ+L8–¹qÂ[ÊüÈ‹–aÛA½ÀZÊXpGpCù¡^J›pD}ŽRgÁ?mÿ…}°«Jí¿<åï5wî\ÓKI…AÓŠ#c€‘¼/»S6o¤”»)-9 u†‚ã(¬Us k×r9ö´P Q±@¨TP• þ‹ÆçW+‡ŽÆ¯àýîÍ”‘†ÏxÕ Ž¦zí)4º'ñðÀìsÂòPq!]!r!Zòød÷®„õŸ¼“Ž¥a.ö‰ ÆQ­©Id{ò¯UÇuć^ Äð ò™ ¤%ùGf¸ƒéCž…Ïà—àkùCgÈÝOÄ‘2…®ˆùˆƒ „¸Â‹gw–?f¸hCÈÒA=ÁöQß /tÂil,_K=#þT.tD›Á²3fµœÂxˆ£Œ „W®]»v™ÀiCN~cvŽ´›ñ»ÙØû†HÊ<Á^5>€‚v…´ º@fíhçÀmbÏž=&LâkùöyÀ å†2’:l¤ŽJ8ÊGê+øàB}·ðâ~k¸ðHÝF˜ðCø¥Là/éKÒIù¡î€¼Òÿ!\Ò‘ø¸kû¯¤þŸ¿†dþ`……BAâB¡IB`dŽÏ¢“@á B^¾h„8¸¤2X+€µ Ž™‡Æyè°Ð) ã€éB‰‹Î•¼Hñ0²D:‡N3ð éðIFY:„,©hÊŠgÈ“gäñ%mè™H ~à•F„´ñ,+äÃqÀ~ ƒLè‚é‚$w„# :X üÁ<ƒŽgð€$ ¤ âØ?ÃiI:x†þàƒ è+y´–?üwÒÌ$>î¸$o ‚þÖ;äCðÁp¤½%=„Wþ&ÿHúˆ‡ü€—`,ø€~Ö¼àYt‘´Ïuù >ÐSþ÷]ÚGiÊùÌK*ð¢ÍXËAÊa’>ÜÀ|¸ þÐ~ øPÖ»è"ùC¸–¿¶Ôk½“:SSÛ¿¿—Y 3Œ/Ÿ4*<£ã’Ž&ža´@X"ƒ?â\ĉ1‡V¹<#xaˆK–ààFZ „K#†¿øáŽø¢ÜÐñ¦,ƒÂù:ä n!茸 ðJ ÏHSÒG|ðÊé€GÂá6èP&$r¤s³„!¸EOärà7.‰E?ÈAþÀ‡¼! ÁaÐþˆ0ÈÇWË_ò yˆ‹ôQ†À¸,å´EdK^dà…¼ôFš¸Ã7HÊþr¡^¤Œ€™–¿–?êê„¶mÿ¨ BžÜÿ{ñéQÛL:KÜÅÀ t”p î 1.Òñ˳t¦ÂNòÄßÊ8·I4*ø#:_«Há"[€—Î>üäâF„gIò! á’ž¡'üŽxGü‘–øAâD¸EG¸!OâÃϸOð‘‡4ÅAÈE<x$Ä?žaø`¥êZf#%î£G³ ôÞ$ ©@­›GPhPP¥f@;cé锘r‚¹\@ÞÞ ø»k‘o€ëÅ@Æ ;ëï/ÏNM}E@¨þà/Kúé§JϨ뽶j؃>–’C ŽQÝ€ÚÔ£u#j@~ºÿi‡Tå?æäÒîÃ!´jÛS>QaèCA¥0ÒøÏ×ÐÐPª]»våg@STEÀÃÈÊʲý?veªVf#Cb؃Æ×0ÒèMƒÉ—Z•Î-9E[“ëRÂÎæŒP^¥%èàààÒFS~E@P7!Pf#SÜ8$–ŸwÊ,qcí.‘‘A³gϦˆˆ>|¸›²Z²˜¼¼<òñ9’ùóçSBB‚C±±±tþùç; ;Wž(”¶òóM9•öÔ½ÿÙ¸’" (ŠÀiÎE¿x¶E:­K®Š8&vèÐ!š0auìØ±ÒŒôóÏ?O_~ù%ýþûïgå{êÔ©´bÅŠ³üá1hÐ 3ÒVEËZ>Xî¶ÒæÍ›©}ûötêÔ©”Íœ9Ó”]§N¬"Ô­(Š@•FÀ¾_¬ŒÌ”ÛH%ùÄ[^þ)>Ùç±AÂI:wÉ,I±‡~˜š4iâ0½W_}•Ž;fDŒ9’aùš™><ÂÃÃÆ))½ŠGy0€nK&,,Œ®¿þú 4œ4i}÷ÝwnK[)Š€"PS8sºTFØPÁ©|3ËÂLË—¨YIIIÔ¬Y3úç?ÿIãÇ7Ëà0¨ü±IkÏž=&üÎ;ï¤Ë.»Œêׯof}‹/6áû÷ï7á×]wM·K/½ÔøeffÒ!C(77—L÷ÜsI †öí·ß6{Ç  }ûöÑW\ÁZI!‰}eÑ nÄq–eYêÑ£‡m/¼Ö; EføâòäI#kÆŒ-[¶ä}à|3f %&&Òµ×^kž±„¿~ýzÃë,Ld–÷Žò@¹”•¼½½ÏˆºsçNÂþ;ò¾cÇêÖ­=ñÄãAÑäÉ“ ?^Û™`eï-[¶L‘wÌÆ?ûì3³uÐ¥Ks(­yóæf°%‰a[ÁY˜ðè]Ps€}¿X:¸ÅH³Ù23D&w]’yÈÃl„ƒd_}õaÿ¸iÓ¦Æ`?~ÜHK–,¡O>ùÄÌta çÌ™c ‡ Ñnd?õÔSf K×ÿûßÿÌ’®ðÙß c–cÕ/::š–-[F?üða@‘––F'N¤gŸ}ÖèŒÁ믿^l˜}ze}.ÄÌ}ËÝ999tàÀ“} z6lØ@«V­2«œ¼ùæ›ôÇÐçŸNuêÔ¡wÞy‡0˜/&ÉÉÉôÚk¯™ÑÅ_Lýû÷'þ»îº‹n»í6#+55•œ… îzWE &!à#­ÏS˜M»i©[Œ2fm"…ifrð«[·®)'Ì¢…¿U«V¶ðîÝ»›p,uÃÐÄ ‚3U«ŸyàIÏÙÝôaI„¸˜Éƒž~úi ¡Ë/¿Ü<ÃÐæ,ÝRûsy©eÒ-íOI$€XÞǪ@TT”™5#¯ ÜýüüŒƒ“éÓ§›­Š… šAÐC=dÊ 3î®]»Ò[o½efÔXNrfé" (ç’úÅŠPÍ-Ç`@r²r(3Ó=37̾„`„qp`KÜŽÌq° ‹¥Vtô7n4~XŠ# >‰Y.~ø0ân‹ù±òIú0Jâ“é ;î¸Ã,wC/ä«aƶ¯Ö8 “øÅ$íRPNV®mðâR„20aï]_(“øÉ¯µµhÑÂ<îÚµË`$ƒ(áÁ*Dqa§wE@Pjn1Òž–K±÷Ù¡C¡± ~OxèСf÷Ö­[Íž(–jeéVtÇ'¸_~ùeºå–[ÌÞ²„¹z·Ž²°l;eʳ܎>–Ý1‹Ä쳸°Ñ£G»š\…ò¹r@Â( >áÅŒHÁkv⇕øaÙÜYX…fV…+Š€"àÒg¹Àê6·,w»M›r j×®íÝ»—^|ñE# KÍX"aÖŠ™òÝwßmöS{÷îmüå‡Í°¿ýÜsÏ™Ã^â_Ö;ö·±×}âÄ s¨jÍš5fF #\\XYÓsw<뀣´²1«ÆÁ~qaVêVEà\ Pž~±¬úzäL¯Wa¦%„Ù—õþsçΕ` ã7kÖ,à ãl=‰‡Óǘ!ã$6NÛÓ»ï¾kâ! '´–dí)..î,ýÀƒ÷ŠqÁ`a™Û: +.Ì^~U{îׯŸÙÆPd/^ò€U¼w~ï½÷Ò3üO[(#ìKã<¨¸0‘¡wE@Pj n3Ò^<'·¡ÊÐ:ºØßtDøuqá,ž#Y¥ñ‹‰‰qÊ^\˜ÓH. <ÊCVL!«23Æv‚¸% ¼+.„Oºb¶ŒSÞ0Â8n¥o¼‘pa}mk½).Ì*CÝŠ€" T6öýbe¤ï#N¶_çŠpÒ{ìØ±¶÷Ï•ž”.ÊÃjüJ«[y+£+Ì÷®QqaÎ⨿" (‰@yûŲèVf#íÃàà]t¥¦¤Üð`—(½Ë9ƒ+K&°DeR¥BP¹üÖÊEÊ奤(Š€"Pµ(³‘®íïM¾T'À›’RÓèàñ@Îy.ùú°•.Ç—®ª|¨-¯sçò?“<ΟSårAù œP^¥¡s1b,~Ê«(Š@e#p.úÅ2éÀ`_ ó£1uéðÃôçîD:T/˜BêÔ®lÜ4=;Ndd–™A GšrBy•†Ê³T^št”WPª‚À¹èËl¤ýý¼ù›Íüê¸|ˆèNH£C{òk5îù IU)4OÔKÜ<ƒŽlnÊå„òRRE@¨Z”ÙHû²!âeÔ† ƒÈÇ;†"dRÚ‰\óîkÕ‚ úi‹ÕÁ!¾üJZEDšrBy•–~þùçÒFQ~E@P7"€ž»`i9>òœË3çìœ|:™–KYÙùü¿Ò:“vcù”I‰™3¼ÄtY t™ÖHŠ€" (nC ?¿¥Sæ™´hààCA|))Š€" (Š€û8/L¹Oy•¤(Š€" TgÔHWçÒÕ¼)Š€" TiÔHWéâSåE@Pª3j¤«séjÞE@Pª4j¤«tñ©òŠ€" (Õ·É>’plßAG ŒôôêŒW•È[  Šà—jܺED7*“ÎËøo$ÏÅ×uʤ¬FRE À¿ÿõ}ðÁJHétå6Ò0ÐÛ—¯ öÑ iÔEQldÔiéê:'ì?œDëù¯7q¹Po*“¡ aÆ™¿š<'™ÐDE@ð òóóé§Ÿ~ªtÊm¤÷nÚBq‘‘4¢wŸJW^tŒJ¸ò—.¥Ý\>e™MãCò¡¡¡Åþÿ¶ãÔÕWPê‡@VVU©?Ø"HÜ·Æ\v©<ê݃èÚ¦5ýþíweÖ¨víÚäÊÿB—9¨(Š€"P,åžIgœ8^)Kܹ¹¹ôé§ŸÚ2ãëëk HŸ>}ÿ' ÊÉÉ¡Ï>ûŒ¢££iÈ!6Þ’yyyäãs6ðÿä“OF0`€IÇš^YÓw˜€<1›Fù”…*s?:--p¡Ü”E@ðT*³_ ζLââýÔ©S.r–íäÉ“4a„³„„‡‡Ó{ï½G—^z)?~ÜðôêÕËe#ýüóÏÓ—_~I¿ÿþûY²333¦ Æ·Þz‹F}FzeIÿ¬DÝìQÖò)ͲΪU«¨ÿþ„å ’hæÌ™Ô±cGêÔ©“a}â‰'è©§ž2{߃0è¼zõj—e–”¦†+Š€"àJÓ/º#=È(·‘†l¨ãÔ[Efµ ̶>øàÊÈÈ üÑˇzˆ†NÂ]Ä]’N?ü05iÒÄ!?òjРýïÿ;CTûöí)ˆOQÃè éIš¥Iÿ ¡n~8£¾’²0iÒ$úî»ÓKðX©¸á†èÅ_4³é믿^O•—¢†+Š@AÀ-FF©¬36W‘ƒéççG4Ñ`˜ÑÉïØ±ƒ0ëµê î””š8q"­\¹’°dŽYö믿αÙ.¼ðBãwðàAjÖ¬mÞ¼™üýým*9JÓÈÈ~ãwïÞúõëç0ýÅ‹Ó]wÝEûxïiN›6ÆŒcSaîòŒú*ËÀ§ó+{Øš 4ׇ~Xax¨`E@Pʃ@eõ‹VÝò1D4™MVÄ] & - .fÑ÷Üs™}EEE™åRá‘™,fÛ0˜¹Á·nÝš~øá:t¨Yš•%Wæ=z˜Õ{ÝÖ±cÇhìØ±¶ë±Ç3yÅ4 ü¡C‡Ì³}úƒANLL¤k¯½Ö`„%ûõë×W(VÈt‘еÀ]uã?©­„PçÎiòäÉa°|ã7¬,6÷o¿ýF—_~¹9ܱÂ+](“+®¸Â¬@`psôèQºãŽ;Ì@fçÎëpUfÅŠÔ¥Ks¡yóæôñÇÛÒ³:ŠÓõâïÿ»Ñ '×Ï?ÿ|Ú¿¿5ººE@pŠ€}¿è”Ñn™IÃÈåFÝÎ…oö*%öÉ'Ÿ<Ã(FjÑ¢E´iÓ&Â!¯¹sçš(_|1Áˆ|õÕWf/ôÍ7ß$ìkËr¶Z0‹‘á& „’’’hÊ”)¶pIOøåÆ q±ÿÈ#ÐEü9ŒfòÓ§OqrÇ,Ú£¾ììl3¸À!½µk×Ò’%KèÖ[o¥:œñšŒÞˆ#èºë®3Øã°ßí·ßn¶>ÿüsjÌYyçwÌ*Æ%—\Bqqqô 8eܸq´wï^:Àı§ÔÔTB¹]}õÕ4oÞ<š5kÝvÛmÔ¦MêÙ³çìÅ鉺ƒ÷±ßB£F2åøÑG!CE@ðÜ2“£$†º"îb<ëׯO÷ÝwM:Õìc¢ÃÅ.¤)<Ïá¸è7hëÖ­6#‹g ·¿# †%99Ùvýüóφß>=ûg,qƒž~úic` A0Döé¸ûº \ÊJΖÊÿõ¯<®¹æc$íO¿ã£Í—_~Ùœ¸÷Þ{ û÷˜uÃ0‚pǶE½zõÌ{‡xÍ«N:NUŬ v'ÌÄ»víjÎ#8‹äHO¼R†åuÄ*qj !¨þŠ€"`€³~ÑžÏÏn™I£ÃÃÉÞò…’2%'‡qX ¿•° <øµjÕÊøcæ'†‚¥k,Ÿâ41ûÕ輟{î9Óù—Lû¸ØëÆ;À'Nœ0®0›ÄŒËóžNΖu€ïaàƒý~ìëZ {Ë8õ>cÆ 3“_¸p!ÁHã$=ßOHH(Õ,‡ï°·´±ê}o,wã è›o¾1óPôãHϯ¿þÚl}@?4h~›Ü šºE Xœõ‹ÅF*g [–»Ë©ƒKѱ‰™Tq„ýj{,ycï<Œ‹|Lä¼ûî»&ü­¯_!†Ü^žÄÃÝ>=ûgðà½_\0LXæ>#1èá.ÂéiÌl±”.‡áð!œ˜á…è0áXúÇ ö°AxUmüøñæu7¼Žæ ¡ _}õU³TCfØóƾ4< È9r¤IWä9Òa8éSþXá€nxç^IPOEÀ-F†# tzžJrhÉ‘~Îö6ñ–Õ/&&¦¬QËåQž³#ÞûÆr3V äSª8a-ûñP,ÁékÌxñJ•U¬h`À$Å0x±ì£Û˼ñÆ db5Ä*{ÖöäHOðàÐfãHr”E@pgý¢«ñËÂç6#mí4Ë¢ˆÆq/(ò”Iq•_X+‰6¶Qyþ´FßUr¦'fâj ]EQùE@(®_wß˽']›—„øƒ s‘wRäI9 \P>î l\yå•îU¡2ªŠž ‚ Wjƒ@¹gÒ‘ci=/7^Èqð¥+Y²¬6(U¡ŒÈìKÑ(”OYH ½ÄÅ–þñËÓ©ªèéé8ª~Š€"p6öýâÙî÷)·‘nÌÿY¼m%â‡thÞŒ†þm¤ûUU‰®"pèX mܽ‡¶ñÞqóóÎü"—«2ÄØ»Ê¯|Š€" TwÎE¿Xn#Ý¿Í|ŠâìÚ¾þàO.fñ_J*[°ÄÁÕšvïF(%E@Pª‰@¹´¯/Åð-"ù@O>¿âTÀ[éÜ"àÅ£¼ù#-¾üéM”OY Ÿ?URE@8wxqÒKùW%E@PE@ðúãõfÏQG5QE@P+j¤­h¨[PE@ð ÔH{Pa¨*Š€" (Š€5ÒV4Ô­(Š€" xj¤=¨0TE@PEÀŠ€i+êVE@P<5ÒTªŠ" (Š€"`E@´ u+Š€" („€ùâ˜ÿ*þö¶’" (Š€" œ3²{žý_ ÆH; 8gZjŠ€" (Š€"`Ðån­Š€" (Š€‡" FÚC FÕRE@PÔHkPE@P<5ÒZ0ª–" (Š€" FZë€" (Š€"ࡨ‘öЂQµE@P5ÒZE@PE@´‡Œª¥(Š€" ¨‘Ö: (Š€" x(æ‹cª›ª¥(¥DàHÂA:°}9p€2ÒÓK[Ùš@  Šhܘ·nEÑ<:Ój¤=ºxT9EÀu` ·/_Aí£Ò¨‹.¢ØÈ(×#+§"PƒØ8‰ÖïÚE›¸½PoòhC­FºULÍjõF`ï¦-I#z÷©ÞÕÜ)åDX\ùK—Ònn7ž<›Ö=ér¶FW<Ä}û¨k›Öž¢Žê¡x<h/h7žLj¤=¹tT7E dœ8^â÷€ÈËËËá5hÐ Ú¼y³ +(((EÊîe­]»öúñþa‹-hæÌ™îMH¥Õx0›F»ñdÒånO. Ô-//—~ú©-o__  ¦Ž}ûPph˜Í¿¼Žüü<òö>]ÍönÙL-YB©‡˜CÆŒ!ß2%“››C‹>ûŒÂ¢£©Çà!.˰×I"ÂÁ'ŸÈã÷ŽlÜ6iz†_Yœ¥_VyïÔ©SâtzÿÏþC‡2áÏ<ó ùr¹ßsÏ=æ9šq £ë¯¿ÞI§B*!à¿ÿý/õïßߤ´sçNzýõ×éæ›o¦¾}ûRÓ¦M+AM¢ª#ŸŸOS¦L¡§žzŠû o§Ùq¥Ý8\ §{ÏJHL“ð²Nž¤'¯Ÿp–B¡ááôà{ïRßK/;+¬´³^xžù%½¾ìwõƒ'§Ÿú7ååæÚD5à–¯.û¢ÇÚü\u¤?nòСW/—´½NÖ´²33bžûßz‹.›8ÑÊ^&wqé—I ]$tLÎfÁ>>>ÆÈI”>ø€üýýiìØ±âeî~øáÏçâ!..Žzq¹‚p¿øâ‹)&&†,X@7ÝtÓ¹PIÓ<Çäåå•JÔãE‹Ñ{ï½G7Þx£Ã¸XUòtÒånO/¡ Ö/’gO/-˜OÏ|÷ºõV:–œLo?üˆ[R}롇)õH²‘•~<•>~úŠäŽé}´yõ»t%ò«BoÜw™Ò ®WžøìSšøô\ŽoÕÉY¤ð ŒŽÐS®þ£F:c/•¿+é—J 3 4fŽ.;V‡˜µÆÆÆC¿cÇêÞ½;=òÈ#ÉÒó€jÖ¬YôØc™çöíÛÓ×_m“³bÅ êÒ¥ SóæÍéã?6aÐéïÿ;…††šëüóϧýû÷Ûâ¹â@Ü€€ÛŒè·ß~£Ë/¿ÜÈkÖ¬™éˆ!§´:Â+'mÚ´1:GEEÑÄ¢ätëÖžxâ jÀõ¡~ýú4yòdWTUž BÀQvæo¶GéS^1ij#^gÚ ÊB™Äª‘.lÕ'’Ϥº_0˜úŒA÷¾ñr»oûvÂR2(íX M=ŠFqG5‚;ª‡Ø°M,\.EøG¼”t3wfÃëÖ¥›ºv¥/_~Þt×°Œ\:ÌFxwúÇÙøggeQVFåð½q\zðý÷iê‡ШÛo7qð³kÃzº­woBW·jEÓ‹:ƃ;w9Ïßv]ݺ5æÁE<”7œ^˜fâþ}†qäü@§ëxFö'/¯ƒìu’<š@Ëo&ÀE®Ð¢×™ŠÃãXò“þ “&ÑãW^I—°aÓ¨ýR´—ê(ý©—6qÒÝ´/†Ž³iÌ:ì/K:srrè—(;;›Ö®]kö©W¯^MÇ7Æ~Ë—/§aÆ™¥D𦦦šÙ.–¨aèïºë.ºËjÕªUưÿôÓOÛMÝÊ5ˈ猖-[F³gϦÏ?ÿœ^}õUÂ~9V®ºê*càGpùbf½iÓ&úç?ÿI·sÂ,»4:c €Á!III4}útzûí·é×_5r6lØ`ôÇllÆŒôæ›oÒüáLeõ¯`ìësqÏÏ>û,ÕªUË ìpúé§Ïjh'h/žNj¤=½„*X¿|6¤[׬¦•ó~¢W¸c=™–FõDñ^¥Ÿ1Ô“žOKæ|CÑ-šS,Öes¿§;Ùûª[V­¤S§RT“XºùÉ' ²Þæçø];©Ϩ@~lðÚžw5à}Äó.¼R¦.A—Õ n½ÂØøwáý^PvfÝwÑ0Ú¹~=õæåͺ4û•Wèó_4†³î9ÜQúóÁ¢FlÀCxi~G‹öXóØÀÏÑ„êvÁ Šß»—¦òŒëdÚ‰³tBãuDiÇŽqœÑ¶ëÝG§6õâð8ņé÷Î;”ÆFkðøñ”‚ÎÿŽ;L|{L>öæ9BÜa¸ƒd¶€Èþ*«|ìcv=á d=þøãæ  æÆXÌš±tøÐC™YøW\A]yÐöoà X:XåKÞúÀ Fï£>*VÈ{øá‡éV^ÝÁž9Œÿ.~¯²0ûÅãË< Ä>ú½÷ÞK˜Õ¿ÁƒL!WtÆì|áÂ…4aBá¶V °'/³|ì…·mÛ–ÆðÙ Ì´·lÙ"Iè½’°¯ÏÅ=ÃHÏ;×Ô9Ü_xá…³Úƒ´•JÎF©“sÜK•ZŒF¨ª$ç™ö('q­ãl'Ï&ºó,ûʸºhfÚ‹y–sâèQ÷Ûv:‘r”î›ñ}Ï3ï˜-évn8”T;¾iÌ‹Ãcÿýá{º…gÞ-:t   ,blìÓ‚–|õ5%³Á=oèú{þçytó´i׳‡ Ç–çßùk½²x±ÍÏÞ¯ ÍX»†žúzõ<˜Ž³žK¾úÊ¡Nöqñœ~â-ûn®íZ½`¡a+ ‘Æy~–=÷pÍ˾ÇÙèg¤§9LÿU^¶Ï{áî:¬‡å;t^ÒYï¢_iïXºaé„%p–Ÿ‘4“ó0,yã¬ÆyÔ¨QƾöÚkæÐWÏž=qD¼+yÅAN›×ãí !\,9'&&R¿~ýè‹/¾ Ú½{7uîÜùŒÃm}úô±ˆ£+:cYþÛo¿5:ÁØcPà ̄°Ä/„ü"\éÜ `­Ë%¹1ĪŠ\x¶ƒº«ËÝç¦,5ÕR P—°o˜:…&þßÿÑ}o¾AoßF\y•‘°eå*sï2°p¦‹qïãÅߨ°yíe÷{OL£Ûúö£¿wêÌûÚGL<ë–sw¬[GÃn¼>`ÃÿÕ¡º™÷ûÐé}ñÒK†5;zP î€Auø´ù„G¥Nýú›güÄð ÚÛËùIMðÄòc-*<ÒîoƒÞ·ßÜ]ùiؤ -âU¹Þø½ðà[Ixˆl¬HúA¼äÊá£2Œ¥£«¬éc™ÙYf™0z89Ê {Ô˜‰ç~˜­îã÷Q±LƒxÝuב˜ñ®\¹Ò\Xf¶'???úŠX1žW&@HkëÖ­g°®ãºÕ²eK›Ÿ+:ciËØïðÊG2oÇ@_¬nXóUÙ2]ÍŽêtyýÐ^ï½àZËû‰lðr§<ûÅ—èÚ¸¶ôŸömÕµ _]V±mâ̳ Tž¡|üïS¯‹‡SSžá®åÙÒ?yù‡«náevü DG~÷3O^ÃÌsõüôêÝwÓãW\Ißòl'qÏ[n¥s·y8p`yù‘K/¥w{”÷Úç™@>ð²ê”““å ¶s¯’ðpótˆ}úÓxvx ã⮃c§Sª\W+^ထÃ{ؘéÂèb_úÚk¯¥n¸zóaÀÖ|à'¥aÈq«4„A^ÃÂ4œæÆã<@\?ñ ö­¯¹æšÒˆ¤[n¹…Âù\CÆ ÍÉõüÑè)û쥦̅€Ô ðPGª²‘Æš`ÁR*Õ,ðJÔðz¡ÍÞ,ÞãsFû¶n¡gö¡ñ~s³víèÑ™ó!¬Bƒü/c®ä“»8†W—zò;<ôÁûÆèâðÕbÞÆÞôÇ,'‰gWÏÞü:P´¬P§uâAO|ñ9Ÿ*1*üöí7ôÜÄ[è(¸ fƒÝy@šÆŽ0;žÐ±õáŽúžQ°¬~YD$á=é7x'ÀÇ·jM-;v4ïbïå%Ñ ^)¸›÷B/dc²×)ºy ãì_Äz`¹ûs>p戊çÞG5Œ¦N}ûÐk¿-3ÑqR}#ŸþŽu­W?ü¬ô§¿š6ñrï÷G“)$¬¾£$]öûô¹géßÿ˜HY|zÞÑ^ defÑØËµ_*Æ`è³îó–GäiáP›}Z¥‘‹¥nì7Ëž{iâ*oå!€-W õ«8xC¯êaFmß&Àƒƒˆ¼=ƒÆ—ñ5PWõ)+_ÖQtYÑ«añ0Óƒ‘†¡qDɼÇÎÊžŽ<`Naû×°áÙ ^úŒæ=DÙ»µ9/Œg8Ö¯•ÙóØ?‹‘î9d0½ðË|r&ÑNö²Jz. ââ»#}Gòa¤ÿÖ±î« /f¸JŠ@UE¯É•†p¾à~;ä~»ÂQ{@8Ê>Ì^=ÙH;>R$”·F RxÊYfhðF4:}:VâbÆXÒ¬ÑQ<‰ïêÝ™ gþ®Ê_Ix'Ëé;“ÙAyf•Î䪿"PÕ€aÆ6­Ž¨ª´5ÒŽJOýª$AüÞë0^ÖnÅW©‰T›_‘Jà×Íêóa<ûSÊ5ͳ"àÌ@£}€Ð^Ðn<™ÔH{ré¨n¥BKñSJøHF©V1æHþþ9þÈþB>AWÛÐÙïÃU±,©ºŠ€ WÚ˜‹qÈJ^ÓC{A»ñdR#íÉ¥£º)¥@ q›Ö´ ï¶óç7;4oF CÃJ[YÏFÀÙ»ïeÑútiãî=´£5?¯gYDTZ5Ò•µ&¤T, ù¤þ)>A½‹¿ôõŸ¸Ç?))ŠÀÙ`‰;‚¿ýÞ´{7B»ñdÒÓÝž\:ª›"PJrù0¹ü%|GÝ]ß/¥ Ê®x<^ÞÞäͯ†úò·|}Êöö•‘I¼‚¥3éÊ@ZÓP* t8žÜéT šŒ"PmÐ/ŽU›¢ÔŒ(Š€" T7ÔHW·Õü(Š€" TÔHW›¢ÔŒ(Š€" T7ÔHW·Õü(Š€" TÔHW›¢ÔŒ(Š€" T7ÔHW·Õü(Š€" TÔHW›¢ÔŒ(Š€" T7ÔHW·Õü(Š€" TÔHW›¢ÔŒ(Š€" T7ÔHW·Õü(Š€" Tô³ Õ¦(5#Š€cîÚIñ;wÒá)-å¨a «ÏÑ׈bZ¶¤F-Z:ލ¾‹ÀÖ• ´{Míß–L) ÇžaÑu)¶M85ïMqçE{¬îªXéÐ?Ø(^Ê]FV/˜ÏI‚ÃØ±±ÔuàùÃÔ³ì¤O¥ÍüAéûP\“Xjß¼5.ì¼÷rYlÚ½‹¶îÛOA±©]¯^\·^ÙÓ˜•‚@rBýñÍV:²í0µîMmÛGQÓVá&í½;’i˦$Ú¾1"ÚDR¯‘q\)zi"ƒþ`CtÅ`«Rí¸ÀÆ+ì| »_0ˆ^Z°Ða˜z– è5¿Ì§¨Zµh(c/ÆÙ^Œõ/lÈ“N¢îC‡85Ôùùy´à“O¨NݺÔﲑ61»ÖÿE»þú‹ÚõîMZ¶²ù»êÈÍÍ¡EŸ}FaûÔü[ÚùW^A~~µm<wî ÍË—S4¯²tèÝÇæ_‡`ì.ÜìåØ?—F·ªÄ«Fº*•V×uç_Òñ£…K­ޏ”|ùoâþïë¯L®êEDP‹Žªx=Ký?ýH!™tÍÅSÚ§;`GZfdeÑÌ~ uèoÆ;b¡ôÇi8Ï´£›6¥Y{öØx^½ûnšõÒK4é¿ÏÐøû°ù»ê8–|„.‹ˆ¤M»újÚ³y“1–ÎÂì¡‚±}lÌX:|ð õ5ŠN²AD:H/77—¥ÂFy2ï“/™ó E·hN±qmhÙÜïéNöÃR^,ü6ÐÖÕ«hï–Í´—|íÛG«þ…ÞºE‹x™1šv¬[G3¦N¥(Þ¾ùÉ'ÍòâÛü_d S¡cGŽð²¥Ýº¥}&Jñ|`Û6Šã}æÆQQ&ÈcIxqËCÅ l Äp í ž9½pà2㑇͠(ë••iÔ*N>fWˆó¤Iôø•WÒ%¡¡4¦Q#úeæL÷® .08ær–L¿ÿ=º¹{wº(8˜ÆrþÿûP¾ý²@1ÂÈHسۄ¢N\ËubbÏž¸ËïµíÙ4šÃëÑÉœ<ÊÈ- L¾ŽÈ¤¯>ÿÌ\É©éÆaà/â ®#6aÕac¼aéRÊHO3,¿÷HM¥ŽýúQPHá2¹³Á°£ò¼ã¡® Æ'xõÌZ¾Hd׆õ„Y<ÌWs;Ÿ>y²M½ªR6…+Á¡Fº@Ö$\G ’ Û;­£W/¦¹o¿Ãˬ'hÌ?o§»^}•îyý5Ó C/.Ì>µ%_m:žó.JSþ÷?zuÙ2ªVÈV€EÂB²¦½Žg0ÂÝ 2K£Xí:p íÛ¾ólþ‚«®4‘~›3‡–Ïkܵ¹Ã[ýË/„ôÐQ =šÐIöoÛN'ødõ}3Þ¢ïوǨ~õ·ßh~f&‡éc¸Ë÷“ȃ…Ö±Mx0‘Oyyy.]àEÄ-ŽR¦[Î;Ïv-úüs;Œ^q‡ƒ¡ðp3X:zè‘ÁF5¶m[jÊ«'>~¾&,';›¼}|ŒQ-Nþ)Î+^߽󥱡<~<¥ðoúwÙ­xµäçïOm9‡9¯/ß1™2ÓÓiìä;È? Ž‰»zÞφ¯¸ŸŽ}ú˜´æ}ø¡aûëêFË¢4Š‹[–°ømû©^L4edåCœ™K”’z’î½q4½ûòSæºïÆ1Æa0ààEÄuDÁ!ÔiÀÊÈÈ ùŸ|jXæͪ‡ß0Áô ]:q"e ´Š;Ò)£Fò¯Î”~üìƒZÖ´»b¦Ì³Õþ÷¿y€Ux-—ú1KÇÒ.{Ô®ÐÀ±c©—÷J>‡€:0‚MEQLë(ÊNÜIÁµ½øò¦à:µÈ;0„^š9‡&Ý7Å\¯|òñC˜áa^ÄAÜâHÇ,][ßWwe0l_žVŒóxk¥>ÜN°º±šW•ðúÞãW\Iß¾ý6%òk}0Ø ªPÖðbå)­»/»nþu1uæýý(^Ê<Å3ÇØB.fÑ0Їx&¸™ ­ÏXÚälü2°ùרqf`Ìšñ€èÑ™Û66f'Ž„Ý{LN¼ÿÂû¤BÃø SÓ¶íÌÀÉ™|WŒk!ƒi1ï#¿÷øãôΚÕôë³i1¿¼ŒÏ æá{·n¥|RßÂQ°b‚\k^!iؤ©+ÑÊÄÓ²gKÚ÷ÑbÊ=ÞÑï{÷ ¢£¯32Sù¬ :¼ÔÍëÜØå8ŸD'n¥^×_¦x22&†ÇÇS>ìh% Ïxèa3PÆŠ” ”woÜ`eµ¹­c…ÉJ7…·0ž›x‹y¿>˜ v?~µñz^5Á*ÔÞÛ¯ åaÍSE»±Và¨s¬è„U¾"PY°‰¢ñÍ[ðéÑ=æ£ØÛÆÉížÜa¿À_åªn”Ë3’e¼WÆFrÔÀóÉgy0Ò0ÖV‚q†‘ÎáYü><•‡Ýú^vùòhy Ë÷Ž6å•+ñË#ïÆ‡ð¡?ÿÚ…«(8ćٯ¯ŸˆwùþïßΟ5‹îš>ÆX^%rY€‹Œ(Óo_ú„ßRð¢lHƒØûxóJ —Ÿlh@—s^þ)JOÏ Õ|n#¤^]v×Õn)SGegêÛcìˆ<$Ûý­*”‡£üT„Ÿ~q¬"PU™‰Fý8$„ƒ0x=¤uîtÝ”).Ïð<2SÅ(…Yû*^†á¼èÚò²¬ýl3è%ëÖR<ÏbzzÖL¿˜$jtöP?{æ¿´—ß+ÇÒü¼Z#F¿¢€IؽŸ~yÿ[ªåD=.ÄÛ/Ø]‹’šéSˆ±>´÷ ­ž·ˆÝé4ôÆËø„{lE©ä1rÏEyTFæÕHWÊš†"pÀIY¼z†o0g%&Qþ˜ ÞƒŽá×@ñü¾óv6,Ûø6j7ˆ2ßÞçwjåõ·s r•Jr¿:ô¿:ÊxNÀ¡À.…+2(ÓC;Ðò9óùï$ÞJhK-;µ¡ØÖMM²û·ïå÷·ñÇ~¶Pd“(ê=j5lÕ¸F”é¹(Š,k‘­FZл"P @§ž“•Mûøu³C¼ßœœpˆÒùC*  >Ýò{ÄMxߨ¯¶èÌ«z1K™n^¶†v­ÝL ;öÓñ¤Âsu£Â)ºU,µèÖŽÚõí®eZÕ ›õW#] Q³ ”„ö3±?œÏ>­ «IÞ¼í×;ö KÒAÃÝ‹€–©{ñôTi0ÒzºÛSKGõRÜ„Œ°1ÄŽß6sS**¦2Ð2­L´ÏmZ®¿HynõÔÔE@P‡€éWäšaE@Pª‚€éªRRª§" (Š@C@t+rͰ" (Š@UA@tU))ÕSPE Æ! FºÆ¹fXPE ª  Fºª””ê©(Š€"PãP#]ãŠ\3¬(Š€"PUP#]UJJõTE@¨q¨‘®qE®VE@¨*¨‘®*%¥z*Š€" Ô8ôÛÝ5®È5Ã5 ­+h÷šÚ¿-™RŽ›ì‡E×¥Ø6áÔ¼{4Å]Ó 9+¿wí¤ø;éðƒ”–rÔ„‡Õ§ÈÆ(¦eKjÔ¢åYqÔC¨ ðoáKùÏ•<ÜÜZôÙgæoûŒaSpÛÚ5´oÓ&êЯE7knó/ãdÚ Z6g5hÖŒ:õë_QåŽ+ù®CÝ/\¢<ዎ¦ƒ‡”Èï†üü<òöö!IÛU]Ý‘¶«2’Òèo¶Ò‘m‡©u‡hjÛ>Šš¶ 7Ñ÷îH¦-›’hûÆŠhI½FÆñ_W»*ºÚð¥O¥ÍüAéûPÿçvûæ-¨)×#ÐÞ„Ú´{måÿÜŠmLízõ¢àºõ<:ï.YBÛV¯¢Ëï¸ÃôÖ6ººðÓOÍ@¤cÿþÔeÀÎKqÊIû[õË/tpÇv=éöâØ«l˜þU¥‡ݱä#tYD$5æ‘ü';vØ´}î–[è›3èž×^u[åܽqMèØ‰ú\|1=óý÷¶´Î…ãØá$º,ªuìÓ›^_ö{‰*N¸}cùòùËË0ë…çiñ—_ÝJ«kyÓv5~2Ϙ¼¿†|¼©ÿŽFü÷Òg—7QüþZ:eæåÓ໳¡®{<,ü|åegS·!C(¼áé™÷–U+éÀÖ­Ô¨MjwÞß„½JÜa ×ü2Ÿ¢jÕ¢¡\wÄ8Û+cý ò¤S§¨ûÐ!N 5þëyÁÇ3É¿N 3Æ&f?‘­߃j›p‹†ëêV­©ËÀtë³ÿ5}‡µMÈÈ¥´¢ nø! <àEg´~ÉRzòú ô×^kcI;–B÷¿˜ž¾éfÛÿTÛ«€c3è` ¯á©3l <àEg”—•m0š:v,-øìSÛo_müWÍ›gó«Ç/¾H {öÐèÛ'Qp½zôë0ñéÿØ’Ú²jùзɇ«¬Ff¬í¯Û  (‚Ëçõ{ï³å³º9ÔHWƒýè©§èænÝhxݺtS×®ôåË/Ûrµî×E4¡}{º((ˆ®nÝš~=ÛæÈ‘–zŒî2˜.âÑöuqq„ø ƒ;wиØXzþ¶ÛŒœÑÜ0Ž&¢ïßnîÞÝðå%Áÿþã„f·àaÒ$züÊ+é’ÐPÓ¨ý2s¦-Ù]ÖÓm½{Ó°ž´¢é“'ÛÂàÈËÉ¡iãÇÓpîp «$Ý­‘åÛ½Ö,\@×ðÌzÝ?|¸É3Ò?rðÝuÁ¼ÄËK†ŒNб$]‹+Ÿ©—6rÒOî[óPV÷Ö•û(~K<µëÓ‰NxSzVÌΧ̜Ê(ºà†ÂÀ^ÄA\G4é…ç¨a“&´fá"¬ôò]wÓñ£Giø 7P‡Þ}EóX?ìA§óþ3fÐuj×.QOð€q·$zõž{)3ãdIln ÿéÃÿQcnGq=zRfZ½qÿôåô¾íìp|<åsݽ¦ekZ½`þi'îßgê!Úàƒ¼µ†¾íËç gíßY;Cœâê½³x%µOûö‡%üAãÆÑºÅ‹iÏæMH¶Ú‘é*^¤Xnœ1u*Eñ~ÚÍO>iáÛüÏ Œè##GÑѤ$Ó‘žÊϧiW_]leÞðûrÊç¥Ï¾ÜPíÛGŽK0 9YY”ȆiΛo’?wX¸3ÈÎ̤—ï˜L™éé4vòBÇŽá¥h»µh–‘¶ý±“"›FS`x=:™“G¹”É×±™ôÕ矙+95Ýø! <àEÄuD~~µé¾·Þ4A¯Ý}wŠ¿ÒÞߌâ×äé/Ù¢8ë|ÁP\§mPIŽÛ¶Q*GE™rEÙ–tq·8òññ¡äC‡89"Ô±)£GѨ hDýúôÐ¥#L[uÄëªß)* C{÷R“¸6&J×'ÔÙ£¬uÕÏϼÿŸ½ë€¢zþ_ B ¡C„„Þ‘¢ EŠJ/*¨ "(RD,Å‚í¯(JW@ŠŠ`¡ üè(½‡Þ )„$$$!õ?ó’=6—»Í]rI.ÉL>—Û}eÞ¼ïÛÛy3oÞ.ÉÖ uk”¦ß›žx²©ý¾CȽÿ`ǸAüÞëÛ¯bé÷Ïõ­Ý_Œ®ûÌü>ÓüþÈ»QóÁ¦ª+§þûOߥUŸ·^ì§Ò88—å:\×µ¤IU'ZË ¢›ø„î=Ôµô]S<9c2ºùÝ´­µ—•é4­åýH™ÅÓ„Ò–—å:\׈j4nŒrôø×’%à O=±å7¦ý£Ø¹z *ùúÀ›”êîuëñ:¥ñšrF‰'Q4Y.[¥ªEc¦OG1ò ¹Òä’o¾[,çN·y$ó§«V£E§NÊS²såJSYýïßèwftÝÕÓ²öû4ÿýqðfÅêÕUµ‹GiÕóÔ·(i'Îb¥Kgå|sÐ[°Lüƒ{¨k7´ëÕ—ýü°ð£)ñH¼D?@vÐL˜‰Ó9áM*ËxÕúøºÉŠ–ËÕNQ˜ìæÕ¨2)KŽ8"b·ö²/§âEjoµ{êßä™l\ì=­8<ÉZ(HëLäBcŠ¥$ÿ Ô±/ÝИÜ=ŠcÈäÉ©"ËË’Öêò̓):*}%mK¿­Éu놿jG›±sûUÉõY“Õh|Òã™Ñüÿ@%K)&ž,hºïógË_«xÃÄ28Ø_¥iù\–ëp]#zó»¹Ê c…оO<Ò£§©¸ÑÍ×è¦mbáS YѶ(h.Ö6×áºFäZØ £¾™†8ºÎ¿¤õú$Ý$ûðÖ­8ü8šuè 90²)z]9{;ÒYŠ2jóÒñäx/úÍd†¼éZ×~sõJ ¢èvô¿£ß™ÑuoTOkÇÚïSË×—§e¦€+—Õw^û'JÚ‰G”•!+§°  ^»ÿC¹~>Ù%YŠnª¬(‡ñ;ŠÑdåÕ¡õá‹§Na%YŸšUúÜ„ñøõâ,!«w>mϘ¸x‘Õ^ûS9®&ýßM·~·zælå nùøãX€gÞNÞ(¨y¨®[‘ŠÿÓ¬N>öªR™¿pÜÝLññq˜:|¸ZãV ô¯›«vˆ:ž¦D+¶ôÛš\Ú¶Q\ýöPßlÝ\NÁÁÔËB‘¾z²&«Ñøèë;ò8))—B’p!”ð½ü‰MLžXéÛá4-ŸËr®kDlñ7jóˆ*òøàçS5ºùÝ´S1ɦ“¤¤DÄ’‹7Š&}wïÞµéÃe¹×M: xжvÄÙ#G°yé/¦â~ûö«ã&íïoÒŽ¯ÐD;£T¡Z²¢Š"×tfˆ-rv3]<‘¬øyª‘þ÷oô;3ºîêÝoÇò}Cå›ýþbSŒ¯JÉ÷G^ù%íä#Ùnˆ|s`Kõ‹¡C1æÑGq‚"LKÑZVóÇÃß|‹çêÔÅÂ÷ßGͦM蓼>ã]»Ú‘ š-ñ­äÞ:¾k¾{g<†5oC4›·F×®«À±&¿§‚„<ÈEÖ’ÖŒ5rIYƒåsV¬ê›, ¾ù°%ÅÄkÔéïûfËúÀæ-˜õÆøð©§±vþ|Ptjf)#ýÖÚlD7Ϩ¶óf¼Û»—òD†§êâµ½°[·ðógŸ)¯€V×Ò·ÑøpyŒã 5GŽ•._±¡7Qˆô²›kv-ˆ‡ï…ŠîßÄÊ—¯¤Ò8ËpY®Ãu3JF7_£›vFÛËL=žl\£ &^6J´2\ÇÖ¥‰·i›¤{±bÊrÖøû4j¨Ï¼ï?sðJãgd”˜//Ä{BÒãÏËQ“zôÀïOÆ>ŠFg#¡5­™k¤ÿýýÎŒ®{£zZ;Fß©~±1*¢Ë³ /’(i'Õ ‹£%)ã0Z{^·p¡Šb¬D?æ~]N\Eñ4E‘r`Õ øCÛö®[‡nC† Ó³ƒPº\yZ‹ž‰È°p|2äœ$åÞ“¢¯;dµ×ì:¿uýò©²^•{Ó+u‰V¹Y¾u›7ÇZ³šDAVõ)‚”éüÑ£Z«ß,Û»?.AQº‰ýúí·jâІnƒÉåYÊH¿µ6Ù{ññʨ@Ñ܇¶m§uözhBkóLn´}…©9E¿Gâ^øá‡¸MmFd4>\ÏÿâE\%W'´9Š*×*{çQ¼Hú¸ ¸{A¸+o—®Æko½«>3Y£Ò8O•¡²\‡ëf”Œn¾F7팶—™ze«VÁy Ô{}ÒãÇe¹×µ…*ùø¢ÿëcRmJnnÚ:°e«štóÄûжmj‘~ouªJ6œp`_YZ¿pôˆ ¥­©Ñ°!=yíü‰*ôæÜ¹ð¤gX"£ß™ÑuoTÏR;æiúßß«Nÿ·Oiâá1/ŸÛÏy )OË£È.¨º¡st±â%,J|릪Mè ±»¼lÕª¦õ&}ž¥ã`ÿëêÇÉ鯳Uìêê–^Q‹ù¼æíY±¢z‚—Å™H´·ß¼eõìÙ¨ÎÛÖQ-¿HëæìÜmÂe.áå¥&J¶Šg4>¶ò°¥Ü±°å§hÐcݸ+®hÂU€b €’nÉVc8mÁJ ¯fR"}ÓA0­ÅŸøówtz¾=µknØ oÛAEŸ®Z‰v½û¤*»†"ÀçÑ>r¾cÄ£´=†-J¦Oiò¸oÃܦå/ŠUhÑ¥ &ÐÒ‹¶šŠQŸðö¿³Ûw` ÉPž¶&Òò…~íXß<[Ѭ ÉÊ\FÖe­GÛ×ücNÔØ™d¢{›ž­U[Y{o̓^¯ŽÀ•Ó~ø ÿµO^¨ê4œ¼ôgÚ9ì3çkëùÚÝÁôké÷È1öo±HBiAÐiô€{“F¿3£ëÞ¨ž‘üúßßXZV¸ƒ¹{öUÉ•yòı\9l"tV"À“¡dñC!øiM1´^É>ÚÍ++Ûvï8R k¿ý…e4§à.wrg“ÓŒ”M²Š&åÌQ@S|B""#£p€¢ÚK”JBϱƒàZè~,@Fe2ºùÝ´3Úž½õ£Ýkד”dïö–…XI³²Ö+gVÒ±äéX½c;n“Kù‘ž=‚/q°’.UÆKßd†y’пª7¿÷Œks%mWå,ÌOü{ªº>¢'©éœƒ"9´iVÒMòáKä¶Òcô׌fæ1´®E„`A‡§à ²®]Ýî³83V.]Pª‚'.9†Àsh;P”)WEɥ͎å+ì–DŠ)a7¯ãÐ_‘ÂŽ@ûA]Ia8f+ãvJ-wzPNNcäA–þiZjð'v:.AK/šÕÌÊYûÐ6ÄÿîÅ ŠÖæHlG=¿Û­pqOÞ¾æ< Ñ‚?—(à«M¯^v±L¤ J¨ÿM4n×Þ´leƒ*üme‹#× ïCdm³‹>úHM¬Åݵ8 wA Ûà@­›ç®aïêͺH.Õº¨Ñ¨6¼kUS²\={™$s†lã‡r”G«ÞQ±fUÓöºl8dŒxË?ê3†®jÓxteÚÏtƒÜògiùã mA*R¡<êQÌ…/¤lAÌ‘¥É|†€¸»óÙ€Kwó¬„bcîáÔpèüÏ]Ex`òß’å½P©¦7|¬‡z4S[åò£òÑ0âí†7)æãY“‘ô0 &Š÷ªTé!<к1oÛËå¯_sõV”´s‡H#d ¼þÊkŸüÜf~4,ïYwIq:b :KÏF¦‚Q6‚-MÙŒ+éôCwmf'AÀ`%¬qò2g1ÇeŒr|D+È>i+ÀH² ‚€ Óˆ’ÎéöA@+ˆ’¶Œ$ ‚€ 9€(éœi_A@°‚€(i+ÀH² ‚€ Óˆ’ÎéöA@+ˆ’¶Œ$ ‚€ 9€ì“Îéö"Ào/»vö‚¯]C” ïõv`ÓÂJ0D€ßMÍoâ«Z«&½šÓ¶×}2Ì'™¢¤óÉ@K7ó>¬ ÏîýõéQ–½éŒÞôÎn!AÀY¸Jo¬:vá½×þ?€^=/ŠÚ¶‘%mNRJpz.ŸôCz9D÷V­^V0ÿ!À“Fþ$ìÚ…‹t­Š’¶í5iÛp’R‚€Ó#på šÖ®åôr:JÀz•¨¿¿¿£Ø ŸlB€¯Q¾V…lC@”´m8I)AÀ鈺ž®‹;**J½3¹8½Ó9&&&UŸøœÓ½½½Uú)z3¿_9‰Þ5l‰¸Ü²eË,e)œ·ÿ~)RÄb™Ì$~DïÙ-S¦ |}}‘òÒ_xx¸’ûßÿÕ’lú>zô¨ªgSa]¡¬ê£®‰< ~Í;uN¿‚%>{~06ÿú+Š‘5ØáÒ%4i×ÎÄÁž¶ÉÇr¶~â 4ZßÖÄÖ{Ú±…Ÿ½eø&iÍò-TèþϽqãÆX±bú÷ïojâ?þ@ýúõqçΕV¡B,Y²Ä”ïL<™èܹ3Š+æLbå9Yâããíê_/Û¶mÃÂ… ñâ‹/Z¬ËÞ!ÛKÚv¬òEÉÀkW)úr/)fW„‡†bÃbç¼I[ ?r­.Zkoaü‚–Šäù4VÐl¥Xúè;ß»wou3ÕÒ¸ÞöíÛѳgO- çÏŸWnkMé¯_¿¬ÜK•*…§Ÿ~:•µd”gbHÿý÷š4i¢Üê>>>øùçŸõÙ©ŽÿùçôíÛ¥i²X½zuuãçÍš5CHHF±cǦªcË [áµk×V2”/_ÇOUíõ×_{ªT©‚éÓ§›òì‘ÝT)—XºŽ¬¥Ý¸qK—.U'^êàsKeµë)—C“mâ‹’Î6¨sGC«fÍ»Bû£þsÞ÷&ÁCoc­CN{í5|H7é'éæÙndÿ£¦F·nÁ³tìZ¢ÞîÖ _¡ê_¿¦Iõ}xû6 !ë­ í¡T«¶“5g"BoãÝ>½Ñ›,¼î´9¡Gw„ÜTÅG´j… º)$ìÏÖ¨[\êë-ÄËtÃïB–wïªørØ0$$%˜š Å;©üçëÔ˪‘=r³ûýåD·’%1´iS¬˜1Ccƒ÷úöQøD:hŽoŠlM³dþ15J=öbcc±gÏ•¼uëV°ÂÒÖ£9‘ó¯Ñ~k¦«W¯â¹çžC¯^½pìØ1,Xдe”§*§ü ÃähÛ¶­š°‚AׯéšóìÞ½;*W®Œ“'ObÔ¨Q9r$¶lÙ¢>yáÛü2:Ÿ:uªº.ŠÒD™¯ÿû¿ÿKs òµÉר툒¶«|Qró/Ë”»xèÇSàÛ üÄÅÇUßéÆ@7í?ÉJ n§q›otdÑ0±}¿_]¿Ž¶d©Ý%ųú»ïTVüæÄ vR¯Þ!Ýh}1‘~ÀS Â¥S'Í‹ÒÄ!cÚ?Ыנ’¯¼ëÔÆîuëñ:¥%$Ä£nË–pssƒ ¹t´nÒe˦á¡Oð¿t3FA4¹MûM¸»ê×›LÅŽïÙ‹„ø­”¤–_–&AUééQLFyZyþ¾@²ˆ¦¾²»š]Þü9Kã§VÓÊ_¼xQ¹Öõë—­iòuóf²çD+gï7+vžTT«VM)ÿ‰'*kqbb¥ÿÈ#˜Ø²¬ì¶µGvSå7¯Ãצ¸»í»0DIÛ‡Wž.½†¬¦óGŽ`x‹᜼^¸kåJeÉj÷$wsA$x —)–~œ·nø«ãštóeâ2UÉõm._VY ?š‚¶da½Ùµ›:ÔY5Z]¿}É.Ñ&íïƒiÇWüü´b6³[{Ù—Sñ")óQ´Á©é)HDq±÷L<¨›¬¸9¡v‹æ*ÝööÈýõ©]¯ž¸L2r?GP[/Q›¼tÄ7FV„–>æí±åÌJo%/`±õh8ïÌ™ûn'((H7ÊÓócw:+In3 @}xwMö̉˲ò×ÓáÇ eÔ—µvÌ®íyóæayƒ82œÛg«\SÒ¼EÝ৉![òöÈ®ÕÍ ß–®£Ì¦ñµ#d;¢¤mÇ*O— ¹…Cd5!K">6!ôˆH²Š=È:eKùïE‹Mýw+RØt\\’5jÛFúí? ¾Ù }™öÚZ£:)Šï¹ ãñëÅ XB–íüû1qñ¢4U|5Tig4å9xHW  "{iõÌÙÊßòñDZš"ØŸyû-Å‚oØù“L]Ié‡ݰ푛'ÿø‹ÅhrÿÕ!Ëì"ñZ©[—ÖÚÈîo޿܆<€eäêf¹x-™£¾/^¬,!veÞ½{W‰l”§ïS¿~ýÀ ùꫯ”µÏ.fvwÿõ×_úbêxÀ€¸Mž V¨ly±‹š•ôã4^¶[Àl‘k~ð ¯-óD¡ydØ]?‹<8œ¦WœÆÄQÊ'NœÀ Z‚±Gv[åËkåxŸã´/keûw¤Ìó¹µs¿C YÑ iÝÜ´Ïø~P½Z7^º½kDVn èÙ¿y3ÞíÝKY©¼%ɵ#73ß(·’{ú8=*ð»wÆcXó8D7dsjJ7U^g>°e+¾:T}ÑM´,)Íötó·—âã“×Èãi­œ­tva3ñµF×®«À±&¿‡ƒ[·© KKRVöÈýÇ7ßâ¹:u±‚j6mBŸ¦Š½wí:ê{ ­ës £Ç4Ùmýæ-X¬¼Œ\ÝÌ«õñzðÛo¿­"ÂQ¼q^ž*ò¯fÍšJ)~þùçÊ2mAÞ^—æ€4sjÔ¨‘ZÓ|çwÀ®u–sܸqxöÙgÍ‹Z=ç:ü°íÓµkW¼òÊ+ðòòBÅŠ•»þï¿ÿF+ :deÌT’3QÒŽpöY&í²°^èöÂ%· À‘Õ×Îü}ÿ¡n‹–&±ÙìS¾¢ nš¾m+^ïÐiÙÿìVe8ªú=ÙéOrß–*ã…CTæ«W^E(¹ [ÕJnÍ#´îºÊÿ\(X©gÙrjŸô\º2­ùþ;Ì›0Q­e³‚”,¨·Éz²DWNûáƒþÔ:4+×êõêaòÒŸQ³I²âãHs~óÓßd™»—õmóúXêË9rí¢€³N´>»~Ñ"´£ÉÅP ðÒ°‘rSóõeró²[ÿ͹sÐyà ÅÚšÜæípáO)Hi­ýÞ&÷°Ýü[ÐMy x9àÕ‡ÂIŠ$^OžŒžeÌŶë|ÙWSñÙ°áêIb–ÖýXñd–˜/»«ÙlNFyæeÙŠæ5mýš³y>gž\–£ÎÓ+k©¾µ4vus²µ}ÖìŽgecéii¶Ên­íÜ”Î8ØJ<>ìáýö¼uŽ-jóëË0¦“hÒ?ð­·meoËñ2 (é|;üŽïxÀÕ+X={6ªó–*zÓ‹´¯ö)¹M1ѦulK-óþl~¶Öm©Œ–ÆV'+ižd–X©²6z` ¿]ʳ|Š6.”¦9{ä¾uÓ^Ó*·4L3˜ÀJúóᯨÀmUÏŠ×U…{ЯÏÛR—‹fΜ©–P,]ƒœ_˜‚;'R¥(éôe%ö®“~=)!XD Y;Û~ÿK) k5»ÏiÍ’#²[Ð^ãô”oùªÞyZJô(‘¬f)ÏÞ´Ò^Æ[µ˜ŸÑÛzì‘;+´Öo¶Tiqj|å[°V̼,ÁQÞ–H®OK¨§Éš´1>’k¬ˆ?_»]Ÿ…ȵíI–…}B[³„²"¡íŸ²åK—õ-K ‚À}¬)híšäk”¯U!ÛKÚ6œ¤”ø4h¨öHÛX\Š9rä8F{‘oÞ\íý团ùš ›VùG½¼Dóîp (_£|­ Ù†€(iÛp’R‚€Ó#Pµv-œáýä OuTtÐRœ¾ã"`–! =UÎ ܤ‡øœ¸x g(ͧe G°Ìâm˜®´åѵ«ÓËë Jt·3Œ‚È 8¾ùÉ Ð€ +A ‡èîi^A@°†€(ikÈHº ‚€ È’ÎáæA@kˆ’¶†Œ¤ ‚€ 9Œ€(éi^A@°†€(ikÈHº ‚€ ÃÈÃLrx¤yAÀ‘ܸƒpz ãÝ;wgå%Žl/¯ðr¥73+Q%é]ÓÅŠ—È+Ý’~äDIçA”.Œ+è[×®¡\ñâ¨ãë‹’ÅSó@ª³ÜrÅ]§÷‹††"ˆð£]‹¢v ¶Â*sˆ’Î~R[pBƒ‚PÖõäåv Ohø“DïD¿M8Š5m7„R!‹5é,VØ Ù@dX*xzfw³yª=ÆqœQÒÎ2"‡ IbcbÒ¸¸3ÉÒbõsçΡ}ûö(F/Lð%·ú¡C‡,–ãD{ÊZe’lM3ŽB‚€³ JÚYFÂr$" ç±ø‰ CBR‚Êó¿xÁ­&³ºyå²âO/z0'{Ú‹‹U|Èýh/ÙÓŽ½¼Ó+Ÿ“mÓË£ÓÏ!ù&L@Ñ¢Eqüøq´iÓ½{÷¶Êמ²V™8(#>!›6ü þ6¤lÂÑPÉR5éÒ2hݳ'ª×«¯ø²Ë|Ûo¿áNÈmx×­ƒØ{ö­ç9tÇvîBðõëjŸjõ† ЦO“Ì÷¢¢ðçüù¸~ö,J–õB‡§#”ÛX@IDATŸV2s#¹L ¬Xk7.:˧~¥ús÷..;†ÂEŠ -ÉäÛ¨±âæébrŸo…À»^]” `£‹G¡ÿcQ¨PÚŸ”‘œ›~ú7/]ÆÀ ïÀÍ­ˆimOfÅ•”˜„ĤĴ•ðß . téÒ©x³ÒH•Æ'ö”MSÙ†„„„J%ù‡®iVÀ…^ŠÙ¹m;º<ùDªú ÐêŸ,¦ÂDNr¹$s~ ²L‚hÚûÉJ@û°2VD7ò0zà¯O35¡ ºƒáݸüöïÃî5kOkÂ-{ QÄcÍœ9ˆ ÷sãö혈uóæ#ôV0XÙÿ9o‚‚Q»Espsýì9ÅÓ–Üþ–_–)ÅÞ¢Kb+‡dð?ÞTýêé3HLL@­fâöÍüùý÷ˆ1”ËTÙÊQ»|ãglmÙŠ{¤°ë·j…;·C±eùrÅÍÔçÀ ÔiÙ±T†Ër¤Ä´ŠÑ?f©°eÅêÒ´RÖ¬°uGðVëÑz^îî~˜“=eÍëÚr®ï[zÇ]º=Á/¾„AÏ?¯¾»uïž ®ÏGƒ-òKAÀÒNûm©%er÷¢cpúÀA“¬UjÝAóNMçÚ[]† ÆïÓ¾!…þ³JîüÜ0.êŽÃÛ·‘²ŠA›Þ½Ð¬c'T«WLŸãÿüƒòÞÞˆ‰ŠF]RПzZÕû~Âx•¦ñ6úö(Y ƒ'¿‡‚..ˆ"wdhÀMܺá0Rüô'&’%Ðë•WÕqtd²e{÷Û·­Êõ)|#2j׳BUµ„gi<ùÒPu|ÅÏOÉÅÁl—OBÌÝ(ÕçžRùó&MËfIIŸ!«5üØkÑ{ÄkF¢f(OSXªœN¥ „O•žÂÃÃѨQ#}’:¶§lšÊ6$p?m%›×¯ÅÛm+¤R.›%M@çD3¥Ê–ÅÉ“mjº‚÷¨FeôU¶Jeÿüs,\¸A4þ}ûöUr¯¡¥’jÕª¡qãÆH¯lvvô6ÀŸ®k*V®„Ç»uÓNå[pZî߉œVD,«`7îÿÈÍÍ5O¾<—üˆ]«VûNe]sºß¾ýàÈð3ôàŠó‡ û°—U¾«›NýûŠÐƒ-"hí6*"íÚ¤5ù5÷0[ηèzŒ\èL±ºCðZ/ŽUô©NVþ ¥œ«ÖªEë‡IJ^KrU¡È]#²¥]kõË?ðõÕŽÅÆ—à6Y•ìò·Fì°†ŸoÃF4)YVø(èÌcÖäpT:÷…·] 4H}óMfΜ©¶aqãÆC¯^½0mÚ4Õo£²Ž’É>m)îbýÚ?‘@žÚŠÖîÑGm©&eGྉ㢈9…ÀÎ+TàÓƒd%ù4hˆV=z(E¹”ua züùç}7Rí[æ(ë¦;€Œ;=‰ó\ ¹`߆¹é¯ö5ÛÚÚ-Z€*+ùU³f£²o²r ¹qßâ©Ó¼¹ÖoS0ÛZZ».ˆÇž{N­•u/fU®ôÚ·¥]k< ¢m_Ï åq‰<ìÞ€¢Ú™\h;’9¥''OBx>7,=F…l=ó«ë•?pà@S·9¢›´FFeµ2Ùñ]„¾Ò¸iÝ݈v-ð¹ à0‰¤]v`ä†N‰ŒYƒ@äpx”(™†9»|£)˜(£/%¸¥ön»pIÃ[Kà7<ñ¾nV’ædM.óræç¶´k^‡Û:µw/)é ¨Ñ¸‰ÊþuÚ×äò¾ŽQß|c^<ÕyFåLÅÄàäØ?»Ð±ysµ^l)¨ªa{ãqº&ì!öìÛû/Z¶zXEv›×å|^gßJ1Ú´5Ï–sA ÛhË×d¶·* æZ,)hî +ÎŒ*h®ÏQäé‘kr¥ÇÓ–vÍyp[ghíœÝÜüĶ8zÎsÀ•«ðiØÐ¼hšóŒÊ™†‘$dž¸´¢ø‡¬8Ë@RI°QÒ6€$E=¼dÇŠ0m#kݽ!štè /’#Ç…(>à.íÛ.Lnw¶ -YÓ9"˜5jMA3^LŒã($8 ¢¤e$DŽ\ƒ@érå¡í‘v&¡9â>ˆ^³X\ñj›ôŽÅ'9“ЙES¬™`¡ªª'ÑQ‚Ä{ô G!AÀY%í,#!r™D $½è‚¯ÊT®T)¸ÓãL]è/¯’ =ÇQEË<Á ¦}𞺽ùŽâ/|Œ"À> Ë(zROp"ø5™üð”pŠ燮ÄÇÆ:‘tÎ- »¸Ýiÿ7OtJ”ñ„Q£s÷D¤ËKHàX^MéK¾G€ +r×òÞsÞK.diMšO[¶Š‚¶ 3)•=ˆ»;{p–VlA€Œ‹+¹ÓnÙΖö¥A@p,ò0Çâ)ÜA@‡! JÚaP #A@AÀ±ˆ’v,žÂMA@p¢¤¥0A@‹€(iÇâ)ÜA@‡! JÚaP #A@AÀ±È,Çâ)ÜA`÷_¨çuçHãy¨Q~Þù#ãÇç¡IWr;¢¤sûŠü‚!P”ÞܵkW8òQ™ù Øz̆ ò[·¥¿NŽ€(i' O°‚ ¢téÒ(BÏëÊ1ôünÆQHp&DI;Óhˆ,‚@(S¦ŒRÐÅ‹Ï ©Æ0ŽB‚€3! ÓFg ‘EÈ !ônk¡Ì# 8fCáàXDI;Ï|Íí²ß)lúéGޱݡ8ÄÅÅ*¾¶lv(_óßøãüôé§8²s§–œê;!!>Õ¹³8êÝÊÎÖ¯4µjÕB±bÅТE 8pÀ$â¹sçо}{•çëë‹C‡™òÌl-›Wq4ÇCÎs¢¤sÏX9½¤_¿:‚É}ú‚Ÿ£(2<\ñýá½ÉŽb™ŠÏgÏÆ'C^ÀRŠÞøã©òøä×i_ct»viÒ)!/®¥òqÿþýQ»vm>|5jÔÀ!CÀ^L&LPsÇG›6mлwo«CbkÙ¼ˆ£UP$#W J:W “ó xí*NîÝ WWW„‡†bÃâ%Î/tŠ„~û÷£0EG¯½„ñ ¤‘ûû |+Mº$d-«V­Bpp0æÌ™£¬éŸþ7nÜÀï¿ÿ¬_¿3g΄и…ÓdŽ-os²§¬y]9rQÒ9=y¤ýU³f‘õ‡þcƨý9ï{SÏBoc€·7¦½ö>|úiºxx`¹C·ÿñ‡>;ÕqDèm¼Û§7zW¨€î4¡Gw„ÜTeF´j… ºñ'ìÏÖ¨s—úØŽU¿‚®]Sò\"—>÷…åãvûTª¤xÉc”÷^ß>Š_äðT2Û{’Ý´/^DÍš5QµjUo/kܸ1Nž<‰£Gª /ÎgâÉaƒ °ysÚ%{ÊæE@ò/×" J:×s ¾ù—e(F‘ÅC?ž_ºYú8ˆ‹'Ž+!ãã@JîO²v"ÂÂÐià@Ü ÄôÑ£U>+Ñ÷ûõGÐõëhK.Ë»¤°V÷ªÃŠßœXÁNêÕ!Ä£Û / ‘ÜŸS Â¥S'Í‹*·û˜öbçê5¨äëï:µ±{Ýz¼Ni¼Î\·eK¸¹¹Á¥P!4hݥ˖MÅ£f“&êÜ­paU6!6VÉÅò¦íNUR”„5yÒ“5,(¡d-&¥¸pS5nÇI^Üݹsge1GDD($øA#§OŸÆ… Ô‡·œé‰#³ôIê˜ËÛZ6/â˜IÈUˆ’ÎUÃåœÂîY·¤`›uêHncw<öì %èïßNO%°g¹r˜ºqÆ‘û²¹(Ù-äÖ¼CÊ»åãá]Zžµ{7Jzz&×¥³9­›¿‘wî ß¨‘Kü¸9³•µ»bú ó¢8¼u+ÎÓše³0g÷õiJÁFWΞŲ¾ÇLŸŽbd½»’þ`ùrø6jœŠÇÈiÓ”•VŠdŸBå ’5ÇTŽ,èGcæŽ0’Ç(ùÌúçlŽŽFñÒ)ýåD!…ÀC=„òåË£K—.˜?>ž|òI$&&âÞ½{Ê ÎpÑ“»»;"##õIê˜]æ¶–MSYF@”t@^h~ÍܹªçÁpŠÀÝôÓÏê|×ʕʒÕúèIîæ‚( N=J–Tß±týuÃ_kV+—©J®okpù²ÊZøÑ´-Poví¦Î¯^MSÅoß~•Ö¤ýýÀ/íøŠŸ_šò¶&T& Ú¥@²Â6’Ç(ÏÖ¶l)—WžxÝÙ“&l¼æÜš<üT5^ƒ®@×’fakøðš´æ×ÒøÛž²yG=rœ»‡™ä®ñr:iÃBnáÐöí(BVL|lB( ‡Éƒ¬S¶”ÿ^´ôì¡ÒÜŠVßüO³Hù¸QÛ6ü¿ýÉÛkØ }ùÔ)•fé_ͱnáB<7a°e+¾:T}mÛ†²ä®n߯Ÿyq‹ç¼fvë~þì3ÄÑš4“~íÒH£<æ3…Öç9`NÇÔÄVí€LÛË–-C(Müž¦àÃN:¡M'Nœ¨õ¼yó„¾tm0­Y³F—ñqze¹ŒF8¦!!ß΂€(ig‰\*dž”}Å=†KÕƒ¶ýú(åÈdW(ØÇˆØmüñʨ@QÓ‡¶mGµzõЄö½2¹™­;rZéråi-z&"ÃÂÕþfÞúÕsØ0t˜¼Îe4ò(Q3wî@e²¼6üô“²j+V«†/Ö¯ƒ»Gq­˜áwóÎA“†…~ˆ0Ræd$Qóñ§æ«´>ÎÁu™¡¼è¦åGœ~ôÑGH™*ä%áãŸh Y9³2eøâÅ‹Q–&aS¦LQÛ±¼¼¼ŒãÆÃ’%ÉÛÓ+«Ç=/â¨ïŸç>x0i—…àœÜב8·"põ VÏžê¼¥Š†Âô"mµ¹væ 6ÅD›Ö±-õ÷g—¥-:ÚZ·¥2Z[«ñ-^ªLò\K·å›·‚• P¸Hê`%óºFòå™ó±÷ü<)«~äÈ«Ïîæ¨m^[6'Žø¾F;¼i‚—¥W–׸W¬X´c@Hp8æFÖ¤a$ò¹ åè»í÷?°ôË©XÍîsZcæˆìdÁ¦§|ËWMÿæ¬ÁËVuF©l•佺éÕ7’Ç(/=¾éåçu Ð’‚fLØJ¶EAÛZ6¯ã˜Þu$ù·€¸»oLòD¬ˆ?_»]Ÿ…衞´í†ƒÂ>¡ +!ÛåbNé•ÓCHò³±¤³qiÏ"> ª=Ò3%Q|Š€XÒùtà¥Ûy ±3ž‚£cp.ŽC@”´ã°N‚@Ž! [‡½àè…‹ã%í8,…“ ‚€ àPdMÚ¡p 3A çØ´iSÎ5.- ‚@– û¤³Va*‚€ dÞ'-îîÌa(µA@,C@”t–A+ŒA@Ì! J:søImA@A Ë%eÐ cA@A sˆ’Î~R[A@È2d V–A+ŒìC`÷_¨—Md_‹y³%~SÖ#ãÇçÍÎI¯r%¢¤s尉Ђ@jŠÒ{·»ví —Ôrf3 ذaƒÍ奠 ˆ’Δ¥ A ‹àgN—.]EŠÉâ–ò.û˜˜ȳ»óîøæÖž‰’έ#'r :Ê”)£tñâÅu©rh/Œ£ àLHà˜3†È"d Ö”jzG=rì ˆ’v†QpÙï6ýô#ïØžíhÄÅÅbãKðÓ§ŸâÈÎÛOHˆ·˜î,‰¹ùíMÛ¶mCÛ¶mQ¬X1Ô«W'Ož4Á'žxì!¨]»6fÏžmÊ;wîÚ·o¯êùúúâСC¦<ó[ËæfÍû,çyQÒycs}/¾~u><“ûô+Íì¤ÏžŒO†¼€¥!½ñÇÓ4ýë´¯1º]»4éΔ›×Rß|óM”/_ûöí+Ûž={š 4h®\¹‚]»v¡ÿþ˜4i¢¢¢Tþ„ ÀsÇG›6mлwoS=ó[ËæfÍû,çyY“Î㘫{xí*NîÝ WWW„‡†bÃâ%è1lX¶õÉoÿ~¦›ýÚ[ApsKxõý„‰¨P­Z¶É“Ÿ:zô(üüü°qãF”-[S§NEݺuáïïOcᆭ[·ªˆë&Mš€?ëׯWç?ü°:fíãムÀËË ;ÉÒÎlBż¸ž-eóöÒ×Ü€XÒ¹cœò´”«fÍ"ë9ýÇŒQýüsÞ÷¦þ^?¼½ñõˆT«úTª„€›¸püF´j…®%J`PÍš˜žR×TQwzïöéÞ* ;MèÑ]ñà"Ì#èÆ $PûÏÖ¨…[6ëjc;vT²]»¦ä¸DnyKòÞ¾ Cê×G%çö?þ0ñ1Ê{¯oÅ/òN¸©|Fr«›¶qãÆ¸xñ¢RÐÜïS§N¡páÂpwwWʸT©RÊ~çÎDGGãÈ‘#èÛ·/X¹sWM{&žà5hЛ7§?γ§lnÅ‘û)”7%7Ç5Wõjó/ËPŒÖ‡~<¾t£õ;pOW}ˆ¥m1¤ W÷ Óö¢*tSö(YouéŠóÇŽ¡­W–$ ì™3ñÛ7ߤé7»ÎÇ´;W¯A%_xשÝëÖãuJãuæº-[*‹Í¥P!4hÝ¥‰—žj’õÆäFŠƒË&ÄÆ¦‘‡ó'õêÀ@t{á$Ò~Û)䦽tꤚ XËãzaAÁ FÕÉ åæýÑ+VDXX €áÇã“O>+çk4îhbÕ­[7pÞb6&e2váÂu®ÇŒ•v@@€>IÛS67㘦ã’'%'†1÷vbϺu¼~Í:u$—³;{vêÌïßNOÕ©rdA/8z3wìÀΕ«pëæM´|¬3>X¾_oÚˆ—§LAÍSÕá“Ãä.=O.Ñf:`Îî=êÓ”‚®œ=‹d펙>ÅÈw%%̼|5NÅcä´iÊJ+U®¦Pù‚) Ñ˳nþD’¥×oÔHŒ%¯À¸9³•õ½bú åqC³þù›ÉB,^Ú3U»ùí„×™Ùm]¹re,§q`uPPŽÑD¬FJùþòË/˜7ožÊ¦‰ ¯G뉭ïÈÈH}’:¶§lšÊ’ ä0¢¤sxò{ókæÎUœ'7æð-(Âûgu¾kåÊTd•É‚v)ü4-²¢˜|ÉUÊäîQC&OF£6mÕ¹þŸß¾ýê´Iûû_ÚñZ Í(éå ¸|Y±YøÑðKÚßìÚM^½ £¼Œ¶m©^nxªD“°/(pÝÙl kk̬x¿üòKÝÍnîf͚᯿þR6G~ë)<<U«VÕ'©c¶Æm-›ÛqLÓyIÈõˆ’ÎõC˜{;r ‡¶oGºÇÇÆ!„¬§Hr{zeËd/ZlꜛîIZ^U*«ô de1ÅÇÇa*¹I×/Z¨Îõÿ|5T§g4%Ÿ9˜¼U§Bõê¦4ÃzšSÑË£YðÏM_/^ÀZ/Ÿ`?&.^d²î-å¥b˜OOþûï?Œ9ÒÔ{^®NãræÌéÍJSo1³;º]¼HË ññ÷·Æq8G‡›“=eÍëʹ Óˆ’ÎéÈÇí¯ûbÈÍù­ß® à-í3þ‡*ëæÏ3¡£_+lݽ»Z—>°y f½ñ>|êi¬?—.™ÊkMÉÍÍë̶lÅC‡ªÏ!Ú—[–,·öýúiÅ ¿9Ê8ìÖ-üüÙgˆ£5i&½<íÈÂ+DkÚ[ÉM{œ¶ }÷Îx kÞ‡ÈÕn”Ç|¦ ˆgiÿo~ «S§–.]Šß~ûáÀ–-[°Ÿ¢í{ôè¡Ä8ˆìóÏ?Wy¬ÐÒd‹ó:uꤔõĉ•¢f78»ÇÙÚfZ³f ããôÊr$pLCB¾QÒÎ2ùPŽ ){’{ O½Ýªm¿>J±rÙõóçÓ Sº\y¼K)J¿øõÛo•2lCÛƒÉåmN%JbæÎ¨LÖÙ†Ÿ~º… Q‘¶S}±~r“›—·tÞ¼s'D+uá‡R WPš",ÏØY3É ®ö[óv²ž´…¬óÀA0ÊcFþÙ|•ÖÇuašlHÈ­nÚ’øõ×_c2ï•æ=ÒãÆÃ3Ï<£ú–-[¦`Â.ëÇï¼óºté¢ÞøÅ.ñÅ‹«Èð)“0“‚yóX²d‰:fÅkTVJù—[qÔ÷AŽó¨;I»èõlB‚@nD øú5xR䯋Kú[þÙZ§­V¥Ê$ßÈíé/·S‚@á"©ƒ•Ìyðžï²´.ZüÓJMFy©KÚvž”U?ò äægw³»š×¦y;•9]¥õ}Îc…žøÕ’îMÛôÒ£ôÊòºõŠ+Pƒ"ô…g@€c\R_ñÎ •È Ø@Ù*i…¬Ug«:£dk;å«ZWFy•K«—,Àx@ëNšokJ˜­dkyæLl)›p4ï·œçnÄÝ»ÇO¤¢\s!ŽŽÁQ¸8QÒŽÃR8 ‚€ E@”´Cáf‚@Î  cpƒ£pq¢¤‡¥pr Ù:äèGÇà(\‡€(iÇa)œA@‡" ÑÝ…S˜ 9‡À¦M›r®qiY²Ù'%° SA@A sð>iqwgC©-‚€ d¢¤³ Za,‚€ dQÒ™ÃOj ‚€ Y†€(é,ƒV ‚€ ™C@”tæð“Ú‚€ ‚@–! [°² Za,d?Áþ×qíì9Ó›¡¢"#³_iQȸ{x¨·ÕU­U“Þ-_Å©%%íÔÃ# ¶#À úìÞÿP¿REô¦w.{Ó{®…A -Wƒqìœ¤ß ZÁ©µ(é´ã')‚@®DàòI?Ô)WÝ[µÎ•ò‹Ð‚@v!ÀXþ$ìÚ…‹ô»qfkZÖ¤³ëªv,F àÊ4­]+‹[ö‚@ÞA€/ü»qf%íÌ£#² v u'Ü&wÑ¢EÁ/’0ÿ4iÒÄŽÖì+ºÿ~)RÄj%½,\®yFŒ;wîX­cK†¾ÝS§N©>'%%V=qâ„*g­ÐÒ¥KqìØ1kÙVÓõ²X-dg†yŸô²¥×;›Ê“ÅÙšæß3“¸»³ptîEGaÇ dÙ²x¨k7SK ñpqÉè3Ã;£uãâb±mùrxVª„æ:›ú™—2ŠMvb˜˜hssï¼óZ´h‘ª|™2eRg÷É»ï¾ ž(„‡‡ãðáÃX°`¢¢¢°dɇˆâéé‰Áƒ*`[zíµ×ðçŸÚR4ˢ÷É™dËòΧÓ@BBøšúôÓOé~ëbµ´=¿«L²0#k4E œ›X‡Ü¼‰A† MJú×i_cÇŠ˜³{û’Þ™©I7Uîgƒ‡ΓJ:3Ø8|ÓaÈ7&k–b¡B÷î­ZµBïÞ½Óá–½ÙmÚ´A×®]MV­Z~ø!&OžŒ5j˜Ò3zP¡B‡)üŒÊàèzy±OÖ0Š·–e1'wÛ¶mÃÂ… ñâ‹/Z,“^M*în‹C—u‰ßO˜ˆ°à[YÒ@fxg¦nñR¥ðÑòeþŸgI¿ršif°ÉnÙYA³e`éc,ýõ5j¶Ôžzê)tïÞ]YìBõõõMŪuëÖ&Ëò—_~AíÚµQ¼xq”/_ÇOUÖž“ñãÇ£páÂXN^&k¼‡ †1cƘX37Æ?ÿücJãƒóçÏÃÛÛÛ4‰Y¿~½²ÜK–,‰'Ÿ|RõñçŸ6Õa+ŒÝîüyûí·U:O"غgLصl‰¬Éi^Ö¨}–½oß¾(]º4ªW¯® ×÷óóCóæÍ1dÈ56Ó§O7õÉšl–úqîÜ94kÖ “&MRýã ѯ¿þŠ÷ß_ׯ_«V­29GÏ-]ÓÖÒnܸ¡Æ§X±bX¶løÜRYkÚí¨Y㢤ÍÉÊÓ±;"..A´‡uÝ,ØM|xû6 ¡DÚ·7¨V-l'÷8ÓõóçT™¯i]ŽÓû+9$à&Ö/Zˆ—éÇÕ…n‚ý½«âKºA%$%ÀÞæ}´T7"ô6ÞíÓ½ÉúèNnÐ =º«öÍëòytDæ¾ýVLŸ¡²Oدdÿ~âŒnÛ]K”À«=„óG`[Kt>œ\­¡´ ‚)àêU~:ÝhÇ“2èF7ÍçëÔÁ‘;U¾5,Œd|ç‰'0àp릿⑈$¼HJç¥&ÁÇÖpW…uÿ̱¹äwÊâ¸ñ3Ê{¯oÅ/ÒAëb|#bkš­ó®[Jñ±…ªÿü÷mG!ºzõ* ¤ÑÙ³g•¢d…Aã­nxz^ׯ_Wy\綾^R7þÀÀ@°™?>¶oß®/n×qºNž<©d²Æ»S§NJk–O0¸ýGy$U[±±±¸F¿=¦Ë—/ãÙgŸE¯^½Àëº<¡à>†††šê=zÀ—_~‰¯¾ú ‡Âo¿ýwwwåŠ0`€©¬v`+Fí3žU®\Yõ}Ô¨Q9r$¶lÙ‚{÷î©õð[·naöìÙj"¥õÉšl–úÁ|¸?Üwîc·nÝÔØqÚÞ½{•GƒÝÄÎDæ×³ÑùÔ©SQ°`Apüÿßÿý_šßÿNø÷âì$J:G¨fJ`ŽYu[¶DÝH&õêúîö H¤‹f Ý/:‰Ø˜Ð eõwß¡0ÒT©Y÷è9côDÓC*úÂEÝñ'­Ûظ öð6ï²yÝDºÁiÿ(v®^ƒJ¾>ð®S»×­Çë”Æk³æOr³¬ìÞgЉŠVçË¿ú•IîzµÄÉ}û0¬y ”¡ÉF­„Ýæ¾3^•§›§Ö×<رnÐMô=²$îFܱˆE š8ÉØäÑö¤ü¯âÏï¿WmìY»ç§¶›!4 À*îŸ96 :YµqáâÖÆ‘'VÖò¸^XP0Bƒƒ‘D:‚4ko@æ=ÿÓ§OcÏž=©>ÚÍ~ݺu`×ø{ï½///|ûí·puu5Y z>úc¶ú¶nݪ¬RÊpÖ¬YJùêåä~³åýý6+ÑuËÁb%h‚ÉÄßnnnúâêØV ŒÚgKœ×Pg̘¡Ú}óÍ7Á–íܹsU<Ñç ÐÀÕäBšl–ú¡Õá ÷ñå—_V^^`OÉ3Ï<öš8™_ÏF笤ù:fœù{Ú´ii~ÚoÅ™úhIQÒ–PÉ¢´‘t¡ðÍ®ݼ¦Åüעň¤èÕ~£Fb,Ý$ÆÍ™­,mÍ"e1ÊÑÍaÁÑØ¹c*U÷Á¢ãÇ0cÇv´%V­^]%éÕÓgÞZ7Íë% –Z³ÔÚ9¯Ÿ7mßWȪâ@8[©6)ã tÃúñǪJÅjÕðY"¯Ï˜®Îo^º”Š?hÞ¡ƒøtÕj´ ë(<$;W®4•Ñcq˜”‘Œ=_}îtÞúëoªþ_?ü ¾û¿>ëæ/Hw­Qsl ¦ èe1âg”ÇmÌ"·æfš|/í©5™©ovßñÍK»é¿õŒùfÌ–™þ£);v©Ö­›|mq(«Fc—±‹{-M†¸,+µ‰'*ë…eÐÓÓO?­(+ÑR´TbDóNŒäÔ×5jÿâÅ‹Ê]¯Ÿdð’ÂÍ”I0÷W/›ž¯¥c}Yó~øøø¨*˜Pßžù˜éórêØÒ5Ù4þ½8;‰%Ý#D7-ºk©Vë´h®¾Ÿ›0¿^¼€%d%ϧõ܉‹™¤rÓí-]=s¶r·|üq¬ À3o¿¥Êñ0åÀ.Þ¦Fø@'—O£†*ëÌÁƒ¦"gRÇ̬ S ®nÉîÀ)òi犪¤k¤x½˜ébŠ«Ýãé±°Eƾ´$ÀôÅËÃÔs¬ŸH‰ð´w­Mõ­ÃFK×ËbÄÏ(Oã•ßAAA`‹MÿáuR&¶tïÞ½«.Ÿ³k52å9à¬Ù¤¹–çÌ™£Ö£Ù"aeÍ–+ ¾ù³ ™ÓÌo„P¬oÿâ~o6OXVž,Æë²µ(ƒƒ´Òãͼ*V¬ˆÉCÀnàôˆ-oîÓ'Ÿ|‚Zþ`—2„ÙBleúÓ²Œ%+,=95þFíóZ7»øyRÂm°›Ÿ•ôãô»OŒdK¯n^Êç€8¾~´Ï„ rm÷DIgóÐñ:VY?ö¦ˆR¾™m¥Òqz<Ýw´FË붇èG©‘~_||²åÉ([’ìJeâ5j&{y«J)ÿôuÐöœÒ´·ûÀ–­øbèPõ9D[Ê’Âl߯Ÿ¾šCÃ)hgRøáýÉØ·q#ØýÝšÖ4ÒcÑ”Az26¥5ôª´uç2YÿÅHy<ùòPŪ-¤‡»Ö&뱉K™yëe1âg”Ǽ§By–<Ž cž¶Ð+¯¼¢\¦ì Õ>š‹›] ¬|y/5¯)/Z´H¹¼™oYº.ø!#XÆnðßÿ]/qóä5lV–-ü÷߃·zé-V.g80Œeáèå±cÇ*%Î ŠÉÞÆ“ŽöNø)ìçõkŽFçusö =pEãÉ[Åx"ðÁhI¦o[ääÂFísT=¯¡2þŒ7/CŒ7Nº™²r`$›•*y2YÃŽ—SÃܬ¤ åÉrâN5ïÜ ;hÍu!Íø;>ó4­EÏļ ñÉP‚.¨žtƒéÏ=÷œ²Yáòº°æ6eËú3šd²åÉ7@=$ï »ºÙ¢ÓÖ9µ|v)[#KV©¾,Eñæ²,_»víRS±…­µËkÐZ;ì5àc¶Ü™Ø¥ÏnfލnР©œÊ¤z÷ó—Á‘mNéÉ©ÉbÔ>ó䇒ðdˆ'zìÙó ÷LèûÄõô²ñDRë/ç1éû¡ÏãÀ4ý9O“ÜJÀÇÆCÛʦï_nê/&í¢‹U(û¾~ %Èâ(\¤¨©ÑÀkWÕ«Ó´5YS†…ƒÐ[Áð mJ®®i£K3ÃÛR]¶ðØr/UÆË‚$ŽIâ-VkÖB šÀLûßf°ždÙúT¶ÌÈh+î–°±Ô{#~Fy–xÙ›¶ì«©ø|ø+*XÆÒš"+¡Œ+ ÞÆÂ´³º±{œ£Ÿ92ºsçÎéŠÈž|°åÛ&‡?Ñd”÷Qóv%[¬étH§@N·ŸŽxN™ÍÛêì!^œ9s&F&.‚ùp>ïß8ï{ |+y¼=ü³£,Ç*‰»;;6k£l•ª©4g—¯êm š2+žæ´´WY‹ š f†·¥ºl±f¥‚NÓ¹”>ت ¹~fd´wKØX’݈ŸQž%^IckR³x3RßZ>}ú˜—¬•É©tžì¢å"v?Û¢ YN¶ô9×9Ø‹-Ò´«!;´3´ŸSc•íòuÁË–&¬,GVýVÝGqw;Qág7ýÝ•\«5iË–PÆ(BÛhüiÛZZË·UqÎÉÁc™©Ÿ•uÙU«­]ÛÓokâONQN·ŸSýÎÎvy'‚%Ò‚mù÷¿g&QÒÎ<:ùD6¶Ôß%w£Pæ(GÞ~‘ýãxÅOcâQn]‡ËR;/" ÖÌLÿ4Oòï…7ÎL¢¤ytD6AÀªÖ®…3¼ÇžæÖÀ§:*:è)vˆ E,C€•ª£è&=öøÄÅK8CÛï|Z¶pÛ,áã¸^g‰xÂTlE "íaçGË^ hñ7l@ íu´°‹»,EòWkö øwãÌ$ÑÝÎ<:"› `'q´—ž÷r'PD¾£žn§R\pz Ðö4z(?`ɵ«ÓÊËÑÝbI;íðˆ`‚€ýð Ç™o:ö÷HjùÙ‚•¿Ç_z/‚€ àĈ’vâÁÑA@ò7¢¤ó÷øKïA@œQÒN<8"š ‚@þF@”tþé½ ‚€# JÚ‰GDA@ÈßȬü=þÒû<‚Àî/¾È’k8~4é#ãÇ;‚•ðò¢¤óÝK‡ó"üV§®]»Òë=]œª{ ô´ ôô3!A@È¢¤3†›Ôœ ~™¿è>»^µhkçcbbÔ‹>l-/åA 5¢¤Sã!g‚@®D L™2JA/^ÜéägÙ„A cHàXÆp“Z‚€S!BïÅ5¢XzžwNQz²å”\Ò® %íd£ƒM?ý¨>ÿmø;•t[–/SéÛ~ÿ-Uº½'qq±ŠÏ-›í­š-å­É—Ÿeík¼­µe ;ˆ±öŽ\KìV¬XZµj¥Ê:wîÚ·obô6 ___:tÈ”o”g*dÇ‘lv°‘¢‚@¾D@”´“ {ÄíP|ôÎ=‚Qdeñùð-¨Êû_Wå?~öYM ŒòI•Æîb{ù™˜¤˜Ë7¶cGÄÑû‘ƒ®]3µqxû6 ©_]<<0ˆ\¹ÛÿøCÕ¾~þœ*óõˆ*½O¥J ¸ k8™ó¾Ck»zl˜iDèm¼Û§7zÓ¤¦;AMèÑ]ñä¼Ð[Áª½i¯½†Ÿ~OR„u¿*Uð¿¥K9[ÑOŸ~Š—|Ýhü†6mŠ3’qçÌ÷úöQõ#ï„'ÎàK.eVÄ£GÆgŸ}–Š+[×ÌU³fM•îJïÕmР6oÞ £¼TLì8±$›Õ¥¨ ¯%í¤ÃïY±Ê‘‚9´m‘„Óûö¡FãÆ( skú_ºˆ£Ç :2ýÇŒFá¢îø“¬¢7aÐĉ §[`ýÂ…X=w~üøÜ£í0/|ô!âiïj)¼›7Uïc¢¢Õùò¯¾Feºq×{¨%NR{Ú·@’¡)¿0÷äRÄÇŧªÏLnݸ¡Ò’’a/?ó!0—¯f“&ªˆyê¶l‰°À@LêÕ!ôÝí…Hý™2h.:‰Xê#÷mõwß¡p‘"¨Bý¹m'sÞäNÕ7žtŒiÿ(v®^ƒJ¾>ð®S»iâô:¥ñRDb|2Œ{DX: ˆÛ$×tRŽL~û÷aÞ{ï¡üÞxùã‘@“ùt~ãÂy•ŒÐà`$Q2C–öGÏœ9S›æJòÂ… j»–¾=VÚ0ÊÓ—·çØ’löÔ—²‚@~F@”´~]R–Wüüpð›Š&íS¯ÉVªîƒEÇaÆŽíhÛ·/ªÕ««zsõôu/†wú èá_¿6QQQxgÁ|”ö*kµÇµIO ¥>”nìLɽüÑo¿áõÓÕùÍK—Ô·­ÿÅoä´i`k¯y¦Åü×¢Åj) ß¨‘;kÆÍ™­,mÍ3ÀòñgÁÑØIËF8™ó.P(õ Ðá­[qžÖm›uè€9»÷¨OS ¸ºrö,v¤XïÜž'É6uã’e*‘ ™Ç+*2l™3]=swn‡à­yßc=Yæ•}“=%³þù›iQ¼´§*—ÿ‚iRÀëÑzrwwG$MöŒòôååX²QÒÙƒs†ZiúhÜ»wó'MRõ¢'Jé‰ÝÚ˾œŠ5ƨGÚàÔ¿ÿ©ì¸Ø{ê» ­!6iÛFW«]mzöÒWOsìU¹²Jó —-“vîI )ž¬@=%&&šNõÇZ¢VßV~Z½ô¾._VE~4mÉ[ðf×nê<ðêUSUö¸H~úVz8™*Y8ðÛ·_¥ê'HÚ1O 4ò$WxAP§¼,ÁKc÷ÉÖ®WO\¦²,ï§—h¼ØMîH²'8«ÉAKz §`ªU«Â(O_Þžc{d³‡¯”ò¢¤x”~ò %»š=h]¸n«‡SI»zælåÖmùøãX€gÞ~Kåk7ÅCÛ¶RàÙN"ëðò™3øí›oRÕ7?qusSIšK];·V.Ž”FQw´CÓ·V?=~¦ FìæO™ÔiÑ\•|nÂxüzñ–7a>­«O\¼ÈÄÁ\Ý¥‡=ËÄ[«£}û4j¨Ï<¨%áÌÁäíJªW7¥¹)l:.¨{4'O†Ósµ;ŠÑä¨CñOÂJݺ´©b6ðZt ¹äãÉU¯Ñ•+WÔV,£<­¬| ‚@ö! J:û°¶»%v‰–M±bkк¬fjŒâã“-[¶pÙâ[7Êâ5jvµ~6dˆz$ãä¥?ì~xÿ}ðV®Ì’WÅJŠßÅ'T@Ö·£FÁÿò•̲5¬ïFˆ°[·ð3A=LAj<ñغ|9ŽïÚ…ïh­œ×Ï‘kZ#ý:¨N\^Ï;žÖ´õÔ”Üܥ˖Å-[ñÅСêÃq<.íûõÓµxüÇ7ßâ¹:u±°¯Ù´ }šªrÞµë¨ï)´†ý,y9²"pÌ¢@”Ø©S'ñ=‘âXQÏ›7AAAèKK&FyÖø¥—n¾&ž^yÉûˆ’¾…SÕM±5«^ÈÇ£n󿨱r%&Ñ>×ú­Z©ìó½ûÕ°á¼v=)º»ãSOcð»“HqGâ#:Îl7ò<Ýàùå ÿ÷ÒPœ;r¼N›•Ô¼s'DKvᇪ૱³f"2,Ÿ y'÷îEOŠjïLûÈ-‘N\^Ï;„‚§ôäQ¢$fîÜÊd5o 5þu´fÏkõ_¬_G•âú¢Ÿ÷&ºŒ{öbL‡ŽØ»nºÑä©Ó³É²ú_¼ˆ«´¾Íh™!Í{b Vš¼íjñâÅ(K)S¦€ƒÌx–Qž-¼-•±G6Kõ%MÈÏð"ZÒ.³äg@rcßy}“×A]]“ÝÕÙÕ‡{1ш¾{¥ÊxeK“Áׯ¡)’ÂEî=^»Š²´–ª­ b„“%Þæ¼ØÚe¯EFû{ë¦?Ø ‘tžn?²ìíyv7?àäEÂ{ÓV=s2Ê3/ktÎkßüij…/$ö!À17©CYí«/¥£ˆí¬‘•¥^afe[Ì»l•ªiš(_5­‚IS(%Á'K¼Íù°UÊ*Í2eÄZe«Ù’‚f~FyœoeD6{øKYA /# îî¼<ºÒ·|ƒ€3+Bg–-ß\ ÒÑ\‹€(é\;t"¸ ‚@^G@”t^aé_¾@À™­Ug–-_\ÒÉ\€(é\=|"¼ Œ¯!;+9³lΊ™È%hˆ’ÖoA@AÀÉèn'GÈ(›6mÊhU©'NŠ€ì“vÒ±A@ò7¼OZÜÝùûÞ ‚€ 81¢¤xpD4A@ü€(éü=þÒ{A@'F@”´Žˆ&‚€ ¿%¿Ç_z/‚€ àĈ’vâÁÑA@ò7¢¤ó÷øKïA@œQÒN<8"š ‚@þF@”tþé½ ‚€# uâÁÑG pçvîܾÈðpÄÜR,‹s‡GÉ’(áéIŸ2Žh&WóŒrõðåiáå± Ù0¼‰HÂÅ£GSµäâê R%Q¶R•TéYq’”€KǎýxqTòñµ»‰ÌÖO¯AƧ ÀÔN‰¨TÝ'½j9–o’3ƒxf—à÷b¢|ý:î……¡léÒ(G ¹$ÉÌ RÜÁ¡¡(\ªÊV©‚ÂEŠf—hNÓŽ`ä4C!‚X@€ *–´`{ëXh‘­wÚè>l\]Ý,æ;"1>&¹ýªµk¡ïÈQv³Ìl}£OìÙsGŽ Ïk#£pbLøÜY)+ñpTŸYùø_¼ú‘×­]¥Š—HźliORÜž‹¸ƒ‹¤È¹l%«ŠÚÿÄq$&$ÀË×E<’=3 »q‘·nÁÃË ¥*gý„3U'2yâhŒx²éO“ñBnn¨P·žIº»äɽv žÕªÁ½d)Sº¶ kÒ¶ ä 2ÅK—€7ƪOï×F \Õ*¸zú Î>ì ²†M¡"…Ñcø0´êÞÝá lûíwDݹãp¾ù![Ь Öª•FAë±aåÍe¸,×±F!—.aM4ü²ÌT$Ž&;fÏÁÞE‹‘””dJÏ-ŽÆ()>^a´sîwð?~ÌÃÍ“§TzðÙ³¦49lE@,i[‘r@¹Bd-ëݸA×®ƒ?Qäzdºyå2vüþ;n¢¤W´îÙÕëÕWyþ—.bÛo¿áέx׫«Ö/=†þ¤ô™þøæ[°¥ÜñégÔù_‹!øê5<7ù]unþïÌ¡ƒ8¶s—º1#÷rõ† ЦOD’ tÕ¬ÙðiÜWýüp/:}GÆ®•«P‘,­ŠTÃþMqêßÿÌYâ¹w'ÁÅ¥a?ô•ÖΛ‡ÄÄD„aÉ”)¦¾$ÄÅcËòe8Ov÷bh׿Ÿ #Œô¼ùØZã¨O˧~ß&iö..;Fd´¥þû6j¬Øá]¨PÚŸ‘\‡wlÇùÃGrÓž*¢~«‡QÿáVªM?ýˆ›—.cà„wàæVD¥eö¯¯F“»n:p¥ñH¸Œ¹»>Mk×!רë=ñ®9 ÿ'ès•4Äñ?ÿĽÈHÔîØ¥«TM¯§ÊÏ Œô<øÇ(_·.\ ¹ê“åX°±¤í†,ãb¢îbÿÿþ§”Ü?k×àð¶­(@LíDTdÖÌ™ƒº¹6nߎ\‹‰X7o>BoƒÝr’B #eV§e r GãЖ­#7c)9þðqdø}‹424L¥!1­…†-dÅÞ‹A‹.]P¨°ɲþçÏ#¬æÅü Ѻy™ŠQ´X±þaªóÅh Ó«Jex‘'À…—‹ 0ì‡9rìI`r¥ö+‘•±`ºqáb¨ Û´UO[~ùE¥a¤ èþö‘ܶZyR¿U+j'”&ˇôðÖ5£ä ö¿Ž+V¢dO5 HLˆWç¡·SêFª1O²0NæíØz‚r´íáø8›>\–ëp]KÄ“¯ƒª¬C¤€BhBya÷+S t–Û”¾yêT¬|ë-lüüsÜòˆ0¹Ø×ãÅ)G¶6Û¡å׎ͻ>FÙ³gçº"ôÈSOQ6>¡îLJ>G-yò÷-ãOw"")ž…-Œ´2µ££mD;Z÷èIÅýKú'ŽÎâm‹¯{Ñ1Š¥ë!!t7:šÚ?ù$½8ùsÕçxÐkÄËô ,WmEñBÏ—û&h‡>œyו¨\…*6oNѼ Ù2}†Z6æqªi‹40ßá2ªòBcs/4£x¡ ßõ!^¬äçÅJÃ>½ ¦áƒ|¼X±V§»î[ÂèîÝ»´háBÚ°v­úà÷Lø9€øõe‹/§¶l¥Û!WS4á>qn9“.îÛO…üJ±‚]^Ò6¶^û´…sìÛÂÌsÈuš(R¢õýÏh%lv¬\E‘×oPÞä œ>á>>A³.ÈB äË?h|Ë”¡K)}\‰‰vâý-iª#\mÙB§xaÁ”¯PR ƒ4*Æ4"®­QTÄmZ=gÅÅÆÒãÞçèà$s§­vX+Ëü¾~KOÞ¤ˆã„¸8r¦lGÚX HQSÕyòæSç÷Yxh>r{xk™mñU¶JUªÑ¸Ü@¹6Ç/ œZ åN®S+ÇUÇ6á*X€¸?õgK̦ò ¯-jÀöê±ct—]4š6%¿ê5LɃx<űP«ß«'›À;R©ZµhóÔit–š¥ëÔQé"®†p€YjĽh9ŒëcÌT°N,aÈî[¼Ñè6køW»n]uËQŒ²³ué¡þýhûì9´ñ hÔH+’®Ÿ;O7.QÆ£Õðáêþö9³Óãt•] ¥Ù $è!­GÃÍç0ƒ(T˜ñv34”ŽíÞCëøAE\—ä‰Ô²ÛãT³Y3Š»wOM²ùy/«¦á†\¼¨Ò@…q´¨FÙsäP§0Uk­ÜسG™³ë¶nE­yB=Ådã’¥&s3òÁÔm m®š5K™×Ûò¤]¹n=SR[í0%2;17õfÏ™Ô$ÓLà8w¦lGÚ˜#׃áŸ-Ç£R©ŠQYÃ[=Ô}Ùâ }ÕìñÇ©i—.t…Ý 'Yƒ‚9ÿÈ®]Ô¸c']).ô¼Ðü—™øhÍ»$/VÊsú‹<Ön^¾¬c!Ýö•‘äã¦ÅŠV¿Å£Œî[p9à^/A‹Ž`„´¥9–¤ ÷+Gް&þ`ñ‹(oP‰ªUÔ_%«VUB:âÚ5*mº+'‚@f)A$ChÛ·]äU:´á#;wP•† Õ$pbï>*êçÇAOU°Q·†qPW]ÂË'ÎràκïÒ ðw“ͨ`:kã>¹sS !X*üʺÉ?xk¤™È¡9‡_¾B‡YÓÅòÂ@#Íô¬]ëØG~%˜ò³ŽÅ‚¼@uêDj×¶Ú½0×ÊÃbàkå®ªÆØ"gÊv¤Öê‚ÂÞæùlñÃ'@5›4¦ºmÛRIÆ Bº([U@À‹Dý»*p,7["¡Ms,3„<È›V*š¼Ð¬Ë Íò¼ÐŒçñ„1–šX¬ÔæÅJM^¬„ñb%ˆ…|ãpŽýÚ5xÜd4Y¨jµªtøŸC&mº0ïÆÀ==9ƒQCx Tš³VF¡R¥Ôé  íiçùØí%$˜#ð@}0"×nEr§AƒTðSC¨<:øiй¥ö _æí ;´WZ*ÌÎÝÙ4V¬”gS£¯*_3É̈—¢€šóĈ2Ö~·Íª—XX>ا©辪7iBe«TQÚˆä.þFÐuîŽÐ­ä|ÅGx’=ºk·úDó­òr4¶µvX*»r½ºjÁ±cÅJ“™ÙR:Üs¦ìô´Ñ¼õ<Úâ«NËVT¯MkºÄýùó—STÄz}ÖZ9ذà1·&èËwö}78ˆ0{6ÇÞH‹<È›V*¼@»È Í.Ð1޽XÿßÏ(œ…ñy¶üñáD:±nesw’©›¨@ɤÅJZëLk>K!²ïŽhÿH'õé?`  ŒÔêp#ÝZ˜©§â•*RÞMq™wfü³l™ú\9Ì/â8ŒÒ"$˜# ,bÛ호Ì3ɵ{€¿fqp}|÷nÒ¥¨JýêöÒ)_²Éû² :ÒÒ%päp<ûˆõuÞãèUìFÀ—;ȼÖê¸Ã/ÔÈÃÑň v”-;-mtoK¼Úâ Qà°z¸›®‡†ö5×e3j^¶°Ø3y«hdÖzœ>MÅÙÔ_œšl"“aînË{ýKÕ¨™"éÅýûèЯË(–ƒä|¸O+6oÆ>êÞ*ÍÁ_~VÂé.ï‹ÏÇZjv“4ì×/EþŒº°…‘æbÑãæF ñq´ìõ1짯Ní8(„@±õŸ~FlAiÆ‹ðòMšÒëá´{Á·tfž{<Öô™ÁT˜·ç zðÆ1ÒzD |þÃ'Ÿ(37öB#8'„ƒO*±¼;¿­LÈõde¼ñÚÒ ')û¥kp¤1~ä ½ÐÁµIñùI~ãØ]Ö$ØBãŠE[ /,óꚨO#DëßZ¦ÝÏÈ£0Š‹åÈqÞj™)>ùŒ[êJ3"¤Ó ]Æg¼y-”þÙº•#œ¯«-/ˆnо½[_'šñ­4NYoX'‚ÙÌ\€·µU`ór~øº¯‹þÒ0ßÞá=»ØÍÅ®ìUÏoöúPãôˆë9Œ\©”èZDH»O)M0 ÔÂV²k Ïo+ξfüÉ^ÈŠbs4þ\ã:ÿùFÎ8r>€·â²¹íÎ0s#‚‘‹€”b܆€i·A+ ™„¶åÝæ—ÔÀƒ×Ïâ k D7ã_Ñ ñK7 s¤9‚¦lí‹ÏüÖ¸‡ÁÈ=¸J©®A@„´kp”RC#ÿ+^3‹ýºZy6~…k¶ì9(;ïw…ÚÐ8Àœ`äH’$Àv<œ6ÃÙ“ AÀ@çP/ˆIÚ®çŠ2=­ ÁÈÓzÔsÚãøFJÏi³´DA@ȈÎÝ$L ‚€ x#"¤½±×¥Í‚€ YÒY¢›„IA@oD@„´7öº´YA K B:Kt“0)‚€ àˆöÆ^—6 ‚€ d DHg‰n&A@¼ÒÞØëÒfA@,€é,ÑM¤ ‚€7" BÚ{]Ú,‚€ %wwg‰n&´#pùìºræ ]»t™"o\W,VœJ–+KeªT¡²•«¤½pÉyro0;LAát#ø¶jU1ÿÂPÝ—*5ò§Mý=¤¥ÒŒ¬†@6f8q{bbVãÛ¥üî߸˜ÁË,Áÿ³Û°Ýß9{3!!žrä0κ(..–6/YBÅüý©qÇN›shë ãÿ$nÓ·/åÍ—?EGòkîDFÐΕ+©TÅŠT¯uí¶CGgêq¨Àt$2/öšyûß³‡¢‚.QòT»ReªÀ} ºÀãýع³tòb(Gµš7§‚…‹Ø+Ò㞇GÒžU'),ðU«ãO5kûQ…ª¾ªN‡Ó‰c¡têh0•¨^’š÷¬A¾þ=iq¿ªLî›”Ú÷jA…Š£ˆ¨ŠˆŽ£ÈäÎqÏi‘ÇÅß½§0šÐ¯mäß•F;V¬P÷÷­[§Ý’£ à0"¤ª*õP£Õ';V.ÁK*L¦˜àÞéÝ‹z•*EÝŠ§qÝ»Ñõ«©òj7~ÿv kÔˆ:,HýØ<ùù /VîoòÛ¿|y ¿šd®¿O‰4´^=z®A}¹­zb"#iÖØ7iÙÔiZ5´ï¯¿TQϘÎ):"ÒôÌ‘k|jy#oݤ×:uTí\£Ú²Y{¤Î‡Ô®M P=èÖµ0ºÆfV×i¢—©/äÊùù©Å!ˆö>H‹<Èk‰||òÐlqÍ|íuB¬ÂÆÅ‹É¯lY=õµÉÚ‚ù~øäöÐCj±ò|Æ´lÚƒÅ!žg$î9C%+øS~ß"t'6ž¢ã)†?7#bhùÏKÔ'üV”º‡gHƒ´Èƒ¼¶(gΜ¼h¾ªp²”ÎÖ‚ÒRz¹ç݈v¢ÿK²=ÿßC4}ëVêöâp%°·­X©JØ´x‰:vþBªGN™B¹rå¢"%KÒDÖÿeM ‚´ zø¨ñiØ®ò[-hðc}{ä0M㉱ OðjÕTu ¤/½Dùòç§MKV÷þøæuì÷ŸÑthÓ&§êÙÎ]‘·oS³ÎÒ„¤;wRÖþ%[|jec ¦mÙBïñ_Ÿ}t·oÜ ]«£5óæ³°Ž ¾¯Œ¤Wg̠׿ž©‹^Ë×Ê0?Úk§-Œ©Wßïæu;Òæ|l˜{ð}Âc¥IÇŽJûܶ|¹*jÆŽ´!&† -f^tš¯CxñS- ñ]ãǸnÝUÿ¼Æý„E#ÈÖB ¾ë¹&° ûè#µ€œÇ×W¬hîÖxpÕý+AT¤Œ?EßW‚8&ŽèÆ­;4fhoúfÚ'êóÆÐ¾êžA€#-ò ¯-ªR¿>•,S†þà| ÷»žì-(õiå\"¤eXƒÎH¦"Å}•9sø0s”ìþ ¨og©Ù¤©ÝOìݧÒ4h×Ö”V;¿xâ„éžv³öâÏ'³»>½Òª5Gì&¹ÅÅÞcÿyajÊfé ¼st×NeŽ¬Âæn˜ð­çkä “LöÙÙÀæ}GÉŸZåk&-0p]½Icu;ìò% a ´àÉ„`‰1]ºªëÐdžÔ…•/{í´…‘#õêûÝœGÚÀn` ªÕ,)¸êGU»‹ns샦E;" ‘š6ò ¯-3{-Q‚¢ÙªÓ®wojÕ½‡)¹­O›ÅAA§›7æÎ¡ßÙåcŒh*ÌÍ'׃C)/·ãng}¢n÷Œ½¥­rå™w" BÚ‰~÷1óï=>ìy•ûË—FÐ-ž„Ú?Ñßziìç&Ž UªWWS™Ú›>ú(­ ¡cßPIà;õa8hÒ°ÔäùØÐ¡êÚÙzŠû—Qù‚Øl¯Qð¹$Ÿ»vmëhOäÅ‚F£‹Ç«S_¶PÔHØO{‹–rš…l9˜·ÿî[-¹Õ£#í´†‘#õš÷»žGÚ 2â@çŽ&ç6»‹ïS,›Ú£££éÎ;}y×Aã¯×º•Jòè3ƒS$µµàiÆ‹®¶={Ð^„b!6‚›Ïñ¢Ñò™A‰‰ñtþz"½ItöFÒ'ö~Ò\ÏîiÏ‘y×µïÿokì@§þù‡6,úÉ”ÜÞ‚Ò”PNdDH;1räHù#nÇQœE8èk/[ÁÕmøp«¥ùøøÐ­ðpúñÓO9Ȧ…ÒFöóÖ®IÏ?¯>7o¦ò,ãí@=8‚½ÓÀAv«w´–0JO½`Ì‘6W©[—ßøu–¾ûècÕ–1³fQ1¿¤4X*`¹@0¡«¨D¹²t†ƒÿ4K‹#å"-ò oZÉÖ‚ç×ÿ}EOרI Þ{ª6lÀŸ†ªš€ê5ÒZ]ºò•©æG÷BÎPÁ<Ùø“ƒ æËN9òóÖËE+éå7ÞQŸé?­R÷ðL¥á´Èƒ¼Ž?¿<±!zrdA©O/ç‚–Ù^ÿƱô ƒ   K—Ò«S§R_Þjc‹à{-äëËgyMÉ AAC†ÛÁ4ˆ@®\¹|ì%MõÜ™za~ƒÍê%üÓ6a;ÂgXðe%¨,½-ôR ‹r&?nªÆØ¸áL;Í‹IO½–ÚŒ-\«V£&¼@›Â/Aÿ+]Úío;Ë®‚S[¶Ò@ŽUð+Z”½,ll×ùEõí† ÊÚþbÖò«ñnl;´EØ6¶•ƒà>Y±œÚöê"é*ŽŸËu‹ ñ>ü‡y·4JÐ'C†ÐÞµk鯵käËQòM˜¿qìÎÐüõ) róÅámûiã[©N÷þT‚ƒÁräÈÆ‹ÆlsBTØ'Éúp›·`%°—"ñ>ù$ìJ0ýíê8¸ÕkÛ8‡÷b¢©¿È¤n˦—ÁÒòTµê|þ<½1ëkêÉ®±‹'OÐûýú«@Qüö+ÖªEï.ú‘ª6HZ¸¤*Xnx-òƱtt=öÆ.™ô9]`#LÊ¿ðžX½ðMGÑ’ÕC0ÒÕ¬8 ;y+^1½ØâÖik=A8CHDz¿’wÜà­zô \9sé“¥éÜÖ‚{úõo-KSéÌŒVõ/&²Qc€+P ›³Ù°Èx$‰hΨƒq‹O¸OQQÑ´Ÿ_JR¨H"õxuK0JÏ‚2Í—ìYyãX:: [, ùç×^ôåúu" Ó¥§f-ÀZl~CWóÇÏÐ&BÈ"îá GCø†ð67døÖõÜÃ3¤AZäq…€FcýÊXÕ3[@ƒ?´³EŸö”=[ýóÛ2º{ã2ÌŸ L¤"…x÷pŽ{x†4H‹<®ÂnG,hàWÈ{s·÷ö½´Üƒ@ V8›gñªÏ»!¡T÷'ctÞ§ºÂ&çSlý ä­`yJùQ-о0û&o1ô`hLMFWO_¢Ý+7е‹¡lv®IUêUçm‡Tš SèÌá@~¡Î *YÞZôêD¥«–ó*ŒL`ÉI¦ æîL]*2¡Ø»÷[Ý®r€ZxðUŠâ½É îë_šJWªDåÙ'ê“'·W  £ã;ÐÙƒÇ)øtÝMÚ+^ØÏ—ü«På‡jQ­V¼£Œ­R‹%DH[BEî †ü¯PÂkbµ?ÂÈÓ7û sòÇUæÛ¬ ›`”•{Ïsy‡6Ο{.ÎÒ2A S€V‚øÁ¦‚LåLj• FFìá È>i‚€ ‚€A!mÐŽ¶A@Ò2A@ƒ" BÚ #l ‚€ "¤e ‚€ E@„´A;FØA@DHËA@ Š€ì“6hÇ[‚@ZÀ¿‹]:ušÂ.]¢èäÿOK9’GðdðŸîø§½rÕª¦ùßþ2 Ò…´Ô#¸èS»ÿ¦ÚüºÏ^ü7üÝB‚€  k¡tøìYþßú¿‰Z¡µéÔý'w,‰À…c'¨ÿF·-³$ÿ´ Q`‹OÂöítŽ7%üËfTÕN×#>i§!“ ‚€1¹x‘V¯fLæ„+AÀ€à÷‚ß‘I„´‘{Gxœ@ :â¶S&î=z]bÿµ3´hÑ":|ø°3YÒV_çqþW¯lülj‰‰i.wß¾}”'O•_žæ3!cVå; ²Z%´iünŒL"¤Ü;Ä[BB<­ÿáûŸ¤í«VÒÅ“'2ˆƒ”ÕÄÅÅ*^lÚ˜ò•+-ýþ¬¤pým`Òêv”W×sò Äû÷ï?¸°svçÎÚ¼y3U¬X‘fÍše'uÊÇ/¿ü2ݺu+åM7_éë,V¬=óÌ3JP»¹Z)> #@ãÆ#m‘3¿[å¸ë™ø¤Ý…l*7†'ìžb‘ãNO>Iï/Ybñ™»nFݼ©ø©Û²5êÐÑn5Q·o«ôuš7§Æ;ÙMŸÞK§|I[—-£¯wîâÿgvŽ×ôÖm/?&$kfΜ~îÐLóçÏOƒ ¢o¿ý–>ýôS{Eæy©R¥háÂ…†áGÉâã“ÆŽÖ†1‚…è‚ hèС³Á"ctMÚè=”ü•ð÷§¯wl§™Û·Ñ'Ë—Q)Þ¢°aéR:y`raüªæŒO·Â É(44K=Ã?üð5mÚ”úöíKÙ'·iÓ&Óã£GRåÊ•M×8iÙ²%ýöÛoÔ¥KŠŽŽ¦'žx‚ èA;vì >}úPÑ¢E•fŽItúôijÔ¨½ýöÛT’ÚÊñxZÊãé½÷ÞS×µkצ+V¨´øúé§Ÿ¨zõêT°`Aòóó£áÇ«gæuž9s†L‹‘ýû÷Ss^ )RDñùÇØ-Ó”Àìä…^ Ñ£G›îÏúõë«6šn&Ÿœ8q‚7nLC† !h÷Kx1û÷ßSƒ T*UªD?²U „rž{î9…pzøá‡)(((¹$"W´ÁZݦJ²ø‰¥1míÞ•+WÔøÄBtñâÅ„kKi­-h•i#õF&ó’‹ý“u[µ¦z­ÛPÛÞ}¨\rR^ÞSxùÌiêÏã—#FРjÕ¨7 ôë!Wé÷oÐ0žˆ;óÄÚ/ }Γ\BbÝ Sé§°iôÖÆ物oÙ²ôWòÄŽ¦ž=r˜F´hA] ¢AU«ÒTÝäˆçñ±±4qà@êÊ“/êÞò미íÚ²™†°è̼ƒ_-¯#|Átý ð5¶kWÕfÔvù½Ú¡›¸ãèûqq<‚lñúÃ'ŸÐ°‡¢®… Óó Ò²iÓLm˜Ð§·*'ÊE~1LDЦ¡u˜´JCBBþLhuêÔ¡Zµj¥0yÇÄĨIMKãåË—)22’~þùgÊ—/ÍŸ?Ÿú÷ï¯M·nݨL™2tìØ1zå•WhäÈ‘´qãFºwïøæÁÃÌ™3©^½zôØcQ›6m ‰W_}•FðïXcq²víZ…éS§ÔzçwT™®hÜÖêNÅx½a>žm]Ož<™²gÏNyóæUÇÏ>û,Õï¿ü^ŒN"¤ÞCÈß-žlFµkG£Ú¶¥Á5kÒ¾ ©'k3å«× Ø»w)„ÓÊÙ³)7Ü”e¡z'Âi£FS ¿4£ßèQ”;o>ú'ïýëÖÓ}HëHž@:²°½ÉwÔ(Õ¢{1ÑôFç.t†'¹<±.Q‚~>~þßÿL->±ÿ€ÊÛ…ý×Y°@à;BX<¼Ý³]çúº>û,ÝçãD6ëž?~Ì._‘7oÐ{}ûÑ5Jmzõ¢;,<Ñf´¹*kI ŸÜ¹©&k¢ÙsäP×Öx=±o/Í0üÊа>¢.c__9{Få»u-Œn†…Q"óè Ò´L@æ­üo¾ù† ñc ˜¼×­[GðSÛ#äሠ3h¿9ƒi¼ððç…Û˜1c²ÞÏýùçŸ+ÍÂZø| 4õ´v´KhóÐJAм¡BКש$mذ¢xüA¨Cû;v¬˜è+[eêËП÷ë×O]®ZµJaèÅãÀšYõ î<¾Á?ÒÁ Ö,æÌ™£‚ÔÀç2v“`‘3,¬ W´»µºU%ðe>žm]CH¯Y³Fáã”)SRý´ßŠÑ¡yà¤2:§ÂŸÛˆcÍ èäI5Á@ƒ.³ÉòFhˆ©î’<Ïÿ÷åÈ–$œ¾em87¯VÃØœt5¦ Ö‚NR•daVŒ'ÛÉëÖRvÊFX»Âóè¨HÚ¹j5…_½Jm{öP>oÜûå_Q&SÔ5éÏ?LyÏsùîX Ø¢5óæSTD=÷þ{4ôƒi/×?¦KWZ6u ýð•Õ_ÛØüÁ‹Š¶½zÒ;ßO÷)‘z÷¥Û7nÀfI#ùǾlÆ *ÂíšÈšýM~)¸Xâ5âúuõ<(ðUa³ésçPÍfÍX°%ýôf°©Ø•á€ÉË–Z2´_4hÉðMCv†Î;§ÌÁz!ÓøLÅÀì ‚é8ZxÁĽzõjµ`€V³7ø²§é9r„ªò‚  Mkäl™ðÛ÷ìÙSù¼¡™B€nݺU+.Õé5×ÀY~94x­}Zb,^ è±€Æ Ö‹ÿñ‚´[f\Ñ[uk|dõ£½± oBö¿hÛF'ÒFï¡ ä¯OÚKÏŸ7Õó3´Û>þ„º¿˜ä,â& aÖ^üùdDÍB± kC ¸Ø{¦2Šq4¨›{A±¼æ T™(_‚4äÝwÕ¹&øŠóä¦åÅkü@1¬‰ÙÒ!.¨´ >œHøhÊZ™FÖø ¿¬’h3ê/Çã6›gm‘5^›ñâ ‘m¼(Ñø©ÄôW[·PQß¶ŠLÓ3LNwÙêaIHfüáG…P€ † Ð&êï¾ûN i˜r!@¡%ƒ¬EsC{…¦¢§C‡Q•*UL·ôk¦›|¢ŸtaÚž;w®2mÃ_‹<ЄõiôyµshÜð©£½ÚBš;|íÐlÓRæK/½D0{×Y–]4ð1[#Ô©Õ ,°Ø¸Ê‹OíxýÛÜ+Áû￯LÞïòXdi>{eêËÑŸ7iÒ„J—.M|ð2cëŸÙ:lj/¾øB-p`ª‡¹Clb´\Y•DHgÕžsß¼Z²¯p|îôÕèÿ¨ZšsÀFÚÄ‚kVud¿Ü‰½ûff|Ôö¨%ÿ€ Yïg¿÷Œ×^£žx’V³ Ñiòöʰö¼-GCÛÄѶGøµ³ß|‹^hÜ„²±GõÚµ¥B¨¶Íœï°É{h½ú„-^z‚iþûyÛ¬¶èW6á?]£&-àˆæª ð§¡JÀ~~ðFš«ÇT¡6¾–/_NšßUŸ ÷Jp\2Žx‚¯ºxñâôË/¿¨ (-}ëÖ­•ð‚Vˆ`)øûÞ|óM•å¼þúëôÔSOiÉ:¾øâ‹äëë«„#¢ÀÿüóOjÁA…šÏZ_§¾@˜Ò§³µgüøñªþQó! o¯L}9æçàÿÚµk„hoG fw,.þûßÿ*ÿ8„=CO?ý4=û쳪=Õ8ˆ[ÈÀ£¶°pElÕí(ÿž”NˆøÇxÎÊBZÌÝž42ÓÙ£¿’÷DçæÀ(˜¯»²‰nÀcéÜÑ#©JïΚïÆÅKh+Oü;ÙäÙ‘£¸/°&væßS¥5¿Q”ßôóÎ÷ é‹á/ÒÒ¯¾¢‚,°[wïFϰ{ÓC(ûÕÓiî¸ñôñg•ÐíÁ“m§ƒTDº­²aÊÿˆ·Ÿ}ñâKtpójÂaa`ú‡}Ç>¬A‚wêH[W¬¤¬i5z¤“­âèÉ×Ç0‡i/Gön[¹Š|y‚¦Ÿ¤ò³O~zÚe]Hv˜×…Å„’F†ÂÞiza’Óӯ싇ƈ(o´puhŽØ¥™z¡¥kÚ7Ò! L À+‚àÉ#ÒZ³æ¿VùK_'xÕ—ƒ¨pDª#^#{eÂ-‚0Õε¼hC[ „ÙÁ ®iúZð°/xŽ 1¸`…Ðó‰gémʰU7ž{ÁU‚ƶ=P?f²p&ng3• V°­ Zq®\‚wœ) [›Š±yQ ¦r&¯½´¡—‚Ô_Òi¾m{éC‚.ÒJí[É/xÊ~óK¼½gýÝ“<bÍ/wž$Ám¯\<¿L¾¥ýIš¦4‹¿˜LÿåEüÉ–|¹¶Nš*ôÀLðÙc¿3¢Ó¹Þ©“íE˜B`Ø&a+œ3·¬,°®Xú=à9”‘ñsçÐ@VDŒHmx±(š´{&‹ñ”Þ¨eh?®nº_¹§Š,Éšàæ_~¥E·rÖlºËQÐg8Ъ kÏzAŸžÝ) µFBsÓkoÚ}9:†&óíì")_´c˜5ú®,Z-QVù­ˆ¶Ô{rÏk€ þïêU*jQâÅØÜùôãÑàäO˜<¼Å)˜·|çHxh –´#óoÞ`’Gð›g `M@kÁ­ø½àwcd!mäÞÞ2Juêª=Ò™Ry:*-ÉVü‘ý£=Œ=Ƙˆ²ª.0HVE@´šž&j–&—â÷‚ß‘I„´‘{Gxœ@¯q ä({ÞˆKu*U¤ÒE‹9‘[’ ÆFÀÚ~û´p}•ß,xôÜy ä7VjÚ$-EdXÒµT$¸Ò¼ÿ¯@=ËÑâ{8šþt!A@HLÜxyS…F~7F&‰î6rïo‚€“ÄñÞõ8~ëÞîª÷;É‚$ @6ÞÆ—ƒ_؃?Ê•óÁK“ŒÆ¸Dw­G„A `Â1ò¤“ÎæIvAÀë7Žy]—KƒA@² "¤³JO Ÿ‚€ ^‡€i¯ëri° ‚@VA@„tVé)áSAÀë!íu]. A@È*ˆÎ*=%| ‚€ xò2¯ëri°'"}×.OlVŠ6ÝoÙ2ŵ\Þ€€ioèei£Ç#€W&–-[V½¯ÛÓ‹? ÁEÇzZä=‚€ˆv$I"üi@þ÷«\ü%O£8~{šö§žÖ6i `Òö’ç‚@@^7oÞ,À­ó,¢}–ÿØù²$‡ •À±¬Ô[« `kÿ›k%¹ÍÛ ü'F#W¶Ïhm~[ˆ&m yfû”Hçþý×”&ÉœüÂúÂ%JP‘⾦ûu’˜@ç¡|… ‘ÅJUmŠzL<,Hþ•*§xæÎ Kæàëü‡ö?ÿü39s†Š/NC‡¥Ò¥KÛdãøñã´lÙ2z÷ÝwMévqPÚZþW-= <˜ªV­ª¿åÖsKísk…R¸ `DH¤#²"ñ±÷è÷oXd½NËÔqÀ@‹ÏÜu36æ®â' FuêýòHwUc³Üø»I˜à¿ûŒ|ÅfZW>Ìž=µQìÏ?ÿ$^4Aà®_¿žæÍ›Gï½÷žÕjCø¿u-ZDyòäI‘æâÅ‹T‚^µjÕ2Ý/V,cÿ«ÚRûLÌȉ àÁˆöàÎͨ¦*V”{þyJLLTÿa¼õ—_éè®ÝT¿];ò-íŸQlH=:"##éÀ4aÂ*R¤õíÛ—ÆO¸åË—×¥L:…¶ !žŸÿgל ¼›5kF-e ”94r-¸Òn‡Øó+ÈÁèW.ÀÔÐ#lR½®Lß7®ÓŠ3©RýztâÝ‹‰¡cÇÒ¥S§èð¶íÆ[kò³yºbÝ:ÔºwoŠãçK&A•ÔWÿìáÔ›5»6ü¬r½úªŽ›×Bió/¿Pè…‹T¨x1ªX§.µìÖÍTB\YЭka}ÇÔ2˜ÄûCÙ)›º÷Ì»(;kwÑ,n†\åüÁt+,ŒµÙRê9Ò?þÜóêü"kyx«´ñÈ›·¨FãFÊç{Gwî$_~‘‡F*ïÐç’ó§k—.+¡›“5~=Ý‹ŽQ—×YC,Κbû'Ÿ¤åÊ)Ïþ—…§åz .BÖø/Áåè)pÿ~^lÜ¥Ö½zR£©ûu:ŽìØAm{õ¦^#^Ö'wÙy l[ÒüÓÎFIÇÆÆR£Fè±ÇSZöÕ«Wé‹/¾ ­[·ÒÃ?¬/^ÎAÀ ˆv¨ÞVda6>þOv­Y£´Ûƒ7R-öe‚б­ hD…Ú²…N±‹‰ºÃÑØUhØ(RT;¥DH»[¯-¹d¹$­öúÕ0uktbÏeήۺ•Ò0O±Ü¸diŠ·JåÈõ`hfËñÀ+“¿paUÌuÖÜAØò´sÕj®P£†º—=guÄ—-3)„m³Ç§¦]ºÐ6µŸÜ·Ÿ`*?Â[Ž MZ$Xª'žµK˜ãmñ¯1P’…>¨e·Ç©&/XâîÝS&}­Z:W!\oòBÁ|áìF¨[·®SU]aœs\@×®]Mù ì+V¬hº–A@pfB÷Õ!%{81ìwE (–…Ä@k?ز“-ûA›ÈïbAs¿ÌB€M¿ äµGÄyò女‡Ðî?~§HöWŸØ»—Zõìa/kªçÇYcqP³IcªÛ¶-•ä)颼ÝÈ¿J«õäIÖÎá¿k’'ö~~xð  TëöÂ0ª\·žÂ-„#®û¿öjºÇ4A¬5²råÊ*Rû›»ð"d?[-nqì´`ÐÉ“'UÔw©d÷‚–Ïüˆ­V›6mRcõêÕSg§OŸ¦~ýú™'uëµyûÜZ™.ÒꌬÊÊ]öíbËQÄ0_#h«n«Ö„HlsªÞ¤ ²ö||ÏßÈÚkV­8Êû iZ«yzýu^ŽÖîÂ/åØÀûy÷®]§)Ȩ†}ÒÎP–\/kŠçXS<Á|,Z„곰֢ȭÕs?>ÁaþÁŸ¦­ü‚ì)Ï“?Ÿâ„ 9øÛÓ8f©Ýƒ ¢ (ÿ1^:`ÀÊ—/Éu°|ùrjРò5[Ê«ÝC¾'žx‚~ÿýwõb”»wïRŸ>}ÈB‚€ à~Å“¸MbB‚@F#p/&šræÉM9²=0O;ÃÃÈÝ­ùºÉkžQØù $ùÆÍŸY«ÇYþ£"nSBIæzó:Ò{]€·›U¯^Ýâ»»oß¾M…“Ýé©‘áðug´V óz`` Eñ¢BHð&Úà-ŽÞÔ`i«±È–V®ò,”Ö¬©òYÐHh­gùw—€¶§+4êÀ¾ëÌ"[íË,ž¤^A #xà(̈ڤA@A@pÒC% A@ŒE@„tÆâ-µ nAÀÓÍÁžÞ>· )Ô#!íÝ(A@ðDDH{b¯J›A@<‰îöˆn”FDçÏŸAÀÃ!ía*ÍñN"øm`B‚€ àyˆ¹ÛóúTZ$‚€ à!ˆöŽ”f‚€ x"¤=¯O¥E‚€ ‚€iéHi† ‚€ç! BÚóúTZ$‚€ à!Ht·‡t¤4ûØ9i’Í?ÙÈêè$ò?õµzë­¬Þ á_pÒNC&ã!€ÿ}îÒ¥‹ú?oãq—>ŽhíÚµé+Dr YÒY´ã„mA@@öìÙ©hÑ¢”'Oým8¿{÷.¡}B‚€7" BÚ{]Úìq/^\ è‚ z\ÛÐ ´OHðFdyê½.mö8®_¿îqmÒ7ÈÓÛ§o«œ zDHëÑÐ'$ÄÓú¾OñÙòë¯tbß^ºO‰º”–OãâbUÞý7XN`á.ê¥%¯…✺¥Õy`ÓF§ò¹+±†EzËOO»´¼Žö¡–Þ†®j›Kåxúôij×®åÏŸŸ*W®LÔgIq¾yófjÓ¦J[«V-:vì˜é¹;ž™ wðÄRûÌ*É,€i+Ýsç}ôÌŸwû÷§áM›Ñ¨Ö­éNd„•œI·£nßVy¿™ð®ÍtÚÃ¥S¾¤QmÛªKgóje¤çuóf¿ï:Æozê²—W…½´öž§§]Îöƒ#u¹²mú¶[òÙŽ7ŽPväÈjÍc¶W¯^ú,)ÎÇŒC~~~´wï^%Ð{ôèazîŽg¦Â<±Ô>³J2A K# >i;Ýç[º4Møñ•êNDÍgÞ¹‹Ö}ÿ=õùŠÜŽ?ž3n<•ªPAe(X¤}¸d1-UÊñ<(¥‹Ìl–;ú!£ÚL¿ÿþ»Е*U¢ùó瓯¯/mÛ¶Ú&/5lÿý÷_:qâ­[·ŽJ”(A“'O¦š5kÊ sù3­j9 ‚€D“¶OîÜÔ¨CGõiÛ«7µéÕS叢އ¶l¦!µkSçhPµj“¸5úýÛ4¬Q#êÌÁ=ýÊÑç/¼@ ‰ ôj‡l⎣k—.Qÿ€Š`ÿ⬱oÒ²©ÓTQo>öõ/_žÂ¯«k˜Û‡ò¿=× ¾2½;ÃÃÙ#‡iD‹Ô¥P!Tµ*M=:»ñ±±4qà@êÊ ð¢o5þ/Ÿ9­Ò~9b„ 7OÂ×C®’µô¨ÐæXÀ„l­}ÖêMÑ ä [í²V~LddŠ~@Q0e?U½ºÂol×®„6§°Ë—LÕZ«ËRÛ&ôé­òGEÜ6åOˉ¹9‚ÁVU¹A¹rå¢:uêІ ©Ý/õë×§sçÎ)´Ç§Ü<îóåËGîx†:œ%óö9›_Ò YÒvzÚó¼wÞ¦9ãÇÑ”—_¦U³fSΜ9©Ë³Ï*AôvÏ^t=4”ºòõ}ÞÏ9qÐ :ü?O+>øü9š6j4ÅDEQ¿Ñ£(wÞ|ôk7û×­§ª ¨dXÔlÚ”ˆ_ÜÂûúÕ«ê~ƒ‡ÛQHPý6gŽºÞµz5afµ‡ Žòp/&šÞèÜ…Î>L-Xðf­é×éÓéçÿýOc“Nì?@‘·nQ—gžáö…¨6ã¡-þcy‹ ø]9{6åæ-@eY0Ü‹‰±Ú^[|˜cq‹±µÖ>Kõ/UÚÔý‰µva1a­üxîO}?DÞ¼AïõíG×._æÅZ/ºÃ‚mF,²4²V—yÛ`½u-Œn²¶šÈu¥‡räÈ‘"ûÙ³gÕ–,ýMíäÅ¥þ>ÎK³Åè÷{¸t†§?þ˜ŠðBÍ]ÏTÁN|™·Ï‰¬’TÈÒˆ¶Ó}·oÜ ï?ý/ýøÙ$Z1kUf­cæÎT¦rZ3o>E±ïûÊHzuÆ zýë™j²Ö4`}Ñþ+Ñ·¬ÅNÛº…ÚôéCjÕTƒNÒÈ)S”¦S¤dIšÈšx6^è©ÇK/Q>þÙ´ôguûo¾QÇ~ÿíÛ–¯`mü*5}¤½¿d }¹~ ›8‘j4ilª®$kÁ“þüƒþ3mš¶ ª¶ø×2#ïüÑô­[m¦·Å‡9|û]Œõõj¼˜­µË™>ܶbE° kúè#ô»;fìÜI…‹KªŠVY«Ë¼m9rä¤;vÐ^Ð,š\ŽVH:0Sí'hÆQ¼H´FÑÑÑÓx™2eh ˜»5rÇ3­l9 ‚€uDH[ÇF=)ŦÌ_ƒ.ÒÐ÷’ªÂXk*ÎZ(äÂu\ðáDj“-éÒU]‡²ÖkN0k/þ|2›©ëÓ+­ZÓñ=«$q±÷Ì“¦º.P¨05íÜ™.œJ+CChÀØ7TÓäƒ7*Ý¿Ÿ*Ÿv£›ÈA“†½@Ѭ =6t¨ºv†ß²eTž³lîÅÇÇÑd6mÂw¬QNŸ\Ú)eÓ½åÉ.ÿœËG÷¶+[éíò¡Ã‘öéë51ovb­]Ž”¯U¯Mkuzbß~uÄVª ì¿5'ku©tº¶™çså5|Ñ¡ì*ˆOÚÖ‡²/^¼¨"·Íëùûï¿iäÈ‘¦ÛðÿV¬X‘ÉÏLɉ ØE@„´]ˆ$;çm*G÷ì¡%_L¦¶l¶†z›lßN³ß|‹^hÜ„nÚô Sò"(ž}—'öîSfj\ÃG òññ¡[ááôã§ŸR<ûxÍ©a»‡©\•*t£pósàÙãÞWIœá¡e·nT paÚ¿a#Íxí5µØX=o…œ?o^]ªk{ü#ƒÞoh+½=>ôX4üq»ëëMŸÎàW¯][*Ä~Ú}|õÂ*‚mZξm±±wUÑ\8Ö±cG*ÄÁãÇW‚zîܹtíÚËO/!IDAT5êÃc´jÕ*Bp¨F´hÑ"úùç$wÊÆiß¾}Ô½{w·P[© àN5ç¨b˜ªAEKú9ÌÒ¾óýBÊËþí¥_}¥­»w£gØmÅŒîËVz{|è±HàE£ëªwøÔü`Êÿhù2‚ äàæ-WP‹ðþc™ÿ×ú¶…_¾BÁUtêÝ×i¼ÖòÚºo²È$'‚Pƒ¹ú»ï¾SQÛ9ö`: bèõ×_§… ªó¼pûòË/é]Ø+=Òx>`ÀrÇ3U©“_æís2»$²,p>&n×½dÙ–d2ã¡—‚¨ûû4®5vn†‡)m6W.ŸTI°§O¢¹ó¤ øI•ÐÊ Gy@vÔUŒ}ë^r†lño©{é­ña gÚg‰{÷앱 +gÎ¤ŠØrÇ/º eÿþ%6 ¯¿c·ïµú-µM{–ÖãÆ}ûö%ówwã//qE/,!˜ÄáÆ–-srÇ3ó:,]÷¾lÙ2ª’¼Hµ”Fî žˆbDH{bÏJ›Ü‚ö§¬T™·£§:Í›Ó]~+¶Â5a+È”¿6¸¥NG =ÇÑæ½{÷N%¤ÍoätÒ+8²¾o ¼ i1w{SK[Ó…¬$ÿ]½Šº L9YÓ,Ʀa ~Ì$³ÉÓÍÁžÞ¾Ì?R¿qpÎÖiÜvg‚@† P©N]µG:C*“JAÀëMÚ뇀à xº¦ééíó„1(mp"¤Ýƒ«”*d(ž¾EÉÓÛ—¡ƒE*ËRˆÎRÝ%Ì ‚€ xâ“ö¦Þ–¶z4ëׯ÷èöIãoD@¶`yc¯K›A@ €lÁ2| ƒ‚€ ÞŒ€ø¤½¹÷¥í‚€ †F@„´¡»G˜A@ðfDH{sïKÛA@ €iCw0'‚€ àÍÈ,oî}i»Ç!|™.:MaüÏWÑÉÿUîq” éD _ê_ ËU«J%ü˦³4÷f!í^|¥tA À€>µûoªí_šzuîLüÿáB‚€  k¡tøìY:Æ¿jA†Ô"¤S÷ŸÜ²$Ž %KR·-³$ÿ´ Q`‹OÂöítŽ7FÖ¦Å'Q£BêÜŒ@Èŋ԰z57×"Å žƒ~/øÝ™DH¹w„7AÀ ¢#n;dâÆŸUè?… ¦¶mÛÒY6ÿ¥—-ZD‡No1)ò?~\ñ›˜˜˜â¾¥ wÔo©¹ç@›ÆïÆÈ$æn#÷ŽÞâiãO?YLU—']_Ú¼d ãcãŽ,¦sôf\\lºÊJk~´1GËÃenZ¼˜®]ºLµZ4§ëW®˜ÚšÖúôxت[ŸÎ(ç÷ïßw˜•wÞy‡4h@ tùòeúä“Oè…^ M›69\†¥„/¿ü2ýöÛo–¥ù^±bÅè™gžQ‚Ú^!î¨ß^òܘ`lcœclçÈ‘Ã*“Îün¬âÆ–g?7V(E»{11ôÑ3C,8vÎjÓ§·z^§yót é¨Û·ÓUVZò/ò%m]¶Œ¾Þ¹Ëb?ü mXº”ò,H­{ô u¬ÅimMK}úJìÕ­Ok¤sLLÖ4Μ9üÜ[·nM]ºt1±Žgo¿ý¶Ú¶&4S† <)Uª-\¸0k”ªŒˆ@||¼SlaÌlÞ¼™,X@C‡µ˜7+üª˜»-v]ÖºéË“ØW7¤ø´éÕ“ )B.YLÃ?ûoÖjP2·sƧ[aáVy?±oåΛ—V‡_£qß}ëÒ¶Ú«Û*S™üš¥-ÖÊ•+§„ûÝ»wU²;vPŸ>}¨hÑ¢T±bE5ÑáÊî¹çÔ}<{øá‡)((HåÐŽŽ¦'žx‚`v¶•öïßOÍyY„ÇiË–-é?þPåœ8q‚7nLC† !hÑS§N¥€€UÞéÓ§©~ýú4zôh*Q¢U«VfÍše±~uS¾<KcÚÚ½+lUÃÌŸ??-fk®-¥µ¶ 5h"¤Ôiä%WîÜÔ¨CÇŸ¢ìk‰‰Œ¤YcߤeS§©’o†‡Qžì¦°Iòƒ'Ÿ¤Çy’í[¶,ýŃY£ß¿]@Ã5¢Î¬ö (GŸ³ 4!1A{lõtQ•=•'Ï·ºu£®ìç\£ý³m›Õ<‘7oÐ;½{Q/^dt+^œÆuïF×C®ªô¯vè@qqqlʾ¤Ê…ùZO#Z´ küÃKà4OU©F»Ö¬IÑV}ZíüЖÍ4¤vmêÌ{$ñä¾å×_µG)Žæu¿ùØcЇàóçT:˜ÁŸ®^†7iBŽ´ÛV½ØÚ>‰r‘_ ´ihæŸL¾À$µgÏšgø¶k§„ßVÖpGò*W®\T„yÈ׿Ác£yÏÏBV„÷98® [´fÞ|ÖV#¨ï+#éÕ3¸í3• Ь ú¼æuwñåfá´ÛÚ´x‰:vþ‚:âËZ»íÕ;ƒÍÊX{,X´˜©¬ôœ@3Æä¥M@ú£¾ÜI“&)ís×®]t“ûfä&l;wN™”õþ:˜£¯òb¯W¯^JhCíP¡‚Êc-ØÌVZhÂU«VU -/hÓš …¼råÊÚ£Ghëíx¬hÔˆ-?gΜÑ.åè¡èDz½s, auÑ>¸6σ߉˜»=t°­Y¥Ë—§Í¬yhŸY<ñÚ¢bl^ÎNÙT’l–Åò€†Y{ñç“ih½úôJ«Öt|¿‡).öž::òÀZVv­fÍT–kƒRe=±wŸº× ][Ó3íü"û$]M!¬Õƒ|8‘ðGêcºtUסÉþTuaå«Hq_ªÏVgxkQð¹³´ŸµÀrUªPÍ&MM9¬µ;=õš wâ“LÆ–>úbª0ÿ=ôŠð†OXOÐ\Ož<©¿E‡"ä¹Í„ð_ä½¥»wïVšöàÁƒS¤Õ.l¥-É‹/”¡Ÿ$?ÿüsÓ60m‹˜V–þ¿7Läýûᅧ7íZŽž‰€¥1Þ{ø½D“6z¹?Ÿ<¹M¥f×mMX9}&Á\ÝôÑGieh û†JŸŽ£t‰'Ïû”´ŸõÜÑ£*[q6™šS¥zuÕ­ÀLTç¥8PIêu‘ϨF“ƪȧǽEKYÐ.d‹Á¼ýûh<œY$³ºö¼JöåK#è[Ú?Ñ?E6kívºÞ¥fÎüÉ7nÜ ¹sç*! MBúQ+V¬PA^xŽà®öl ÑGƒÃÌî_[i»wï®´œÙ<Þ@ƒ†ðv„¾øâ e /~shí }ýŽ”#i<ìT@\…ö7n\–mèƒ=Y¶ ¸«ˆOòƳVM¦Z|ÔŽLçoó\•‚ö®[§ÌÀ-9 Ìœòä^”£sašŸô|’<ÈþÍ,ÐÛ±¿¿æ-ö•þøé§ôįóuób¾nËæû¯Øß¾ ¦ñš5iã’¥´ûÏ?éýŸQ§ƒR•c^w»~ý¨HñWhï_L±Ý†O‘ÇZ»¡Úªw"û\9jξ½T P’U#EÁ™p@-øð…ÉfÂ×_]ùÁöA#ª8àÙwß}gâ[»àGÆ H&Nœh5m%v³LŸ>]•ûÞ{ï)37oAXt„`ž‡ÆÅxìÌï*™×ïHY’ÆóÀØ}íµ×ÔBn3Òž×Ç^Ù¢î,x6²¿u+ûwr°EGŽ¿ÀfÏ3lNt”ªÔ­KWΜUA`Øg<†ƒŠù•"D–ë iú¶­ô~¿þ´–°0¨X«½»èGìIuãNi+û|ðuð$ûÏ-û(õåZ;G´û«3¦ÓÜqãéã!ÏR!6ñöàÈuKeXª»uÏž´†÷\VãHéòRTe­ÝHd«Þ`öÿq‚ú2Šô&fku⥠#FŒPÛ«°ýIïŸF0üyИ¼¥§_9~ ’wÀo ²•[¹°/SÑ—ƒ—¬è͵x\˜ó¼uëV !___µXÐx0¯_»/GïBÛ±ãAÛJh>†² "pL&nçÕ¨  ! _u®\>Ú-»ÇËgNÓÀªÕ¨ Ö)m °Ë—¨XéÒ©¾,„íGÒðýšÊ)Äqî¬Ô4é%>®*ÛÕ帮ծæLʧ(ÍûËñºÓ³-¾‡ß¯}÷ΧòKbAÀ[€‰»»±*4zˆð»12‰¹ÛȽ#¼ N"Ç{Ýãø-JøãW½ÜI$¹ `x²ñKœrðþé\ü.†\9s–_1w¶k„1A m`Â1ò¤“¶VI.AÀ{pü}Þ‹‘´\A@ÈDHg ìR© ‚€ `Òö1’‚€ ‚@¦ B:S`—JA@ûˆ¶‘¤A@2Ò™»T*‚€ ØG@„´}Œ$… ‚€ )ˆÎØ¥RA@AÀ>"¤íc$)A@LA@„t¦À.• ‚€ ö!m#I!‚€ d "¤3v©TA@°@¦ÿUeXðeºtê4…]ºDÑQQö9–nE _T¢\9*W­*•ð/ëÖºœ-\ÆŠ³ˆyozÇÞÛ÷ζÜÈcmÉÔ¿ªÄ¤{j÷ßTÛ¿4Õ«\™Jú9‹¯¤w1A×BéðÙ³t,ø*UkÑÌ0‚ZÆŠ‹;ÚË“qìáìÂæu¬ ‰™þW•Ž %KR·-]¹•°PÂ'aûv:ÇýcmZÆJzzÕûòÊ8ö¾>Ok‹:V´ödªO:äâEjX½šÆ‹ „úýc’±b”žÈZ|È8ÎZý•™Üm¬hXdªO::âv†š¸ãââhñâÅZÛ)W®\T°`AjÙ²%+VÌtßU'±±±´dÉò÷÷§N:¹ªØ )«KôQ(£ÇÊÎ;iÏž=”={vjݺ55iÒÄmPÄÇÇSΜÎýËiÖøÁýŸ~úÉ"mÛ¶Uã\?ÞÓZ¿Å ÒxÓÛÇqa#ŒÿÐÝ»w©~ýújþÊ‘#‡CÅYëwkãÊ¡B3 ‘ÑÆŠÖdçf-—‹Ž÷ïßwQIŽsçÎ2dHªÄ¾¾¾´`ÁêÞ½{ªgé¹qûömU_óæÍ³œF»3ºlaQ¼\¿~žxâ Ú´iS vhU@¥HèäÅ—_~IË–-£]»v9•Ó‘±œ–ñg‹Ÿ˜˜‹¿0>gÎêÝ»wŠñž–úÁÁÄ5vaÇH¼Xâ7$$„ @[·nMñ¸V­Z´téRªS§NŠû–.,õ»­qe©ŒÌºgÄþÉT!ŽHHH ÄÄÄ é¬ä@Ðl¿ûî;ŠŽŽ¦?ÿüSM0ãÆ£®]»º”)½hÑ"*Uªiu»´7–-b E1V† ¦t›6mèÕW_%¦·ß~[Y`yäi0€0V/ñÑ&j×®ÊÓ¦MS&ïÓ§O« ùÔ©ST±bE9r$Õ¬Y“Êñv¤+W®¨A ³O‘"EÔý_~ùEñÜ­[7ªT©]¾|Y]cbjذ!5jÔˆnݺEo½õ¡­}ì–ÊéСƒÊ§¥CþÊñŽ2p ¬fïÝ»g*KKëŽcF-œ0%w•óçÏÓÊ•+©lÙ²´nÝ:BßöïߟæÏŸ¯ú‚ÈÙñ¾ùöÛo©iÓ¦T´hQ5¶^|ñE‚¹.1v0æ° °6>ÌûØ‘±Œ<iùÃÃéOŸ>ª~~~Ô³gO5¾ñÜ?Z>ÍëÄoHûÀ Ê|¼›×ïhûôõ¦çÜÇqZñ:tèZ –)S†vìØAÏ>û,õèу֮]KÍš5£3gΨÅjPP¯X°juÁ ©aóq`o\iedöшc¿ŸLפÑ1øñg@Ú$ƒ‰qïÞ½Óæ_ýE‘‘‘jÒ‚ÏfDLšsçÎ%LÊЯo߾ʇýôÓOÓúõë•Y¯jÕªÊ_‰ é'L˜@kÖ¬¡£G*K›€¡í@xÔd­”“ûßÿMùóç§Ã‡«ù ua"„ù fs¬ö4 E›]}4êŠÒÝcåàÁƒ ʺuëªñ¨áܪU+ÂtìØ1§ÆG¾|ùèµ×^£Ò¥K+ÍdÅŠ&×J½zõT¿æÎ›7nls|`¦'GƲ–¿-´¾BL˜hÆè?þ XöïßOæühù´z5£Ž´ÑBZ[A¹tKei hÆ0gj„Iò£>R¼€0Ñ xC“&MR‚&q˜>;w˜ï³Ï>S~E˜t.\¨ò¿ôÒKj•‰ Ô‹ B Kå@xãùêÕ«©p᪠Lð7n$LŠüX­jŸJà¦/´ÙˆƒV'Z¹ºùW¯^5án g­nGÇÇÔ©SUZžÜËL2êX&™.¤µŽÄÑݤ ¨âÅ‹ÓСC 'Ρ]À¬Á¥ñÓ ÷.&oE‚0ÆG£Kü–4ø_`ªþí·ßTð4^hàøÀ´¨‘½r ©L¤0©ÌNˆ0Æ$ˆHt˜)µ€V®;ŽÀƒÖhäî±R¥JÕdhzœÑψt…@sv|`qõ¿ÿýOˆˆÕ·¨Ds[hÛz~Ç‘±¬¥Az䇶 Ân­<œÃ´yòäIÓ=¤Ñžã\#í~+Шô„gæõ™_ÛúieëËLï¹·Žã´â÷æl­? ”1/ÁÔ B­_qÔÒiVÜÓž#½öç ó뤻™ÿmÔ±d2]HÃ4‚0mòsgw¡ë˜1cRT HK¿µvæOШQ£ænœa’… i ð!¤G­ÌåO>ù¤º¯•…‰t¶ÊAZj˜Î±xèØ±£òñ̘1Ci`ú0ƒk<)†Üô-:#úÃYöÝ=VªW¯®4˜ýàn€ ë úÞÇW÷³fÍRXJ>ýôSåÓûàƒ”™}‰Å&5{ãüߵ±åÈXÖÆŸÖ˜õµò4?,æü¨†ê¾´:16´üºÇ¦ßŽVŸ–^»¶5þ-•§/;-çÞ:ŽÓ‚òÀ,ñÅãÄd Ú{ÕªUªX˜Âµ¹ý«õÒ‚pϼßq_?Îqm42êXNÆS— Ò{zMö±"¨~˜¸»té¢Ìá`·E‹J "¨ƒ|РA[a¯*Aø#ð> hø¸†¹&v!÷"€=ó/¿ü²šdà§Åv=¸`ÚÅ . š†ÂÑþ ,ô@° \ˆþ¶5ÎT†t|aÓúv~›Ü믿®>pé €ìñÇW%ëùÁØs%Ùÿ®¬KÊrhÌØÝ!‹q=›cnÅb¼`žƒõiàꀵɹs\٪מ‰v 1±Á¯s%´e˜>¡Qc_¨FОA0}cB·DöÊyôÑGÕàÇD Û  ‚&&ä~þóŸÿÐĉU? hLÂóæÍ£jÕ,¿ÏV¿bœ  ‹ ¢öAðMƒ0ñaQöÅ_¨¸{ãLeJã!ð=cËׯ¿þª´z˜®üñGÓ˜Õóƒ@/W’-œ\Y”•v1kÞ?ü v3 ß'ƒ½ü°&‚p„A‘p}@Q±EîW¶êõ„gØ ›¸=üÁ–ÀZüÅdúô…áj妙P,¥3Ò=ø+aæNo`•«Êq6h~¤oÏ›KßëŽ*œ.33Æ „ÞDÓ¶£d­_áÛƒ´TÊP5²VŽö<½G˜'¡Ý[zÓž%~Ò[Ÿy~w·õÉ86Gݹklÿ„5š3‚ÊàöAÜ_Ax†Ý0–Æ¥š2b\Yª×‘{F+à;Óÿ`L˜á·Ê „`.W«Êq/æe ?Ð/F£Œ+XŒ9KÖúÚˆ5‚?Øœ¬•cž.­×Z´®¥ü–ø±”.=÷ÜÝ>ð&ã8==D&+J þæ›oRˆ…<>ŽRFŒ+Gy1OgÔ±>3Ý܉׈Á¼½éÚ¨}bT¾¼ild¥¶u¼•¯¬Ô·®æÕÈ}’©B:ûnƒÙ ÂJF(óÐúý‚þ1 ÉX1JOd >dg~2—F+6™º«d¹:|ö,=Êo[B,ÀÊ*¾i @O:b5 Bàúýc’±b”ž0>2ŽßGFáÐÈcEÃ(S…t¹êÕ(pï>â·,PJ©tQ×ÿ]¤ÖP9:†ÀÕ›7èè¹óÈAS•šºïïãæA*+°3ûÈ8¶‘¤HBÀ¨cEëŸL£+ü&›Ë¼¿8Œ_Õy—#…2˜•Kpp\YÞnT†#9såÌ•¹ %×.cÅÝe˜qœeº*Ó5êX0ˆîÎT! &0ùÆñ[Çx;H"ï»Ê\²ñŸŒäàläâmGFÐ"2V4$ähÇö’çF+†Ø‚A`4a už…€Œcõ‡p“6d§ 7oÍ%!ÕÞÚóÒnA@Ã# BÚð]$ ‚€ x+"¤½µç¥Ý‚€ †G@„´á»HA@ðVDH{kÏK»A@ €iÃw‘0(‚€ à­ˆöÖž—v ‚€ Ò†ï"aPAÀ[!í­=/íA@0<"¤ ßE  ‚€·" BÚ[{^Ú-‚€ `xDH¾‹„AA@oE@„´·ö¼´[AÀðˆ6| ƒ‚€ ÞŠ€ioíyi· ‚€á!mø.A@¼ÒÞÚóÒnA@Ã# BÚð]$ ‚€ x+"¤½µç¥Ý‚€ †G@„´á»HA@ðV²qýµñÒnA@AÀÈü<óH+…IEND®B`‚phylip-3.697/doc/images/InputTree.png0000755004732000473200000013336212406201357017217 0ustar joefelsenst_g‰PNG  IHDR'¨<Ò ’iCCPICC Profilex­–wTÉÆo÷äD CÎHfÉqÈ9™fÈ0 CP1¡²¸‚TVdUDÁU ²Ä€iQT@EÝAõ¹•×ÀƒÝsÞÛÿ^Íéªß|uëvuõís>ºO$JGe2„9âPo7vtL,›ÔtìG*hðøÙ"×à`øÇö±ÉÉ{&“¹þ1ìOÈ ²ùH06/Èæg`|» ù"q*Átí¥9"ŒqxŒåÅØ1Vœä¤i6žäøiv™Š uÇbÂÈtOœ@‹Ãtv? ËC+ÀØL(Hb|c'~2O€ñÆÆ™ÓU0Öÿ[ž¤¿1?›“ÇKšåégÁVb7öHÉ¥ó–Oýùv鹨yM5ÖÓE9n¡Ø(‡™bJ{æiNÎõ‰˜áüäð¨ÆÍ0?Û;Ëéø´L¿Ù<‚Ï=;/l–ó“ÝgôTžïä;›º/OŒÑX”<»azàdÝLÅ$нfó'd{†Íè9âðY=1Å‹;£‹Ò§jnj­87töY„³k<¿™xƒp °K0ðÈIX†½o÷LÑrqJRrÛ«Ðc6WÈ75f[˜™›Ãd½OƼgMÕ1ºñ—–iÀ¹†ÕÒ®¿´ø!€æ‡J´¿4íÓÒKš6ósÅyÓù&ËØ7$ ò j ú`‚íÐÀ<Á‚ b`1ð!2@ Ka%¬…"(m° *  Â8'¡ÎÂE¸ 7átCH`^Á|„qAHa"Jˆ:¢ƒ!qB<$‰Aâ$Dˆä"+‘õH RŠT Zää r¹Žt!‘~dy‡|Aq(•GUQ]tÊA]Q?4]„&¡Yh>ZˆnAËÑjôÚ„^Do¢Ý¨}…ŽâGñp8çŽ ÂÅâqbÜj\1® W«Çµâ:p÷pÜkÜg<Ïijñ&x¼>ÏÇgáWã7á+ðGðMøËø{ø~üþ;AP!ì \B4!‰°”PD(#"4®º ƒ„D"‘EÔ#Ú}ˆ1ÄTâ â&â>b±ØE Ž’H$%’É‘Dâ‘rHE¤=¤c¤ ¤»¤AÒ'2¬N¶ {‘cÉBò:rù(ù<ù.ùyœ"CÑ¡ØS‚(ÊrÊVJ ¥•r›2H§ÊRõ¨ŽÔpj*u-µœZO½B}L}O£Ñ4iv´Z ­€VN;A»Fë§}¦ËÑ éîô…ô\úúazý!ý=ƒÁÐe¸0b9Œ-ŒZÆ%ÆSÆ')¦”©WJ µFªRªIê®ÔiŠ´Ž´«ôbé|é2éSÒ·¥_ËPdteÜex2«e*eÎÈôÊŒÊ2eÍeƒd3d7É•½.;$G’Ó•ó”ÈÊ”»$7ÀÄ1µ˜îL>s=³†y…9(O”דçʧʗÈ—ï”QS°RˆTX¦P©pNA±tY\V:k+ë$«‡õeŽê×9 s6ΩŸswΘâ\EÅÅbÅÅnÅ/Jl%O¥4¥íJÍJO”ñʆÊ!ÊK•÷+_Q~=W~®Ã\þÜâ¹'ç>RAU UBUV¨T¹¥2ªª¦ê­*RÝ£zIõµKÍE-Um§Úyµau¦º“zŠúNõ ê/Ù lWv:»œ}™=¢¡¢á£‘«q@£Sc\SO3Bsfƒæ-ªG+Qk§V»Öˆ¶ºv€öJí:íG:ŽN²În1]=Ý(Ý ºÍºCzŠz\½|½:½Çú }gý,ýjýûDŽAšÁ>ƒ;†¨¡µa²a¥ám#ÔÈÆ(ÅhŸQ—1ÁØÎXh\mÜkB7q5É3©3é7e™ú›®3m6}3O{^ì¼íó:æ}7³6K7«1ë3—3÷5_gÞjþÎÂЂoQiqß’aée¹Æ²Åò­•‘U‚Õ~«ÖLëë ÖíÖßllmÄ6õ6öڶq¶{m{9òœ`Î&Î5;‚›Ý»³vŸímìsìOÚÿé`âæpÔah¾Þü„ù5ó5yŽ%Nl§8§Ÿœ$ÎÎ<çjçg.Z.—C./\ \S]¹¾q3s»5º¹Û»¯roóÀyx{{tzÊyFxVx>õÒôJòªóñ¶ö^áÝæCðñóÙîÓËUåò¹µÜ_[ßU¾—ýè~a~~Ïü ýÅþ­h€oÀŽ€Ç:ÂÀæ âíz¬œük1$8¤2äy¨yèÊÐŽ0fØ’°£aÃÝ·†÷EèGäF´GJG.Œ¬‹òˆ*’DÏ‹^}3F9&%¦%–{(vtç‚] Z/,ZسHoѲE×+/N_|n‰ôÞ’Sq„¸¨¸£q_yA¼jÞh<7~oüß¿›ÿJà"Ø)NpL(Mx‘è˜Xš8”䘴#i8Ù9¹,ùuŠ{JEÊÛTŸÔªÔ±´ ´ÃiéQé 䌸Œ3B9ašðr¦Zæ²Ì.‘‘¨H$ɲÏÚ•5"öÊF²e·äÈcÆâV®~î¹ýyNy•yŸ–F.=µLv™pÙ­å†Ë7.‘ï•ÿó ü þŠö•+×®ì_åºêÀjduüêö5Zk × xYK]›¶ö·ufëJ×}Xµ¾µPµ° pàïꊤŠÄE½6Týˆÿ1åÇΖ÷lü^,(¾QbVRVòuÓÍæ›Ë7OlIÜÒ¹ÕfëþmÄmÂm=Û·)•-Í/ذ£i'{gñλ–ìº^fUVµ›º;w·¤Ü¿¼eöžm{¾V$WtWºU6ìUÙ»qïØ>Á¾»û]ö×W©V•T}ù)å§¼4UëV—$Ì;ø¼&²¦ãgÎϵ‡”•úvXxXr$ôÈåZÛÚÚ£*G·Ö¡u¹uÃÇ»sÜãxK½IýVCÉ 8‘{âå/q¿ôœô;Ù~Šsªþ´Îé½ÌÆâ&¤iyÓHsr³¤%¦¥ëŒï™öV‡ÖÆ_M=|Vãlå9…s[ÏSÏžŸ¸a´MÔöúbÒÅö%í}—¢/Ý¿r¹óŠß•kW½®^êpí¸pÍñÚÙëö×ÏÜàÜh¾is³é–õ­Æß¬kì´élºm{»åŽÝÖ®ù]çï:ß½xÏãÞÕûÜû7»»»z"zô.ì•<<z˜þðí£¼Gã} ‹ŸÈ<){ªò´úwƒß$6’sýý·ž…=ëà¼ú#û¯ƒ…ÏÏË^¨¿¨²:;ì5|çå‚—ƒ¯D¯Æ_ýKö_{ßè¿9ý§ËŸ·F¢GߊßN¼Ûô^éýáVÚGƒGŸ~Ìø8>VüIéÓ‘ÏœÏ_¢¾¼_ú•ôµü›Á·Öï~ßOdLLˆxbÞ”Àa=š˜ðî0#€y€*5íG§"iñ¤—žòÓÿÍÓžu*Þà`@x€?6îÁF]ì’v˜´eá.€ZZÎ^˜2Ù²--¦¡‹1kòibâ½*©à›xbb|ßÄÄ·Ìë`Þ¦-kÚOFû›`1²–fV¬ ¦Öÿ­û7¦WØ-Q«L" pHYs  šœ IDATxì€EÕþïn6½“@%! ¡ƒô4H‘*ÒEP„ø0Hñ©‚ˆ|ôš”*"Ò›!t- !”Ýlûÿî>äd|ËÝ»÷Þmwϰ¼9ï™3gfž;}æSñɘ̜úLÝZãûNܳ÷ö»ô³fcccEEÏL&#º²²R˜r&Ã+4OüÚ»éÕ˜444 DO“ h‘J-ü¡P<-ˆq”TÑR%_ÑÆ±W|VÚ`ê5.fÑ™ Ñ–‚‹)=zµÄØ«)ƒ“ª S¦$¬§"5Ú”˜ ¡Œ‰°ã/Ì…›hAd(9þ‹PƒË°28¢%ãå_©ªŠ2ª’¢c¯*~B¦•ÆPƒ…Ò/"™Î…MMMuuõ¢E‹ÈKÅ´Õ2‹×\oðŸÑ}ÃÍ»uëFÅ#Kr–U²(0Ä2ÿ¯_ÅÖÁ%¨ª¯¯‡ÎƱ J“ õ@ãL?¼Z@½ŠcñÊ—X¤Y| ãÕ‚C f’z•/L©5aÓcòP¼bêÑ£W ªU,…^%)ÐxZ@)çyÃÇ|Å·§a GªÃÊtüA§¨‚*&p™—Aa"àå_˜* :„°ÄL² âOÇ3oÞ¼ª ™>v­Úà›@P[[ "!:jaª LZPª5ŽjìB(%còÀÁYªèx*"$ÅQ`J³iЫZX|EÈ×d L „Ä`¯ž¦S¾âCã$ x¡å0/8D*¡2á« OXy™˜Ð8¥Bú!")Žão8X±Qã/sz\0Uö(fÐòâ ÓÄPb´ìø .á |Ätª¶Ž?˜¨h a²0ƒÎðìÞ½{Ïž=+þ¹rfäS«Vƒ-¶‚(BÌHaEXÅ´)\–V|¼ŠÐ> %,޽JRéÆK±H)ç‰Ó+„‚˜…ÅW¾ÐS4ñ²‚Ð4‹ÃS4àË«”‹€ŽÌÿàã¤ÊÒ¡4XXé4>Að" :o‹HJìµI±ãïøgK—•Rh{…PAR1S‘ƒc„оóòßž·?_9:Hû³téÒŠ¬œ5õ=•c~$+ÐÖkEPpkò)ñ Ž<øZe¼‚ˆF’öWQÀáU¾0- ÆÇKò<)^8hž0‰H2<¡ÕOÀç'^³a–Å¥DVUUI†§ HF‰—éÔa˜ ¢œš$„†ú%¦ÌŠV¨0^øŠT^<á8þÀ<РȖc/ÿÞþ4µi²ýyœ^ç±w(Ð8Š2Ok½BÏ›ñù›O<ñÁ¿^žñÁûô6#Ƭ¾Ú¬»Ýö‡ ǧFÖäᘪ ¼ûþùœ9sàcµQ£¿1z½Í×ßv¥G¥–'mºúq‹rpJ|œh˜z¥3‚ a@^åPnJD¨ Qž …0jyÅI >QDú|-%áPzÁ+T˜UºÌÁ—|$ºP^4O™˜”&“G€4ÀTJäk‰äIœq ¥“§hÇß ±^…•`4 õj0:þ`…ŒM¯ÐßË¿Ú. ¡ÂEqIø¶ýévÈ€ÌÀÃŽ%•r–òýÑk¯þã†Þ|èjkÖ1bÕþý~öÉO?=kúô~+®8hØ0dht–…ÎþKÀºº:žŸ|òÉ /¼À“Îf5Ö>|8s«>úè‹/¾èÕ«Wÿþý kBa 1íÝ×îžróÔW§T ª=nø £ú͘ÿÙ ¯>÷ùÌOõ[aÈà%FÔ†2JxÅ¡9›”&'$Oëô Gu^b4y¼”*‚( ©•¼R®'Aµ‚@ã •Ah©Ñ<à R‹¼‚˜—øJž’1\þÄ‚“i–¯(h ˆÃÓ4› „‚\¾‘$O1|œdœé„à• TI6ùw8üŸ}öÙÇüå—_^¸páˆ#ôs[.‘Mœ€U^ä«òß„‡ãŸ…”ÐË?Å£ã—~)UO~5+í¥-ÿUõÙÒ :ƒj«cD¼`ÖÌgo»}þëÿ:~çF}kÇ[o•iÌ,}ò©Oþþ÷kþû󙊾ô'#FZÓÁf!¾úê«×^{ÓrßúÖ·˜ßôë×å0™÷<ùä“xõéÓ§oß¾4@ª·j¹!%³çÌ|à‰;ßùòCÞ{â:;¯Øoø3¾ô·7¼ý¦ŸhØðÁ+Á$¬Ä+´B <Ÿ^ákÓžDÊBÂGA àCààðJ@žræÅ«hã@ààKÁén•Ažj’ðUjy"`A~ãG|„å%_¥M„á/ý’¶H!—°€Œ8J³ž0qFC„°<¥Ù¼Pañ;5þë¼ôÒK™—O˜0ŒÜ{ï½÷Ýwß 'œ°Â +;9rêø [D‚jb2^þ)*TP^!¼ýévЀLÿƒ~ªbDMË–—¦R¯ÿí¡OÿöÐ÷Ç­5fýõ*æÍ«{éåúý«bÁ¼~ [´èåW_ë¶Âákƒ°+ai›x}ë­·fÏž½á†®¸âŠ`Ͱ‘* Ñ£Gú›éÓ§ó¬´R¶ç€©à0ÑðôËêÝG·ÝmƒÍÇmöUfÁ‡ _ûdñK2‹V0¤®ß—þõjÿnÆ®šWÍ%Á¡q8ô Uâ@ãà+a0%!/L{ÂWJBâ’€¼Ð¦. &bÒ)Bž8„UÈH¯ŠÅiÂFð‡£PòåmÁQn ñ²„!Ð$ûõƒ”+ ž¦U84HBxD|&¬! H a“Àá‹0|x…ÆAt|üï¸ãŽ>øà²Ë.?~ü:묳óÎ;?úè£Ó¦MÛb‹-” žÊ”åQP(›BOÙ4yÇ|p*l!€Æ(/ÿ*N@¡²¤R88«}VÌ:oûSE[KÞÈ-™á‰#‡˜s_{e|ß^Ãêkëþýïª+{õ«¡ººnÁüa µëöé5çÕW÷Ú[Ð(8@ à³fÍbI>†­žÚw¡B2ê篙3gR±³p69” dy¾óÙ+­5¸ï€^Óf¾Ù¯×ÀÝ»#²´¶æ«%_õÐ{ØÚƒß›1M±À×A(h¥D?†Uþ&õËÛVdà\1*RhÒo’j)ð•N=õ{+FdÔ˜R.U!ÇB©j)ýÀhqüQbÊ-ÁJƒR+&287üÅDY0€(¬Òy"/(ì‰ú¥V¾Ð!,H"£|‰V¤¾$˜:uê.»ì²xñbò¥,o³Í6“'Oþì³Ï!]wÝu£GþòË/Y‚c‰xÿý÷ßh££„#Ã0‹ÝÊC9„ÅdÐxæ™gž~úiú­;ï¼Ô2rüÖË¿j– ˜ª4e¾¼x† _9•Õ8d:oû“]a#Ö~NMvíôéköéÙoÉâ^sfWÖ,ÉôìE¥fIÕâÅÕKÖìÓã‰?BAz¤Šª;fÌøøÂR2d½ŽâÜÀ*™y‹gŒ?´¢wfiEõâ&¤=U[Ï»êÊ^#V]qÖ 3“£‹~8(A-¯ôsb*/ð%&%CÐ8£!ÄS¡”%X=¨Å‚B‚(ñˆñJ@É‹æi‡ð" tZ!µJ^8K3L|­¢*8q)‘ ¢è"øÃÄ Gµ<•Ti mö*ŽÉbG£•8¸²ÁÆŒdjÝu×%Súdܸq<ñbÊÎúðÃ?¼Ûn»}öÙ,¾Ýxã˵×^ËöÏõ×_ÿØcýñ<å”SÐÃèŠiÓ<ðãÿ˜à×\s &¾TÀËXâ²WpLÆ0‡c4*Û¸²ÁŸ<*›äÔ˶ö.«w‚…W5eÐþ,Ÿë+ 1E_E™¬ö®ê6 wÏÞýúõèÛ'Ó»O¦g/ZÓL·J3ˆ™ŠÞK²Ûã8BáÐ@qቚ޽{3­¡¦AS7`ª¥T±È“€M²Ë[|ˆ·gïªlõîÓ¯Wï^Ýzöì–­«K©_Ý+3Ýö_:¯×"ý¨B ¾è–6=Up¡‰abÐ<ñAø’‰¹°TIF¹³'Ahß%¦àhPèGY“8bâK(„yUJ ¥Ž„-EMD„BXQ@C DIe GyT(øÀá‰C§‰/zHz5¯J°‚ ŒBIa#…Ä‹¥YO…•°O·ìI„Åõ ç/O†E™•¬Z¥SJx¢G)‡æÇPS r$!BîLã•'¾8‚“RKRaª@à „§„!ƒ6刵h˜(„ƒ  ­ôK9òü "Í"P‚$bH’BbA¦ª|^£ Ë“ 0‘Á).^…•øJ­áIj-y׫‚KÒ`Z,Ðú2õÆol¾ùæäB™boL† ¦_¬èœ“eРAt$ЄºÿþûÉ;¾k¯½6G49&Cð" Ÿßô Ì«ãP^þA€’@ÙP ¡ÀàÔ¤”}û“]aÃQ©„‚JJFßuÆÏxö鵆¬T5xPåÈ‘™†f{9³ºõXRÕcƼEÆWëp8hò„fÈŠÄ*«¬b¼xâðB1@§6)Ñá_bÕÖühÎK=º÷©è3^§û ‚TÕvcWhIý’êÙ5«G†I²‘GôÊ= l®ÒuÃ!z žü1³«Ñ£GPhRá+8Ì5VYÎÛŸ~üæC6RÙ£ª²*¶²¾ª±¦fÿúák­¶‘Š&I…P5FŽ´„W|•^Q®Wùªë/%/\Øc‘x|qhÀ yiCžWÑxñ¡ñ• ˜P£ðåm<¥J…Ò¯§ÒƒN‘ò”8ø*”R‡N¼ xÊš<³LaªxE¹|œ'éD!Ž¥$’/4bÄ!a% ÃiRœ}À‘¼â ‚WBéU¾ 2~À°[Cdzæšk’Z>5ã Ûž{î©\óüàƒ6Ùd¼øe4Ê6Yo½õ8` PÌ~È9å¬|4j! /aáà@ („‰ƒ)_$‘Çu)ü)r*!@ƒ£'à#4¾^þ)!@!LDë5Bƒ˜d@¬Cµ?_¯°QÄ•>ýÌ$—šÐ³O¿;|kö³OÿóÝwGf*†VVe*2³gÌúì½wjz÷¾ùÖÝûT5#KÇ‘y¢Îj«­F¯ƒÀÊ+¯¬/!R“çÏŸxYt*j¼RÓ8Á°ÑØþ=ýÉ=óæj£–è€õÌY3?ü䣪úëŽÝ²oÏþ$èÐ`È¢&jሉ|¥èY!‰Ä!,h1 “§IfsØÔ(+.‚+ÙÌ´†•Ê?€°4Ç“]Ê-‡3)í(fŒ8 A@hVÞxsÊ9þ”+ h8ÎË?P°@P<(0‚§ÂÃÓ1MÞ¼ÓÊ2BdxJ@òb¶=þ·ÐëÜö(I$nZ ž$…Tª}!q¬ks4@§Îà#Ã(˜Ñ¯$ˆQëTt–ž,5œ°4ýÒI…¤–€D$1j£5sHÒo£°Ô[º.¼ˆ‚H騨ð+ €#Áø¢ '>É ‚£/`â%š4H†WÄ qb"Cj•6´Á‡Ö>¾R‹BI"C²‘À#´T).^ ÒR(mÐJi ÒDüѬ¬¡Ë—R"0ñ',Nú‘$m¼"ŒS@ÂAÄp¢‘dT§Ã0I9ð’2%üéuF½þúëó[ë— ñ‡ †’°âéøS0À¬TêDó¤„À4h/ÿ ,`2ÀA TUÒSˆ©Zš$‘é\íOÅäá™á“Öï­Ê¦RB6ȹ’£%U›B tpT’ƒCX!Epa^òåGça5™€Â‘'N¯<QC Š'Èj͇WR‹ñªª#ã1”$1"Œrµ zŠ)>2dŠî ¯&MY ð LÈD¾Ð„B‡°’!štâK"¥ h8ÈCH¹ñ' ’„åIÖ€(òÄ_)! J²ÉjÊ,@Y‚!ðå‰# =P^ Èϧ”ÃGC×ğɆPŽ?eÃË¿·?44jlÕÂÓJÐ\À§•-"ÚþìÙ7Ów¯ 7(L ð ZI5”Ðj÷É+´äÑ)¾Zmy)ˆ¢“h¾<ñUk®€@ M±@4ÉfGšâñ,@Кã¥ú©4ðD’h衈,‘¼J2b*0у6ÅI¾aø¢Gy„Vú‘ÁÁW¡ñ‚Æ) µdôŠ˜ÅhYE3=ð M^Êy…)q4HØñV‚X@Éà…Ð"Ð OßžŽ?Pxù·Z¦Â£¢åíOŽö§âš•2û¼ù¥U$'GÀpÖC òëÁpëÅàšGÀpex¯³ ÿ×pG õ¨Ìîf¸sGÀpÚÊïvÚhÄpG|…Í‹#à8Ž@Û!à+lm‡µÇä8Ž€#às/Ž€#à8m‡€ïë´Ö“#à8Ž€¯°ypGÀh;|…­í°ö˜GÀp|®ãeÀpG í¨òÏuÚì¶Šé‹/¾øøãeÔ²­âôxR"€)ÈaÆ­²Ê*Øi-¥^×Õð^§ü%M]Îo¼±òÊ+o¸á†C‡-©nWæ´³gÏþàƒ(ÉãÆóާ@o«h°:ÓVQyh–Ú˜?ýéO¢1:ï¼óÔŸåÈ‹{u^®¼òJº–ž={ÖÔÔ\~ùå§všåÅ':E™Þë”͎ٚtž9òFïR[ÑXYWOÝFÙlßÒ¡·©c''ÛûÐádU4V~­0¢M±¼÷Þ{·ÝvÛ¾ûî«~îºë®Û{ï½ÿüç?ŸuÖYô: c·ÝvÛ“N:iÒ¤I½zõZ°`ÁÕW_ýøãKXPË7­,°À=uêÔŸÿüç÷ßÿe—]vÔQG1©êÖ­[$v-…XïBy7ðÍ«<2ë¹þ½N–µæ“ƒtËÎr(<ùÃF]#GÜŠÍøAôË_þòÐCÍGQ=8½xñbÖÙ®½öÚ~ýúåÊe:# à( {„L‚¡}I­3þˆ-M³Ÿak)b@žõ \î„Ò“,­Yܽà8 IDATaqöØš6v É5VÕš‹_Ÿ™4FN™2%®ÿÞ{ï5&gpöÊ>Ó–[n¹öÚk'i¸ì€Ão,DZc‰×‚;Q6ì°ÃÓ§OßqÇérŸ:|-›üzF|®S†e@Õ5÷°±ª²ñ³×ÿI?“Ö;Ñåd]Ó?+ôﭷž\¹ÆéÓO?=ÿàd%—üå]²ó"@A=üðÃ9É‚Šq„é¯e€€÷:eð#þG0‡5gÎV¥¨´9:žãØ©[E†½9N”!ÌG@N á 8†ã’‚ÿˆ£…/3fÌ8ðÀ9`O¸8`ÕUWÍGÒeÊÄ.‡ÜQ’)Ïe“Mψ¨8m…Ìñï|ép” O>ùä AƒX¤¢;!S9:ž²É²g¤Ì`¸CŽî¼üòËóæÍÛzë­Ë,ƒ]<;¾¯Sn€‰Â´iÓÈ'›é~Ê-{žŸ® ·"a,ŠoŠ»FŽ»P.}…­Ü~l¾na•ŒíYúŽ•[öìçæMj,{ø¾¼Vn¿n—É‹l”äîÝ»óì2™î*õ_´ i*ª×Õ2ü]=KŽ@Y àw”ÅÏè™pG “ à½N'ù¡<™Ž€#à”Þë”ÅÏè™pG “ à½N'ù¡<™Ž€#à”Þë”ÅÏè™pG “ à½N'ù¡<™Ž€#à”Þë”ÅÏè™pG “ à½N'ù¡<™Ž€#à”•çŒ/‹|x&GÀp:>×é ¿’§ÑprA òÔ×Ë%+žGÀp€Ïu:üOä tG Œ¨Èl—9d‡*ê—fjæ—Q¾<+Ž€#à8 ž»õÈÞ9=d`¿ßýÈ­õu°ŸÇ“ã8Ž@Ù!pµOV±¯S׸„¬=óqMÙeÐ3ä”7^úôð=W/"Wátyz4.Yn_gá’Ú.ˆà$ P]S—Àu–#à„@ßëLj YS×Xä”9u ²j|öÙg·ÝvÛÿüÏÿ„?ÏW_}5yòäW^yåg?ûÙlz%Ò‰J%Ùxùå—7Úh£–f|ù\gq{LuÜlèó«Ý=Véþùw×ÿõ:¼¥ÙpyG õXZßÝ={öÄ‚8æÃ1ç:~üøßÿþ÷«¯žº:‡‰qL5çÖŽÂÍ6ÛìŸÿü§Ä>øàƒ±cÇ^pÁ‘î$®dÆŒW\q…Äøú믯²Ê*pÞzë-ºœVXÁ˜ñ°Æ •Ó G@444EÓ¾NS¸Eµ_—~þahVéÉ+Œ¦HáLEÅŸ®zøá‡k–cÌÄxÅ •äs¯®‰@>å0D¦[·n¼.ŸëÔÔ7RÍÂ%”ŽÓ%®ÈTl½û!×}Ìñ—ßOíBçÒúÆêúÆÉg5ã÷kªO<ðØ-w;ØÏ:h«ÝÿÅC“·xÁÜÃϼî¥GÿòÂ#w¯±áVGœu}¶½xêá»/;½wßöûɦ÷§Vœ’¤9Q¹kaé\h(µŸNÿä£é³z÷ë?jՕü@³ÂÖlQU¬¬¬üÕ¯~uÕUW]sÍ5¿øÅ/~ô£1ÉX´hÑñÇðÁŸuÖY,^í·ß~;í´Óo~ó›‡zè—¿ü%“Ÿþô§0CmÔ‹Ýwßµ²I“²káwÝu×Î;ïÌ$‰$ýýï?餓/^¼Új«]yå•cÆŒAàÁ<á„zõêµå–[J æ;ì@_xûí·?õÔSo¿ý62W_}µ˜Ã‡Çž¨$L•ÓŽPqùCñu¯cû:µõ_úPö=ª“a;bEVŒ0výÉçôÍü妭ö8Œ~¯®!CJv8àØácÖùxÚË7yĦ;g{/?ÿèå©{ùCÞþ‡ËNüþ'_ºÛOÎ8óûëøÖk#ÇŽ¿ý¿òá^}œsàfã·Ý£²{RÕJivÍúÝyZI°2`É¿ƒ ××Õ½ýÖU½ð2lx·ª*ËY¨oÈÖ.Ëu³DSMÌÊÓa|ç;ß™6mœãŽ;nuÖy饗~øÃxà§œrʵ×^{ë­·2+b9_ºþýûo¼ñÆô1=zôcÙÿýé“X+{öÙg×Zk-f-êØŽ:ê¨ë¯¿žÞåw¿ûÝ™gžyÝu׊ճ;î¸c½õÖ»ä’KxUÊ?ú裚ššc=ö¯ýëÑG½Ç{À31öD%a’œv@ ¥s¶|®Ót˜@-j"žq¯8Çƽ✬0u»[ϾûýâÒO?xí-¿‡dÔ7V YeÍß|ù—¦Îù©rØåÈÿExðˆÑcÆo¾îv{ <`Åó¿übÉâç˜=|ËÅpÞyíÙ56Úº©÷$†Ð%'£I"î瘪¸Wœã†@HÄŠsL>îç/\ñÎ[ïÔWöfÐÓPÙûí·ÞYsÜ:Ë”f£céšÚµŒÓü¿Ì0Lž‰Èˆ#xe=í…^˜:uê'Ÿ|b¾’¤/aôÛßþÕx=ýôÓÛn«üu\LJ¾øâ‹ýë_ôRÿõ_ÿEÏAÀçž{Ž‘ã[l “MÝ%K–0aH¼ßúÖ·n¼ñFÅ¥ˆ @µègžy&û Aƒ•4Ÿs—èbP¢p-Íôò}Åuº‡j–æâ^qŽ…{Å9YáÅu«lô­qÛìqÇ…'4f*–ÔU.üôã«™¸ýA'­´æ&MÙ„Ñ{T×W"œ]dÏ@d™ Õõ ¾˜Õ½WßolŸ]Xã9på5–åF<Ò81¹¸Wœ³LÖ5Y"Tœcâ^qNÛ W/Zôñ'_VõìÇœ†5gè!£–ôêÛwY *–ÖWTWW/{mþ_f’_ºt)û(gŸ}ö;ï¼³Ûn»±¶ÆùÚzùBHrúôé}úôùîw¿‹jžœ>£CŒéÈž{îyà 7Ði¡í¾ûîƒ3oÞ<úIRóá,\¸ˆúõëg±Ã7UŠ N!3;½`š’æ3ï] J¸•¥|òÍ–$bËç:ݺå¨ùù(,PFñî~ìo~wÐÆ ç}Áëû/=¹âªknµÏ?yóņúºŠŠ†ÊÊì^üUV²n‘¥áð·ÊŠ1l¹hÞìªn•«Žß¼ÀDx°.ŒÀûÓÞìÑ«ËaÂÎzßÜÌ ¡È?A7ÀÂÖÇþüç?¯±Æ¬­qÒTëoLSýgä¾ýíos”€îG›:œG`ˆãô@‘S 0B¬´ÒJñØ[ª$¢Ó_doÿœtДóüí«_ZœC®í½ª}Õ«oÿúºÚºÚ¥={ÛrGjBXˆX8wv¿ÁC³S!wŽ@é˜ñïOÙmHñúøB“ó,páXRC!*ôO,”AS€é<† Ò¢ÌÌiîܹ‘“Ù,z —šã± $ÿè\²<àL>ß[fùJìäkY¾¯Ó¯—öuLà?éùVúKI„ûõØ2['årkîß{Ø2Á쿹…CIv4"èUE¨g÷ÒŒcèrPÛ½ÉI¿–¹EÓÙD:ñs?™ñÄCql:w¨¸o<ö”ÄÕ:§¼`B_@—†ú÷¶„s7Øøâ¬&º°£ÑT"²NW6ÂÒÒ–#ÿãsàá¤#еØpà u4¿E0,ß×é Rj_VsÜ+αĽâ6B"Tœcòq¯8Ç… 4" ´Ý+´–ÐùŽ@—E E ¹Biù\çé>è²ÀyÆÜüãÿÈ-ྎ€#Ð,»ì² 2Ë÷uNÛݾŒk6ìrÿûßì‘._Fqzgéÿ:e€@!µ£ ²íYpr#ðé§ŸrYºöŽ'·¿:Ž€#P®0ËaÊæŽW® 9Е§®á´ìUs¦/Ò–?è8󣎇›ÕshäÖ÷ß?.À"à»ï¾ï$ã’Îqºÿ÷ÿÇ ie™Y¾Ó˜ä™»´vƒà\¤Í}t‰zâÍ_=¡2l9N'"ÀËq~œYÈ7>¡^ 3›åNE¯Ãý—_~ù-·Ü’£$ýä'?á»VîTçJÄ›o¾ÙÔ¾ñÆ\rä‘GrA†IŒÏÍð|`ǽ„8ö¯Œï„#Ðî°‹9¦Éqÿ ŸmоôÒKK˜°?þñ¯¾újIþüç?çšê’¨*^ í×ccëk¹¹á´Y…ií wÓq –ôÎ=÷\Óƒ£ïÿû\LÇDÆ„HÓÊ0ü¨}öÙç{ßûÙ~x(“Ö.¥ñ𢉂2Ã’”yqË­–‡±ªÄ~¹q: fÁ’—ÈÌLºz }CaŽ¥Ì¡¸É#tüÌÜ}Ë¥ë\g‹@šfî(”÷ò:zôJ¡Áä4{EÆ ÃÞ¢øXÇâÒ*Ñá“ká¹>ä8í´믿>“’Öˆr®zQ¼rn~;à€Š×S |f(=wÞy'÷m7«3­ÝÀV6îÎ 7¢bó[ªx¥-Ât†‹BåizB’D—#Á±fúŠNk—Òø sæÌáö<¬Kü¿ÿ÷ÿÌkÕUWåÞ ¬ðƒI, ±gaœB`¹„™Ò€?öØc?þ87£Óìã8V†ƒ ãá‚%&=¤™î¦Uæ:¬­qt[éKˆÆ:½ÁèCŠ‰æ•Ľøâ‹\  Í„£X@ ¾r%"†F0x%ëXœÿÓŸþÄ•ˆÎB »¿ÔU~Q†*6ïNd†:vJˆÀV[mE‘–BJ&•:±b€ ¹‹š»¥%7@s9Û¡‡Ž|¹ãªð¥‘*þÿþïÿ–|bFÒ‚[5l6a,–H?‹lÍÚóF2±Ýæ‚Ä…‡¦öÚk/›6ñʳNñ*"=õ„Ð,Ûмp8˜v“ño~ó›`²È "\Ø+í­ÓŒ!…¸bÜš>®*Çä+³ŸP óÒ%Ø×¡ÐØQ¶àÆkŽ^ÇPc*ºÝvÛÑQÁáö'zu XÉ—šÃÄ$¹Á—1Ö|M­ú›ÓO?ÊL¼” ®ËÅ ðÅg¼±|·÷Þ{³uD—ÆPBz™…Ž@i 1µ•ʧFWñBH‘ÆòôùçŸÏ×N+ ”d.!­ÅØŠ=gD†³à†9mö°Ägí ¶DY¢‰ŸXMÌW\bÄH ·ß~{ì^c›Eo˜‰I nÕ0Ÿ„‘r:ŒçŸžÛ¯™È Û FÜܩʦ$YœOÜ0nVOD³F4MtátÃü(¬üK€.!4Ko—$–ÆcaþtØa‡1ªÀX_¸{Ī –ÅÙñB˜­ @ªSÓ­2×±^‡ÉêILwß}7õ‡zb”?®pg¨uÞyç…‡¿0œa4ÄfNòT-6{˜¢2(/&żÒc=ù䓌.™H±àËÔD¦úÓh !ƒqܽ÷ÞË7&)aÞÏŤI“˜ôc¼€íO”Bf!,F1Pcˆ†Ul˜˜Wà“muŒš#{ •øŒ«’-ì|ж²ŠÀX;1,Ì´àV óIkò8&v¯½öZZD~¤Ý`5Ýä Ñ»woq"¡â¯=Hõ>ëu´Eü4`v"ƒ­¼&¶K9øaŒ>ÿüóvÚ‰9.ý7½‹ù;&Ï+³·D"=›Wg'ªÎ™“™TD&èTl‚WÓl¯ó·¿ýÉÊ#"”ÐiÁ­æ“0F™¨½é¦›XÉàP@·j@XZd9º–¤ r»¸žˆü‰'žÈj?É;úè£i÷iyDdÒÚ¥4~œ‰#f¡iè.£ 0Ýa‘i+§®”/óêÔDU‘©§fü•Öñà…@Z ͘Wb‘Q[(ƒ6º"Ì .¸àÔSO }¡éŠtð-BòJœQñ«#ÆeÅ4$2Í× G 5н¼Œ´¼–V7Þxã×_ý¢‹.¢#¡ñ¥%¥õ¤%Õ‰M+äÌ<¨8ô7 ¤°g >è ƒ¨,Ô8ÖŒŸƒHS•$’‘<ƒç™0–³XkB'ÿ´ÀOl7¨ãì‡qj@[/Ó¦MÃz^%iz"A3ü…Éb £a`⽎‰·KòJã³WÄ1(,ûi-”5´“O>ù…^à8•b—}kÖrø¹-–2 J°¯ ô‰/«*q°˜¯`Lž¡U gý“]ýðÃcì=Ž8CB*¤m2ÐÂB$ ¦ÀÜÖÃÙ­ªk…4‘ 诎@ `P…ñ+VÕØù×· ‰…6ˆ.‡ÞB[â”|ÚMì…¢»böCs¬ÚÄš -,åŸö—ÏSЬÔÒjÓW]rÉ%õç÷¿ÿ½mgFr¤Åº4Ua½Æ3’gðxÂBýâÒ+Ó&f96MK|b»!Ü´!Ïx”ƒ¯BWH'ê  9‹d†À9¢•Oølq‘÷ˆp¤]2ß4>}­{Òœ×Àq²á™¦›¬ ²ÊºÇ{˜¶r øJ´˜“Ót Ô„.‡@¾7ß|SzX’f׎•4¶p¨®¦œýIƒL6ùm8¢g|vá8/Ç‘8,†ÞsÏ=æE5Öä”ß5ñ™Ä G â'§ùÉ ó¶ÊiRÙaF¼Ò!Ñ‘ @Yýío«4\qÅ u9Dˈ˜!³NNÓ?1ôfÉ…A1“4Ó Jž!;1–~¶‚¨Möj­Ø…^Èkšª°‹>¼‹g$ŸàŠ4’0K ùbtHÕ¦…­Ý`RH'ÁA;†¡—]v™éaï„å>æ,WBЋË+M„`BÉiš#ÃÉi;lÍj»Î’Lk—Òø¦Ÿéøƒ½B°ÆÈÝeìçÑ5ºº=Ép”±;ìC° ¡àLt¨2¶U€*æt,q X;H&ª²„DbFò IX¨“úÎÑdê;½EÈ$>ôÊAsúŽC¥=dR×àÄÀ&×qŠÿ«_ý*¼úÛùô$$ÂáÞ”R™ 5Í%ÔÉ7yÜBÍi4,ÂqVˆ[¸ôÌ"j)Á?\*ÓÒPqy.Iã[=N„ãt[]\&ä¤Ù¥óh‘Ò|ZâjÖFjZzÒøa^ø¨–;YøXjÔ¨Qf&P„¦˜: ÒÊÌU©VצÑ|qJéuøý¸1‚ À¹ÄÐnWKÓP Ÿ¯…Á”ËBXåKÓƒy .âMóu¾#P¸ÀI¦4І} ¾fgà ͈žVÀ®©-I\í¢„¯g¸Å‡[5¹m„Ë«¸"a릪-+×Öñ‰:Žå0 ‰4—žqì–oi1.Îmž–~®o åú.lü`Oˆ_PÁÓZ˜4~)·qS*q…1ä‡^F§¥'oEpñ7SpÍ ?Šùò-°ÑŒ<øfß^;Qì¾Nâ\‡µ5.¥hÖ–hšB>dãVPî7dCaL‰¶ùíÙ=ÊÑå`i´#ü*ž†N‡ÐÔë°0Â¥,Ü*¦MM2]•X€a&š M4器$ÛÀü(6Ó˜Upó5N? WÂ3ÜdÊÂ+ý+ƒnîä¥ÉŽT:|±OCuæbîã±_–{çò¬æ A@3l ì„Ð~1}M³.]%ƒùØ!m¶å!m,)…ÌêB©¡ÒÄô*Ï2­£|Œaš%ÌŽo„´2ó“ÔYB<‡qŽzÊ_èlêx(q8籇ŸK©åv^Ê—~Û4… b™?E¬‚¦i6>uƒ{ 8äoiÔÂ:áäs¦Ü«FåçPz‚H¯“X€ÓL…&šòŒk rµùQ£éJ šTnÎæ†J3¸Rš/(¨bˆÑä…•Ž€Fœ&žå8F™Ü-tèak'Ïjf™Ô`D>ö:MØ:Q³_\ŒRS˜H$ÚHØ!UÀ0=¡ª4¾É€?·Êò¤ï“Áz7BZì [â\ÇzŠfŽ^'Í !Ëâ”TFˆX!ľPR¦ãVA ÷C–}ó·4šC•{99àæ@–Ô§NJA¥þÓë<óÌ3ÜLÈi:¤ÄœÃT(ELy&j`œ‹kó£l{°JC\!\½C/‹ý.†g¤È­ÌLw¤‡•_ºa.ÆfñƒÖû°¹VYJZTÍÍ2i˜h ¦)û™økÄ~¨GØsÜŸVüq E𵳑Ú!•X$=–à4¾ `Æ…M3Ï<“Ή-7ñ;¾ÒboÄQ¯c(DˆÜ½³` í!*8³E&̶C§Ù^ŒD—Ï«ÝFZBùÄë2e€¶vfΜÉ~;kM±eÌÄíý´Âl$pú+bÖ–ÉMyâE·`ÅÒ±ÓMÕiÅ•6· ÌÒ¡rš€.Ä Ž‘6 ÌPC9(A5·väWž>}:c>k7ÜpCmw!Ö¢jnÕ6¢ŸëSÇñ æ‚iö…ÃPqû¡üX°-¤+ìèrr_÷jËMçc#5žéLã‡1rZaŸ}öÁe,äÓÁwd#¤Åö:ÌìQìÂ<ö”`LnÄí!†2œÑdÕe‹åòˆUÐP278ß*RgîÝ· "ÀÖvq˜”`˜‹ñ2wã³²ÓN;Eba£±££±‹˜ M„.Q’¬z±²×ÚæG™Çp&‚+ëuØpeGf¬Mgº1ªJ§<2Ub®CjYg‹÷²%©æiö:#`r”€4Gì`‡4¢6÷k©‰éA[??GâÝcÜi öu‚^'Ñá•ØâÃÏa…bBq6Ãvbm/F~€ð5´BÈ0-ñË€–ê õ;íÄ`ÄMcAóÊZ ã-V“XmS¯“XØÒL…Æ5ÃIÔ€¥º´[Ûü(ó•ã?ž] Œ©,&°Èn ­ cI‡î–Õ6††²°i•ŽN1ÌÁQ£9ÿFGef1‹¯æ!PÌùèál>lB1æCa ç£¿ÇiL,[d‡4Ô™F§ÙH íeœšI IDAT&¦…iü´¸"|&ÊÚé©¿¬h[¢9¬rÞš™;³]Æ/¡¼â¨]q« iVš±o©€,n´ÈÒ¨BùÓ(š3–×O(h¬iˆõšX€M…"ŸhÊ3®¡-Í’‘SN9…ZÉœ†ömWz e.cdnÇ„Œ#0ÃJÇÚ °°ÈÆüs= B™säYÍC4Üžiö:i¨FZgŽ?H3ÎÙ!Mky,ai6RC;¤iéIã‡úAõŽ;î9Ð0ÛØ)g1@¹/{%ØmbBÌî&C †Pq´öìbR\I!ÝMEÖ¾ÎùífK4n…‚ÈñvŽZ’ ³e¿㩸UPóM# °4š¦ÊùŽ@1Ä 0û%4è‘uùQÄ5´ªùÑHJ¨Jï¼óSŠø'™lhQa9¼£ ‘JÇLôIÌ*äË<ƒ%;æLEVsÐÃd\ܾpG°CJ KÇL™ùE`,§WòHa`­‹ù}ç°%jýŠÙCÔïÁªy…¿7|Í“.ÀÒhžš]ÌhñÌ  H ‘ºƒ¶4V§,HÜÊ'§B󣑴Q•XŒ0õÊ(äG*]džY œ«¶$…aã(…¾!MNí çÎE¨!¤Kk‡”VØVüÂXœ®â¶‘EÀPr[¢—_~9G2ŠH‘u ±òY<Ë#-ø;•h±gØJžYΖ\§+tDØnÁ%z9Óh%ª°¯S·í!kçÐGEãÁÏóÅÅœã8Ž€#Ð¥(v®C—ã¶D»T‰ñÌ:Ž€#P •çToWLxÎi°iÖtöýë§èÜ–h1zXGÀpÊä;Z”azðêO}.*“nnK´EHº°#à8e@婽þQL&5×±ë>EÐ qÁ”:·%Z ¼Ö0JnÔ4·QB¢|ä”h4Ÿ–Çò\ÇçD‰ÓLˆ¦ñ¥$G\‘XâzòQÕ_‹ëÐë0¹‰÷:nK´þØž¤ÖC€ã3|íãS3>7é°ÆÈ¥/¥26Êw‹t¥[«wÜqG{m=¢T&DÓLm¦ñÓr”&O£Ï}¦bà¢3 ¦Yp.æbM¾Â‰/Nã[@ˆ´¸B™4=ù„å7í<¦EûlWÌ8ÜvÀ._G‡ìø€™›þ¸qz¦Vrì! ×}cTª°(¸ãÖ[o‡MãÇ%ã„`\‡IIÈ)M™ä&´’hãÞ‹Üj,m÷ÜsÏ„ J¢9·’wÛäñå#Pq}ôQúx𠽦ñ#Áí5Mž{íN:é$Äha¸GŽK}$­ÁIã[Diq…2izò « r¸YŽ›ÎÙõàòI4Ãä2X¸Ÿ…(ö>ŒS$QÀ8­2×am;ÛÑ–h¢ù­¶Ú ƒ²Y`—î0bôÐ~Ä”a¢IG ë„#A ^ðH,EmcTÉcø|ÔQG™ gKs¼îrÈ!L³$À$`“M6áZ­ÄôSSèÀ¸G‘êÏ iÜ‘CEãFç?üᦿ$&D훺ƻô:ÒŸÆ·Ø#D¢|+™MŒ‹ô„¦E¹)Ñ8rZØHvx¥¿a–ÖÁM‹–`_‡ASxšÚÜ éHãЈÓJ¶D‰1Ñ|!W½rk“¢Æ:!%54zHš-~Ä”!×Á`ÕÔ”;ÑÕˆ<ˆ—"ÊmÛþô:Ü‹#0ö‹$Ö.Ͼá†$C÷ÃM‰¬"ÆÓ5…,`ï‡kÐÎ?ÿ|ª 3†K.¹„aײIC©LˆJ[š©Í4¾BÅŸ¡|ë™MLs¢iÑx ÃÆÅ˜ët|Ó¢­2×±^‡)G¯ÓJ¶Ds˜/ŒüHŒ,¸-Š+¡"WHÅùfÊ0ѤcD­¿:¹H,EŒRqm`TicDÈ×t Üm©M¬;ìdp‘ð¬Y³›DDì¶Ñ<øàƒD=t ¤9~G÷ÊøÉéĂ׾†A©²8©L± ƒùmщu/6½7ÜpC!ž~¼ÂƒÑÔµgžy¦Ö8'’˜´°f4ßL‹¦¥-4-šf¢4-¬ý:9rÄlmÓ¢œœ® ?ô:çÿøÛ‘Œåù:eÊlÙæÎ-Ъ¶D#æ I$g"Èp'LpÄè¡y¥ñ%7éhp"$ûÜÆ(ã w9ˆßˆ“C¸E^nKÔàJ«ªi| !ÒäÛË–(·Ä*…wÞy'·vERËWG묳×HFøzÕ8ì ´£-ÑêêjºFn;c‰#;88xqû%Iå*œ¬°Åç:lê°‰Ôél‰6k°ÏmŒ†c+§óDÀm‰º-QŠJZóÚÕvÂ,²a=/,`=ô–Ž9æm–„^!o'°%Z’66rBg n0éÜBPBºÃÚ † v£!Nç‰-HĈ-Yi‰X¤¥¦¸-QN¸3MÀFlø&¢VUÓø‰J`†òíkKôꫯf¦Å ß\Rl©å#&:› E5f"Á\§ØþÉA‰©Ï“IU‰Ïu¬×Á+G¯Ó1m‰æ0Øç6Fó,.–D[œŒRqnK»¥†žÙð5NH¤UÕ4~6¤#ò:õkWçµ±-Q¶'pï½÷wE["øÃbƒ•54ã$Æ–èªÜ\·mÖfFaN½NZØÜ½N´%šÁ¾0³nc4DÃé<p[¢Å Òý÷ß‘†^V¿âÀ¦UÕ4~\ƒ8qùöµ%zÆg0¦5XÒÃ4%×É`CrêÔ©|û‚YlTr×øÙgŸ}Úi§Åsä¶D³˜0v‹|dHu@[¢yìKœ½ñQ.›fncÔ~_'BØSåՌئ•·%‚–ƒN«ªiü4U‰òÁ–èž{îyØa‡ÑHn¾ùæë®»®{峿^”“Äu[¢Ów;81y2Õ3§It(Il£áw@[¢iûBC„nc4Ï‚áb†€Û5( Ü–(+x8޳íGc(ˆ˜Çˆ`ˆY%‡W>·g…MŽþÓy»îº«Äò|v8[¢ì\¹-ÑM6Ù$Þ;šB·1 8îòD ~rÚm‰Ò8º-ÑHaÍKhKt›m¶áZKº–õÖ[oòäÉñ"ÇÚßëÄùp:‚-ÑúK%ÈOˆ>üðÃR)l=¡ë<{bïaãÊ{ƒ¡ .¤0iýšã}܈ƒåW– sìÒù³E¦!@Ä„"³Kó Ç-²]·‡hçRB¹m††’¢sË·(…qåÎ)K 9—–HE@Æm‰_$hÊ;…-Q6Âí³žâsÝa5T±¯3²ˆÔ¹-Ñ"Àó Ž@‰p[¢‰€º-ÑDXÚ‹Yì\§äév[¢%‡ÔvÜ–hù¡;{6«Šü^‡£‡nK´³O¿#à8m†@±s·%Úf?•Gä8Ž@ PKì€éºžœ·áÄŽl‰–@žGÀp"PìÓ$…^'¼úSŸ‹rhG¶DK˜VWå8Ž€#ÐÙ¨Ì\•z't>yã<8½Ž]÷)B'§Õñ<õÔSùè)X†ãÿ«/8¸t:2n?4þëð‘P„Ég.|áÏ*K„Ÿöš£Ñ(¹ýЖ¦-’f¾Há{R¾ƒ‰ð;õk±sz&7ñ^§ml‰ºõÏN]øÊ,ñÜ–Ä7j|Ôiù:÷Üs“=óÌ3Æi)ñÇ?þñÕW_mi¨DyÙšäs¾cÛl³Í.¾øb»”%Q¾TÌRÙå&æïÿû¤ÿðÃÓöÆopS_ìs„;ªC¯D:­Ñ 3Ã2 6Ӹ͌ΦśƷ€ù§#vBùòÐCÝzë­ùË_bÌí±Ç 5‡4Ÿ"ñá_¯³5Š¢)/QäÝíkK4Í`øE®[ÿ ÑpºõàJ’¾}ûÞ~ûí÷šð¹ôÓO?mœ–¥µzÛm·‘:’„d®öjiz ×m#Œa*ÃË\æ¿û^tؤþ2×EóÅRè§ÓÖ°šÚâvBO9årÊw÷Ê¿Zdà 4”,ÁIòFm”øƒìÉôk‹-¶À*å7ÞÈå+ð‰‘î–ë{ßûK:’$Á´z\BÚ®¼òÊ41²Àí$¤Ê3*Ÿ9sæ·¿ým®m.¹ýP¬[±ULâ•<=YzzñÅwÜqG^ ðï|ÇðÅB:±Ñg¦•à‰$íµ×^7ÜpƒB%Æ+±xzˆr¤-´JD;¡ü²Ç{,+IÒɸED~É5Ó~McBä(Š¡X»Ð¥éuÂÓжàMǘ–±Ú M†Ñ¹õÏ §[–€hA¸mŒˆn¹å»……Z@CÆØˆÑ:“Œ—^z ÌerÓùöÛo¥Zvd`þô§?e±/‚³gi5(^_è<æ+SwÜqë*4‘zmö¹Ûn»¡™ïíŒ6edÍeûôa,f°¯@–(“tb•Ž“±öùçŸÏ]–'tÒ%—\€[•Œ /¼ž’¾ö…^`¾%fÙŠ¥_Äõ×_M(Xß¡Ë7ß´¢híH”¦×±n&BÐQ§EžKeK4b 0D“JȵiÌ@e© /³NH}òÉ'—1Ib=7´`jpÚÈV“˜ÊÐþÒjS,i‚–y-ã©§žÊ„€æ€³]ò:î¸ãØi€CXÄ ?ñÄ“&MbäÎ܈ûÎm¯/ûï¿?݆ú*†º‘ ÜÉf†ÚJt‰”¼òÊ+ô$€éóƒD1EA?Á\g»í¶cbǺCo&@4šv?v0ét'NœH‚yäBåÐf54wúÍ— –И2žwÞyùŸ]Š4¬Î¡Ð®Ë+•ýд´Se!HÜN¨ú&‘ GxòãÒXY–`ÏwÚi'Ìð0(a c^iEÑÚ‘(ö+Q:›Ƴ‘»×)‰-Ѹ)Àx2BŽÝFšÃ‚a(ï´#Ð"èTXô`Ú±Á°f¥° Zú°Ý­W¦2¬_‰f•F7ÞB°ãÂV]‚˜á3^_ØC¦ýböpÈ!‡°JÖ"³+”Žr±»“X˜1PyYÅä»ÒJ+‰ Ý*t¤M`Ô/"í‡JIäÉL .ÓÂR@Úe©a¨x£¡Q)«”úÉè„âw‡ò¤›M[šPÝ}ÌàƒµJ¿, €ä./Ê•ÎPq:§O<Ñd‹¢ù¶#QuΜ̤"âgLÇX,RÈL^ØkH0˜bVˆmpºhA¡—Ño½õ ܼR»¨!qK‰¦-¸‰ó­4Ê G öÛo?¦̰)ÞœF†ŒSRj”YgKìW—$Ís^­è¦Õ—ƒ:ˆ)µ ‚¹‹ÅØ,1eÊf0ÌÌX‹W®Q£F¡“éƒh©*I}i¶F7›ì¸ é8¼.¸àЈ˄œÄFƒ–-7~]gaJÅBÍ䉼ÒÖè0 VéÞtØ/äwû¡ii Ó™›&_ü¹eÚ×·€^§*Óg»HçߢWö²¬uL[¢–·þiP8Ѿä¹G¶%®Mf²`&Gùà†#³ ü-_°ÛÆÌ‡`NÓ6{ÈO "4M–Œ,ñ¡@Kµ…aE³yÌÐ>Î/ ‡1cÆÄUÌIK[þ K›žüãmUɪS{ý£®ˆÜ–hàyPG ¸ýÐDÙêHä;³ÝhÁ¡—¶I«Ûmœ=–ò@€"\yäÅsÑE¨:§z»bNNs-<~f¨ñÕ´ÑN8Ž€#à8B Ø¹]]"hr†‡£,ÞñD`ñWGÀpJöuŠAãì˜ñ€9ÎÞ¸-Ñb õ°Ž€#à”1ÅÞÃ4ô:P1§ÏEeÒ/Ê;Ïš#à8Ž@K¨d_§¥aByÍu"—~Òñ%—:žüïã ÕæOó)GÚó—wIG “"àvEÃ.­â§ñð!CžÏ€ÌîCš–"½¦ñå˧KÜJ@›™VÌ´ôðéO‹ì¥æˆ¢Ý½Šë€ “›x¯ã¶DÛý§õ´%ìbòiŽïÿùEô¥—^ZÂ4”Ö®(÷¶YÚî½÷^™¨1N+¥²+šf4Ÿ–4y:’Û%.èrÅ87®šñžHÂÒÒ“MRn1àö6>Šâ½ÓN;Mša–ܾm$Í…¼y#NÇ·%J¿˜Ãq7èÅÒøqIç8!믿>“’S*š2©ÛqŠWÈg• ¹zYªøÐ›;rŠWÛ¬†Þ46à P½r¯}<ÉzMã‡aC:Mž;Ù¸Ö Iv©¹@ÛŒŠWnÛ3Mã‡qÙ•6wÞyçꫯz–ž|l’ò›bc‰)¶$Øõ £B-Ì’Û·µÔŠ(àn‚V™ëè®':^ )Kë ã¶‘Ä2æI09ÅÝ ÜÔm3ÜDs‡\*å[¢Lãv¹a‰«Þ$YúKî~î¹ç¸­–ÊÀÄÖÒá“$.™'÷ñ!“˜ ë„#A ^ðH,EÜçÈÕ[”üýèGR‚X+ÙeX}ÔQGqõN$µñºƒ¦Yc€=SîÈIL›ÙM«øiüHí5QLJnW”éæ/ e\giíŠ&¦‡{Õò·—JÃ, Ófé®-Á¾hÚQ¶àÆkŽ^'n‘_Ó„4úyZ´_.4 HŒ‰vù±u«®ba|¤þ†«éH³i‹ðIRhß0Ñ¢…uˆ /xÄKås,˜ãÄ¢—­´ž]Qz®ÕÁZ˜ÚĺÃÕ¢¶"DsL‹É*b<ýèi{»¢aÅ3’ÆeB:”gðÎ]y¬‘J€pKµsÌT•YÔóÏ?Ï…ß{hWÔ˜azrØ$5y#˜ë œ§ÙFbìžhßÖ‚´=Qyð÷V+&VŠi|_Çz¼HÓ·(ÉY$HÄ, —çÙmŒ¤$niTq¾Ù7Ìa1¢Ü_4K£T[,L»u݃ÜÖ³+ʈ˭1‘`ƒbR›Xwh¶°´+[/“'Ofꓘ~e¶-íŠF*¾¡Æ7‘ç#X°‘w≠UÀëgMŽi"w“[ðЮ¨˜‘ôÀL³IjJD°ôǦΙgžI§e7V°“hß6¶-_«V}àæºm)8Jõ:iÁs÷:qÛˆÒÓ"˃q³€9ì6¦¥³Y¾[ m"È4»œÌ$бîÏ®;QkÛe·àˆ#Ž`]³ÓJ|bÝáÆR¶”¸Â™4›,5c¾„Ù€-²hayo3»¢ñН4¤ñ-…".¯ OÙª)­]QâÕÌ{pk•²ÛlzhÖ&©”pZaŸ}öÁîƒì¼™æŽfTtù²’%±ES9hô.‰¡ðB Ñ+Í6b(ܬåÁD³€²ÆÈ·Û(#KTÙÿP\i³±D>;U EãVMÃd;íDˆ¼´R„]/lØ\tÑE˜˜¢«$Óüµž]ÑsÏ=—±°µK«;´’´•Ÿ~ú)"“¤´ôGrûµÙÚ#xbÅG>Ÿ¦*Q¾•ìŠZ0†½@³Ój^‰éÁ—–et{© #o„éhFE+§ïv°å¹BM3 $:&¶ÝðÓl#â•¿åÁD³€iv97‚Ñ'ÜÍ;W™åd!‡=D‡ÆC¾|õ,Æ¢b¨Ç鮃@¼à%–"¶èérXf9p0HÓvEÎ_xá…=ö˜êiZÝ¡d7‚u–×H[búóüAó¯Ý9&V|äÓøaÕÕ&Êë'(¹]Ñ9sæ(jÖx˜åP*ôʶ¡èÄôà¥Q DÄ^ªB5ûìpFEOýeE»Xu£Žµž-ÑD»ØfÌÈ0Ý6~ò|š†mÖ8–íHÁ]‘ÄON'¼¸]NFBa¢dÒF`:ZÉh=»¢2{ªX¶Ûn;lqŠN¬;x±é½á†J†g<ý0ÃѭjW4ÒΚá4~G°+ŠYX®¬¤_ÇÀ9Ûc†dhW4-ýi6IM „NN‡1sØ·öZÀÉéý‘œØ IDATL‘ß뤨¶ äˆÛFTÁ-‰åÁ¸ÝFÖÐliAÿ—hi4¯P,€¸Ò’þš†@bÁC8^Š(Š,á†zÚÅ®h¼î„I2:ž~óJ$ZÛ®hb¤0;ˆ]Q>ÇÁ _Z"sðÙ§H´—š#H›yÐ뻯ӑm‰Æí6Ú¹”pL‘fi4¯°Ì™B%N;9H,xÈÇK‘ 5míbW4^w,=!Oè§[Û®h$ŸxnŸ´x¬Ž€#à8åŽ@Qû:SþùÂc¯¼gMØ`ìÄm6µW'GÀpEíëÐ嬼Æ84ré »C½ò†÷:|ýÕpG D ð¹ÎÅ·Ü7wÁ¢¡_Í­XvaÞÂE?ýÍõz[öÌþÛi;rè ïFì´#à8Ž@D ðÓ}>ûÐöåî&Ôê+*»m±Áúµ upj2Kë³ÿqμ¾1SÑ㑇ËGÑX4à츗sG0ÕÃ'Ìñ«S:8%¯æ|$¤Ë® š®…æso]dü4‚/¢¸qÎnÅø,†ÃëXB ™¢ãñææ‡âas¤! ˜–LÀpœ³ÝvkeªƒÓE&˜3Á—sç6ýÍ›ýåÜÙs³_Î7wîüìß<þÌ™7ÞüM+pQ(Ò r•:7ïF¥ýÝèØ´†ÝFî}õÕWK’o’G»Ì'#|‹ƒñ‚‹/¾8íŽÄ’DgJÒª¹ äIpñ(·X’~>Ôƒäcg3”O³×IÇPrû¡iiNKC˜Î´ô`®åÐCÅê×{s™4·…¡Dë·î°vE«¸‡md<Õùpæ/ª«¨¨Ív\Ìpêë3 Ls+ø8–WÞêñÏG¥džzê©Ä±Fþ\Òh/0qW¦b§ÏèÙ³g{¥$/v øX’Šùì³ÏbÁš»R®¿þú¸XÇäp2w•þùÏæÎž0…4Á\ÉÌåŠ\“Ê•3\í³ë®»†{ZúZ“öšk1¹bNßðžxâ‰\­ÍÝš\ÕȵÍÛo¿=-;aÓâMã‡Ñ¥É¤¥! ›–ž³Ï>›Ë“¸<”«/‘ç× C}Ùe—ÑIckuâĉ?øÁèŸäÕÊgásÎj+ëêêkë–Öeÿêêù£÷iÀ,g]öµ±é/SoHÄà‹› #,Lb]‘kn¹ ׿óŠçŒ‡@»q¤>±„·žýPáŇîÜL|Ì1ÇpÁ%´ÀL4 J‚wß}wÆËÔ¾+¯¼2M¬Íì‡Ò‚sC‰W.ôl‘M± BÄ­d?41Í$#1 ðÍ®hŽô°èzì±ÇªË!HåQO¼:¦]ÑJìë„ ÍŸ¦­«ÍÔÔÖW×ÖóÌþ-­¯©«[\[·´®~i=œ,AŸTW_Ï‚DͬùÆÍ†b‘P†«!¸„•AÇý÷ßO¹·Ë¡U9Óhw]&Úm¤²$Ú·M,á­g?4ćÑ©âj}˜Ô2î dk„ûâ°‡Å7.ägÔÏÑÜ­Ioš(“,0Õ`hH@ ¡¢ç¤“NbR…E`LŠ €K¬æñH‘D[h··)t®G‹ìlF…ö:[Ï~h$ÒÈk˜¼Ì®hZz¸¦»ìØ”âQVJO>ùä´¹Úø;¦]ÑÂç:¬¥Í«®[²´~qMíâˆÚ%µÙ×%Kë²]Q¶j ï¡ãYRÇUÝÉVvâfC#¿ÊqÇÇ(ËÐÜ Ê¤2â믎@‡B€Ršh·‘¢›fß6RÂ[Õ~hˆ÷zq¥?Ñѵ<ù䓨jã*^†Æ˜ EŒc„žƒÛš™î06O“¶´fAtžv6##ö:[Ï~h$Þð5’¼Ì®hZz°nŽØ=÷ÜÃ@œ'³U~»P§ÑÙ®hÕ9s2“,¥-!Ø¿YZS]טÎh"Ã#ÿ5à‘ýz'ËÔvNCeMý΋-;€Á p4¾Ì©E'ÚÝ31'‚@¢ÝÆDJp¤„·¶ýPC‰ElŽQ;"n”óQ¬áÐ …òq1ù¶™ýPKLHäig3 Ò–öCÃxC:ž†Ð—UM^ãöLÀŸ4i¦çpì`±wˆ …0¬èŽlW´ðïujë¾ZZ×­¡¶¢©Év8MÙeÆÇFN7eÞ‡n¨[·Úº›;YQ3,¨pþt:)‰vÓltÆó(ÉVµªH§L™Â †™} ÝOÄ<î¨Q£XŸaºc‡¶¹^:.OnŽUó’hS\ôŽùØÙ´„%Úë¤Kk«ëR°Å –åt˜»`¶ë‰'žØcåŸE§Ù茧¹ 쇲zsûí·³âÏ" Gì ƒÂdµãr²÷ƒ¹D±xú9ñj^Œ¶Hiv6;‚ýÐHRí5Íf¨ÙM³gŠùp¶Ðn»í6¬q„š+Æ9¢fjó!Ëg>K(Sø¾=ÉÒšÅuÕ‹k«ÕU/ª]Òôl" —.Ér²œêÅËá¢)§LSÍÖXc Γ°½õöwG "p饗Þzë­¡MwzöØù2‘ˆ3H9N÷²ÙÉN Co¾Ô¡uÖ!%î“1bÄa‡Ætꢋ.z饗Àë;ûì³'}q‚Šïc°“›aûöíK»sÍ5×°©@û… ÜwÞ9uêTÆ×œ<橘wÜq P¤lz'Š%ÆgÆ«yÚ8çM Ï:ë, EB$Etä‘G²ù4aÂvÈþþ÷¿ìi8S΋cº[Žùôp"©œ–fºsÆgØ\'-Þ4~˜ý4™´4ü⿸馛r§çœsÎaÁ–_вÄÄœ€cÌ‡Ž—Ï|B•P&»ƒëóüí–*=úì+¾œ¿˜Õµ¬Š$§7­» ÐçŠÓŽH1æb¶ÈP…kZ¯Œø«#Pf̘1ƒÅ%ÚÜÜùbËrçn#bsæÌQMaçß쿱“´ãŽ;}ô×õ‹³dœ‡ækHØf_éè“"F䨾&F,ZðD1ó¹«yKµÅõ³¤Ï.ÍȘ1cBßÂpàtœïÓÒŸþUÛ?a®ÛžþôÓO¹‚uN†GLÅxât°‚ôð±Z‹£TÓÓŸ|Í#Í”þy8툽»Udø±å¤WOÅJÄ8Ò«Ï1ˆºÜ†£Òþît~ò´Ñiú-ß68³.‡æ¯ý+s“)Ìn&ÁéM‰ñv-QÌäãDîjÞRmqý´6‰v6 Ã!ÒuÅ£kcNZzÒømœ¼Â¢«b_‡ƒÍ¸Q#†ʃ8Ž@ipû¡‰xúw剰tfásŽzOƒ#ÐÅ`ß×ÅAðìw. ÿ^Gùä ŸÅóÌ•q¦sGÀpº8ÅÎuèr8W‘ë;áxÇÅ_GÀp ÿ^GØq>Ý<'„®¦¦†«¶ùÊÚñuGÀpB ÿ^Ç´pD“ ¡ƒÃÑC–›½ã1”œpGÀJîa+Æ1×Ñ éðÉì‡ ÖÔñhµ­°(¸ÓC ñ°iü¸¤sN@u¾²àS|îþÉóâ&nÅNÆ(ƒn©ûb-ÂÌ¡'"YXXî„&_4Œmþj;×Üp–#š¹w›óì-·ÜR úi†Óø–+'¶G ƒÛÅv &ÝÚ–ÄÛËh>F<‹1Š~®Ýä–®U½á†,ï\ÄXܾæeD7jé,Á÷:ÜM@ßP˜ã†%>“~ø¡o¿þ²;®>ÿÑ{o~ðág'_ý‡î¿Û ¸mBW0¦Ó~W_}uù”Æ1ù°;@zÅÌæ| ™3q«÷g‹Ï+Ía˜N=Óô„2Å„å“y©â’!n¬1µL°¸9MŽÓ9™—”n!b øÈ#Ð?Ñ1ãÕ…Êb, Áõ£ Æ\“Ãlê“Mò‹Ópð†'1ÒÝ”f®3æ{u3_:jìŒAߟòî7ëFØ¿ö¶nH%š´¾1Ñ‚aš!Å4>¦ ÿô§?aD—Jýú׿æn"ŽÕa,NQcœŠÛ™HÙæ¾& ¸BóÓrÄî”SNyî¹ç¸°–‹&¢*-y4Üë€á[Mê±A*I …ETîîåf­ýèGÄ âC¸É#n_5¢! âtÇA SØ®x5IÙ‹›îÝj«­^|ñE!L-c™#¸ÕD:ÖŠ&‰ÛäÂï+¨‰èä¸L£2tÕN{&ßÌap³¥F<Ãx[Ö €¢„©b‘ ë-¦–ånÂq“ Fq¸Mμ"ýMÇ4Ig1¯•çŒ/&xöŽVÕú¯¼q÷-O¸ïÞNÿøËÅóß»òç=ùì+t} œ£×I4&˜fH1ϯä"QFÜZÈÍíÚbު˹qïwÞÁÔ#føÉ ”€ô7Ø0§.‘‹ˆªÄäÅ™ê´LI\0‚™E®´:í´Óˆ=$üH}Rľj\CÄ鎃@g±%¯&`HÙ£ôFL÷Òzra³>k¬µþCý¦<­×I4MÈ ˆ+â±YÄhçwfjBÏO®ÒøÊ0¿"×îêÂ(f#ƒÄ³¸}Õ¸†x(çt:‹-Ñx5zæi¦·ê³ÿþûóý¸LÎ3´?üðÃ#ªÚ×h܈g$y9^ãaͨB±-„c̓-†ˆž»îº‹ž›Ç#|{íÈ@-‘ÅU§¾^à=lŠ›N…6q•ýõ‘Ó>é¶wßÙï<ó—>›1㓇ŽWZ¯“hÁPSø]~i|¥!üL•9>Å‚?9‹`ì°Ã”~Ž£pA:‹ÂÔl"±’€X>S•˜¼Df¨$M€ãTT&aܾN ƒÄéDûª-Ò×éœ6C SØWá1lšZZp«>l½PÎ1sÈ!‡ÐþÆÍ:´£ÐÜF<Ó²,~>a±’€0 ØE£Ù1#È :O³æaœx\Ùh<µs훈ù K™ã×¹ÿΙg»Oüó=מ|úi‹W¿óîÿoï¼ã­*®x¹bT‚ØÅ Ø£ˆˆ¢€Ø’(ëQŒ£‰ Ø° Ñر<1bok,DQƒX°‹QA‚(X¢A}¾/,3Ž»²÷)÷žßþãÞÙ³gÖÌüö>³fÖ̬ß;ë¬ÓG$ˆ¬ß%¯!À`ÈœƒÕE ÁÌuÈå4V\|@2dï‡vú €#;BT·nÝ`Ô €VeP< A…ew¥øÒ"«é ‰K¥ Ôp¢ ùÍ£pd¹~ñ"‘‘üÄ W ÕÏ%ù3I@5ažB9jæµ<³Ãû2jÔ(~þ°yäÓ3T„4'‰g žþmAyYN†Í¬+z¢ â}™p5€ªšæ6ƒuʘtnÛ|ÿö;éÔÓ6ìÔ¹]ûv½zõ‚—Gq}k$™`‘b\| å0²bÏÂ#ß[ø±wYŒl 1lÀûÓyçÇÎ{„ÅÙìQ‘Õ‹Œ$£™€ý¨~xXHÌ? áé¡íú[¶3ð±ÇLwFJ £®*D ’«±J¸D±=€XÜÏ$LìÕðXóSb­‘=Hùg§{e€‘ÝᇖlšŒx>r~ -Mà·à2²àÄúF¥\6–µßƒH’aÀ`#\dYNH9~‚„p\^–¾€Ô2¢W,Àt9 è9l¯ÀªŸ0Ñq)ÃÈ*œ¬áÄôJµsš½ÂG;Ó3Ïu´š{Wù@úæ›oZì"X/ØhÇÈoìØ±Nû‰0÷±€Ê¦!ðÙ£89.#Bóî³Ï> dM]F¬+ßtÓMN,]"Ó; û.&àó`çE >ù£ $®Èm;§ëRž×IP*AÎ ˆ¬ouqËÇän] .Þ%°/Ø(ÂÀ£È[~E )ÃÕCH82 $œ€"0ëùd±GöËgôçZᲄ%¸G 4ŠaªÊYÛ¸OÝ}þ»ï¾û2Žv2Y²vƒ"Ål€Ñföpºd‘¢ÜS?€aÒ!œOö@Å| ŒS±FàkÀ$¨|àiÜ-r˜Å=-g<íâHT{å,´²e¡u‚öÖÀX ç-À˜j°·’Ã+|š6qưÆ8 ëÜcœ(†QGqDŠqñìa¾Å@‚À-S+ÇÌxÄm¸z‘‘!á\á"Yü¢™†»V¸ø°÷H†‚@ù¹DùI²’iÊÔ©SYït@¹,çwS²Ëå9³‡IN]^ØúÅû1nè ´Ë- ·N1†@“:,l'Ô.B¤`“7–‡<û¦j¨°êPYèÜ™4pȦ²Õ—^µ WU1iøè£°Áb?Dã2ßà/·È$À_¦bL¹˜œ°p0rÜãiç:iª«¼aØ:ŽTŒˆC€…"®¸§Œ¯ÚŠUm¤=¯Ãn1·UÌÇT”n> ! „€!v®ƒÊq_¦ìá™>}ºD! „€028¯ƒñnùúþ°ýŒM/¬[²aT( ! „€ðHë‡ Yh‘ÜÅ:û^8ž"Åãc­°B@Ôã‡-ÍÅþ´‡„ý ÄÎiS< \¢\€Û¿ïJSòr>&¥eUŽ@¶¼¢ƒ†ß`EZÍyÕHŸ ÅU†c8tÏ3/Çæ8ß™8Ž“42žMYx%À´)Ê"SòŠÒ.ÎÃâû8¡ˆ†ø(í\­ÃÌÆW9„Ñ:_|ñ•W^™Ì%ÊI]`àg¢k×®Åa‡oWœ-²A>ìÔ¶8Ê%ŠC€UÌõ—_œµâ´–…¡Ó-NZd® )tqŽÀnÈ!8æÀ‘Ldq%Ä .]vú"èú!HÅå<=ð àHâ$Çq‰2lÅ­ ^ð1fÌ—=.•É d«ž¶]z?Ç+ê§4s{mäbÿ±{DCðìгgOÈ#p-ÆY+÷ÈpÇÀëãàT»víŒB…GD"Š£Á.ÍabÀB»‹©‚@:8¸–a‡ƒÊþ…2à4#©?üÐùž  å­à0Ÿƒ ÇþÉ@ƒRÂŒÉôT”-·ÜÒ÷aA|ùÎAT±ì;åd(ãh„˜OÏB¥ù¿ÁBóZzßûNq,~ÜQ9ƹ"N~“¥Åq‰â#1’“4.ådï‚Õl| Gr¡r>×U’Sº‘ÃUó†å'€p’æ8 (Wÿ©…̓*Ë=Z„o‚’Ìu°­ÁpƒfÐ@‘ªÌò•ðýá&„9,„:8í@±ã8ϲ„É ,œøYbõMY„"…@5 IÇ™ç,•'^”#_&¥DÛY{áˆ6òH èËðìg.,æz ²>‘̶ß`\F\–0ÿ `´Žù ¼mÛ¶½üòËÝ¡aÎÜ8iô H€.Äe·f˜|9¨ˆÁêšm·Ý6 pë@wÍ@Ù¼h,óHsÎK—Ò¿ÿñãÇ“1.>O.TŒ@V:³:ŸWÔç TÏÝò*qDé$$tqd¡¶è˜³mà7—‘¶ $hîÞi)3 ŠCš[Ibi' ™ã¤9–R¿!„ñKCP±(H¬LX½ ân}>PFëÔ £¨%Æhf ?qñùs¡2B™xEÜ£8TåuÀEäÞNöXÂP„QÜ”Œ9Ò= 7‡¹Nƒã-É\ÇiR^ñ«†oiæ§8}饗ðÍȈuèÂEaøÈ “Y8ïD1B  `òôÓO3xg¸Í@Õøm´rùœ³ ºtèmø½8ÎÜœô>بQ,HðscDYKf¶õƒ ÑÌuP üÒ±Aò†b 3Ű'9Ašc)õ[A;!F3Ư¸ôfã; "o| XÉHfs>ôKOt ìm¬Lãm:aÃÔ2Vѯ¨Ï=Êh›i3!ÈeµµL¡<æ/Î[ùZìQàoåmzîæu#M)ä¥â¦á| ZÇOŒVÇ‹(C ‹dƒ›…1 Y 8^ Ë«¿B ªÈ“s6B7Ìæ‰5;™¾3€#è>}ú0W ÷„tŠÁþàÁƒ-ï$4’ÙÖ‰Šk 0`X2:´#á@G–œ -Ήð‰'žH+hÂСC™Qaă¯ÓU/2æeàKJÌ’x #€²1Gˆqñ¤É“ •Š‘8Ì+J¤]¼5踨9ó¼sÎ9‡Ý¿æY•†É š‰½ÿËôãÿÊ=šÖ7ó;hïÉ¡Â#¸Û¸o—7Ívû:±³Eú£uÙãæO.B ª`)…úØÒ=úFå"]⌱ö» 7´ÝMî³cóL¦ï Â*È1ÇÀOd}ü,>³-ñV™¸†øs†ä"¤±Ž­Œ"˜2¢ hK²Ö‰ä¥ggE™nÇ–…fÍš$2ãâyD—?*ª=À+Æ¥Â+€›Ž…6D–ŒÏ[_8 1ljˆtIh)¡"‘¬ëPoÞAäÅ#÷SIh^þL‹qÄ õHT˜ÂtœyrÎÆQèb¤¢—da<À™KïÆÐͧæäÄeàȪCf6‘õ1m*–@ˆ²+¼û &d̉XrÒØM‡ò¶² ltö’Ȇ“ŒùDþœ¤¬`I‹ä*µaÅçBõ[Ç+Ê:Ÿ33ºôÌfÐ.¶Ùåp,¨¬ô°E‚A6È9ä]úä&"L¯x&ƒÊ/9eežVÕm9¶P†fbrÍü†£j2Ó¢ÏÂÇú‡@]B "„wN‡é8©Xžœ³¥à¥>Ø«ù­¡oè•°fGÖ‡Hö—²6êƒÏlK¼ÿ 7Ä2² A€‹‰‡EØ6 +ÉüüÃ’sJ[.òÇ?·2Øg$¤l5v£³â$e¢‰b s•Æq¡þX³ï¿ãõ¹GYýbÌ ¥2FNå¸ìŒºwïŽvÇÖÇ<Ém¡v ØÎi?Æ"m‰3¹-bçtÝ2~k']|NªÐœ ü¢`…ðc"à |.ÅÈ4ŠÕƒn˜Ž“êѹ°˜ï×3ÌKú>?…™XÀÿ-è;ã¨9ù•aG¢V±úØ©jîÊré¿Á@F—,.À¶ J‘’ÉR¨4úDq «Å³u á4áx¦‰¨OâÉ] iЂɼ¢¼\¶°³‰ÜåòÈ_°`S…á"´NZ¦ƒô\¢þ/O6³N_šÂB Jpû£õa%#föŒ£Ð ³y†é;éõ8º(‚[lGh”p|¸>¤‰d¶ ü#3†å»ú ˜’\+Ü#…J ÷q ÷K ‡ÝæéÀ£p<+ÖÉkÏH ;èny¹œPt·@¸Ü@‚z›v7AÎcY U[4DX°XVùã?>M+pj•ÿ ˜4ež7eÃ3¯ú¤Õ:¾,……€¨ ,ÕpeX1Ûf¦D9Òjq‰:(B@œ¤Õ:,¾‰K4'ÊJ „€†@qX4㌧»ØLÂÞ6eŠÒM™B@H«u‡Ö1§Ÿö׎‹²iG\¢¬u+„€iµûÇÑ:ÎݧP?eãe×?»Úõ"…@£G@\¢q¯X\¢qÈTg|Z‡ÉMX딇K4ް:±V­1â-èåŠKÔàªQ.Ñ”qp¦Ä*G—ý çÞåáu0¦L™ÂɲðYë*<Ê«*5nÂq²j¯¸DÃHŠK>1Œ¶ IDATâ]6íaQïÑeàuG}fÀ‚Æ\J,J€¸DÅ%ŠÇ9÷™‰K4 ›¿›€°3¸fxâàöÙr‰úÌ€~) Š# .Qq‰B÷é¾Cq‰f£uœš Xò‰Ó:r‰˜ÝÛU@T'‘¤™ìÊá—h€bÕÞ <› € ðBÅ%Ú®]»Ñ£G3ìvn)ð°‡o7¸Ã1¾Ñ7¢ò Uü6í)Q” ª%® ZÇÏ’†K4Ì èKVXT!q¤™¬@¬ÂZ‹íƒŠKÔ½;q‰ŠKÔ} ˈ¡ Å)‘àÇÔ1!ãLĽy¡\¢‘Ì€1…(ZT q‰&@/.QÀ©-.Ñ„¯!ŸGf@CëD^Hˆ³°ù‹ædô%+,*Ž€¸D#_¸D}XÄ%ЦÈ÷ÊIÚ– =—¨ÿÚCÍ›o½•N”ðÎiq‰ŠKô¦›nrŸ›¸DëRž×IP*†rÎîe—¨†ÂN’‰KÔ›â54j”K40](ôV\¢…"¦ô5ˆ€¸D/]\¢ˆ¸DF^·âÍ &%eA@\¢³¸DËò¹YHÚÓE«lB@dÀr*Ñõ2”*.Ñ Á”(‡@Z­#.Q¥B@!´ZG\¢9!V! „€pĺp)’ìÄà(¨#% .ÑdÄôT!PˤÕ:`‡Öñ½ÚqQq‰ÖòW¥¶ ! âH«ul®pú‰*—(ÿñéK5âZ¨x!Ð8—hÜ{—h2ÕŸÖarÖ:eãe›ÍðáÃ7ÜpÃñãÇW'ĪU- .т޲¸D .q‰ÚiÙÂþV–K§ŠV]Üôâíª°ª+µ(a8Y".Ñ0’â—èsž²q‰2Ͳ!F6Ÿ­¯ a— ’" .Qq‰ú½“¸DÓúaûûßÿŽƒ)쪑H¡Öëý÷ߟ°`Á‚o¿ý¶k×®ƒ^¸pá]wÝ…Òzá…H=ÔÔ×\sÍŒ3X(âläl,I. IßÀa|š;wndAŠåD <׿cÚ´iVfäûÛßCŸsÑE-Z´ˆÀ 7Ü@ ƒÖŽ;N™2…_Íœ9s,ý®»îzì±Ç²5ôá‡^mµÕøÚ-þÖ[o={6¿—5ÖX"ÇŽÛ¥K{zíµ×î½÷ÞŽü ‹O‹-.½ôR µáúÏp³Í6›8q"Þ9‚j¥~ƒqié„ žxâ 4z=yä~Ú­ZµZºtiœdâ¤ASæ~õVgþbí - ܧOÜ6»øœ#Fì´ÓN– ª*Id·¼—Þ½{Ž‹ço™ÖYz¼y:QãþFöN@ñâ‹/ZšqãÆÁórÀðBÝ»€ó…ú`„Üj«­xJUÝ#'™À:ë¬sçw‚ËtöÍIÅš7o¾dÉÒœp ×_ýꫯn‰Ÿ=«ð¼yóø$èÿq6øõ×_óõòŠéÒ¹¨bxdUqí¤’¬ë¸eæ"´-r ™—èüå¿ÀW_}5² E jC@\¢¼‘ÓO?½ÿþýúõ£Ï}üñlj‰„ÅÞ¸D#¿á?ÿùÏâ "“ uü¤i¸DÏ<óLDÝxãx c”¼/Ya!P…ˆK”—ÂdË^ Ö§|p,<—¨¸DíkYö—©_¹D]=°' 4hæÌ™Ý»ww‘ *A;5ùæ›o°BÀ,†ÙáÞ{ïmÛ¶­_Cè _{í5,<¿ýíoBÇ.æ‹–-[’Ì™ øÎá¦brE޹‚“€©yÔ¨Qü$ `opñq\÷sÌ1<ðÀ~ûíY?£cü´H«L\CüŒ9ÃNrÒ˜$± Œ"0âaM¢-Ø£JŒdæ-`½|ûí·Í—ñ¬Y³6ÝtS„ÄÅóˆñô—_„/¸à`O(4ŸÞI\¢ ÙÇÇ;ˆ¼Hí~*ÁœÞ}Ñ\¢‹/61˜Œ™å`Jö¤*(ª>Ë—_~óÑQG…i›:a¯ïܹ3摯¾úŠ[ΜñÛ7*mqÈ!‡p‹}œ¾¯M›6˜þQWXùQ6ök‚†Š^²}ûöô¡|ð’IÏEºì²Ë?üp‹aå†å »¿¾òÊ+Ü¢ŸžþyTWd},}˜ñ“øu×]÷7Þ Ñ²'ü K.BûúPÞV ‹Á,>Y8²á<Šd6ÌY]#Keøí6ãâIfìxñ¦\q½[ŸxƒÄâåÛÎ÷ÊIÚ– =—(~½(­-¶ØÂgëË·öJ'²F ¼›@\¢âõ{'q‰¦ÝÖ Tìçœ3ÿ«/”K”Á[×Þÿ}_ˆÂB Ú—hàˆKÔ—h`Η×me¹DYRÂÎWE•HTDÎJF žM®è&1vÛºuk ¸, ¿Ø`EÎ¥gLvÐA¹[ÀvÄ®hwëáúðÛµ+Ë¥lÖ¬™+—ÈÈŒ.q8 .QÃD\¢áo#wŒ¸Dsc¤B \ˆKÔ—h¹¾¸bÊɽѥ©Ê#„@Ù—hÙ!WÅ Vë°‹Æm¡ñËÇ¢«°B@H«uP9ì" @‰ÿv­Hñ`Ñ­B@dàG\¢úŒ„€B OÒjŠ—hžX+™B@¤Õ:ì:Gë°Ó¿ÊÉ%ªW(jq‰Æ½hC¦÷4ïŸ÷Þ{/i·8z@TøQd<‡ñ2€¿¢pz“’é˜ÊàHoÍN`ãd uð…ã«Âhòp‰Ú;À#D C† i¯D­hˆˆK´ ·&.QƒK\¢ãÃyÝV–KÔªÈÞüM6Ù'WyÕX‰„@){Äɪ4q‰†‘—(S.\ bpÂ3øÀ¯³òÊ+ß~ûí+¼…Á¾ÓøùuÊÆ%ÊxФ—^zé¸ãŽ÷‚F[J,ʃ€¸DÅ%*.Qÿ·–… “š9ƒ‘¨\¿<>úè£wß}÷ƒ>˜ÙÒZk­¿´¡8?묳pø ×)aÇÃG,<}ÐoÀ(Š2ǹ8žä¢Ü3Î8+>ÝñN}Ûm·ø Æe¤-°|Ò4ZDKÿô§?QÒðŸmõ¹ð ’‰“vòÉ'Å%—\byÝ_ÜlÓœUï±Ç€ó©{”À´EFHHI&µZýõ- ‡mmá'.f 0d°ké9=‚ç¡Èâx#xî¹ç u ~ýë_wëÖÍÝÂ-Ëë[÷.>ýôS„`ï1NÇGŽé¹\.À\áü…ëÁ"á­`,nŽÌo¹å–H—H.{EÙh§f–|âðÊ„Ký#!“ÊŠ`§B…@qÐO=ýôÓ»ì² Öt°¼#‡É:×ý÷ßÏr·yìg™£oÆÀ¯ÃmóÍ7'•H‡&ÝŸ Ñ`ääȶà´ÿˆ#ŽH¨|É(:é¡p@YËfüD™á„ GjÐÿ$dDOÀDÏÎ/ý”SNšÅ@Wèœü‡%'H—häÛ—h, ZÇO]—(£ôÐC!tßsÏ=“›ÌºOŸ>Ì`ãíÝ»7ƒýÁƒ[&R.o˜ñÓ="×aÀ°”t6¥ àç KN&.Qq‰þøñ0³cüøžÜc‘ÀÝÆŒ3B*û:Q!‘¿4—ÝæO̦¯¸â ‹d¼†÷V&¤.B z—h»—(àˆK4á >2€Ö‰¼HgaóÇ%Ú©S',lv!õžœã;¿P……@y—h$Îâõa©).ÑŸÌy}ò £Tœ­9œ…GùhÌlõc±‘ß'úcذa úƒHV·Ùf›pqŠUˆ,Ùÿ·Íf6£ƒbù3öá¶mÛâÆ=5T›ù:+n¸áÎ;ï¿úÕ¯,Òý2 ß²f\„4Ö{ØæÀŠë[lÙ8ï¼ó¬ È†ó̯¿þz°µ ³Š¥Ç8}xÏž=¡nÁäHÅ’ã9H¡X&‡>yòäHþ$¬— ¼õÖ[;–.Ë 9ˆ¥Îl€2ù¬x‘€ #XQ<®/=÷ÜsYwàÛà³!ÁÀ-}þ±±zY±ü…”*å¨Öu#®„n(îÊIš3_n¡\¢~^……@Õ" .ÑÀ«a[›&8ßÍ~ŠÀ#n1¶£ƒÃñq1ôøðŸN˜0á/ù‹“O˜­kìh§ dzp€ÃÛUNo1¤ÉÉtL3Ù²dÉ’H!È_°`Aä£ê‰d§àãFœý/¼Ó¥K—òr¹0,s †GÖLÔMZŸÓ•å-•*–\!)qCÎ0§OÐiU—hÎWÁöâ@š8Õ@²À­Û<3žIIòÚ3H““éX\¢¨óº—h^0)‘( â5˜Å%Z–Ï­ÈBÒÎuŠ,VÙ„€È{¸2”Ú·oß ¥I”0Òjq‰êKB@üH«uX|—hþp+¥B ÆÈfçô¿½‹ý lf`«%'«k\5_! ¤Õ:ˆc«†ïúÓŽ‹²i‡SœR<¸u+„€¨qÒj¶£u8Þå_(!~šâyæ™gâ Æõç˜8éÆ)ª¸4ŠBÀ—hÜ— .Ñ8dª3>­ÃäÆW9„Ñ:åácâ«N¬U«FŒ€¸D z¹â5¸âz0ü,vØaøJÀ3”•S§N à ™.VqÜ®];çõ˜Hœ|søÔ¥3f ƒéÓ§»˜ÊRú&¨,—è¸qãðÃV=ÇtU! .Ñ|¾\ÅÜwß}ù¤LN#.Q< ˆKtÙ´ÛN‡Ðý棉T­8Â÷>‘øþph{ Dð7…bÇ‹»eÁ-Õ<€ïwœRÛ±/HAf̘a¼R8Zˆ”¬H!PUˆKT\¢âõ’ÙXØüÝ„ÁpœÖÉ„K”–„™øüæ),*Ž€¸DÅ%*.Qÿg˜Öqj&`É'NëdÂ% ï5ŽZGàÆ•Ëo˜ÂB jˆ$ÍÄøÎ%.ÑŪ½D~Ý¸ÓÆæx§ì#Àåf|6³q)Oû„x˜æàù6iœû à|åáÍbââI /5®©±Äàå:aÃÔüå${﹚C¾‡ûp»ìÁ˜ò äƒ>È_˜aažuÙý€¸D}4~'h?uq\¢HÀ‚‡[x.<ÆÃ¸Ž»röÎù’Uˆ@i¦¸Dã(MÅ%*.ÑȬe1@C»üå…xD/":X—¨/ËgâóãÕ€þÞ©]*¨ 0ZbT;£%¿zKÁòÂ’'¤Ôxû·ßÇ®[¶lI2g6˜9s&œ>s®'–Q£Fñ‹#€ÕÁÅÇpÝ{Ì1Ç0`ßo¿ý"ëãgtŒŸi•‰kˆŸ1gØI.B él£ˆ=zœp ´eĈ %âÁ è&MšoKÆ[`E®óe4wÓ¦MÛk¯½’UêQZ‡ M\íy”Ö)šK4މ/®>ŠåG@\¢‘˜‹KXâz0q‰¢8b¯œT¡9ø¢0¬óc"à ødy”ÌÄ™W‘B üˆK4€¹¸D äL\¢‘㕺Êr‰&3ñE×X±B ì¸ýQ’YÉĈK@°P ÜŠK4H•ß.[r\Úd¥¢k).Ñ¢¡SF!9â5HÅ%šù§••@ÔM<â|öç4­û®ù÷Ë6u ! „€(¨œoëVX6×éСÃg_/ÛÙ©K! „@‰hVWײÅr­sd·â-l%ªœÄ ! „@£D íÎéF Š%„€%B@Z§DÀJ¬B@D ­Š¢„€B DHë”X‰B@r¸Äóà¢E‹ð„ß!<­EP”B@åžýp"Çá};›ËÚHT’´*5HYuÕUq„'­‰ "…€Bð¦ŠórÜ¡š{Ó8ÅÓ¤®WÝ¢‰‹"!Ã-STŽ‘ÖàÊ“+2e\$¾†x¤\†Ðð¿¡!4|ü°¾ІùðFëàé ¾Éý†¸pÒ\ç믿FY¡r˜èª9\ ! „@- €¦±3LeqM®õZÜ£eñèIå$a¤gB@!°”*Å‘€GÒ2si9'>=B@T†é?Òן»¹«°B@"4×)a±-„€5‰@ŽušÄDB@R! ¹N©•\! „€#ДuÂÑŠ¥DàÓO?7oÞÂ… á/e9’]ëÀ¾ÖZkµoß¾PVìZ®”íOÚßVÊr%»v@åÌš5‹dП¯¹æšµ „Z^z>ùä“Ù³gó½m¼ñÆR<¥Ç;¯š&Ÿ×ÉK† BÀçcÏí¶Û®LJ+ŠA€a >Vøê¤uŠA°y´®SP%2|4uîÜ91‰ ,à{3Ï`Y •¬bÐyb‘S¾bøòË/~›û‡|¸Ü0§OÜûá‚ı R”¸âð½ñÕU¼ª€! ¹NU ²[tÚ"Oè,X0nܸ?þØO߯_¿³Î:Ëbü°Ÿæ¾ûî#£»^yåž>ôÐC/¾øbrF_ˆÂ ¾« /¼0î늋o(­kLõÔºN%ßæ£> }‘«ÁV[mµÉ&›Lž<™]7[l±ñtP=N8!¯m†‘Òœðª И—À|juùå—ŸwÞysçÎ=ãŒ3\z²sa¯'Æ»Î=÷\öȱgÁ"Ûµk·Ùf›Ý|óÍÄtéÒ%!£/DᆂÀ½÷Þûøã¯·Þz ÔYž$€TöV{Ø*‰ÿW\cÊÆ*Ѷm[´Î=÷ÜCŒi‚*)­ E$Æ\N×Ïò¢'ò{2Û»á†:uê4~üøQ£Fù^}!~دÉn»ívþùç» EëpëJËè²(Ð `cäí·ß-ËwÞ¹Ã;6øŸMƒhN㮤ÎëTøýöîÝûÔSOõ+qå•Wú·…ÃÒ Ê^DâSN9eýõ×?ýôÓ ÊK§ÏEŸ3׃>ˆâAåЕ0™Ûu×]-‹i 7×Aš…}~ä‘Gn³Í6C‡%ÆO@·uÜqÇ=ñÄk¯½öe—]¶ÓN;‘`úôéçœsÎ /¼@wöôÓO§_ŽrÕP [®¹æT ´,»ð òSšèd uzišë¤Ç0c ÇïÚµëᇋ-näÈ‘Ï<ó }ߘ1c¶ß~û@‚ÈÛ%K–œyæ™O>ù$OwÞygÂ?ÿùÏ SÊ–[nùî»ïÞÿýô­üJ)”à évùÝΜ9ó’K.iÖ¬3Œ¾}û’…®ðÎ;ï`§1bé™]ÑϘ1ËÆi§Æú ý5Ý÷ÙgŸYiÝ=]L\à¯ýë!‡Âäoë­·¦»ì²‹¥$/ʆ[?ÃTìÙgŸ%VÝÍ7_æéKF˜pÆC=”#Dì¸ãŽ;N<ñDEâ“O>s ëC~ø!ŠÇrŠÐm5 ÀèÇi¾B¼{T UU’ü°ÙY•Ééß¹Þ|óM+hþüù¨Šp¡ŒÄ[µj…†8âˆ#âæaiä"ò®å’¹u¥°^Ò¡CÉÊ+¯Ì#n›7oŽqœ#ý”)S0m¡ÛÐ=–¥M›6¨“çŸóŵ×^K$5éÖ­ÛÞ{ï=qâD›ôìÙÝ®| ÆxòÅ ôÙa‡F²Aƒ±€¶X:!~ØH‚iÓ¦Ñq]zé¥ ß{ゥS§bŽÃcê ݃º"=Û{ì1ªPP†øÂ®*¾ù曥ÿ»êÆ—øu[R<󤹎%}1&üÿø=>a–F˜(Ä•ÈX›ôºë®cC&ÌKŸ€4ra2bŠÀº’QØ—>úè#ÖÕ¹Ý}÷Ý=öXWâž{îyüñÇsËb;¹PB˜,˜1ö·4{챇HpõÕWF]¡¨Z´h±úê«Û#ŒWHþk}DÂwiÙQ{«¬²Êرc¹eç+¹X•ùÝï~Ç-Ý ãYÌ)°e´¿ÈßgŸ}Fm·áÄNÇ×IsñÅÛ­@ñ ã)úø[°È1Ñ4*w¿…«ú±œ_Zõ·¢ÁÕ0A}äX×IÈÙàP¨Î ÷ïßß·AÇU’18èë­[üå/9gΜ°Ö H³\¶_‹ì Ò´–4¿8§9(‚× ò?Fý˜ÔXe!W~Æ(•[n¹å€0•I¡h\bLëø5O¶Ý't c¾4¦8&L@—3Ó" :Ûªpõ Àž6Ä›‚ÁöËì¶zêVƒ5ñûpó“æ:áÔŠ©¦'ØBm†¬<«ÁlR¾ýöÛxë­·¸ (›ª@šë$€S]°n±·ŠåzL=hzÉîÝ»'W+ŠŠ¬ù3xßh£¸MžüÆ ¤çeÿ(ŒoüãY{gM誫®B1 <˜šp,œé‚­<á NŽÅS‡œÕ¸õÖ[ÃBl3ñl½sOý°‹$0iÒ$ÿÖÂ~b?ÌõÌêØ:ÓåsÏ=gÚyOX”bª Ö YÈìÓ§æ_ÿëZþ­5©ªªÖxešÔõª[4ñÇãñ>l«mݺ5[NýH…+‹@ [̳2lŠc;Vú1 rØx2£k¦hÈK¬l-£§¶¥ œUâ4‹ G(œ‰•@äæ5Œlô|–̘!áé#ðH·%B€Ãï‹/Æ^)¿iݲƒº lã*´º«­¶Z¡Y"Ó;9NßX²‚ÖŠÇ`¯ãø ƒÐ2ìJˆlˆ"%a•c“¾7ž«ç7Õºz*£šÔ¨(v*c²³£|R<5ñÖ+ÑHS9ÌòùÞ U¢²5T¦ÖujèeWISÙ$f;Øìà&OUR7U£‘!€M˜#À8¤¥Sõ¼Ù¦ç.®;¡zª£šÔø8`E‡Žàõ×_Çþ[-V+††5VY`à««X%TðOÐ\ç§xè®ôàZïÚôXØ´¡ ôx×t l%ÀÂÆ.¾ºš¢š¯ujz5Sºõ5ó¶ÕP!ðêr§! „€¥D žu]B@! ʃ€æ:åÁY¥! „À2êu^G‚B@” Íuʵ B@º¤uvµréè¸>! „€ÈT†éޏôIsö¹Këħx! „€#`Zõ~d1Ië:8jśܑÒ=qð)^!  Ó7¨ ‡ÏŒÀ'é¼.>íñV ¿\Iœé3WzüºB@!hTÀhXrâ0YFvd4‘‘)>ÿüsX#„¸<ÙS"å(R! 7XŘœ0EÊ *“¸Æ6Mæå%gBæ8¡ŠB@!‰@ý:Ýù@‘B@! 2G i[æ…I B@Ô8õì9°Æ!Pó…€B lÔ¯ÓþÖ²¦‚„€Üûœø ÊIDATB Æ¨ÿþÿ~_ã¨ùB@!P6ê¥sʆµ B@úº£–ÙÑ%„€B  Ô?N³2à¬"„€B`õÿO8! „€(Z×)Ò*G! ˜ë|?Hë:ú„€B Lh®S& UŒB@€@ý÷ãµ›@_‚B@” ¦Ò9eBZÅÆŽÀ'Ÿ|2wîÜ?þø?ÿùO#hëÏ~ö³µ×^»C‡k®¹fÊæ|ùå—ÿþ÷¿ù ñLJQeËÞ¬Y³•W^–5þf[hÓy»ì›­DIB @å¼ñÆíÛ·ïÚµëk¬ÑøôÓOçÌ™C£6ÝtÓ4Še8PδiÓ&ó¼t8Síýë_Ôœ"²­vSèÚJWoIB Fxÿý÷ñ‹_tëÖ­Ñ´ÝÉõÝwßÑ´4Z‡¾–²víÚ5,dÐ4\(êO ÃÊ×·›|w†â$JÚD`þüù;vl|m§Q4-M»`d^mµÕÒH¨`^jNý³­€øu²ÅSÒ„@"ðÅ_03¨òÆûí·°,TIEÓ ÊH¼téÒlç ù%½¥æÔ?Û"š^þßîgd+RÒ„€¨= íÍ ! 8=ôЋ/¾¸`Á‚õÖ[‰Eß¾}YÁ.~ýúõÛi§Î<óÌ‚ä×4WDÑ«ï¼óÎ?ÿùOJgSÃFmT__™IBÑõwM÷ºB@‡ýcA=Ô¢E‹:è ·ß~{àÀ¬ØÏž=ûÆoDåì¼óÎÅU g.ªÇÅRMΔ.A“&œ£/¨DŠþïÿûÀ|öÙgÀÂ^²gŸ}ö¹çžÛk¯½Xrk¸¦ÃV|¶áÖ^5B z C/hZ0jÔ¨÷Þ{.Õ­Õ_pÁ… )´ù…ÊÏj†A¹ùWõ©§žZ²dÉ¡‡Ú¼ysrmµÕVwÞyç”)SP<ù É$e&J7PÍu€èV"@åpåÙ½²3êú믿âŠ+Z·nž \uÕU÷ÜsϬY³ºté2zôèm¶Ù†:yä‘[o½532²%ì /´øÅ‹ŸrÊ)“&MbÝ{ƒ 6˜>}:‰Ù÷|ÜqÇ=ñʧ.»ì2 kDšÊ ×à¬ú\ʵ+® ?ž‰Î+¯¼²å–[6mÚúèZk­o½õj©W¯^´šd_}õÕäÉ“92ÅbLŸ>}ØÅN$‰A™¯½öS%³Û/.!œUóEÔ³®ˆÒ­B èX—ë¼þÐc’µ™šMØ]t LƒÐ–†þM³pá‰'Ò“¶x tÓ¦M;vì‡~ˆB²H& tÐè­¡C‡žxâ‰i]¿…óüK–" d±róü‹¥në¬³ŽŸž[d¢J‰D¹¾ð ¨ }{ì1Òÿð÷hÑâˆ#ŽàÈÔÔ©S-;‰g̘Á¹ÝþýûsîKÅçÿ7Жô·•YžJ_oIB Ú #˳+'™?d'2Ëo~ó›Í6ÛŒ%=zÐQZÚ»ãŽ;b…ci}×]weñØèèa‡ ÂN&;wv‘»í¶Û¼yó˜¡{lY UQ¤Ïçü{yóìЪU+? 3BªÁlÆê³ñÆ£ZˆìÙ³ç×_Í^ æŽ(Ýu×]—wÌ„P]ó´ÄLz˜ê¡ƒÙ¯ç_lÎp&mѺNÝ !P$¸{a—­õt9E˜çÝwßÜoýÈ#\rÉ%æ?ˆˆXýõ-ŒÑ‰[ÂìAàQ÷îÝ-Þʵȋ/¾ØVe˜Q¡xèÇÉ‚yÍO™\OLLy6'Y…r%§qOWYeÂLìü½¶{›GÈ¡JTÌ¢˜WZi%têꫯN.æ@&‡¹ ŠgÅW$1ªÝ/Ý»Bã+¬°BÜ£¢ãµ®S4tÊ(„@ñl¸á† ÕY$ßn»íR˜¾ 4èŽ;î`3Ûí·ßŽ%-À¿µÕ zÛ-¶ØÂÅ[äI'Ô»woÙPhGf˜¸°ÍÕyΜ9(·íÂÅ3Ña„62…o³Å¹UкN¾UI4~Xg Àm·ÝÆÆ}3a‰‚i «´Ÿi K8O>ù$Ëæ p`5Úa‡XÎyùå—IFväɲ9 B̈Äf•ò¤gB2Äü »â|`•G>f1Öf0²lcÅaRcÆÆ<æõ×_'¥øyC×bu4Wœ‡­ZO£šëdþÍH y!À„†ñ;[ÔÆŒC‰nØ|óÍÏ>ûìm·Ý–%™í·ßž¥ˆ#F°/먣Ž7n\œÐk®¹æè£æx)émÁ•Fä°að­Ñ#³ÂZ¢+“Pmñ,\Q%t0ø „˜Í`BdW›«'‘&L@µpÉxÄÊ»«o¾ùfæ=¨œÝwß½mÛ¶.Kõš kUwÆìEÕS!ÕD†ˆÝàÁƒó_×ñÛˆ£3æ"lö·ê²,­‰îÕV×Q'~–p˜.$ø5@ß=l˜ gŒ¡>(°ë®»îC‰LO$“0Ü0Ÿ+b‰ˆy¹˜ÇøÈÜ{ィt3­ ûÚAß ŠXìɧnÉi(”u™3g²=9eAO5×).%B :)®"úÖÈ!¹íÚ¢°œúÆ*ÄD'\3:åp¿Nc-Š{Z†xôMB)‘McÞcSŸ„Œ•}Tù¿*[•.„@c@ÝÀì„–ÐS7†öü¯!4*Oµ×jtkþ"öé2ð&Ø{¤æ™ë0Íuâ>Å !Pì„fŸ•™’ÈVÄŒ§€ÂJŸÔú\ìK4*ÿÃü‘õbƒD¢œ¡ĦGf“M6‰,(ÃHk>©¹íäÎPxSÖut ! R"À†]ÎÄ „ýc¬Ç¤”V Ù9w ŸkNÌ-ÒÔ‡¥&w$–4Ö¡§Xž¼ì‘Cåp½*WOÍuâQ¼ €¶,sÊÝcëÿd®Ê¤Ö˜ à쀦¥© ;ʘßÐ}£{ªv7s¸ÖX7â ¯X5œ¬ˆi"@S! ‚à†Åºi6P¥7"¥WâžI ¢óåošòÙYǦzp´rB†æSslŒüMÓüpÞTh†Å)FšE€Þ9eÝX¡£ãμïn¸Xe¬Ä.ª¹B@”i2€¬"„€Bà¤uô)! „@ù¨¯ØHŽt•3•$„€Å"P_wK6œEÅV@ù„€B †…­†^¶š*„€¨8²°Uü¨B@B þ£¨¡æª©B@!PQêÛM¾»¢PáB@!PCÔ×i [ ½n5U!Paêç÷Ý·ÂUPñB@!P3ÔÃQ3UC…€B ÂÔwšªu ¿/„€¨ê[¯ …ÚyÝj©B ÂÔÏû?Ã*\/„€5ƒ@“ã‡ÿæÛoW¿ýª÷¾ù¾íГàúî»ïŒ}ˆ¿\ÜÂDda\caÿ/ ‘Œ¿D€.O+„¶šXžüòK¸’ÒW^! „@CA E‹®ª‹-¢÷3Ö5ãï‡ Úihè¸%¼dÉKLØå"кuk:Rºñ¯¾úŠ[’µiÓÆ,X°`Ýu×%Œ—ËÆdaþºk¥•V¢vëªájE<‘v µdØÔ¹ÉUkÕ½±ßï–~óÍÏË%ã]ŸîÖé[/ïÂü¥®Ä¸¿ÈÅß4—_„ÕÁþ\°›wèÐ!|åB@4DæÎÛªU+t }ºuëüe ¾âŠ+ »Ÿ?>išýum„ò•~Ê×Ï?ÿœH’uìØÑž¾ûî»]ºt!ŒS3„}!„Ýeà·&¥EZ5,l(ð×ü°|ØÑÍßIEND®B`‚phylip-3.697/doc/kitsch.html0000644004732000473200000003404212406201173015464 0ustar joefelsenst_g kitsch

version 3.696

Kitsch -- Fitch-Margoliash and Least Squares Methods
with Evolutionary Clock

© Copyright 1986-2014 by Joseph Felsenstein. All rights reserved. License terms here.

This program carries out the Fitch-Margoliash and Least Squares methods, plus a variety of others of the same family, with the assumption that all tip species are contemporaneous, and that there is an evolutionary clock (in effect, a molecular clock). This means that branches of the tree cannot be of arbitrary length, but are constrained so that the total length from the root of the tree to any species is the same. The quantity minimized is the same weighted sum of squares described in the Distance Matrix Methods documentation file.

The options are set using the menu:


Fitch-Margoliash method with contemporary tips, version 3.696

Settings for this run:
  D      Method (F-M, Minimum Evolution)?  Fitch-Margoliash
  U                 Search for best tree?  Yes
  P                                Power?  2.00000
  -      Negative branch lengths allowed?  No
  L         Lower-triangular data matrix?  No
  R         Upper-triangular data matrix?  No
  S                        Subreplicates?  No
  J     Randomize input order of species?  No. Use input order
  M           Analyze multiple data sets?  No
  0   Terminal type (IBM PC, ANSI, none)?  ANSI
  1    Print out the data at start of run  No
  2  Print indications of progress of run  Yes
  3                        Print out tree  Yes
  4       Write out trees onto tree file?  Yes

  Y to accept these or type the letter for one to change

Most of the options are described in the Distance Matrix Programs documentation file.

The D (methods) option allows choice between the Fitch-Margoliash criterion and the Minimum Evolution method (Kidd and Sgaramella-Zonta, 1971; Rzhetsky and Nei, 1993). Minimum Evolution (not to be confused with parsimony) uses the Fitch-Margoliash criterion to fit branch lengths to each topology, but then chooses topologies based on their total branch length (rather than the goodness of fit sum of squares). There is no constraint on negative branch lengths in the Minimum Evolution method; it sometimes gives rather strange results, as it can like solutions that have large negative branch lengths, as these reduce the total sum of branch lengths!

Note that the User Trees (used by option U) must be rooted trees (with a bifurcation at their base). If you take a user tree from Fitch and try to evaluate it in Kitsch, it must first be rooted. This can be done using Retree. Of the options available in Fitch, the O option is not available, as Kitsch estimates a rooted tree which cannot be rerooted, and the G option is not available, as global rearrangement is the default condition anyway. It is also not possible to specify that specific branch lengths of a user tree be retained when it is read into Kitsch, unless all of them are present. In that case the tree should be properly clocklike. Readers who wonder why we have not provided the feature of holding some of the user tree branch lengths constant while iterating others are invited to tell us how they would do it. As you consider particular possible patterns of branch lengths you will find that the matter is not at all simple.

If you use a User Tree (option U) with branch lengths with Kitsch, and the tree is not clocklike, when two branch lengths give conflicting positions for a node, Kitsch will use the first of them and ignore the other. Thus the user tree:

     ((A:0.1,B:0.2):0.4,(C:0.06,D:0.01):43);

is nonclocklike, so it will be treated as if it were actually the tree:

     ((A:0.1,B:0.1):0.4,(C:0.06,D:0.06):44);

The input is exactly the same as described in the Distance Matrix Methods documentation file. The output is a rooted tree, together with the sum of squares, the number of tree topologies searched, and, if the power P is at its default value of 2.0, the Average Percent Standard Deviation is also supplied. The lengths of the branches of the tree are given in a table, that also shows for each branch the time at the upper end of the branch. "Time" here really means cumulative branch length from the root, going upwards (on the printed diagram, rightwards). For each branch, the "time" given is for the node at the right (upper) end of the branch. It is important to realize that the branch lengths are not exactly proportional to the lengths drawn on the printed tree diagram! In particular, short branches are exaggerated in the length on that diagram so that they are more visible.

The method may be considered as providing an estimate of the phylogeny. Alternatively, it can be considered as a phenetic clustering of the tip species. This method minimizes an objective function, the sum of squares, not only setting the levels of the clusters so as to do so, but rearranging the hierarchy of clusters to try to find alternative clusterings that give a lower overall sum of squares. When the power option P is set to a value of P = 0.0, so that we are minimizing a simple sum of squares of the differences between the observed distance matrix and the expected one, the method is very close in spirit to Unweighted Pair Group Arithmetic Average Clustering (UPGMA), also called Average-Linkage Clustering. If the topology of the tree is fixed and there turn out to be no branches of negative length, its result should be the same as UPGMA in that case. But since it tries alternative topologies and (unless the N option is set) it combines nodes that otherwise could result in a reversal of levels, it is possible for it to give a different, and better, result than simple sequential clustering. Of course UPGMA itself is available as an option in program Neighbor.

The U (User Tree) option requires a bifurcating tree, unlike Fitch, which requires an unrooted tree with a trifurcation at its base. Thus the tree shown below would be written:

     ((D,E),(C,(A,B)));

If a tree with a trifurcation at the base is by mistake fed into the U option of Kitsch then some of its species (the entire rightmost furc, in fact) will be ignored and too small a tree read in. This should result in an error message and the program should stop. It is important to understand the difference between the User Tree formats for Kitsch and Fitch. You may want to use Retree to convert a user tree that is suitable for Fitch into one suitable for Kitsch or vice versa.

An important use of this method will be to do a formal statistical test of the evolutionary clock hypothesis. This can be done by comparing the sums of squares achieved by Fitch and by Kitsch, BUT SOME CAVEATS ARE NECESSARY. First, the assumption is that the observed distances are truly independent, that no original data item contributes to more than one of them (not counting the two reciprocal distances from i to j and from j to i). THIS WILL NOT HOLD IF THE DISTANCES ARE OBTAINED FROM GENE FREQUENCIES, FROM MORPHOLOGICAL CHARACTERS, OR FROM MOLECULAR SEQUENCES. It may be invalid even for immunological distances and levels of DNA hybridization, provided that the use of common standard for all members of a row or column allows an error in the measurement of the standard to affect all these distances simultaneously. It will also be invalid if the numbers have been collected in experimental groups, each measured by taking differences from a common standard which itself is measured with error. Only if the numbers in different cells are measured from independent standards can we depend on the statistical model. The details of the test and the assumptions are discussed in my review paper on distance methods (Felsenstein, 1984a). For further and sometimes irrelevant controversy on these matters see the papers by Farris (1981, 1985, 1986) and myself (Felsenstein, 1986, 1988b).

A second caveat is that the distances must be expected to rise linearly with time, not according to any other curve. Thus it may be necessary to transform the distances to achieve an expected linearity. If the distances have an upper limit beyond which they could not go, this is a signal that linearity may not hold. It is also VERY important to choose the power P at a value that results in the standard deviation of the variation of the observed from the expected distances being the P/2-th power of the expected distance.

To carry out the test, fit the same data with both Fitch and Kitsch, and record the two sums of squares. If the topology has turned out the same, we have N = n(n-1)/2 distances which have been fit with 2n-3 parameters in Fitch, and with n-1 parameters in Kitsch. Then the difference between S(K) and S(F) has d1 = n-2 degrees of freedom. It is statistically independent of the value of S(F), which has d2 = N-(2n-3) degrees of freedom. The ratio of mean squares

      [S(K)-S(F)]/d1
     ----------------
          S(F)/d2

should, under the evolutionary clock, have an F distribution with n-2 and N-(2n-3) degrees of freedom respectively. The test desired is that the F ratio is in the upper tail (say the upper 5%) of its distribution. If the S (subreplication) option is in effect, the above degrees of freedom must be modified by noting that N is not n(n-1)/2 but is the sum of the numbers of replicates of all cells in the distance matrix read in, which may be either square or triangular. A further explanation of the statistical test of the clock is given in a paper of mine (Felsenstein, 1986).

The program uses a similar tree construction method to the other programs in the package and, like them, is not guaranteed to give the best-fitting tree. The assignment of the branch lengths for a given topology is a least squares fit, subject to the constraints against negative branch lengths, and should not be able to be improved upon. Kitsch runs more quickly than Fitch.

The constant available for modification at the beginning of the program is "epsilon", which defines a small quantity needed in some of the calculations. There is no feature saving multiple trees tied for best, because exact ties are not expected, except in cases where it should be obvious from the tree printed out what is the nature of the tie (as when an interior branch is of length zero).


TEST DATA SET

    7
Bovine      0.0000  1.6866  1.7198  1.6606  1.5243  1.6043  1.5905
Mouse       1.6866  0.0000  1.5232  1.4841  1.4465  1.4389  1.4629
Gibbon      1.7198  1.5232  0.0000  0.7115  0.5958  0.6179  0.5583
Orang       1.6606  1.4841  0.7115  0.0000  0.4631  0.5061  0.4710
Gorilla     1.5243  1.4465  0.5958  0.4631  0.0000  0.3484  0.3083
Chimp       1.6043  1.4389  0.6179  0.5061  0.3484  0.0000  0.2692
Human       1.5905  1.4629  0.5583  0.4710  0.3083  0.2692  0.0000


TEST SET OUTPUT FILE (with all numerical options on)


   7 Populations

Fitch-Margoliash method with contemporary tips, version 3.69

                  __ __             2
                  \  \   (Obs - Exp)
Sum of squares =  /_ /_  ------------
                                2
                   i  j      Obs

negative branch lengths not allowed


Name                       Distances
----                       ---------

Bovine        0.00000   1.68660   1.71980   1.66060   1.52430   1.60430
              1.59050
Mouse         1.68660   0.00000   1.52320   1.48410   1.44650   1.43890
              1.46290
Gibbon        1.71980   1.52320   0.00000   0.71150   0.59580   0.61790
              0.55830
Orang         1.66060   1.48410   0.71150   0.00000   0.46310   0.50610
              0.47100
Gorilla       1.52430   1.44650   0.59580   0.46310   0.00000   0.34840
              0.30830
Chimp         1.60430   1.43890   0.61790   0.50610   0.34840   0.00000
              0.26920
Human         1.59050   1.46290   0.55830   0.47100   0.30830   0.26920
              0.00000


                                           +-------Human     
                                         +-6 
                                    +----5 +-------Chimp     
                                    !    ! 
                                +---4    +---------Gorilla   
                                !   ! 
       +------------------------3   +--------------Orang     
       !                        ! 
  +----2                        +------------------Gibbon    
  !    ! 
--1    +-------------------------------------------Mouse     
  ! 
  +------------------------------------------------Bovine    


Sum of squares =      0.107

Average percent standard deviation =   5.16213

From     To            Length          Height
----     --            ------          ------

   6   Human           0.13460         0.81285
   5      6            0.02836         0.67825
   6   Chimp           0.13460         0.81285
   4      5            0.07638         0.64990
   5   Gorilla         0.16296         0.81285
   3      4            0.06639         0.57352
   4   Orang           0.23933         0.81285
   2      3            0.42923         0.50713
   3   Gibbon          0.30572         0.81285
   1      2            0.07790         0.07790
   2   Mouse           0.73495         0.81285
   1   Bovine          0.81285         0.81285

phylip-3.697/doc/main.html0000644004732000473200000111550713212365300015131 0ustar joefelsenst_g main

PHYLIP

Phylogeny Inference Package

PHYLIP Logo

Version 3.697

December, 2017

by Joseph Felsenstein


Department of Genome Sciences and Department of Biology
University of Washington

address:
Department of Genome Sciences
Box 355065
Seattle, WA   98195-5065
USA

E-mail address:    joe (at) gs.washington.edu


Contents of This Document


Contents of This Document
A Brief Description of the Programs
Copyright Notice for PHYLIP
The Documentation Files and How to Read Them
What The Programs Do
Running the Programs
      A word about input files
      Installing a recent version of Oracle Java
      Running the programs on a Windows machine
      Running the programs on a Macintosh with Mac OS X
      Running the programs on a Unix or Linux system
      Running the programs on a Macintosh with Mac OS 8 or 9 (deprecated)
      Running the programs in MSDOS
      Running the Drawgram and Drawtree Java interfaces
      Running the Drawgram and Drawtree Java GUI interfaces in Windows
      Running the programs in background or under control of a command file
            An example (Unix, Linux or Mac OS X)
            Subtleties (in Unix, Linux, or Mac OS X)
            An example (Windows)
            Testing for existence of files
            Prototyping keyboard response files
Preparing Input Files
      Input and output files
      Where the files are
      Data file format
The Menu
The Output File
The Tree File
The Options and How To Invoke Them
      Common options in the menu
        The U (User tree) option
        The G (Global) option
        The J (Jumble) option
        The O (Outgroup) option
        The T (Threshold) option
        The M (Multiple data sets) option
        The W (Weights) option
        The option to write out the trees into a tree file
        The (0) terminal type option
The Algorithm for Constructing Trees
      Local rearrangements
      Global rearrangements
      Multiple jumbles
      Saving multiple tied trees
      Strategy for finding the best tree
A Warning on Interpreting Results
Relative Speed of Different Programs and Machines
      Relative speed of the different programs
      Speed with different numbers of species
      Relative speed of different machines
General Comments on Adapting the Package to Different Computer Systems
Compiling the programs
      Unix and Linux
      On Windows systems
           Compiling with Cygnus Gnu C++
           Compiling with Microsoft Visual C++
      Macintosh
           Compiling with GCC on Mac OS X with our Makefile
           Compiling with GCC on Mac OS X with X Windows
           What about the Metrowerks Codewarrior compiler?
      VMS VAX systems
      Parallel computers
      Other computer systems
      Compiling the Java interfaces
Frequently Asked Questions
      How to make it do various things
      Background information needed:
      Questions about distribution and citation:
      Questions about documentation
      Additional Frequently Asked Questions, or: "Why didn't it occur to you to ...
      (Fortunately) obsolete questions
New Features in This Version
Coming Attractions, Future Plans
Endorsements
      From the pages of Cladistics
      ... in the pages of other journals:
      ... and in the comments made by users when they register:
References for the Documentation Files
Credits
Other Phylogeny Programs Available Elsewhere
      PAUP*
      MrBayes
      MEGA
      PAML
      Phyml
      RAxML
      TNT
      DAMBE
How You Can Help Me
In Case of Trouble


A Brief Description of the Programs

PHYLIP, the Phylogeny Inference Package, is a package of programs for inferring phylogenies (evolutionary trees). It has been distributed since 1980, and has over 30,000 registered users, making it the most widely distributed package of phylogeny programs. It is available free, from its web site:

http://evolution.gs.washington.edu/phylip.html

PHYLIP is available as source code in C, and also as executables for some common computer systems. It can infer phylogenies by parsimony, compatibility, distance matrix methods, and likelihood. It can also compute consensus trees, compute distances between trees, draw trees, resample data sets by bootstrapping or jackknifing, edit trees, and compute distance matrices. It can handle data that are nucleotide sequences, protein sequences, gene frequencies, restriction sites, restriction fragments, distances, discrete characters, and continuous characters.



Copyright Notice for PHYLIP

The following copyright notice given below is intended to cover all source code, all documentation, and all executable programs of the PHYLIP package. This is a "BSD 2-Clause License" which is open source. It is not a GNU license and does not insist that other materials distributed with PHYLIP be under a similar license.

© Copyright 1980-2014, Joseph Felsenstein
All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.



The Documentation Files and How to Read Them

PHYLIP comes with an extensive set of documentation files. These include the main documentation file (this one), which you should read fairly completely. In addition there are files for groups of programs, including ones for the molecular sequence programs, the distance matrix programs, the gene frequency and continuous characters programs, the discrete characters programs, and the tree drawing programs. Finally, each program has its own documentation file. References for the documentation files are all gathered together in this main documentation file. A good strategy is to:

  1. Read this main documentation file.
  2. Tentatively decide which programs are of interest to you.
  3. Read the documentation files for the groups of programs that contain those.
  4. Read the documentation files for those individual programs.

There is an excellent guide to using PHYLIP 3.6 also available. It was written by Jarno Tuimala of the Center for Scientific Computing in Espoo, Finland and is available as a PDF here. It is also distributed at the main PHYLIP web site.


What The Programs Do

Here is a short description of each of the programs. For more detailed discussion you should definitely read the documentation file for the individual program and the documentation file for the group of programs it is in. In this list the name of each program is a link which will take you to the documentation file for that program. Note that there is no program in the PHYLIP package called PHYLIP.

Clique
Finds the largest clique of mutually compatible characters, and the phylogeny which they recommend, for discrete character data with two states. The largest clique (or all cliques within a given size range of the largest one) are found by a very fast branch and bound search method. The method does not allow for missing data. For such cases the T (Threshold) option of Pars or Mix may be a useful alternative. Compatibility methods are particular useful when some characters are of poor quality and the rest of good quality, but when it is not known in advance which ones are which.
Consense
Computes consensus trees by the majority-rule consensus tree method, which also allows one to easily find the strict consensus tree. Is not able to compute the Adams consensus tree. Trees are input in a tree file in standard nested-parenthesis notation, which is produced by many of the tree estimation programs in the package. This program can be used as the final step in doing bootstrap analyses for many of the methods in the package.
Contml
Estimates phylogenies from gene frequency data by maximum likelihood under a model in which all divergence is due to genetic drift in the absence of new mutations. Does not assume a molecular clock. An alternative method of analyzing this data is to compute Nei's genetic distance and use one of the distance matrix programs. This program can also do maximum likelihood analysis of continuous characters that evolve by a Brownian Motion model, but it assumes that the characters evolve at equal rates and in an uncorrelated fashion, so that it does not take into account the usual correlations of characters.
Contrast
Reads a tree from a tree file, and a data set with continuous characters data, and produces the independent contrasts for those characters, for use in any multivariate statistics package. Will also produce covariances, regressions and correlations between characters for those contrasts. Can also correct for within-species sampling variation when individual phenotypes are available within a population.
Dnacomp
Estimates phylogenies from nucleic acid sequence data using the compatibility criterion, which searches for the largest number of sites which could have all states (nucleotides) uniquely evolved on the same tree. Compatibility is particularly appropriate when sites vary greatly in their rates of evolution, but we do not know in advance which are the less reliable ones.
Dnadist
Computes four different distances between species from nucleic acid sequences. The distances can then be used in the distance matrix programs. The distances are the Jukes-Cantor formula, one based on Kimura's 2- parameter method, the F84 model used in Dnaml, and the LogDet distance. The distances can also be corrected for gamma-distributed and gamma-plus-invariant-sites-distributed rates of change in different sites. Rates of evolution can vary among sites in a prespecified way, and also according to a Hidden Markov model. The program can also make a table of
Dnainvar
For nucleic acid sequence data on four species, computes Lake's and Cavender's phylogenetic invariants, which test alternative tree topologies. The program also tabulates the frequencies of occurrence of the different nucleotide patterns. Lake's invariants are the method which he calls "evolutionary parsimony".
Dnaml
Estimates phylogenies from nucleotide sequences by maximum likelihood. The model employed allows for unequal expected frequencies of the four nucleotides, for unequal rates of transitions and transversions, and for different (prespecified) rates of change in different categories of sites, and also use of a Hidden Markov model of rates, with the program inferring which sites have which rates. This also allows gamma-distribution and gamma-plus-invariant sites distributions of rates across sites.
Dnamlk
Same as Dnaml but assumes a molecular clock. The use of the two programs together permits a likelihood ratio test of the molecular clock hypothesis to be made.
Dnamove
Interactive construction of phylogenies from nucleic acid sequences, with their evaluation by parsimony and compatibility and the display of reconstructed ancestral bases. This can be used to find parsimony or compatibility estimates by hand.
Dnapars
Estimates phylogenies by the parsimony method using nucleic acid sequences. Allows use the full IUB ambiguity codes, and estimates ancestral nucleotide states. Gaps treated as a fifth nucleotide state. It can also do transversion parsimony. Can cope with multifurcations, reconstruct ancestral states, use 0/1 character weights, and infer branch lengths.
Dnapenny
Finds all most parsimonious phylogenies for nucleic acid sequences by branch-and-bound search. This may not be practical (depending on the data) for more than 10-11 species or so.
Dollop
Estimates phylogenies by the Dollo or polymorphism parsimony criteria for discrete character data with two states (0 and 1). Also reconstructs ancestral states and allows weighting of characters. Dollo parsimony is particularly appropriate for restriction sites data; with ancestor states specified as unknown it may be appropriate for restriction fragments data.
Dolmove
Interactive construction of phylogenies from discrete character data with two states (0 and 1) using the Dollo or polymorphism parsimony criteria. Evaluates parsimony and compatibility criteria for those phylogenies and displays reconstructed states throughout the tree. This can be used to find parsimony or compatibility estimates by hand.
Dolpenny
Finds all most parsimonious phylogenies for discrete-character data with two states, for the Dollo or polymorphism parsimony criteria using the branch-and-bound method of exact search. May be impractical (depending on the data) for more than 10-11 species.
Drawgram
Plots rooted phylogenies, cladograms, circular trees and phenograms in a wide variety of user-controllable formats. The program is interactive. It has an interface in the Java language which gives it a closely similar menu on all three major operating systems. Final output can be to a file formatted for one of the drawing programs, for a ray-tracing or VRML browser, or one at can be sent to a laser printer (such as Postscript or PCL-compatible printers), on graphics screens or terminals, on pen plotters or on dot matrix printers capable of graphics. Many of these formats are historic so we no longer have hardware to test them. If you find a problem please report it.
Drawtree
Similar to Drawgram but plots unrooted phylogenies. It also has a Java interface for previews.
Factor
Takes discrete multistate data with character state trees and produces the corresponding data set with two states (0 and 1). Written by Christopher Meacham. This program was formerly used to accomodate multistate characters in Mix, but this is less necessary now that Pars is available.
Fitch
Estimates phylogenies from distance matrix data under the "additive tree model" according to which the distances are expected to equal the sums of branch lengths between the species. Uses the Fitch-Margoliash criterion and some related least squares criteria, or the Minimum Evolution distance matrix method. Does not assume an evolutionary clock. This program will be useful with distances computed from molecular sequences, restriction sites or fragments distances, with DNA hybridization measurements, and with genetic distances computed from gene frequencies.
Gendist
Computes one of three different genetic distance formulas from gene frequency data. The formulas are Nei's genetic distance, the Cavalli-Sforza chord measure, and the genetic distance of Reynolds et. al. The former is appropriate for data in which new mutations occur in an infinite isoalleles neutral mutation model, the latter two for a model without mutation and with pure genetic drift. The distances are written to a file in a format appropriate for input to the distance matrix programs.
Kitsch
Estimates phylogenies from distance matrix data under the "ultrametric" model which is the same as the additive tree model except that an evolutionary clock is assumed. The Fitch-Margoliash criterion and other least squares criteria, or the Minimum Evolution criterion are possible. This program will be useful with distances computed from molecular sequences, restriction sites or fragments distances, with distances from DNA hybridization measurements, and with genetic distances computed from gene frequencies.
Mix
Estimates phylogenies by some parsimony methods for discrete character data with two states (0 and 1). Allows use of the Wagner parsimony method, the Camin-Sokal parsimony method, or arbitrary mixtures of these. Also reconstructs ancestral states and allows weighting of characters (does not infer branch lengths).
Move
Interactive construction of phylogenies from discrete character data with two states (0 and 1). Evaluates parsimony and compatibility criteria for those phylogenies and displays reconstructed states throughout the tree. This can be used to find parsimony or compatibility estimates by hand.
Neighbor
An implementation by Mary Kuhner and John Yamato of Saitou and Nei's "Neighbor Joining Method," and of the UPGMA (Average Linkage clustering) method. Neighbor Joining is a distance matrix method producing an unrooted tree without the assumption of a clock. UPGMA does assume a clock. The branch lengths are not optimized by the least squares criterion but the methods are very fast and thus can handle much larger data sets.
Pars
Multistate discrete-characters parsimony method. Up to 8 states (as well as "?") are allowed. Cannot do Camin-Sokal or Dollo Parsimony. Can cope with multifurcations, reconstruct ancestral states, use character weights, and infer branch lengths.
Penny
Finds all most parsimonious phylogenies for discrete-character data with two states, for the Wagner, Camin-Sokal, and mixed parsimony criteria using the branch-and-bound method of exact search. May be impractical (depending on the data) for more than 10-11 species.
Proml
Estimates phylogenies from protein amino acid sequences by maximum likelihood. The PAM, JTT, or PMB models can be employed, and also use of a Hidden Markov model of rates, with the program inferring which sites have which rates. This also allows gamma-distribution and gamma-plus-invariant sites distributions of rates across sites. It also allows different rates of change at known sites.
Promlk
Same as Proml but assumes a molecular clock. The use of the two programs together permits a likelihood ratio test of the molecular clock hypothesis to be made.
Protdist
Computes a distance measure for protein sequences, using maximum likelihood estimates based on the Dayhoff PAM matrix, the JTT matrix model, the PBM model, Kimura's 1983 approximation to these, or a model based on the genetic code plus a constraint on changing to a different category of amino acid. The distances can also be corrected for gamma-distributed and gamma-plus-invariant-sites-distributed rates of change in different sites. Rates of evolution can vary among sites in a prespecified way, and also according to a Hidden Markov model. The program can also make a table of percentage similarity among sequences. The distances can be used in the distance matrix programs.
Protpars
Estimates phylogenies from protein sequences (input using the standard one-letter code for amino acids) using the parsimony method, in a variant which counts only those nucleotide changes that change the amino acid, on the assumption that silent changes are more easily accomplished. percentage similarity among sequences.
Restdist
Distances calculated from restriction sites data or restriction fragments data. The restriction sites option is the one to use to also make distances for RAPDs or AFLPs.
Restml
Estimation of phylogenies by maximum likelihood using restriction sites data (not restriction fragments but presence/absence of individual sites). It employs the Jukes-Cantor symmetrical model of nucleotide change, which does not allow for differences of rate between transitions and transversions. This program is very slow.
Retree
Reads in a tree (with branch lengths if necessary) and allows you to reroot the tree, to flip branches, to change species names and branch lengths, and then write the result out. Can be used to convert between rooted and unrooted trees, and to write the tree into a preliminary version of a new XML tree file format which is under development and which is described in the Retree documentation web page.
Seqboot
Reads in a data set, and produces multiple data sets from it by bootstrap resampling. Since most programs in the current version of the package allow processing of multiple data sets, this can be used together with the consensus tree program Consense to do bootstrap (or delete-half-jackknife) analyses with most of the methods in this package. This program also allows the Archie/Faith technique of permutation of species within characters. It can also rewrite a data set to convert it from between the PHYLIP Interleaved and Sequential forms, and into a preliminary version of a new XML sequence alignment format which is under development and which is described in the Seqboot documentation web page.
Threshml
Reads a tree from a tree file, and a data set with discrete 0/1 characters. Using the threshold model of quantitative genetics, the program runs a Markov Chain Monte Carlo (MCMC) sampler to sample the underlying continuous characters (the liabilities) that cause the discrete characters. The covariances of the liabilities are estimated, as well as the transformation from the liabilities to underlying independently evolving characters.
Treedist
Computes the Branch Score distance between trees, which allows for differences in tree topology and which also makes use of branch lengths. Also computes another distance by Robinson and Foulds that uses branch lengths, and the Symmetric Difference distance between trees, which allows for differences in tree topology but does not use branch lengths.


Running the Programs

This section assumes that you have obtained PHYLIP as compiled executables (for Windows, Mac OS X, or Linux), or else you have obtained the source code and compiled it yourself (for Linux, Unix, Mac OS X, or Windows). For the programs Drawtree and Drawgram you will also need a recent version of Java installed on your computer to run them interactively. Note that for machines for which compiled executables are available, there will usually be no need for you to have a compiler or compile the programs yourself. This section describes how to run the programs. Later in this document we will discuss how to download and install PHYLIP (in case you are reading this without yet having done that). Normally you will only read your copy of the documentation files after downloading and installing PHYLIP.

After describing the input files, we will describe how to run most of the programs on Windows, Mac OS X, Linux, and Unix systems). After that, we will give special descriptions of the interactive Java interface for the tree-drawing programs Drawgram and Drawtree, including how to run these interfaces on Windows, Mac OS X, and Linux systems. These may require you to download and install on your computer the most recent version of Oracle Java, which is available from Oracle at no cost. We describe this below after discussing input files.

A word about input files.

For all of these types of machines, it is important to have the input files for the programs (typically data files) prepared in advance. They can be prepared in any editor, but it is important that they be saved in Text Only ("flat ASCII") format, not in the format that word processors such as Microsoft Word want to write (in Microsoft Word, make sure that the data encoding used is "US ASCII", as using any of the Unicode codings can cause trouble). It is up to you to read the PHYLIP documentation files which describe the files formats that are needed. There is a partial description in the next section of this document. The input files can also be obtained by running a program that produces output files in PHYLIP format (some of these programs do, and so do programs by others such as sequence alignment programs such as ClustalW and sequence format conversion programs such as Readseq). There is not any input file editor available in any program in PHYLIP (you should not simply start running one of the programs and then expect to click a mouse somewhere to start creating a data file).

When they start running, the programs look first for input files with particular names (such as infile, treefile, intree, or fontfile). Exactly which file names they look for varies a bit from program to program, and you should read the documentation file for the particular program to find out. If you have files with those names the programs will use them and not ask you for the file name. If they do not find files of those names, the programs will say that they cannot find a file of that name, and ask you to type in the file name. For example, if Dnaml looks for the file infile and does not find one of that name, it prints the message:

dnaml: can't find input file "infile"
Please enter a new file name>

This does not mean that an error has occurred. All you need to do is to type in the name of the file.

(Joe, you need to rewrite or eliminate this paragraph, it is too condescending) The program looks for the input files in the same folder that the program is in (a folder is the same thing as a "directory"). In Windows, Mac OS X, Linux, or Unix, if you are asked for the file name you can type in the path to the file, as part of the name (thus, if the file is in the folder containing the current folder, you can type in a file name such as ../myfile.dna). If you do not know what a "folder" is, or what "above" means, then you are a member of the new generation who just clicks the mouse and assumes that a list of file names will magically appear. (Typically members of this generation have no idea where the files are on their system, and accumulate enormous amounts of unnecessary clutter in their file systems.) In this case you should ask someone to explain folders to you.

Running the programs on a Macintosh with Mac OS X

We have provided a Mac OS X version of the executables, in the form of "universal binaries" that should run either on PowerMac or Intel iMac systems (to ensure that they will run on both 32-bit and 64-bit Mac OS X systems, we have made sure that we compiled the executables as 32-bit executables). The programs can be run by clicking on their icons. They open a Terminal window, and the menu appears in it. Note that after the program is finished, the Terminal window remains open, and operations can be done in it. You will have to close the window yourself if you don't want it. The programs can be terminated by typing control-C (press down the "control" key in the lower-left corner of the keyboard and type "c").

It is also possible to run the executables from within a Terminal window by typing the program name, but this is a little harder. You will find the Terminal utility available in the Utilities folder in the Applications folder. You do need to have links made in the exe folder to the programs. This can be done the first time you need them, by entering the exe folder and opening a Terminal window, and then typing source linkmac. This creates the proper links, and thereafter you do not need to do this again. The programs can be run by typing their names in a Terminal window whose current working directory is exe The programs work well this way, though the programs Drawgram and Drawtree may be slow to open and close plotting windows. The programs can be terminated by typing control-C or by closing the Terminal window by using the red button in the upper-left corner of the window.

One problem we have often encountered using Mac OS X is that it is possible for data files to have the wrong kind of characters at the ends of their lines. They may have carriage-return (ASCII/ISO 13 or control-M) characters at the ends of their lines when they should instead have the Unix newline character (ASCII/ISO 10 or control-J) there. This can happen with files transferred from other operating systems or files produced in some word processors. It results in segmentation-fault or memory errors. If you encounter these, check this possibility carefully.

If you normally run Mac OS X applications using open -a, you may need to use the command lsregister -f -r /your/path/to/apps. You can find it with the command locate lsregister.

Running the programs on a Unix or Linux system.

Type the name of the program in lower-case letters (such as dnaml). To terminate the program while it is running, type Control-C (which means to press down on the Ctrl key while typing the letter C).

On some systems you may need to type ./ before the program name, so that in the above case it would be ./dnaml. This is mostly needed if the user's PATH does not include their current directory, something which is often done as a security precaution.

Running the programs on a Macintosh with Mac OS 8 or 9 (deprecated)

We no longer produce and distribute Mac OS 8 and Mac OS 9 executables of the Phylip programs, as we no longer have access to these operating systems to produce and test them. As a last resort, only if you do not have access to a system that will run the current distribution, you have two choices:

Once you have the executables, you may follow the directions below.


Running the Drawgram and Drawtree Java interfaces

With version 3.695 we have released an interactive Java interface for the tree-drawing programs, Drawgram and Drawtree. The reason is that the graphic interface language for Mac OS X has changed from the Carbon GUI to the Cocoa GUI, which would require a lot of rewriting of code. The alternative X11 (X Windows) GUI machinery on Mac OS X has been deprecated by Apple, and is showing its age on Linux systems.

Looking at available options, it seemed best to use Java to construct GUI interfaces, as this could be done in a reasonably compatible way across all three major platforms. There are disadvantages too -- to get full compatibility we need to ask users to download the most recent available Java from its maker, Oracle. That is not difficult but is a tiresome extra step. Oracle owns Java, and Java is not public-source, but there seems to be no sign that Oracle is going to make Java runtime machinery unavailable or charge for it.

Not all Java implementations will run PHYLIP's Drawgram and Drawtree GUIs. A reasonably compatible Java is distributed with Mac OS X, but no Java is distributed along with Windows, and the Java distributed with Linux distributions is unfortunately not compatible enough with our Java GUI. So for these two platforms you will need to download Oracle Java. We will give you instructions for that below.

The new GUI for Drawgram and Drawtree is a testbed for a general set of GUI interfaces for all our programs, which will be present in version 4.0 when that is distributed, which will be soon. The work you do to put a recent version of Oracle Java on your system will make using version 4.0 easier.

For people who use Drawgram or Drawtree in a "pipeline" run by shell scripts, there should be no interruption in your ability to do that. The current C code for those programs can either be called by the Java GUI or be run from a command line or a shellscript (for which see below). Almost all of the features of Drawgram and Drawtree are available from their character-mode menu when run that way, except for the interactive previewing of plots. We hope that the shell scripts will still work and will not need modification for this version of PHYLIP.

Running the Drawgram and Drawtree Java GUI interfaces in Windows

To run the Drawgram or Drawtree programs, you find the Drawgram.jar or Drawtree.jar files, which are Java Archive files in our folder of executable programs. You can run them by clicking on their icons. Detailed instructions for using the interfaces are given in the general documentation file for tree-drawing programs draw.html (which you should read), and the documentation files for the two programs drawgram.html and drawtree.html.

Installing a recent version of Oracle Java

To run the interactive interfaces of the tree-drawing programs Drawgram and Drawtree, you need to have an appropriate version of Java installed on your computer. If you have Java installed, you should test whether it is an appropriate version by trying to run Drawgram or Drawtree (for this you will need an input tree file present as well). Is it likely that you have a compatible Java on your system?

Once a useable version of Java is installed, you do not have to repeat the installation every time you run one of the programs Drawgram or Drawtree.

Running the programs on a Windows machine.

Double-click on the icon for the program. A window should open with a menu in it. Further dialog with the program occurs by typing on the keyboard in response to what you see in the window. The programs can be terminated either by typing Control-C (which means to press down on the Ctrl key while typing the letter C), or by using the mouse to open the File menu in the upper-left corner of the program's window area and then select Quit. Other than this, most PHYLIP programs make no use of the mouse. The tree-drawing programs Drawtree and Drawgram do allow use of the mouse to select some options.

The programs open a window for their menus. This window may be too small for your tastes. They can be resized by tugging on the lower-right corner of the window. In addition, the font may be too small. On most versions of Windows, you can click on the small C:\ icon symbol at the upper-left corner of the window, and choose the Properties menu choice there. One of its tab options allows you to change the font and size of the print. I prefer large font sizes such as 16x12.

The programs can also be run in a Command Prompt window under Windows, in much the same way as they were under the MSDOS operating system, which is what the Command Prompt window emulates. Command Prompt windows can be open by choosing that option in the Accessories menu which is in the All Programs menu. Once in the Command Prompt window, make sure that you are in the correct folder, using the cd command as needed to find the folder where the executable PHYLIP programs are. Then type the name of the program that you want to use in lower-case letters (such as dnaml). To terminate the program while it is running, type Control-C (which means to press down on the Ctrl key while typing the letter C).

Running the programs in background or under control of a command file

In running the programs, you may sometimes want to put them in background so you can proceed with other work. On systems with a windowing environment they can be put in their own window, and commands like the Unix and Linux nice command used to make them have lower priority so that they do not interfere with interactive applications in other windows. This part of the discussion will assume either a Windows system or a Unix or Linux system. I will note when the commands work on one of these systems but not the other. Mac OS X is actually Unix (surprise! surprise!) and you can run PHYLIP programs in background on any Mac OS X system by simply following the instructions for Unix, using a terminal window to do so if necessary. (The Terminal utility can be found in the Utilities folder which is inside the Applications folder).

If there is no windowing environment, or if you want to make PHYLIP programs part of a larger workflow of some sort, on a Unix or Linux system you will want to use an ampersand (&) after the command file name when invoking it to put the job in the background. You will have to put all the responses to the interactive menu of the program into a file and tell the background job to take its input from that file (we cover this below).

On Windows systems there is no & or nice command but input and output redirection and command files work fine in a Commmand window. A command file can either be invoked by clicking on its icon or by typing its name from a Command Prompt window. The a file of commands must have a name ending in .bat or .cmd, such as foofile.bat. You can run the batch file from a Command window by typing its name (such as foofile) without the .bat.

Here are examples, for the different operating systems:

An example (Unix, Linux or Mac OS X)

Here is an example for Windows, Linux, or using a Terminal window of Mac OS X. Below you will find a separate example for Windows. If you are using Windows you should read that section instead.

Suppose you want to run Dnaml in a background, taking its input data from a file called sequences.dat, putting its interactive output to file called screenout, and using a file called input as the place to store the interactive input. The file input need only contain two lines:

sequences.dat
Y

which is what you would have typed to run the program interactively, in response to the program's request for an input file name if it did not find a file named infile, in response the the menu.

To run the program in background, in Unix or Linux you would simply give the command:

dnaml < input > screenout &

These run the program with input responses coming from input and interactive output being put into file screenout. The usual output file and tree file will also be created by this run (keep that in mind as if you run any other PHYLIP program from the same directory while this one is running in background you may overwrite the output file from one program with that from the other!).

Subtleties (in Unix, Linux, or Mac OS X)

If you wanted to give the program lower priority, so that it would not interfere with other work, and you have Berkeley Unix type job control facilities in your Unix or Linux (and you usually do), you can use the nice command:

nice +10 dnaml < input > screenout &

which lowers the priority of the run. To also time the run and put the timing at the end of screenout, you can do this:

nice +10 ( time dnapars < input ) >& screenout &

which I will not attempt to explain.

On Unix or Linux systems you may also want to explore putting the interactive output into the null file /dev/null so as to not be bothered with it (but then you cannot look at it to see why something went wrong). If you have problems with creating output files that are too large, you may want to explore carefully the turning off of options in the programs you run.

If you are doing several runs in one, as for example when you do a bootstrap analysis using Seqboot, Dnapars (say), and Consense, you can use an editor to create a "command file" with these commands:

seqboot < input1 > screenout
mv outfile infile
dnapars < input2 >> screenout
mv outtree intree
consense < input3 >> screenout

The command file might be named something like foofile

It must be given execute permission by using the command chmod +x foofile. The job that foofile describes can be run in background on Unix or Linux by giving the command

foofile &

Note that you must also have the interactive input commands for Seqboot (including the random number seed), Dnapars, and Consense in the separate files input1, input2, and input3.

An example (Windows)

If you have a Windows system and want to run Dnaml in a background, taking its input data from a file called sequences.dat, putting its interactive output to file called screenout, and using a file called input as the place to store the interactive input. The file input need only contain two lines:

sequences.dat
Y

which is what you would have typed to run the program interactively, in response to the program's request for an input file name if it did not find a file named infile, in response the the menu.

To run the program in background, you can place the command

dnaml < input > screenout &

in a file called something like foofile.bat. This "batch file" that has commands and has its name end in .bat or .cmd can be run simply by double-clicking on the file icon, which will usually have a picture of a gear. A Command Prompt windows (an MSDOS window) will then open and the commands in the batch file will be run in it. Alternatively, you can open a Command Prompt window yourself. It will be found in the All Programs menu, as one of the options under Accessories. Make sure that after it opens, you tell it to change its working directory to the one that has the batch file in it.

The batch file with this command runs the program with input responses coming from input and interactive output being put into file screenout. The usual output file and tree file will also be created by this run (keep that in mind as, if you run any other PHYLIP program from the same directory while this one is running in background, you may overwrite the output file from one program with that from the other!).

Testing for existence of files

Note also that when PHYLIP programs attempt to open a new output file (such as outfile, outtree, or plotfile, if they see a file of that name already in existence they will ask you if you want to overwrite it, and offer alternatives including writing to another file, appending information to that file, or quitting the program without writing to he file. This means that in writing batch files it is important to know whether there will be a prompt of this sort. You must know in advance whether the file will exist. You may want to put in your batch file a command that tests for the existence of a pre-existing output file and if so, removes it, such as these commands in Unix, Linux, or Mac OS X:

if test -e fubarfile
then
   rm fubarfile
fi

You might even want to put in a command that creates a file of that name, so that you can be sure it is there! Either way, you will then know whether to put into your file of keyboard responses the proper response to the inquiry about overwriting that output file.

Offhand, I do not know how to test for the existence of files in Windows, but I suspect that there is a way.

Prototyping keyboard response files

Making the proper files of keyboard responses for use with command files is most easily done if you prototype the process by simply running the program and keeping a careful record of the keyboard responses that you need to give to get the program to run properly. Then create a file in an editor and type those keyboard responses into it. Thus if the program requires that you answer a question about what to do with the output file with a keyboard response of R, then wants you to type a menu selection of U (to have it use a User tree), then wants you to answer Y to end the menu, and another R to tell it to replace the output file, you would have the file of keyboard responses be

R
U
Y
R

Since when you run the program interactively, each keyboard response is ended by pressing the Enter key on your keyboard, in the file of keyboard responses you must end each line after typing the appropriate character.

Testing the keyboard responses with an interactive run will be essential to having batch runs succeed.


Preparing Input Files

The input files for PHYLIP programs must be prepared separately - there is no data editor within PHYLIP. You can use a word processor (or text editor) to prepare them yourself, or you can use a program that produces a PHYLIP-format output.

With the 3.695 release of Phylip we have included a directory called TestData which contains the data used to generate the examples shown in the individual program html pages and the output files they produce. Within this TestData directory there is a subdirectory that has the name of the program (for example contrast) and within that there are the files contrastinfile.txt, contrastintree.txt and contrastoutfile.txt. If you look at the Contrast documentation you can see infile, intree, and outfile mentioned in the example. The testdata/contrast/*.txt files exactly match those in the example, so if you wish to experiment with Contrast you have both a good infile and a good intree and the outfile expected from the example, if you set your conditions to match the example.

Sequence alignment programs such as ClustalW commonly have an option to produce PHYLIP files as output, and some other phylogeny programs, such as MacClade and TreeView, are capable of producing a PHYLIP-format file.

It is very important that the input files be in "Text Only" or "ASCII" format. This means that they contain only printable ASCII/ISO characters, and not any unprintable characters. Many word processors such as Microsoft Word save their files in a format that contains unprintable characters, unless you tell them not to. In the Microsoft Word family of word processors, the first time you edit a file, when you go to Save in the File menu, the file the program will instead do a Save As function, and ask you in what format you want the file to be written.

For these word processors, the next time you edit the same file, using Save, the program should use those settings without asking you. If you have some trouble getting an input file that the programs can read, look into whether you properly set these options. This can be usually be done by using the Save As choice in the File menu and making the right settings.

Text editors such as the vi and emacs editors on Unix and Linux (and available on Mac OS X too), or the pico editor that comes with the pine mailer program, produce their files in Text Only format and should not cause any trouble.

The format of the input files is discussed below, and you should also read the other PHYLIP documentation relevant to the particular type of data that you are using, and the particular programs you want to run, as there will be more details there.

Input and output files

For most of the PHYLIP programs, information comes from a series of input files, and ends up in a series of output files:

                   -------------------
                  |                   |
infile ---------> |                   |
                  |                   |
intree ---------> |                   | -----------> outfile
                  |                   |
weights --------> |      program      | -----------> outtree
                  |                   |
categories -----> |                   | -----------> plotfile
                  |                   |
fontfile -------> |                   |
                  |                   |
                   -------------------

The programs interact with the user by presenting a menu. Aside from the user's choices from the menu, they read all other input from files. These files have default names. The program will try to find a file of that name - if it does not, it will ask the user to supply the name of that file. Input data such as DNA sequences comes from a file whose default name is infile. If the user supplies a tree, this is in a file whose default name is intree. Values of weights for the characters are in weights, and the tree plotting program need some digitized fonts which are supplied in fontfile (all these are default names).

For example, if Dnaml looks for the file infile and does not find one of that name, it prints the message:

dnaml: can't find input file "infile"
Please enter a new file name>

This simply means that it wants you to type in the name of the input file.

Where the files are

When you run a program, you are in a current folder. If you run it by clicking on an icon, the folder is the one that has the icon. If you run it by typing the name of the program, the folder is the current folder when you do that. The program will look for default files (such as infile and intree) in that folder. When it writes files, their default locations are also in the current folder.

The program need not actually be in the current folder. An icon can sometimes be a link to a program located elsewhere. A program name typed by you can contain a “path”, so that if you type /usr/local/phylip/dnaml the program run will be located in folder /usr/local/phylip. The operating system maintains a default path for your account, which is a series of names of folders. When you type the name of a program, the operating system will look in that series of folders until it finds the program, and then run it. But in all of these cases, the input and output files will, by default, be in the current folder, even if the program is located in some other folder.

Users can change where the input files are, or where the output files go. If no file called infile is found in the current folder, you will be asked to type the name of the file. In that case you can type a filename with a path, such as foobar/mydata, and in that case the program will look for file mydata in folder foobar within the current folder. A similar process occurs when the program cannot find file intree.

When the program starts to write an output file, such as outfile, a similar series of events happens, with one important difference. It is when a file outfile already exists in the current folder that the user will be asked what to do. (In the case of input files, it was when they did not exist that the user is asked what to do). You will be given the opportunity to Replace the file, Append to the file, write to a different File, or Quit. If you choose the response F you will be asked for the name of the different file, and that is when you can give a filename with a path, such as foobar/myoutput.out, and the file will be written in that folder instead of the current folder.

Understanding which folder is the current folder, and whether there are files named infile, intree, outfile, or outtree there, is crucial to successfully running PHYLIP programs, and making sure that they analyze the correct data set and write their files in the right place.

Data file format

I have tried to adhere to a rather stereotyped input and output format. For the parsimony, compatibility and maximum likelihood programs, excluding the distance matrix methods, the simplest version of the input data file looks something like this:

   6   13
Archaeopt CGATGCTTAC CGC
HesperorniCGTTACTCGT TGT
BaluchitheTAATGTTAAT TGT
B. virginiTAATGTTCGT TGT
BrontosaurCAAAACCCAT CAT
B.subtilisGGCAGCCAAT CAC

The first line of the input file contains the number of species and the number of characters (in this case sites). These are in free format, separated by blanks. The information for each species follows, starting with a ten-character species name (which can include blanks and some punctuation marks), and continuing with the characters for that species. The name should be on the same line as the first character of the data for that species. (I will use the term "species" for the tips of the trees, recognizing that in some cases these will actually be populations or individual gene sequences).

The name should be ten characters in length, and either terminated by a Tab character or filled out to the full ten characters by blanks if shorter. Any printable ASCII/ISO character is allowed in the name, except for parentheses ("(" and ")"), square brackets ("[" and "]"), colon (":"), semicolon (";") and comma (","). If you forget to extend the names to ten characters in length by blanks, and do not terminate them with a Tab character, the program will get out of synchronization with the contents of the data file, and an error message will result. A Tab character that terminates a name will not be taken as part of the name that is read; the name will then automatically be filled with blanks to a total length of 10 characters.

In the discrete-character programs, DNA sequence programs and protein sequence programs the characters are each a single letter or digit, sometimes separated by blanks. In the continuous-characters programs they are real numbers with decimal points, separated by blanks:

Latimeria 2.03 3.457 100.2 0.0 -3.7

The conventions about continuing the data beyond one line per species are different between the molecular sequence programs and the others. The molecular sequence programs can take the data in "aligned" or "interleaved" format, in which we first have some lines giving the first part of each of the sequences, then some lines giving the next part of each, and so on. Thus the sequences might look like this:

    6   39
Archaeopt CGATGCTTAC CGCCGATGCT
HesperorniCGTTACTCGT TGTCGTTACT
BaluchitheTAATGTTAAT TGTTAATGTT
B. virginiTAATGTTCGT TGTTAATGTT
BrontosaurCAAAACCCAT CATCAAAACC
B.subtilisGGCAGCCAAT CACGGCAGCC

TACCGCCGAT GCTTACCGC
CGTTGTCGTT ACTCGTTGT
AATTGTTAAT GTTAATTGT
CGTTGTTAAT GTTCGTTGT
CATCATCAAA ACCCATCAT
AATCACGGCA GCCAATCAC

Note that in these sequences we have a blank every ten sites to make them easier to read: any such blanks are allowed. The blank line which separates the two groups of lines (the ones containing sites 1-20 and ones containing sites 21-39) may or may not be present. It is important that the number of sites in each group be the same for all species (i.e., it will not be possible to run the programs successfully if the first species line contains 20 bases, but the first line for the second species contains 21 bases).

Alternatively, an option can be selected in the menu to take the data in "sequential" format, with all of the data for the first species, then all of the characters for the next species, and so on. This is also the way that the discrete characters programs and the gene frequencies and quantitative characters programs want to read the data. They do not allow the interleaved format.

In the sequential format, the character data can run on to a new line at any time (except in the middle of a species name or, in the case of continuous character and distance matrix programs where you cannot go to a new line in the middle of a real number). Thus it is legal to have:

Archaeopt 001100
1101

or even:

Archaeopt
0011001101

though note that the full ten characters of the species name must then be present: in the above case there must be a blank after the "t". In all cases it is possible to put internal blanks between any of the character values, so that

Archaeopt 0011001101 0111011100

is allowed.

Note that you can convert molecular sequence data between the interleaved and the sequential data formats by using the Rewrite option of the J menu item in Seqboot.

If you make an error in the format of the input file, the programs can sometimes detect that they have been fed an illegal character or illegal numerical value and issue an error message such as BAD CHARACTER STATE:, often printing out the bad value, and sometimes the number of the species and character in which it occurred. The program will then stop shortly after. One of the things which can lead to a bad value is the omission of something earlier in the file, or the insertion of something superfluous, which cause the reading of the file to get out of synchronization. The program then starts reading things it didn't expect, and concludes that they are in error. So if you see this error message, you may also want to look for the earlier problem that may have led to the program becoming confused about what it is reading.

Some options are described below, but you should also read the documentation for the groups of the programs and for the individual programs.


The Menu

The menu is straightforward. It typically looks like this (this one is for Dnapars):

DNA parsimony algorithm, version 3.695

Setting for this run:
  U                 Search for best tree?  Yes
  S                        Search option?  More thorough search
  V              Number of trees to save?  10000
  J   Randomize input order of sequences?  No. Use input order
  O                        Outgroup root?  No, use as outgroup species  1
  T              Use Threshold parsimony?  No, use ordinary parsimony
  N           Use Transversion parsimony?  No, count all steps
  W                       Sites weighted?  No
  M           Analyze multiple data sets?  No
  I          Input sequences interleaved?  Yes
  0   Terminal type (IBM PC, ANSI, none)?  ANSI
  1    Print out the data at start of run  No
  2  Print indications of progress of run  Yes
  3                        Print out tree  Yes
  4          Print out steps in each site  No
  5  Print sequences at all nodes of tree  No
  6       Write out trees onto tree file?  Yes

  Y to accept these or type the letter for one to change

If you want to accept the default settings (they are shown in the above case) you can simply type Y followed by pressing on the Enter key. If you want to change any of the options, you should type the letter shown to the left of its entry in the menu. For example, to set a threshold type T. Lower-case letters will also work. For many of the options the program will ask for supplementary information, such as the value of the threshold.

Note the Terminal type entry, which you will find on all menus. It allows you to specify which type of terminal your screen is. The options are an IBM PC screen, an ANSI standard terminal, or none. Choosing zero (0) toggles among these three options in cyclical order, changing each time the 0 option is chosen. If one of them is right for your terminal the screen will be cleared before the menu is displayed. If none works, the none option should probably be chosen. The programs should start with a terminal option appropriate for your computer, but if they do not, you can change the terminal type manually. This is particularly important in program Retree where a tree is displayed on the screen - if the terminal type is set to the wrong value, the tree can look very strange.

The other numbered options control which information the program will display on your screen or on the output files. The option to Print indications of progress of run will show information such as the names of the species as they are successively added to the tree, and the progress of rearrangements. You will usually want to see these as reassurance that the program is running and to help you estimate how long it will take. But if you are running the program "in background" as can be done on multitasking and multiuser systems, and do not have the program running in its own window, you may want to turn this option off so that it does not disturb your use of the computer while the program is running. Note also menu option 3, "Print out tree". This can be useful when you are running many data sets, and will be using the resulting trees from the output tree file. It may be helpful to turn off the printing out of the trees in that case, particularly if those files would be too big.


The Output File

Most of the programs write their output onto a file called (usually) outfile, and a representation of the trees found onto a file called outtree.

The exact contents of the output file vary from program to program and also depend on which menu options you have selected. For many programs, if you select all possible output information, the output will consist of (1) the name of the program and its version number, (2) some of the input information printed out, and (3) a series of phylogenies, some with associated information indicating how much change there was in each character or on each part of the tree. A typical rooted tree looks like this:

                                     +-------------------Gibbon
        +----------------------------2
        !                            !      +------------------Orang
        !                            +------4
        !                                   !  +---------Gorilla
  +-----3                                   +--6
  !     !                                      !    +---------Chimp
  !     !                                      +----5
--1     !                                           +-----Human
  !     !
  !     +-----------------------------------------------Mouse
  !
  +------------------------------------------------Bovine

The interpretation of the tree is fairly straightforward: it "grows" from left to right. The numbers at the forks are arbitrary and are used (if present) merely to identify the forks. For many of the programs the tree produced is unrooted. Rooted and unrooted trees are printed in nearly the same form, but the unrooted ones are accompanied by the warning message:

remember: this is an unrooted tree!

to indicate that this is an unrooted tree and to warn against taking the position of its root too seriously. (Mathematicians still call an unrooted tree a tree, though some systematists unfortunately use the term "network" for an unrooted tree. This conflicts with standard mathematical usage, which reserves the name "network" for a completely different kind of graph). The root of this tree could be anywhere, say on the line leading immediately to Mouse. As an exercise, see if you can tell whether the following tree is or is not a different one from the above:

             +-----------------------------------------------Mouse
             !
   +---------4                                   +------------------Orang
   !         !                            +------3
   !         !                            !      !       +---------Chimp
---6         +----------------------------1      !  +----2
   !                                      !      +--5    +-----Human
   !                                      !         !
   !                                      !         +---------Gorilla
   !                                      !
   !                                      +-------------------Gibbon
   !
   +-------------------------------------------Bovine

   remember: this is an unrooted tree!

(it is not different). It is important also to realize that the lengths of the segments of the printed tree may not be significant: some may actually represent branches of zero length, in the sense that there is no evidence that those branches are nonzero in length. Some of the diagrams of trees attempt to print branches approximately proportional to estimated branch lengths, while in others the lengths are purely conventional and are presented just to make the topology visible. You will have to look closely at the documentation that accompanies each program to see what it presents and what is known about the lengths of the branches on the tree. The above tree attempts to represent branch lengths approximately in the diagram. But even in those cases, some of the smaller branches are likely to be artificially lengthened to make the tree topology clearer. Here is what a tree from Dnapars looks like, when no attempt is made to make the lengths of branches in the diagram proportional to estimated branch lengths:

                 +--Human
              +--5
           +--4  +--Chimp
           !  !
        +--3  +-----Gorilla
        !  !
     +--2  +--------Orang
     !  !
  +--1  +-----------Gibbon
  !  !
--6  +--------------Mouse
  !
  +-----------------Bovine

  remember: this is an unrooted tree!

When a tree has branch lengths, it will be accompanied by a table showing for each branch the numbers (or names) of the nodes at each end of the branch, and the length of that branch. For the first tree shown above, the corresponding table is:

 Between        And            Length      Approx. Confidence Limits
 -------        ---            ------      ------- ---------- ------

    1          Bovine            0.90216     (  0.50346,     1.30086) **
    1          Mouse             0.79240     (  0.42191,     1.16297) **
    1             2              0.48553     (  0.16602,     0.80496) **
    2             3              0.12113     (     zero,     0.24676) *
    3             4              0.04895     (     zero,     0.12668)
    4             5              0.07459     (  0.00735,     0.14180) **
    5          Human             0.10563     (  0.04234,     0.16889) **
    5          Chimp             0.17158     (  0.09765,     0.24553) **
    4          Gorilla           0.15266     (  0.07468,     0.23069) **
    3          Orang             0.30368     (  0.18735,     0.41999) **
    2          Gibbon            0.33636     (  0.19264,     0.48009) **

      *  = significantly positive, P < 0.05
      ** = significantly positive, P < 0.01

Ignoring the asterisks and the approximate confidence limits, which will be described in the documentation file for Dnaml, we can see that the table gives a more precise idea of what the lengths of all the branches are. Similar tables exist in distance matrix and likelihood programs, as well as in the parsimony programs Dnapars and Pars.

Some of the parsimony programs in the package can print out a table of the number of steps that different characters (or sites) require on the tree. This table may not be obvious at first. A typical example looks like this:

 steps in each site:
         0   1   2   3   4   5   6   7   8   9
     *-----------------------------------------
    0!       2   2   2   2   1   1   2   2   1
   10!   1   2   3   1   1   1   1   1   1   2
   20!   1   2   2   1   2   2   1   1   1   2
   30!   1   2   1   1   1   2   1   3   1   1
   40!   1

The numbers across the top and down the side indicate which site is being referred to. Thus site 23 is column "3" of row "20" and has 1 step in this case.

There are many other kinds of information that can appear in the output file, They vary from program to program, and we leave their description to the documentation files for the specific programs.


The Tree File

In output from most programs, a representation of the tree is also written into the tree file outtree. The tree is specified by nested pairs of parentheses, enclosing names and separated by commas. We will describe how this works below. If there are any blanks in the names, these must be replaced by the underscore character "_". Trailing blanks in the name may be omitted. The pattern of the parentheses indicates the pattern of the tree by having each pair of parentheses enclose all the members of a monophyletic group. The tree file could look like this:

((Mouse,Bovine),(Gibbon,(Orang,(Gorilla,(Chimp,Human)))));

In this tree the first fork separates the lineage leading to Mouse and Bovine from the lineage leading to the rest. Within the latter group there is a fork separating Gibbon from the rest, and so on. The entire tree is enclosed in an outermost pair of parentheses. The tree ends with a semicolon. In some programs such as Dnaml, Fitch, and Contml, the tree will be unrooted. An unrooted tree should have its bottommost fork have a three-way split, with three groups separated by two commas:

(A,(B,(C,D)),(E,F));

Here the three groups at the bottom node are A, (B,C,D), and (E,F). The single three-way split corresponds to one of the interior nodes of the unrooted tree (it can be any interior node of the tree). The remaining forks are encountered as you move out from that first node. In newer programs, some are able to tolerate these other forks being multifurcations (multi-way splits). You should check the documentation files for the particular programs you are using to see in which of these forms you can expect the user tree to be in. Note that many of the programs that actually estimate an unrooted tree (such as Dnapars) produce trees in the treefile in rooted form! This is done for reasons of arbitrary internal bookkeeping. The placement of the root is arbitrary. We are working toward having all programs be able to read all trees, whether rooted or unrooted, multifurcating or bifurcating, and having them do the right thing with them. But this is a long-term goal and it is not yet achieved.

For programs that infer branch lengths, these are given in the trees in the tree file as real numbers following a colon, and placed immediately after the group descended from that branch. Here is a typical tree with branch lengths:

((cat:47.14069,(weasel:18.87953,((dog:25.46154,(raccoon:19.19959,
bear:6.80041):0.84600):3.87382,(sea_lion:11.99700,
seal:12.00300):7.52973):2.09461):20.59201):25.0,monkey:75.85931);

Note that the tree may continue to a new line at any time except in the middle of a name or the middle of a branch length, although in trees written to the tree file this will only be done after a comma.

These representations of trees are a subset of the standard adopted on 24 June 1986 at the annual meetings of the Society for the Study of Evolution by an informal committee (its final session in Newick's lobster restaurant - hence its name, the Newick standard) consisting of Wayne Maddison (author of MacClade), David Swofford (PAUP), F. James Rohlf (NTSYS-PC), Chris Meacham (COMPROB and the original PHYLIP tree drawing programs), James Archie, William H.E. Day, and me. This standard is a generalization of PHYLIP's format, itself based on a well-known representation of trees in terms of parenthesis patterns which is due to the famous mathematician Arthur Cayley, and which has been around for over a century. The standard is now employed by most phylogeny computer programs but unfortunately has yet to be decribed in a formal published description. Other descriptions by me and by Gary Olsen can be accessed using the Web at:

http://evolution.gs.washington.edu/phylip/newicktree.html


The Options and How To Invoke Them

Most of the programs allow various options that alter the amount of information the program is provided or what is done with the information. Options are selected in the menu.

Common options in the menu

A number of the options from the menu, the U (User tree), G (Global), J (Jumble), O (Outgroup), W (Weights), T (Threshold), M (multiple data sets), and the tree output options, are used so widely that it is best to discuss them in this document.

The U (User tree) option. This option toggles between the default setting, which allows the program to search for the best tree, and the User tree setting, which reads a tree or trees ("user trees") from the input tree file and evaluates them. The input tree file's default name is intree. In many cases the programs will also tolerate having the trees be preceded by a line giving the number of trees:

((Alligator,Bear),((Cow,(Dog,Elephant)),Ferret));
((Alligator,Bear),(((Cow,Dog),Elephant),Ferret));
((Alligator,Bear),((Cow,Dog),(Elephant,Ferret)));

An initial line with the number of trees was formerly required, but this now can be omitted. Some programs require rooted trees, some unrooted trees, and some can handle multifurcating trees. You should read the documentation for the particular program to find out which it requires. Program Retree can be used to convert trees among these forms (on saving a tree from Retree, you are asked whether you want it to be rooted or unrooted).

In using the user tree option, check the pattern of parentheses carefully. The programs do not always detect whether the tree makes sense, and if it does not there will probably be a crash (hopefully, but not inevitably, with an error message indicating the nature of the problem). Trees written out by programs are typically in the proper form.

The G (Global) option. In the programs which construct trees (except for Neighbor, the "...penny" programs and Clique, and of course the "...move" programs where you construct the trees yourself), after all species have been added to the tree a rearrangements phase ensues. In most of these programs the rearrangements are automatically global, which in this case means that subtrees will be removed from the tree and put back on in all possible ways so as to have a better chance of finding a better tree. Since this can be time consuming (it roughly triples the time taken for a run) it is left as an option in some of the programs, specifically Contml, Fitch, Dnaml and Proml. In these programs the G menu option toggles between the default of local rearrangement and global rearrangement. The rearrangements are explained more below.

The J (Jumble) option. In most of the tree construction programs (except for the "...penny" programs and Clique), the exact details of the search of different trees depend on the order of input of species. In these programs J option enables you to tell the program to use a random number generator to choose the input order of species. This option is toggled on and off by selecting option J in the menu. The program will then prompt you for a "seed" for the random number generator. The seed should be an integer between 1 and 232-3 (which is 4,294,967,293), and should be of form 4n+1, which means that it must give a remainder of 1 when divided by 4. This can be judged by looking at the last two digits of the number (for example, in the upper limit given above, the last two digits are 93, which is of form 4n+1. Each different seed leads to a different sequence of addition of species. By simply changing the random number seed and re-running the programs one can look for other, and better trees. If the seed entered is not odd, the program will not proceed, but will prompt for another seed.

The Jumble option also causes the program to ask you how many times you want to restart the process. If you answer 10, the program will try ten different orders of species in constructing the trees, and the results printed out will reflect this entire search process (that is, the best trees found among all 10 runs will be printed out, not the best trees from each individual run).

Some people have asked what are good values of the random number seed. The random number seed is used to start a process of choosing "random" (actually pseudorandom) numbers, which behave as if they were unpredictably randomly chosen between 0 and 232-1 (which is 4,294,967,295). You could put in the number 133 and find that the next random number was 221,381,825. As they are effectively unpredictable, there is no such thing as a choice that is better than any other, provided that the numbers are of the form 4n+1. However if you re-use a random number seed, the sequence of random numbers that result will be the same as before, resulting in exactly the same series of choices, which may not be what you want.

The O (Outgroup) option. This specifies which species is to have the root of the tree be on the line leading to it. For example, if the outgroup is a species "Mouse" then the root of the tree will be placed in the middle of the branch which is connected to this species, with Mouse branching off on one side of the root and the lineage leading to the rest of the tree on the other. This option is toggled on and off by choosing O in the menu (the alphabetic character O, not the digit 0). When it is on, the program will then prompt for the number of the outgroup (the species being taken in the numerical order that they occur in the input file). Responding by typing 6 and then an Enter character indicates that the sixth species in the data (the 6th in the first set of data if there are multiple data sets) is taken as the outgroup. Outgroup-rooting will not be attempted if the data have already established a root for the tree from some other consideration, and may not be if it is a user-defined tree, despite your invoking the option. Thus programs such as Dollop that produce only rooted trees do not allow the Outgroup option. It is also not available in Kitsch, Dnamlk, Promlk or Clique. When it is used, the tree as printed out is still listed as being an unrooted tree, though the outgroup is connected to the bottommost node so that it is easy to visually convert the tree into rooted form.

The T (Threshold) option. This sets a threshold forn the parsimony programs such that if the number of steps counted in a character is higher than the threshold, it will be taken to be the threshold value rather than the actual number of steps. The default is a threshold so high that it will never be surpassed (in which case the steps whill simply be counted). The T menu option toggles on and off asking the user to supply a threshold. The use of thresholds to obtain methods intermediate between parsimony and compatibility methods is described in my 1981b paper. When the T option is in force, the program will prompt for the numerical threshold value. This will be a positive real number greater than 1. In programs Mix, Move, Penny, Protpars, Dnapars, Dnamove, and Dnapenny, do not use threshold values less than or equal to 1.0, as they have no meaning and lead to a tree which depends only on considerations such as the input order of species and not at all on the character state data! In programs Dollop, Dolmove, and Dolpenny the threshold should never be 0.0 or less, for the same reason. The T option is an important and underutilized one: it is, for example, the only way in this package (except for program Dnacomp) to do a compatibility analysis when there are missing data. It is a method of de-weighting characters that evolve rapidly. I wish more people were aware of its properties.

The M (Multiple data sets) option. In menu programs there is an M menu option which allows one to toggle on the multiple data sets option. The program will ask you how many data sets it should expect. The data sets have the same format as the first data set. Here is a (very small) input file with two five-species data sets:

      5    6
Alpha     CCACCA
Beta      CCAAAA
Gamma     CAACCA
Delta     AACAAC
Epsilon   AACCCA
5    6
Alpha     CACACA
Beta      CCAACC
Gamma     CAACAC
Delta     GCCTGG
Epsilon   TGCAAT

The main use of this option will be to allow all of the methods in these programs to be bootstrapped. Using the program Seqboot one can take any DNA, protein, restriction sites, gene frequency or binary character data set and make multiple data sets by bootstrapping. Trees can be produced for all of these using the M option. They will be written on the tree output file if that option is left in force. Then the program Consense can be used with that tree file as its input file. The result is a majority rule consensus tree which can be used to make confidence intervals. The present version of the package allows, with the use of Seqboot and Consense and the M option, bootstrapping of many of the methods in the package.

Programs Dnaml, Dnapars and Pars can also take multiple weights instead of multiple data sets. They can then do bootstrapping by reading in one data set, together with a file of weights that show how the characters (or sites) are reweighted in each bootstrap sample. Thus a site that is omitted in a bootstrap sample has effectively been given weight 0, while a site that has been duplicated has effectively been given weight 2. Seqboot has a menu selection to produce the file of weights information automatically, instead of producing a file of multiple data sets. It can be renamed and used as the input weights file.

The W (Weights) option. This signals the program that, in addition to the data set, you want to read in a series of weights that tell how many times each character is to be counted. If the weight for a character is zero (0) then that character is in effect to be omitted when the tree is evaluated. If it is (1) the character is to be counted once. Some programs allow weights greater than 1 as well. These have the effect that the character is counted as if it were present that many times, so that a weight of 4 means that the character is counted 4 times. The values 0-9 give weights 0 through 9, and the values A-Z give weights 10 through 35. By use of the weights we can give overwhelming weight to some characters, and drop others from the analysis. In the molecular sequence programs only two values of the weights, 0 or 1 are allowed.

The weights are used to analyze subsets of the characters, and also can be used for resampling of the data as in bootstrap and jackknife resampling. For those programs that allow weights to be greater than 1, they can also be used to emphasize information from some characters more strongly than others. Of course, you must have some rationale for doing this.

The weights are provided as a sequence of digits. Thus they might be

10011111100010100011110001100

The weights are to be provided in an input file whose default name is weights. The weights in it are a simple string of digits. Blanks in the weightfile are skipped over and ignored, and the weights can continue to a new line. In programs such as Seqboot that can also output a file of weights, the input weights have a default file name of inweights, and the output file name has a default file name of outweights.

Weights can be used to analyze different subsets of characters (by weighting the rest as zero). Alternatively, in the discrete characters programs they can be used to force a certain group to appear on the phylogeny (in effect confining consideration to only phylogenies containing that group). This is done by adding an imaginary character that has 1's for the members of the group, and 0's for all the other species. That imaginary character is then given the highest weight possible: the result will be that any phylogeny that does not contain that group will be penalized by such a heavy amount that it will not (except in the most unusual circumstances) be considered. Of course, the new character brings extra steps to the tree, but the number of these can be calculated in advance and subtracted out of the total when reporting the results. This use of weights is an important one, and one sadly ignored by many users who could profit from it. In the case of molecular sequences we cannot use weights this way, so that to force a given group to appear we have to add a large extra segment of sites to the molecule, with (say) A's for that group and C's for every other species.

The option to write out the trees into a tree file. This specifies that you want the program to write out the tree not only on its usual output, but also onto a file in nested-parenthesis notation (as described above). This option is sufficiently useful that it is turned on by default in all programs that allow it. You can optionally turn it off if you wish, by typing the appropriate number from the menu (it varies from program to program). This option is useful for creating tree files that can be directly read into the programs, including the consensus tree and tree distance programs, and the tree plotting programs.

The output tree file has a default name of outtree.

The (0) terminal type option . (This is the digit 0, not the alphabetic character O). The program will default to one particular assumption about your terminal (ANSI in the case of Linux, Unix, or Mac OS X, and IBM PC in the case of Windows). You can alternatively select it to be either an IBM PC, or nothing. This affects the ability of the programs to clear the screen when they display their menus, and the graphics characters used to display trees in the programs Dnamove, Move, Dolmove, and Retree. In the case of Windows, the screen will clear properly with either the IBM PC or the ANSI settings, but the graphics characters needed by Move, Dnamove, Dolmove, or Retree will display correctly only with the IBM PC setting.


The Algorithm for Constructing Trees

All of the programs except Factor, Dnadist, Gendist, Dnainvar, Seqboot, Contrast, Retree, and the plotting and consensus tree programs act to construct an estimate of a phylogeny. Move, Dolmove, and Dnamove let you construct it yourself by hand. All of the rest but Neighbor, the "...penny" programs and Clique make use of a common approach involving additions and rearrangements. They are trying to minimize or maximize some quantity over the space of all possible evolutionary trees. Each program contains a part that, given the topology of the tree, evaluates the quantity that is being minimized or maximized. The straightforward approach would be to evaluate all possible tree topologies one after another and pick the one which, according to the criterion being used, is best. This would not be possible for more than a small number of species, since the number of possible tree topologies is enormous. A review of the literature on the counting of evolutionary trees will be found one of my papers (Felsenstein, 1978a) and in my book (Felsenstein, 2004, chapter 3).

Since we cannot search all topologies, these programs are not guaranteed to always find the best tree, although they seem to do quite well in practice. The strategy they employ is as follows: the species are taken in the order in which they appear in the input file. The first two (in some programs the first three) are taken and a tree constructed containing only those. There is only one possible topology for this tree. Then the next species is taken, and we consider where it might be added to the tree. If the initial tree is (say) a rooted tree with two species and we want the resulting three-species tree to be a bifurcating tree, there are only three places where we could add the third species. Each of these is tried, and each time the resulting tree is evaluated according to the criterion. The best one is chosen to be the basis for further operations. Now we consider adding the fourth species, again at each of the five possible places that would result in a bifurcating tree. Again, the best of these is accepted. This is usually known as the Sequential Addition strategy.

Local rearrangements

The process continues in this manner, with one important exception. After each species is added, and before the next is added, a number of rearrangements of the tree are tried, in an effort to improve it. The algorithms move through the tree, making all possible local rearrangements of the tree. A local rearrangement involves an internal segment of the tree in the following manner. Each internal segment of the tree is of this form (where T1, T2, and T3 are subtrees - parts of the tree that can contain further forks and tips):

            T1      T2       T3
             \      /        /
              \    /        /
               \  /        /
                \/        /
                 *       /
                  *     /
                   *   /
                    * /
                     *
                     !
                     !

the segment we are discussing being indicated by the asterisks. A local rearrangement consists of switching the subtrees T1 and T3 or T2 and T3, so as to obtain one of the following:

          T3       T2      T1            T1       T3      T2
           \       /       /              \       /       /
            \     /       /                \     /       /
             \   /       /                  \   /       /
              \ /       /                    \ /       /
               \       /                      \       /
                \     /                        \     /
                 \   /                          \   /
                  \ /                            \ /
                   !                              !
                   !                              !
                   !                              !

Each time a local rearrangement is successful in finding a better tree, the new arrangement is accepted. The phase of local rearrangements does not end until the program can traverse the entire tree, attempting local rearrangements, without finding any that improve the tree.

This strategy of adding species and making local rearrangements will look at about  (n-1)x(2n-3)  different topologies, though if rearrangements are frequently successful the number may be larger. I have been describing the strategy when rooted trees are being considered. For unrooted trees there is a precisely similar strategy, though the first tree constructed may be a three-species tree and the rearrangements may not start until after the addition of the fifth species.

These local rearrangements have come to be called Nearest Neighbor Interchanges (NNIs) in the phylogeny literature.

Though we are not guaranteed to have found the best tree topology, we are guaranteed that no nearby topology (i. e. none accessible by a single local rearrangement) is better. In this sense we have reached a local optimum of our criterion. Note that the whole process is dependent on the order in which the species are present in the input file. We can try to find a different and better solution by reordering the species in the input file and running the program again (or, more easily, by using the J option). If none of these attempts finds a better solution, then we have some indication that we may have found the best topology, though we can never be certain of this.

Note also that a new topology is never accepted unless it is better than the previous one, so that the rearrangement process can never fall into an endless loop. This is also the way ties in our criterion are resolved, namely by sticking with the tree found first. However, the tree construction programs other than Clique, Contml, Fitch, and Dnaml do keep a record of all trees found that are tied with the best one found. This gives you some immediate idea of which parts of the tree can be altered without affecting the quality of the result.

Global rearrangements

A feature of most of the programs, such as Protpars, Dnapars, Dnacomp, Dnaml, Dnamlk, Restml, Kitsch, Fitch, Contml, Mix, and Dollop, is "global" optimization of the tree. In four of these (Contml, Fitch, Dnaml and Dnamlk) this is an option, G. In the others it automatically applies. When it is present there is an additional stage to the search for the best tree. Each possible subtree is removed from the tree from the tree and added back in all possible places. This process continues until all subtrees can be removed and added again without any improvement in the tree. The purpose of this extra rearrangement is to make it less likely that one or more a species gets "stuck" in a suboptimal region of the space of all possible trees. The use of global optimization results in approximately a tripling (3 x ) of the run-time, which is why I have left it as an option in some of the slower programs.

What PHYLIP calls "global" rearrangements are more properly called SPR (subtree pruning and regrafting) by Swofford et. al. (1996) as distinct from the NNI (nearest neighbor interchange) rearrangements that PHYLIP also uses, and the TBR (tree bisection and reconnection) rearrangements that it does not use. My book (Felsenstein, 2004, chapter 4) contains a review of work on these and other rearrangements and search methods.

The programs doing global optimization print out a dot "." after each group is removed and re-added to the tree, to give the user some sign that the rearrangements are proceeding. A new line of dots is started whenever a new round of global rearrangements is started following an improvement in the tree. On the line before the dots are printed there is printed a bar of the form "!---------------!" to show how many dots to expect. The dots will not be printed out at a uniform rate, but the later dots, which represent removal of larger groups from the tree and trying them consequently in fewer places, will print out more quickly. With some compilers each row of dots may not be printed out until it is complete.

It should be noted that Penny, Dolpenny, Dnapenny and Clique use a more sophisticated strategy of "depth-first search" with a "branch and bound" search method that guarantees that all of the best trees will be found. In the case of Penny, Dolpenny and Dnapenny there can be a considerable sacrifice of computer time if the number of species is greater than about ten: it is a matter for you to consider whether it is worth it for you to guarantee finding all the most parsimonious trees, and that depends on how much free computer time you have! Clique finds all largest cliques, and does so without undue burning of computer time. Although all of these problems that have been investigated fall into the category of "NP-hard" problems that in effect do not have a rapid solution, the cases that cause this trouble for the largest-cliques algorithm in Clique apparently are not biologically realistic and do not occur in actual data.

Multiple jumbles

As just mentioned, for most of these programs the search depends on the order in which the species are entered into the tree. Using the J (Jumble) option you can supply a random number seed which will allow the program to put the species in in a random order. Jumbling can be done multiple times. For example, if you tell the program to do it 10 times, it will go through the tree-building process 10 times, each with a different random order of adding species. It will keep a record of the trees tied for best over the whole process. In other words, it does not just record the best trees from each of the 10 runs, but records the best ones overall. Of course this is slow, taking 10 times longer than a single run. But it does give us a much greater chance of finding all of the most parsimonious trees. In the terminology of Maddison (1991) it can find different "islands" of trees. The present algorithms do not guarantee us to find all trees in a given "island" from a single run, so multiple runs also help explore those "islands" that are found.

Saving multiple tied trees

For the parsimony and compatibility programs, one can have a perfect tie between two or more trees. In these programs these trees are all saved. For the newer parsimony programs such as Dnapars and Pars, global rearrangement is carried out on all of these tied trees. This can be turned off in the menu.

For trees with criteria which are real numbers, such as the distance matrix programs Fitch and Kitsch, and the likelihood programs Dnaml, Dnamlk, Contml, and Restml, it is difficult to get an exact tie between trees. Consequently these programs save only the single best tree (even though the others may be only a tiny bit worse).

Strategy for finding the best tree

In practice, it is advisable to use the Jumble option to evaluate many different orderings of the input species. It is advisable to use the Jumble option and specify that it be done many times (as many as different orderings of the input species). (This is usually not necessary when bootstrapping, though the programs will then default to doing it once to avoid artifacts caused by the order in which species are added to the tree.)

People who want a magic "black box" program whose results they do not have to question (or think about) often are upset that these programs give results that are dependent on the order in which the species are entered in the data. To me this property is an advantage, for it permits you to try different searches for better trees, simply by varying the input order of species. If you do not use the multiple Jumble option, but do multiple individual runs instead, you can easily decide which to pay most attention to - the one or ones that are best according to the criterion employed (for example, with parsimony, the one out of the runs that results in the tree with the fewest changes).

In practice, in a single run, it usually seems best to put species that are likely to be sources of confusion in the topology last, as by the time they are added the arrangement of the earlier species will have stabilized into a good configuration, and then the last few species will by fitted into that topology. There will be less chance this way of a poor initial topology that would affect all subsequent parts of the search. However, a variety of arrangements of the input order of species should be tried, as can be done if the J option is used, and no species should be kept in a fixed place in the order of input. Note that the results of the "...penny" programs and Clique are not sensitive to the input order of species, and Neighbor is only slightly sensistive to it, so that multiple Jumbling is not possible with those programs. Note also that with global search, which is standard in many programs and in others is an option, each group (including each individual species) will be removed and re-added in all possible positions, so that a species causing confusion will have more chance of moving to a new location than it would without global rearrangement.

Nixon's search strategy

An innovative search strategy was developed by Kevin Nixon (1999). If one uses a manual rearrangement program such as Dnamove, Move, or Dolmove, and look at the distribution of characters on the trees, you will see some characters whose distributions appear to recommend alternative groupings. One would want a program that automatically found such alternative suggestions and used them to rearrange the tree so as to explore trees that had those groups. Nixon had the idea of using resampling methods to do this. Using either bootstrap or jackknife sampling, one can make data sets that emphasize randomly sampled subsets of characters. We then search for trees that fit those data sets. After finding them, we revert to the initial data set and then search using those trees as starting points. This sampling allows us to explore parts of tree space recommended by particular subsets of characters. (This is not exactly Nixon's original strategy, which started the searches for each resampled data set from the best tree found so far. For each resampled data set we instead start from scratch, doing sequential addition of taxa.)

Nixon's method has proven to be very effective in searching for most parsimonious trees -- it is currently the state of the art for that. Nixon called his method the "parsimony ratchet", but actually it can be applied straightforwardly to any method of phylogeny inference that has an optimality criterion, including likelihood and least squares distance methods. Starting with version 3.7, PHYLIP programs have the ability to search by rearranging a tree supplied to them by the user. This makes it possible to implement our variant of Nixon's strategy. You need to do so in multiple steps:

  1. Use bootstrap sampling to make a number of resampled versions of the data set. You can also use jackknifing. In either case, there may be advantages to sampling a smaller fraction of the sites (Nixon recommends sampling about 30-35%).
  2. Take these replicates, and do quick estimates of the phylogeny for each one. This could be done with faster methods such as neighbor-joining or parsimony.
  3. Take the resulting trees, together with the original data set. Using the method of phylogeny estimation that you prefer, read the trees in as multiple user-defined trees, choosing the choice in the U menu option that uses these trees as the starting point for rearrangement. The program will report the best tree or trees found by rearranging all of those input trees. This accomplishes Nixon's search strategy.
It will not necessarily be fast to do this, as the last step may be slow. But the resampling will cause emphasis on different sets of characters in the initial searches, allowing the process to explore regions of tree space not usually examined by conventional rearrangement strategies.

There is some more information on how this may be done in the documentation files for Seqboot and for the individual tree inference programs.


A Warning on Interpreting Results

Probably the most important thing to keep in mind while running any of the parsimony or compatibility programs is not to overinterpret the result. Some users treat the set of most parsimonious trees as if it were a confidence interval. If a group appears in all of the most parsimonious trees then they treat it as well established. Unfortunately the confidence interval on phylogenies appears to be much larger than the set of all most parsimonious trees (Felsenstein, 1985b). Likewise, variation of result among different methods will not be a good indicator of the size of the confidence interval. Consider a simple data set in which, out of 100 binary characters, 51 recommend the unrooted tree ((A,B),(C,D)) and 49 the tree ((A,D),(B,C)). Many different methods will all give the same result on such a data set: they will estimate the tree as ((A,B),(C,D)). Nevertheless it is clear that the 51:49 margin by which this tree is favored is not statistically significantly different from 50:50. So consistency among different methods is a poor guide to statistical significance.


Relative Speed of Different
Programs and Machines

Relative speed of the different programs

C compilers differ in efficiency of the code they generate, and some deal with some features of the language better than with others. Thus a program which is unusually fast on one computer may be unusually slow on another. Nevertheless, as a rough guide to relative execution speeds, I have tested the programs on three data sets, each of which has 10 species and 40 characters. The first is an imaginary one in which all characters are compatible - ("The Willi Hennig Memorial Data Set" as J. S. Farris once called ones like it). The second is the binary recoded form of the fossil horses data set of Camin and Sokal (1965). The third data set has data that is completely random: 10 species and 20 characters that have a 50% chance that each character state is 0 or 1 (or A or G). The data sets thus range from a completely compatible one in which there is no homoplasy (paralellism or convergence), through the horses data set, which requires 29 steps where the possible minimum number would be 20, to the random data set, which requires 49 steps. We can thus see how this increasing messiness of the data affects running times. The three data sets have all had 20 sites of A's added to the end of each sequence, so as to prevent likelihood or distance matrix programs from having infinite branch lengths (the test data sets used for timing previous versions of PHYLIP were the same except that they lacked these 20 extra sites).

Here are the nucleotide sequence versions of the three data sets:

    10   40
A         CACACACAAAAAAAAAAACAAAAAAAAAAAAAAAAAAAAA
B         CACACAACAAAAAAAAAACAAAAAAAAAAAAAAAAAAAAA
C         CACAACAAAAAAAAAAAACAAAAAAAAAAAAAAAAAAAAA
D         CAACAAAACAAAAAAAAACAAAAAAAAAAAAAAAAAAAAA
E         CAACAAAAACAAAAAAAACAAAAAAAAAAAAAAAAAAAAA
F         ACAAAAAAAACACACAAAACAAAAAAAAAAAAAAAAAAAA
G         ACAAAAAAAACACAACAAACAAAAAAAAAAAAAAAAAAAA
H         ACAAAAAAAACAACAAAAACAAAAAAAAAAAAAAAAAAAA
I         ACAAAAAAAAACAAAACAACAAAAAAAAAAAAAAAAAAAA
J         ACAAAAAAAAACAAAAACACAAAAAAAAAAAAAAAAAAAA

    10   40
MesohippusAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
HypohippusAAACCCCCCCAAAAAAAAACAAAAAAAAAAAAAAAAAAAA
ArchaeohipCAAAAAAAAAAAAAAAACACAAAAAAAAAAAAAAAAAAAA
ParahippusCAAACAACAACAAAAAAAACAAAAAAAAAAAAAAAAAAAA
MerychippuCCAACCACCACCCCACACCCAAAAAAAAAAAAAAAAAAAA
M. secunduCCAACCACCACCCACACCCCAAAAAAAAAAAAAAAAAAAA
Nannipus  CCAACCACAACCCCACACCCAAAAAAAAAAAAAAAAAAAA
NeohippariCCAACCCCCCCCCCACACCCAAAAAAAAAAAAAAAAAAAA
Calippus  CCAACCACAACCCACACCCCAAAAAAAAAAAAAAAAAAAA
PliohippusCCCACCCCCCCCCACACCCCAAAAAAAAAAAAAAAAAAAA

    10   40
A         CACACAACCAAACAAACCACAAAAAAAAAAAAAAAAAAAA
B         AAACCACACACACAAACCCAAAAAAAAAAAAAAAAAAAAA
C         ACAAAACCAAACCACCCACAAAAAAAAAAAAAAAAAAAAA
D         AAAAACACAACACACCAAACAAAAAAAAAAAAAAAAAAAA
E         AAACAACCACACACAACCAAAAAAAAAAAAAAAAAAAAAA
F         CCCAAACACCCCCAAAAAACAAAAAAAAAAAAAAAAAAAA
G         ACACCCCCACACCCACCAACAAAAAAAAAAAAAAAAAAAA
H         AAAACAACAACCACCCCACCAAAAAAAAAAAAAAAAAAAA
I         ACACAACAACACAAACAACCAAAAAAAAAAAAAAAAAAAA
J         CCAAAAACACCCAACCCAACAAAAAAAAAAAAAAAAAAAA

Here are the timings of many of the version 3.6 programs on these three data sets as run after being compiled by Gnu C (version 3.2) and run on an AMD Athlon XP 2200+ computer under Linux.

  Hennigian Data Horses Data Random Data
Protpars 0.00500 0.00670 0.01289
Dnapars 0.01050 0.00940 0.00980
Dnapenny 0.01400 0.00860 1.71100
Dnacomp 0.00240 0.00250 0.00590
Dnaml 0.17749 0.23970 0.21350
Dnamlk 0.21740 0.19450 0.24400
Proml 1.3527   3.2085   2.0055  
Promlk 3.3567   8.6078   4.4886  
Dnainvar 0.00020 0.00020 0.00020
Dnadist 0.00140 0.00080 0.00150
Protdist 0.09220 0.09210 0.09310
Restml 0.14560 0.28810 0.21540
Restdist 0.00110 0.00090 0.00080
Fitch 0.00760 0.01280 0.00880
Kitsch 0.00180 0.00260 0.00280
Neighbor 0.00020 0.00050 0.00050
Contml 0.01310 0.01500 0.01780
Gendist 0.00070 0.00070 0.00070
Pars 0.00780 0.00610 0.02930
Mix 0.00360 0.00410 0.00610
Penny 0.00190 0.00470 0.8060 
Dollop 0.00480 0.00450 0.00820
Dolpenny 0.00200 0.01060 1.1270  
Clique 0.00100 0.00070 0.00130


In all cases the programs were run under the default options with optimized compiler switches (-03 -fomit-frame-pointer), except as specified here. The data sets used for the discrete characters programs have 0's and 1's instead of A's and C's. For Contml the A's and C's were made into 0.0's and 1.0's and considered as 40 2-allele loci. For the distance programs 10 x 10 distance matrices were computed from the three data sets. For the restriction sites programs A and C were changed into + and -. It does not make much sense to benchmark Move, Dolmove, or Dnamove, although when there are many characters and many species the response time after each alteration of the tree should be proportional to the product of the number of species and the number of characters. For Dnaml, Dnamlk, and Dnadist the frequencies of the four bases were set to be equal rather than determined empirically as is the default. For Restml the number of enzymes was set to 1.

In most cases, the benchmark was made more accurate by analyzing 100 data sets using the M (Multiple data sets) option and dividing the resulting time by 100. Times were determined as user times using the Linux time command. Several patterns will be apparent from this. The algorithms (Mix, Dollop, Contml, Fitch, Kitsch, Protpars, Dnapars, Dnacomp, and Dnaml, Dnamlk, Restml) that use the above-described addition strategy have run times that do not depend strongly on the messiness of the data. The only exception to this is that if a data set such as the Random data requires extra rounds of global rearrangements it takes longer. The programs differ greatly in run time: the protein likelihood programs Proml and Promlk were very slow, and the other likelihood programs Restml, Dnaml and Contml are slower than the rest of the programs. The protein sequence parsimony program, which has to do a considerable amount of bookkeeping to keep track of which amino acids can mutate to each other, is also relatively slow.

Another class of algorithms includes Penny, Dolpenny, Dnapenny and Clique. These are branch-and-bound methods: in principle they should have execution times that rise exponentially with the number of species and/or characters, and they might be much more sensitive to messy data. This is apparent with Penny, Dolpenny, and Dnapenny, which go from being reasonably fast with clean data to very slow with messy data. Dolpenny is particularly slow on messy data - this is because this algorithm cannot make use of some of the lower-bound calculations that are possible with Dnapenny and Penny. Clique is very fast on all data sets. Although in theory it should bog down if the number of cliques in the data is very large, that does not happen with random data, which in fact has few cliques and those small ones. Apparently the "worst-case" data sets that cause exponential run time are much rarer for Clique than for the other branch-and-bound methods.

Neighbor is quite fast compared to Fitch and Kitsch, and should make it possible to run much larger cases, although the results are expected to be a bit rougher than with those programs.

Speed with different numbers of species

How will the speed depend on the number of species and the number of characters? For the sequential-addition algorithms, the speed should be proportional to somewhere between the cube of the number of species and the square of the number of species, and to the number of characters. Thus a case that has, instead of 10 species and 20 characters, 20 species and 50 characters would take (in the cubic case) 2 x 2 x 2 x 2.5 = 20 times as long. This implies that cases with more than 20 species will be slow, and cases with more than 40 species very slow. This places a premium on working on small subproblems rather than just dumping a whole large data set into the programs.

An exception to these rules will be some of the DNA programs that use an aliasing device to save execution time. In these programs execution time will not necessarily increase proportional to the number of sites, as sites that show the same pattern of nucleotides will be detected as identical and the calculations for them will be done only once, which does not lead to more execution time. This is particularly likely to happen with few species and many sites, or with data sets that have small amounts of evolutionary divergence.

For programs Fitch and Kitsch, the distance matrix is square, so that when we double the number of species we also double the number of "characters", so that running times will go up as the fourth power of the number of species rather than the third power. Thus a 20-species case with Fitch is expected to run sixteen times more slowly than a 10-species case.

For programs like Penny and Clique the run times will rise faster than the cube of the number of species (in fact, they can rise faster than any power since these algorithms are not guaranteed to work in polynomial time). In practice, Penny will frequently bog down above 11 species, while Clique easily deals with larger numbers.

For Neighbor the speed should vary only as the cube of the number of species, so a case twice as large will take only eight times as long. This will make it an attractive alternative to Fitch and Kitsch for large data sets.

Suggestion: If you are unsure of how long a program will take, try it first on a few species, then work your way up until you get a feel for the speed and for what size programs you can afford to run.

Execution time is not the most important criterion for a program, particularly as computer time gets much cheaper than your time or a programmer's time. With workstations on which background jobs can be run all night, execution speed is not overwhelmingly relevant. Some of us have been conditioned by an earlier era of computing to consider execution speed paramount. But ease of use, ease of adaptation to your computer system, and ease of modification are much more important in practice, and in these respects I think these programs are adequate. Only if you are engaged in 1960's style mainframe computing, or if you have very large amounts of data is minimization of execution time paramount. If you spent six months getting your data, it may not be overwhelmingly important whether your run takes 10 seconds or 10 hours.

Nevertheless it would have been nice to have made the programs faster. The present speeds are a compromise between speed and effectiveness: by making them slower and trying more rearrangements in the trees, or by enumerating all possible trees, I could have made the programs more likely to find the best tree. By trying fewer rearrangements I could have speeded them up, but at the cost of finding worse trees. I could also have speeded them up by writing critical sections in assembly language, but this would have sacrificed ease of distribution to new computer systems. There are also some options included in these programs that make it harder to adopt some of the economies of bookkeeping that make other programs faster. However to some extent I have simply made the decision not to spend time trying to speed up program bookkeeping when there were new likelihood and statistical methods to be developed.

Relative speed of different machines

It is interesting to compare different machines using Dnapars as the standard task. One can rate a machine on the Dnapars benchmark by summing the times for all three of the data sets. Here are relative total timings over all three data sets (done with various versions of Dnapars) for some machines, taking an AMD Athlon 1.2 GHz computer running Linux with gcc as the standard. Benchmarks from versions 3.4 and 3.5 of the program are also included (respectively the Pascal and C versions whose timings are in parentheses). They are compared only with each other and are scaled to the rest of the timings using the joint runs on the 386SX and the Pentium MMX 266. This use of separate standards is necessary not because of different languages but because different versions of the package are being compared. Thus, the "Time" is the ratio of the Total to that for the Pentium, adjusted by the scalings of machines using 3.4 and 3.5 when appropriate. The Relative Speed is the reciprocal of the Time. For the moment these benchmarks are for version 3.6; they will be updated when 3.7 is fully released.

Machine Operating
System
Compiler Total Time Relative
Speed
Toshiba T1100+ MSDOS Turbo Pascal 3.01A (269) 10542 0.00009486
Apple Mac Plus Mac OS Lightspeed Pascal 2 (175.84)   6891 0.00014511
Toshiba T1100+ MSDOS Turbo Pascal 5.0 (162)   6349 0.00015750
Macintosh Classic Mac OS Think Pascal 3 (160)   6271 0.00015947
Macintosh Classic Mac OS Think C   (43.0)   4771 0.0002096
IBM PS2/60 MSDOS Turbo Pascal 5.0   (58.76)   2303 0.0004343
80286 (12 Mhz) MSDOS Turbo Pascal 5.0   (47.09)   1845.4 0.0005419
Apple Mac IIcx Mac OS Think Pascal 3   (42)   1645.5 0.0006077
Apple Mac SE/30 Mac OS Think Pascal 3   (42)   1645.6 0.0006077
Apple Mac IIcx Mac OS Lightspeed Pascal 2   (39.84)   1561.6 0.0006404
Apple Mac IIcx Mac OS Lightspeed Pascal 2#   (39.69)   1555.0 0.00006431
Zenith Z386 (16MHz) MSDOS Turbo Pascal 5.0   (38.27)   1539.0 0.0006498
Macintosh SE/30 Mac OS Think C   (13.6)   1508.4 0.0006630
386SX (16 MHz) MSDOS Turbo Pascal 6.0   (34)   1333.6 0.0007498
386SX (16 MHz) MSDOS Microsoft Quick C   (12.01)   1333.6 0.0007499
Sequent-S81 DYNIX Silicon Valley Pascal   (13.0)     509.0 0.0019646
VAX 11/785 Unix Berkeley Pascal   (11.9)     466.3 0.002144
80486-33 MSDOS Turbo Pascal 6.0   (11.46)     449.0 0.02227
Sun 3/60 SunOS Sun C     (3.93)     435.7 0.002295
NeXT Cube (68030) Mach Gnu C     (2.608)     289.3 0.003456
Sequent S-81 DYNIX Sequent Symmetry C     (2.604)      288.9 0.003461
VAXstation 3500 Unix Berkeley Pascal     (7.3)     286.5 0.003491
Sequent S-81 DYNIX Berkeley Pascal     (5.6)     219.5 0.004557
Unisys 7000/40 Unix Berkeley Pascal     (5.24)     205.3 0.004870
VAX 8600 VMS DEC VAX Pascal     (3.96)     155.23 0.006442
Sun SPARC IPX SunOS Gnu C version 2.1     (1.28)     142.04 0.007040
VAX 6000-530 VMS DEC C     (0.858)       95.14 0.010511
VAXstation 4000 VMS DEC C     (0.809)       89.81 0.011135
IBM RS/6000 540 AIX XLP Pascal     (2.276)       89.14 0.011219
NeXTstation(040/25) Mach Gnu C     (0.75)       83.15 0.012027
Sun SPARC IPX SunOS Sun C     (0.68)       75.43 0.01326
486DX (33 MHz) Linux Gnu C #     (0.63)       69.95 0.01430
Sun SPARCstation-1 Unix Sun Pascal     (1.7)       66.62 0.01501
DECstation 5000/200 Unix DEC Ultrix C     (0.45)       49.97 0.02001
Sun SPARC 1+ SunOS Sun C     (0.40)       44.37 0.02254
DECstation 3100 Unix DEC Ultrix Pascal     (0.77)       30.11 0.03321
IBM 3090-300E AIX Metaware High C     (0.27)       29.98 0.03336
DECstation 5000/125 Unix DEC Ultrix C     (0.267)       29.58 0.03381
DECstation 5000/200 Unix DEC Ultrix C     (0.256)       28.38 0.03524
Sun SPARC 4/50 SunOS Sun C     (0.249)       27.62 0.03621
DEC 3000/400 AXP Unix DEC C     (0.224)       24.85 0.04024
DECstation 5000/240 Unix DEC Ultrix C     (0.1889)       20.96 0.04771
SGI Iris R4000 Unix SGI C     (0.184)       20.41 0.04898
IBM 3090-300E VM Pascal VS     (0.464)       18.12 0.05519
DECstation 5000/200 Unix DEC Ultrix Pascal     (0.39)       15.188 0.06583
Pentium 120 Linux Gnu C      1.848       11.953 0.08366
Pentium Pro 180 Linux Gnu C      1.009         6.527 0.1532
Pentium 266 MMX Linux Gnu C (PHYLIP 3.5)     (0.054)         5.996 0.1668
Pentium 266 MMX Linux Gnu C      0.927         5.996 0.1668
Pentium 200 Linux Gnu C      0.853         5.517 0.1812
SGI PowerChallenge Irix Gnu C      0.844         5.459 0.1832
DEC Alpha 400 4/233 DUNIX Digital C (cc -fast)      0.730         4.722 0.2118
Pentium II 500 Linux Gnu C      0.368         2.380 0.4201
Dual 448/633 MHz Pentiums Linux gcc      0.3069         1.985 0.5037
Sun Ultra 10 Solaris 8 gcc      0.25848         1.672 0.5981
Macintosh G3 300 MHz Mac OS X Gnu C (-O 3)      0.2330         1.5071 0.6635
Compaq/Digital Alpha 500au DUNIX Digital C (cc -fast)      0.167         1.080 0.9257
AMD Athlon 1.2 GHz Linux gcc      0.1546         1.0 1.0
Intel Pentium 4 2.26 GHz Windows XP Cygwin gcc      0.1078         0.6973 1.434
Pentium 4 1700 MHz Linux Gnu C      0.10730         0.6940 1.441
SGI Fuel R16000/700MHz IRIX 6.5.30 MipsPro 7.4.4      0.09         0.58 1.72
Macintosh G4 1.2GHz Mac OS X Gnu C (-O 3)      0.0582         0.3765 2.656
AMD Athlon 2800 2.1 GHz Linux gcc (-O 3)      0.0455         0.2943 3.398
iMac 2 Ghz Intel Core Duo Mac OS X gcc (-O 3)      0.0300         0.1940 5.153

This benchmark not only reflects integer performance of these machines (as Dnapars has few floating-point operations) but also the efficiency of the compilers. Some of the machines (the DEC 3000/400 AXP and the IBM RS/6000, in particular) are much faster than this benchmark would indicate. The numerical programs benchmark below gives them a fairer test. The Compaq/Digital Alpha 500au times are exaggerated because, although their compiles are optimized for that processor, some of the Pentium compiles are not similarly optimized.

Note that parallel machines like the Sequent and the SGI PowerChallenge are not really as slow as indicated by the data here, as these runs did nothing to take advantage of their parallelism.

These benchmarks have now extended over 22 years (1986-2008), and in the Dnapars benchmark they extend over a range of over 54,000-fold in speed! The experience of our laboratory, which seems typical, is that computer power grows by a factor of about 1.85 per year. This is roughly consistent with these benchmarks.

For a picture of speeds for a more numerically intensive program, here are benchmarks using Dnaml, with an AMD Athlon 1.2 GHz Linux system as the standard. Some of the timings, the ones in parentheses, are using PHYLIP version 3.5, and those are compared to that version run on the Pentium 266. Runs using the PHYLIP 3.4 Pascal version are adjusted using the 386SX timings where both were run. Numbers are total run times (total user time in the case of Unix) over all three data sets.

Machine Operating
System
Compiler Seconds Time Relative
Speed
386SX 16 Mhz PCDOS Turbo Pascal 6 (7826) 1027.55 0.0009732
386SX 16 Mhz PCDOS Quick C (6549.79) 1027.55 0.0009732
Compudyne 486DX/33 Linux Gnu C (1599.9)   251.0 0.003984
SUN Sparcstation 1+ SunOS Sun C (1402.8)   220.1 0.004543
Everex STEP 386/20 PCDOS Turbo Pascal 5.5 (1440.8)   189.17 0.005286
486DX/33 PCDOS Turbo C++ (1107.2)   173.70 0.005757
Compudyne 486DX/33 PCDOS Waterloo C/386 (1045.78)   164.07 0.006094
Sun SPARCstation IPX SunOS Gnu C   (960.2)   150.64 0.006638
NeXTstation(68040/25) Mach Gnu C   (916.6)   143.80 0.006954
486DX/33 PCDOS Waterloo C/386   (861.0)   135.08 0.007403
Sun SPARCstation IPX SunOS Sun C   (787.7)   123.58 0.008091
486DX/33 PCDOS Gnu C   (650.9)   102.12 0.009792
VAX 6000-530 VMS DEC C   (637.0)     99.94 0.01001
DECstation 5000/200 Unix DEC Ultrix RISC C   (423.3)     66.41 0.01506
IBM 3090-300E AIX Metaware High C   (201.8)     31.65 0.03159
Convex C240/1024 Unix C   (101.6)     15.940 0.06274
DEC 3000/400 AXP Unix DEC C     (98.29)     15.42 0.06485
Pentium 120 Linux Gnu C      25.26     19.230 0.05200
Pentium Pro 180 Linux Gnu C      18.88     14.372 0.06957
Pentium 200 Linux Gnu C      16.51     12.569 0.07956
SGI PowerChallenge IRIX Gnu C      12.446       9.475 0.10554
DEC Alpha 400 4/233 Linux Gnu C (cc -fast)       8.0418       6.122 0.16335
Pentium MMX 266 Linux Gnu C (PHYLIP 3.5)      (36.15)       5.671 0.17632
Pentium MMX 266 Linux Gnu C       7.45       5.671 0.17632
Pentium II 500 Linux Gnu C       6.02       4.583 0.2182
Dual 448/633 MHz Pentiums Linux Gnu C       3.7225       2.834 0.3529
Sun Ultra 10 Solaris 8 Gnu C       3.7101       2.824 0.3541
Pentium 4 1.7 GHz Linux Gnu C       2.0668       1.5734 0.6356
Macintosh G3 300 MHz Mac OS X Gnu C (-O 3)       1.805       1.3741 0.7278
Intel Pentium 4 2.26 GHz Windows XP Cygwin gcc       1.55457       1.1834 0.8450
AMD Athlon 1.2 GHz Linux Gnu C       1.3136       1.0 1.0
Compaq/Digital Alpha 500au Linux Gnu C (cc -fast)       0.9383       0.7143 1.4000
Macintosh G4 1.2 GHz Mac OS X Gnu C (-O 3)       0.7080       0.5390 1.8554
SGI Fuel R16000/700Mhz IRIX 6.5.30 MipsPro 7.4.4       0.55       0.41 2.43
AMD Athlon 2800 2.1 GHz Linux gcc (-O 3)       0.3065       0.2333 4.286
iMac 2 Ghz Intel Core Duo Mac OS X gcc (-O 3)       0.2535       0.1930 5.182

As before, the parallel machines such as the Convex and the SGI PowerChallenge were only run using one processor, which does not take into account the gain that could be obtained by parallelizing the programs. The speed of the Compaq/Digital Alpha 500au is exaggerated because it was compiled in a way optimized for its processor, while some of the Pentium compiles were not.

You are invited to send me figures for your machine for inclusion in future tables. Use the data sets above and compute the total times for Dnapars and for Dnaml for the three data sets (setting the frequencies of the four bases to 0.25 each for the Dnaml runs). Be sure to tell me the name and version of your compiler, and the version of PHYLIP you tested. If the times are too small to be measured accurately, obtain the times for 10 or 100 data sets (the Multiple data sets option) and divide by 10 or 100.


General Comments on Adapting
the Package to Different Computer Systems

In the sections following you will find instructions on how to adapt the programs to different computers and compilers. The programs should compile without alteration on most versions of C. They use the "malloc" library or "calloc" function to allocate memory so that the upper limits on how many species or how many sites or characters they can run is set by the system memory available to that memory-allocation function.

In the document file for each program, I have supplied a small input example, and the output it produces, to help you check whether the programs are running properly.


Compiling the programs

If you have not been able to get executables for PHYLIP, you should be able to make your own. This can be easy under Linux and Unix, but more difficult if you have a Macintosh or a Windows system. If you have the latter, we strongly recommend you download and use the Macintosh and Windows executables that we distribute. If you do that, you will not need to have any compiler or to do any compiling. I get a certain number of inquiries each year from confused users who are not sure what a compiler is but think they need one. After downloading the executables they contact me and complain that they did not find a compiler included in the package, and would I please e-mail them the compiler. What they really need to do is use the executables and forget about compiling them.

Some users may also need to compile the programs in order to modify them. The instructions below will help with this.

I will discuss how to compile PHYLIP using one of a number of widely-used compilers. After these I will comment on compiling PHYLIP on other, less widely-used systems.

Unix and Linux

For Unix and Linux (which is Unix in all important functional respects, if not in all legal respects) you must compile PHYLIP yourself. This is usually easy to do. Unix (and Linux) systems generally have a C compiler and have the make utility. We distribute with the PHYLIP source code a Unix-compatible Makefile. We use GNU's make utility, which might be installed on your system as "make" or as "gmake".

However, note that some popular Linux distributions do not include a C compiler in their default configuration. For example, in RedHat Linux version 8, the "Personal Workstation" installation that is the default does not include the C compiler or the X Windows libraries needed to compile PHYLIP. These are available, and can be loaded from the CDROMs in the distribution. The following instructions assume that you have the C compiler and X libraries. If you cannot easily configure your system to include them, you should look into using the RedHat RPM binary distribution, mentioned on the PHYLIP 3.6 web page.

As is mentioned below (under Macintoshes) the Mac OS X operating system is a Unix, and if the X windows windowing system is installed, these Unix instructions will work for it.

After you have finished unpacking the Documentation and Source Code archive, you will find that you have created a folder phylip-3.6 in which there are three folders, called exe, src, and doc. There is also an HTML web page, phylip.html. The exe folder will be empty, src contains the source code files, including the Makefile. Directory doc contains the documentation files.

Enter the src folder. Before you compile, you will want to look at the Makefile and see whether you want to alter the compilation command. We have the default C compiler flags set with no flags. If you have modified the programs, you might want to use the debugging flags "-g". On the other hand, if you are trying to make a fast executable using the GCC compiler, you may want to use the one which is "An optimized one for gcc". In either case, remove the "#" before that CFLAGS command, and place it before the CFLAGS command that was previously in use. There are careful instructions on this in the Makefile. Once you have set up the CFLAGS and DFLAGS statements to be the way you want, to compile all the programs just type:

make install

You will then see the compiling commands as they happen, with occasional warning messages. If these are warnings, rather than errors, they are not too serious. A typical warning would be like this:

dnaml.c:1204: warning: static declaration for re_move follows non-static

After a time the compiler will finish compiling. If you have done a make install the system will then move the executables into the exe folder and also save space by erasing all the relocatable object files that were produced in the process. You should be left with useable executables in the exe folder, and the src folder should be as before. To run the executables, go into the exe folder and type the program name (say dnaml, which you may or may not have to precede by a dot and a slash./). The names of the executables will be the same as the names of the C programs, but without the .c suffix. Thus dnaml.c compiles to make an executable called dnaml.

Our two tree-drawing programs, Drawgram and Drawtree, require an X Windows installation including the Athena Widgets. These are provided with most X Windows installations.

If you see messages that the compilation could not find "Xlib.h" and other, similar functions, this means that some parts of the X Windows development environment is not installed on your system, or is not installed in the default location. Similarly, if you get error messages saying that some files with "Xaw" in the name cannot be found, this means that the Athena Widgets are not installed on your system, or are not installed in the default location.

In either case, you will need to make sure that they are installed properly. If they are there but not found during the compile, change the DFLAGS and DLIBS variables in the Makefile to point to the locations of the header files and libraries, respectively.

Another is that the usual Linux C compiler is the Gnu GCC compiler. In some Linux systems it is not invoked by the command cc but by gcc. You would then need to edit the Makefile to reflect this (see below for comments on that process).

A typical Unix or Linux installation would put the directory phylip-3.6 in /usr/local. The name of the executables directory EXEDIR could be changed to be /usr/local/bin, so that the make install command puts the executables there. If the users have /usr/local/bin in their paths, the programs would be found when their names are typed. The font files font1 through font6 could also be placed there. A batch script containing the lines

      ln -s /usr/local/bin/font1 font1
      ln -s /usr/local/bin/font2 font2
      ln -s /usr/local/bin/font3 font3
      ln -s /usr/local/bin/font4 font4
      ln -s /usr/local/bin/font5 font5
      ln -s /usr/local/bin/font6 font6

could be used to establish links in the user's working directory so that Drawtree and Drawgram would find these font files when users type a name such as font1 when the program asks them for a font file name. The documentation web pages are in subdirectory doc of the main PHYLIP directory, except for one, phylip.html which is in the main PHYLIP directory. It has a table of all of the documentation pages, including this one. If users create a bookmark to that page it can be used to access all of the other documentation pages.

To compile just one program, such as Dnaml, type:

make dnaml

After this compilation, dnaml will be in the src subdirectory. So will some relocatable object code files that were used to create the executable. These have names ending in .o - they can safely be deleted.

If you have problems with the compilation command, you can edit the Makefile. It has careful explanations at its front of how you might want to do so. For example, you might want to change the C compiler name cc to the name of the Gnu C compiler, gcc. This can be done by removing the comment character # from the front of one line, and placing it at the front of a nearby line. How to do so should be clear from the material at the beginning of the Makefile. We have included sample lines for using the gcc compiler and for using the Cygwin Gnu C++ environment on Windows, as well as the default of cc.

We have encountered some problems with the Gnu C Compiler (gcc) on 64-bit Itanium processors when compiled with the the -O 3 optimization level, in our code for generating random numbers.

Some older C compilers (notably the Berkeley C compiler which is included free with some Sun systems) do not adhere to the ANSI C standard (because they were written before it was set down). They have trouble with the function prototypes which are in our programs. We have included an #ifndef preprocessor command to eliminate the problem, if you use the switch -DOLDC when compiling. Thus with these compilers you need only use this in your C flags (in the Makefile) and compilers such as Berkeley C will cause no trouble.

Windows systems

We distribute Windows executables, and most likely you can use these and do not need to recompile them. The following instructions will only be necessary if you want to modify the programs and need to recompile them. They are given for several different compilers available on Windows systems. Another major compiler is Intel compiler -- we do not have information yet on how to use it, but expect that PHYLIP will compile on it.

Compiling with Cygnus Gnu C++

Cygnus Solutions (now a part of Red Hat, Inc.) has adapted the Gnu C compiler to Windows systems and provided an environment, CygWin, which mimics Unix for compiling. Currently, this is the compiler that we use to prepare the Windows executables. Cygwin is available for purchase, and they also make it available to be downloaded for free. The download is large. To get it, go to the Cygwin web site at http://www.cygwin.com and follow the instructions there. To download it you need to download their setup.exe program and then it will download the rest when it is run. You will need a lot of disk space for it (about a gigabyte).

When installing Cygwin it is important to install gcc and make. During the course of the setup program Setup will ask you to select packages. Expand the Devel Category by clicking on it. Scroll down to gcc and check if the "New" column says "Skip". If it does, click on "skip". "Skip" will change to the current version of gcc. Scroll down to the make package, and if it has "Skip" click on "Skip". These two programs are nessessary to install phylip.

Once you have installed the free CygWin environment and the associated Gnu C compiler on your Windows system, compiling PHYLIP is closely similar to what one does for Unix or Linux:

Compiling with Microsoft Visual C++

We have had success in the past compiling PHYLIP with Microsoft Visual C++ (the compiler in the Microsoft .NET package), although the Windows executables that we distribute are built using the Cygwin GCC compiler. The following instructions are the ones we have used for Visual C++ for the .NET 2008 version with Visual C++ version 9.0. Microsoft also makes a free download version of their C++ compiler from 2005 available as Visual C++ Express Edition. That version has a somewhat different content, and these instructions will not work with it. If you figure out how to get the compiler and Makefiles to work together, please let us know -- we don't have the energy to figure this out for all possible configurations of the Microsoft C++ compiler.

The instructions use the nmake command that uses a Makefile which is called Makefile.msvc in our distribution. At the end of this section we have some comments on how to compile the programs with Visual C++ version 7.0, which also has a somewhat different file folder structure.

With Microsoft Visual C++, you can compile using a Makefile. We have supplied this in the source code distrubution as Makefile.msvc. You may wish to preserve the Unix Makefile by renaming Makefile to, say, Makefile.unix, then make a copy of Makefile.msvc and call it Makefile. (You may have to change your Windows desktop settings to make the three-letter extensions visible, or you could use the RENAME command in the Command tool).

If instead you have an earlier version of Visual Studio .NET which has the Visual C++ 7.0 compiler, you should proceed as above, but instead, set MSVC to C:\Program Files\Microsoft Visual Studio .NET, and then type

PATH=%PATH%;%MSVC%\Vc7\bin;%MSVC%\Common7\IDE

You will also need to edit the line in the Makefile that defines the variable MSVCPATH. You should change this to

MSVCPATH="C:\Program Files\Microsoft Visual Studio .NET\Vc7"
If this does not work with your Visual C++ 7.0 compiler, then the most likely reason is that your installation was not placed into the folder C:\Program Files, or has a name that is not exactly identical to Microsoft Visual Studio .NET. In that case, you will need to find the correct path to the Visual C++ 7.0 installation on your system, and supply this in the MSVC variable above, and also in the Makefile. (Note that in the Makefile, you will need to follow this path with \Vc7.)

Compiling with Borland C++

Borland C++ can be downloaded for free. It is a compiler released in 2000, and which is now owned by Embarcadero Technologies, Inc. (see their site http://www.codegear.com/downloads/free/cppbuilder). To download it you need to register with them. It has a somewhat restrictive license, so we cannot use it for the widely-distributed executables.

You should download the compiler as it includes all the utilities needed to compile phylip. It can compile using a Makefile. We have supplied this in the source code distribution as Makefile.bcc. You will need to preserve the Unix Makefile by renaming it to, say, Makefile.unix, then make a copy of Makefile.bcc and call it Makefile. The Makefile is invoked using the make command.

You will first need to create an ilink32.cfg and a bcc32.cfg file and put the files into the src folder. These files are text files and their contents are described in the readme.txt that comes with the Borland tools. If the Borland tools are in the default location the contents of ilink32.cfg would be.

-L"c:\Borland\Bcc55\lib"

and the contents of bcc32.cfg

-I"c:\Borland\Bcc55\include"
-L"c:\Borland\Bcc55\lib"

These files can be created in a text editor such as Notepad or Wordpad.

To invoke the make command you will first need to open a command prompt window. Then set the path appropriately. To set the path, type

set BORLAND=Path

Where "Path" is where Borland is installed, such as C:\Borland\BCC55. Then type

PATH=%PATH%;%BORLAND%\Bin

If you simply type make you will get a list of possible make commands. For example, to compile a single program such as Dnaml but not install it, type make dnaml. To compile and install all programs type make install. We have supplied all the the support files and icons needed for the compilations. They are in folder bcc of the main PHYLIP ource code folder. We have had to supply a complete second set of the resource files with names *.brc because Borland resource files have a minor incompatibility with Microsoft Visual C++ resource files.

Macintosh

Compiling with GCC on Mac OS X with our Makefile

The executables distributed by us for Mac OS X are currently compiled using the GCC compiler that is distributed with Mac OS X. You may not need to recompile them, unless you want to make changes in the programs. We are distributing 32-bit "universal binaries" that work on both PowerMac and Intel iMac. You may not need to recompile unless you need to make a version of the executables more closely adapted to your system, or unless you want to modify the programs. One reason to recompile might be if you want 64-bit executables, which you might need to address large amounts of memory.

If you do want to recompile, conder the following:

Compiling with GCC on Mac OS X with X Windows

On Mac OS X systems you can also use the GCC compiler and X Windows to compile a version of the executables that runs from the command line in native mode. To do that, you must have the GCC compiler and the X11 windows development kit materials installed. X Windows is an optional install present on the Mac OS X for version 10.3 (Panther) and 10.4 (Tiger) distribution disks, and part of the default distribution for 10.5 (Leopard) on. You can search for the latest X11 release for your system by looking at results after searching for "x11" on the Apple Downloads page. It is easy to download and install on a Mac OS X system.

If you have the GCC compiler and the X11 libraries installed, you can use a Terminal window (which you will find available in the Utilities folder in the Applications folder) and compile PHYLIP by treating it as a Unix or Linux application and following the instructions given above under "Unix and Linux". Basically you just get into the folder that contains the PHYLIP source code and type

make install

This uses the ordinary Unix/Linux Makefile, which works in creating programs using X11 for Mac OS X with the gcc compiler. Note that to run the programs drawgram and drawtree that actually use the X Windows, you will need to

What about the Metrowerks Codewarrior compiler?

We previously also supported the Metrowerks Codewarrior compiler, for both Mac OS 9 and Mac OS X (and even for producing Windows executables). Codewarrior required that one maintain "projects" for each program, and we distributed the projects as well as the source code. As Metrowerks was bought out by Freescale and has retargeted its compilers for building embedded applications, we are ceasing this rather cumbersome support. That means that we do not at present give you a way to recompile our programs if you have Mac OS 9. We may not make a set of executables for Mac OS 9 ourself. If you absolutely need to obtain compilation support routines and projects for Metrowerks Codewarrior, contact us and we will send you what we have. Of course, for Mac OS X the GCC compiler is available, and we describe above how to compile the programs with it.

VMS VAX systems

VMS VAX systems have almost disappeared, so we have not tried to compile version 3.695 on an OpenVMS system. The following instructions should work. On the OpenVMS operating system with DEC VAX VMS C the programs will compile without alteration. The commands for compiling a typical program (Dnapars, which depends on the separately compiled files phylip.c and seq.c) are:

$ DEFINE LNK$LIBRARY SYS$LIBRARY:VAXCRTL
$ CC DNAPARS.C
$ CC PHYLIP.C
$ CC SEQ.C
$ LINK DNAPARS,PHYLIP,SEQ

Once you use this $ DEFINE statement during a given interactive session, you need not repeat it again as the symbol LNK$LIBRARY is thereafter properly defined. The compilation process leaves a file DNAPARS.OBJ in your directory: this can be discarded. The executable program is named DNAPARS.EXE. To run the program one then uses the command:

$ R DNAPARS

The compiler defaults to the filenames INFILE., OUTFILE., and TREEFILE.. If the input file INFILE. does not exist the program will prompt you to type in its name. Note that some commands on VMS such as TYPE OUTFILE will fail because the name of the file that it will attempt to type out will be not OUTFILE. but OUTFILE.LIS. To get it to type the write file you would have to instead issue the command TYPE OUTFILE..

When you are using the interactive previewing feature of Drawgram (or Drawtree) on a Tektronix or DEC ReGIS compatible terminal, you will want before running the program to have issued the command:

$ SET TERM/NOWRAP/ESCAPE

so that you do not run into trouble from the VMS line length limit of 255 characters or the filtering of escape characters.

To know which files to compile together, look at the entries in the Makefile.

Parallel computers

As parallel computers become more common, the issue of how to compile PHYLIP for them has become more pressing. People have been compiling PHYLIP for vector machines and parallel machines for many years. We have not made a version for parallel machines because there is still no standard parallel programming environment on such machines (or rather, there are many standards, so that one cannot find one that makes a parallel execution version of PHYLIP widely distributable). However symmetric multiprocessing using the MPI Message Passing Interface is spreading rapidly, and we will probably support it in future versions of PHYLIP.

Although the underlying algorithms of most programs, which treat sites independently, should be amenable to vector and parallel processors, there are details of the code which might best be changed. In certain of the programs (Dnaml, Dnamlk, Proml, Promlk) I have put a special comment statement next to the loops in the program where the program will spend most of its time, and which are the places most likely to benefit from parallelization. This comment statement is:

           /* parallelize here */
In particular within these innermost loops of the programs there are often scalar quantities that are used for temporary bookkeeping. These quantities, such as sum1, sum2, zz, z1, yy, y1, aa, bb, cc, sum, and denom in procedure makenewv of Dnaml and similar quantities in procedure nuview) are there to minimize the number of array references. For vectorizing and parallelizing compilers it will be better to replace them by arrays so that processing can occur simultaneously.

If you succeed in making a parallel version of PHYLIP we would like to know how you did it. In particular, if you can prepare a web page which describes how to do it for your computer system, we would like to use material from it in our PHYLIP web pages. Please e-mail it to me. We hope to have a set of pages that give detailed instructions on how to make parallel version of PHYLIP on various kinds of machines. Alternatively, if we were given your modified version of the program we might be able to figure out how to make modifications to our source code to allow users to compile the program in a way which makes those modifications.

Other computer systems

As you can see from the variety of different systems on which these programs have been successfully run, there are no serious incompatibility problems with most computer systems. PHYLIP in various past Pascal versions has also been compiled on 8080 and Z80 CP/M Systems, Apple II systems running UCSD Pascal, a variety of minicomputer systems such as DEC PDP-11's and HP 1000's, on 1970's era mainframes such as CDC Cyber systems, and so on. In a later era it was also compiled on IBM 370 mainframes, and of course on DOS and Windows systems and on Macintosh systems. We have gradually accumulated experience on a wider variety of C compilers. If you succeed in compiling the C version of PHYLIP on a different machine or a different compiler, I would like to hear the details so that I can consider including the instructions in a future version of this manual.

Compiling the Java interfaces

The ONLY reason you should do this is if you want to add or modify functionality on the Java interface. In all other cases, the .jar files that already exist in the javajars folder will run on your Mac / MS / Linux / Unix system and you should not be here.

Welcome to a fairly complex process. Unless you are an experienced object oriented programmer, you will find Java has a steep learning curve and will cause you headaches.

The general overview is that there is a Java interface that gathers and validates input from the user, there is a call from the Java code to a dynamic C library that contains the Phylip functionality, and there is feedback to the user from the Java interface as to the status of the underlying C code. Because one has two very different kinds of software running, the feedback is not as elegant as one would expect from a single integrated environment.

Now for the specifics. We have developed these Java interfaces using the Eclipse environment (available from www.eclipse.org). Go there and download the version of the Java development environment appropriate to your operating system.

In the distribution there is a javasrc folder which contains folders that match the programs. These folders contain the program's Java interfaces. For example folder drawgram contains DrawgramInterface.java and DrawgramUserInterface.java. The former does the interaction with the compiled C library, the latter contains the user interface.

If you want to modify the Java interface for Drawgram, open the Eclipse Java development environment, create a project called Phylip3.695, create a folder under it called src. Under that create a project called drawgram. Now import the two Drawgram associated java files (DrawgramUserInterface.java and DrawgramInterface.java) into that project. You will also need to create a project called util and import all the items in the javasrc/util directory. Open DrawgramUserInterface.java with the Eclipse WindowBuilderEditor and you can edit it however you want. Remember that you'll need to add ActionListeners (described in Java manuals) to anything that changes things on the screen. There are plenty of examples of them in DrawgramUserInterface.java, for example, TreeGrowToggle which handles the toggling "Tree grows:" between "Horizontal" and "Vertical" using Radio buttons. Most of the pieces you'll need are in the existing code. You can clone them and edit to fit. Beyond that, "Google is your friend".

Once you have added new functionality or changed existing functionality in the user interface, you will need to pass the information it collects from the user to the underlying C code. This is a bit tricky because C and Java are very different kinds of languages. Luckily Sun provided the Java Native Access / Java Native Interface (JNA/JNI) interface package to take care of it. We used JNA (which calls JNI) because it is simpler to use and our needs were basic enough we could live within its confines. In order to use it you will also need to get two public jars off the web (do a Google search for these as they keep moving around):

JNA passes everything via an enormous list of variables. This is simple to program but very hard to keep track of, as you have to keep things exactly parallel in the Java and C code and there is no debugger that will help you. We have found it best to build a public class in Java that contains everything that is going to the C code and create an instance of it when the user is finished with data entry and decides to execute the process (in the Drawgram case, selects Preview). We then copy all the data from the screen into the members of the class, and pass these directly into the JNA call to the underlying C code (look in DrawgramInterface.java for an example).

In the underlying C code (which must be compiled as a library so that Java can access it), there is an entry point that is the name of the program (for example the function drawgram in drawgram.c) containing as arguments every one of the variables that were passed by the Java interface, in the same order. If you have weird bugs, most likely you messed this up. Make a copy of the Java class definition, paste it into the C code and check everything. Another wrinkle that can bite you is that booleans come though as integers and Java and C do not agree as to what that means. False is 0 in both languages. True is "not 0" in Java and often set to all bits on (which is a very big negative number in C). C often has problems with this. Each compiler is different and there are environment variables that effect this also. It is safest to explicitly fix things before you execute any C code. There are a lot of other odd quirks, but you have two working examples (Drawgram.c and Drawtree.c), so you can probably figure them out.

Feedback from C to Java can be difficult. In Drawgram and Drawtree it is fairly easy, as the plotting is done (to the file JavaPreview.ps in case you need to know) and the program returns. The Java interface waits until the C code completes and returns, then reads JavaPreview.ps and displays the preview. In cases where one needs progress indicators, one needs to multithread the Java code and display a continually updating progress file. Phylip 3.695 has no need of multithreading but it will be implemented in Phylip 4.0.


Frequently Asked Questions

This set of Frequently Asked Questions, and their answers, is from the PHYLIP web site. A more up-to-date version can be found there, at:

http://evolution.gs.washington.edu/phylip/faq.html

Problems that are encountered

"It doesn't work! It doesn't work!! It says can't find infile."
Actually, it's working just fine. Many of the programs look for an input file called infile, and if one of that name is not present in the current folder, they then ask you to type in the name of the input file. That's all that it's doing. This is done so that you can get the program to read the file without you having to type in its name, by making a copy of your input file and calling it infile. If you don't do that, then the program issues this message. It looks alarming, but really all that it is trying to do is to get you to type in the name of the input file. Try giving it the name of the input file.
"The program reads my data file and then says it has a memory allocation error!"
This is what tends to happen if there is a problem with the format of the data file, so that the programs get confused and think they need to set aside memory for 1,000,000 species or so. The result is a "memory allocation error" (the error message may say that "the function asked for an inappropriate amount of memory"). Check the data file format against the documentation: make sure that the data files have not been saved in the format of your word processor (such as Microsoft Word) but in a "flat ASCII" or "text only" mode. Note that adding memory to your computer is not the way to solve this problem -- you probably have plenty of memory to run the program once the data file is in the correct format.
"I opened the program but I don't see where to create a data file!"
The programs (there are more than one) use data files that have been created outside of the program. They do not have any data editor within them. You can create a data file by using an editor, such as Microsoft Word, Emacs, vi, TextEdit, Notepad, etc. But be sure not to save the file in Microsoft Word's own format. It should be saved in Text Only format (in Mac OS X TextEdit you need to use the Make Plain Text menu choice in the Format menu). You can use the documentation files, including the examples at the end of those files, to figure out the format of the input file. Documentation files such as main.html, sequence.html, distance.html and many others should be consulted. Many users create their data files by having their alignment program (such as ClustalW), output its alignments in PHYLIP format. Many alignment programs have options to do that.
"There is an error message saying that there is already a file named outfile!"
This is perfectly normal. When any PHYLIP program starts to open an output file to write its output on it, it tries to open a file called "outfile". If there is already an output file of that name, it asks you whether you want to replace it, or whether you want to append to it, or whether you want to open instead a file of a new name, or whether you just want to quit. Choose one of the these. If you do not need the information that is in the old "outfile", just tell it to overwrite (replace) the file by typing the letter R and then pressing the Enter key. The program will proceed normally after that. There are also options available to you to Append your output to "outfile" or to have the output written to a new File whose name you provide. (Of course, it is good practice to rename any output file called "outfile" that contains results that you want to keep, to prevent that file from being overwritten).
"The program ran but it analyzed the wrong data set!"
This can happen if you put a data set in the current folder, perhaps as a file named myfile.dna, and intend to have the program analyze that. But you fail to notice that the folder already has another data file in it, named infile. The programs will always try to find a file named infile, and they will read that file if they find it. You should either copy your file into file infile, or delete file infile so that when the program does not find it, it will ask you for the name of the input file.
"I ran PHYLIP, and all it did was say it was extracting a bunch of files!"
There is no executable program named PHYLIP in the PHYLIP package! But in some cases (especially the Windows distribution) there is a file called phylip-3.695.exe. That file is an archive of documentation and source code. Once you have run it and extracted the files in it, so that they are in the folder, running it again will just do the extraction again, which is unnecessary.
"One program makes an output file and then the next program crashes while reading it!"
Did you rename the file? If a program makes a file called outfile, and then the next program is told to use outfile as its input file, things can get confusing. The second program first tries to open outfile as an output file, and since it finds one of that name already there, it asks you whether to overwrite that file. If you say to do that, the program overwrites the file, thus erasing it. When it then also tries to read from this empty outfile a psychological crisis can ensue. The solution is simply to rename outfile before trying to use it as an input file.
"I make a file called infile and then the program can't find it!"
Let me guess. You are using Windows, right? You made your file in Word or in Notepad or WordPad, right? If you made a file in one of these editors, and saved it, not in Word format, but in Text Only format, then you were doing the right thing. But when you told the operating system to save the file as infile, it actually didn't. It saved it as infile.txt. Then just to make life harder for you, the operating system is set up by default to not show that three-letter extension to the file name. Next to its icon it will show the name infile. So you think, quite reasonably, that there is a file called infile. But there isn't a file of that name, so the program, quite reasonably, can't find a file called infile. If you want to check what the actual file name is, use the Properties menu item of the File item on your folder. If you are annoyed at not seeing the full file name, with the three-letter extensions, then you can set the operating system to show them by choosing in the folder's Tools menu (at the top of its window) the Folder Options and then the View tab, and setting the "Hide extensions for known file types" to not be selected. In any case, you should be able to get the program to work by telling it that the file name is infile.txt.
"Consense gives wierd branch lengths! How do I get more reasonable ones?"
Consense gives branch lengths which are simply the numbers of replicates that support the branch. This is not a good reflection of how long those branches are estimated to be. The best way to put better branch lengths on a consensus tree is to use it as a User Tree in a program that will estimate branch lengths for it, such as Dnaml. You may need to convert it to being an unrooted tree, using Retree, first. If the original program you were using was a program that does not estimate branch lengths, you may instead have to use one that does. You can use a likelihood program, or make some distances between your species (using, for example, Dnadist) and use Fitch to put branch lengths on the user tree. Here is the sequence of steps you should go through:
  1. Take the tree and use Retree to make sure it is Unrooted (just read it into Retree and then save it, specifying Unrooted)
  2. Use the unrooted tree as a User Tree (option U) in one of our programs (such as Dnaml or Fitch). If you use Fitch, you also first need to use one of the distance programs such as Dnadist to compute a set of distances to serve as its input.
  3. Specify that the branch lengths of the tree are not to be used but should be re-estimated. This is actually the default.
"I looked at the tree printed in the output file outfile and it looked wierd. Do I always need to look at it in Drawgram?"
It's possible you are using the wrong font for looking at the tree in the output file. The tree is drawn with dashes and exclamation points. If a proportional font such as Times Roman or Helvetica is used, the tree lines may not connect. Try selecting the whole tree and setting the font to a fixed-width one such as Courier. You may be astounded how much clearer the tree has become.
"Drawtree (or Drawgram) doesn't work: it can't find the font file!"
Six font files, called font1 through font6, are distributed with the executables (and with the source code too). The program looks for a copy of one of them called fontfile. If you haven't made such a copy called fontfile it then asks you for the name of the font file. If they are in the current folder, just type one of font1 through font6. The reason for having the program look for fontfile is so that you can copy your favorite font file, call the copy fontfile, and then it will be found automatically without you having to type the name of the font file each time.
"Can Drawgram draw a scale beside the tree? Print the branch lengths as numbers?"
It can't do either of these. Doing so would make the program more complex, and it is not obvious how to fit the branch length numbers into a tree that has many very short internal branches. If you want these scales or numbers, choose an output plot file format (such as Postscript, PICT or PCX) that can be read by a drawing program such as Adobe Illustrator, Freehand, Canvas, CorelDraw, or MacDraw. Then you can add the scales and branch length numbers yourself by hand. Note the menu option in Drawtree and Drawgram that specifies the tree size to be a given number of centimeters per unit branch length.
"How can I get Drawgram or Drawtree to print the bootstrap values next to the branches?"
When you do bootstrapping and use Consense, it prints the bootstrap values in its output file (both in a table of sets, and on the diagram of the tree which it makes). These are also in the output tree file of Consense. There they are in place of branch lengths. So to get them to be on the output of Drawgram or Drawtree, you must write the tree in the format of a drawing program and use it to put the values in by hand, as mentioned in the answer to the previous question.
"Dnaml won't read the treefile that is produced by Dnapars!"
That's because the Dnapars tree file is a rooted tree, and Dnaml wants an unrooted tree. Try using Retree to change the file to be an unrooted tree file. Our most recent versions of the programs usually automatically convert a rooted tree into an unrooted one as needed. But the programs such as Dnamlk or Dollop that need a rooted tree won't be able to use an unrooted tree.
"What is a good value for the random number seed?"
The random number seed is used to start a process of choosing "random" (actually pseudorandom) numbers, which behave as if they were unpredictably randomly chosen between 0 and 232-1 (which is 4,294,967,295). You could put in the number 133 and find that the next random number was 221,381,825. As they are effectively unpredictable, there is no such thing as a choice that is better than any other, provided that the numbers are of the form 4n+1 (this can be judged from the last two digits of the number: for example if they are 37 it is of this form as 37=4*9+1). However if you re-use a random number seed, the sequence of random numbers that result will be the same as before, resulting in exactly the same series of choices, which may not be what you want.
"In bootstrapping, Seqboot makes too large a file"
If there are 1000 bootstrap replicates, it will make a file 1000 times as long as your original data set. But for many methods there is another way that uses much less file space. You can use Seqboot to make a file of multiple sets of weights, and use those together with the original data set to do bootstrapping.
"In bootstrapping, the output file gets too big."
When running a program such as Neighbor or Dnapars with multiple data sets (or multiple weights) for purposes of bootstrapping, the output file is usually not needed, as it is the output tree file that is used next. You can use the menu of the program to turn off the writing of trees into the output file. The trees will still be written into the output tree file.
"Why don't your programs correctly read the sequence alignment files produced by ClustalW?"
They do read them correctly if you make the right kind. Files from ClustalV or ClustalW whose names end in ".aln" are not in PHYLIP format, but in Clustal's own format which will not work in PHYLIP. You need to find the option to output PHYLIP format files, which ClustalW and ClustalV usually assign the extension .phy.
"Why doesn't Neighbor read my DNA sequences correctly?"
Because it wants to have as input a distance matrix, not sequences. You have to use Dnadist to make the distance matrix first.
"On our Mac OS 9 system, larger data files fail to run."
We have set the memory allowances on the Mac OS 9 executables to be generous, but not too big. You therefore may need to increase them. Use the Get Info item on the Finder File menu.

How to make it do various things

"How do I bootstrap?"
The general method of bootstrapping involves running Seqboot to make multiple bootstrapped data sets out of your one data set, renaming the output file, then running one of the tree-making programs with the Multiple data sets option to analyze them all, renaming the output tree file, then finally running Consense to make a majority rule consensus tree from the resulting tree file. Read the documentation of Seqboot to get further information. With this system almost any of the tree-making methods in the package can be bootstrapped. It is somewhat tedious but you will find it generally useable.
"How do I specify a multi-species outgroup with your parsimony programs?"
It's not a feature but is not too hard to do in many of the programs. In parsimony programs like Mix, for which the W (Weights) and A (Ancestral states) options are available, and weights can be larger than 1, all you need to do is:
(a)
 
In Mix, make up an extra character with states 0 for all the outgroups and 1 for all the ingroups. If using
Dnapars the ingroup can have (say) G and the outgroup A.
(b)
 
Assign this character an enormous weight (such as Z for 35) using the W option,
all other characters getting weight 1, or whatever weight they had before.
(c)
 
If it is available, Use the A (Ancestral states) option to designate that for that new character the state found in the
outgroup is the ancestral state.
(d)In Mix do not use the O (Outgroup) option.
(e)
 
 
After the tree is found, the designated ingroup should have been held together by the fake character. The tree will be
rooted somewhere in the outgroup (the program may or may not have a preference for one place in the outgroup over another).
Make sure that you subtract from the total number of steps on the tree all steps in the new character.
In programs like Dnapars, you cannot use this method as weights of sites cannot be greater than 1. But you do an analogous trick, by adding a largish number of extra sites to the data, with one nucleotide state ("A") for the ingroup and another ("G") for the outgroup. You will then have to use Retree to manually reroot the tree in the desired place.
"How do I force certain groups to remain monophyletic in your parsimony programs?"
By the same method as in the previous question, using multiple fake characters, any number of groups of species can be forced to be monophyletic. In Move, Dolmove, and Dnamove you can specify whatever outgroups you want without going to this trouble.
"How can I reroot one of the trees written out by PHYLIP?"
Use the program Retree. But keep in mind whether the tree inferred by the original program was already rooted, or whether you are free to reroot it without changing its meaning.
"What do I do about deletions and insertions in my sequences?"
The molecular sequence programs will accept sequences that have gaps (the "-" character). They do various things with them, mostly not optimal. Programs such as Dnaml and Dnadist count gaps as equivalent to unknown nucleotides (or unknown amino acids) on the grounds that we don't know what would be there if something were there. This completely leaves out the information from the presence or absence of the gap itself, but does not bias the gapped sequence to be close to or far from other gapped or ungapped sequences. Sequences that share a gap at a site do not tend to cluster together on the tree. So it is not necessary to remove gapped regions from your sequences, unless the presence of gaps indicates that the region is badly aligned. An exception to this is Dnapars, which counts "gap" as if it were a fifth nucleotide state (in addition to A, C, G, and T). Each site counts one change when a gap arises or disappears. The disadvantage of this treatment is that a long gap will be overweighted, with one event per gapped site. So a gap of 10 nucleotides will count as being as much evidence as 10 single site nucleotide substitutions. If there are not overlapping gaps, one way to correct this is to recode the first site in the gap as "-" but make all the others be "?" so the gap only counts as one event.
"How can I produce distances for my data set which has 0's and 1's?"
You can't do it in a simple and general way, for a straightforward reason. Distance methods must correct the distances for superimposed changes. Unless we know specifically how to do this for your particular characters, we cannot accomplish the correction. There are many formulas we could use, but we can't choose among them without much more information. There are issues of superimposed changes, as well as heterogeneity of rates of change in different characters. Thus we have not provided a distance program for 0/1 data. It is up to you to figure out what is an appropriate stochastic model for your data and to find the right distance formulas. If the 0's and 1's are presences and absences of restriction sites or restriction fragments, you can use program Restdist to compute appropriate distances.
"I have RFLP fragment data: which programs should I use?"
This is a more difficult question than you may imagine. Here is quick tour of the issues:
  • You can code fragments as 0 and 1 and use a parsimony program. It is not obvious in advance whether 0 or 1 is ancestral, though it is likely that change in one direction is more likely than change in the other for each fragment. One can use either Wagner parsimony (programs Pars, Mix, Penny or Move) or use Dollo parsimony (Dollop, Dolpenny or Dolmove) with the ancestral states all set as unknown ("?").
  • You can use a distance matrix method using the RFLP distance of Nei and Li (1979). Their restriction fragment distance is available in our program RestDist.
  • You should be very hesitant to bootstrap RFLP's. The individual fragments do not evolve independently: a single nucleotide substitution can eliminate one fragment and create two (or vice versa).
For restriction sites (rather than fragments) life is a bit easier: they evolve nearly independently so bootstrapping is possible and Restml can be used, as well as restriction sites distances computed in Restdist. Also directionality of change is less ambiguous when parsimony is used. A more complete tour of the issues for restriction sites and restriction fragments is given in chapter 15 of my book (Felsenstein, 2004).
"Why don't your parsimony programs print out branch lengths?"
Well, Dnapars and Pars can. The others have not yet been upgraded to the same level. The longer answer is that it is because there are problems defining the branch lengths. If you look closely at the reconstructions of the states of the hypothetical ancestral nodes for almost any data set and almost any parsimony method you will find some ambiguous states on those nodes. There is then usually an ambiguity as to which branch the change is actually on. Other parsimony programs resolve this in one or another arbitrary fashion, sometimes with the user specifying how (for example, methods that push the changes up the tree as far as possible or down it as far as possible). Our older programs leave it to the user to do this. In Dnapars and Pars we use an algorithm discovered by Hochbaum and Pathria (1997) (and independently by Wayne Maddison) to compute branch lengths that average over all possible placements of the changes. But these branch lengths, as nice as they are, do not correct for mulitple superimposed changes. Few programs available from others currently correct the branch lengths for multiple changes of state that may have overlain each other. One possible way to get branch lengths with nucleotide sequence data is to take the tree topology that you got, use Retree to convert it to be unrooted, prepare a distance matrix from your data using Dnadist, and then use Fitch with that tree as User Tree and see what branch lengths it estimates.
"Why can't your programs handle unordered multistate characters?"
There is a program Pars which does parsimony for undordered multistate characters with up to 8 states, plus ?. The other the discrete characters parsimony programs can only handle two states, 0 and 1. This is mostly because I have not yet had time to modify them to do so - the modifications would have to be extensive. Ultimately I hope to get these done. If you have four or fewer states and need a feature that is not in Pars, you could recode your states to look like nucleotides and use the parsimony programs in the molecular sequence section of PHYLIP, or you could use one of the excellent parsimony programs produced by others.

Background information needed:

"What file format do I use for the sequences?"
"How do I use the programs? I can't find any documentation!"
These are discussed in the documentation files. Do you have them? If you have a copy of this page you probably do. They are distributed in the same archive as the rest of the package. Input file formats are discussed in main.html, in sequence.html, distance.html, contchar.html, discrete.html, and the documentation files for the individual programs.
"Where can I find out how to infer phylogenies?"
There are now a few books. For molecular data you could use one of these:

At the upper-undergraduate level:

  • Graur, D. and W.-H. Li. 2000. Fundamentals of Molecular Evolution. Sinauer Associates, Sunderland, Massachusetts. (or the earlier edition by Li and Graur).
  • Page, R. D. P. and E. C. Holmes. 1998. Molecular Evolution: A Phylogenetic Approach. Blackwell, Oxford.

and as graduate-level texts:

  • Nei, M. and S. Kumar. 2000. Molecular Evolution and Phylogenetics. Oxford University Press, Oxford.
  • Li, W.-H. 1999. Molecular Evolution. Sinauer Associates, Sunderland, Massachusetts.

For more mathematically-oriented readers, there is the book

  • Semple, C., and M. Steel. 2003. Phylogenetics. Oxford Lecture Series in Mathematics and Its Applications, volume 24. Oxford University Press, Oxford.

Best of all is of course my own book on phylogenies, which covers the subject for many data types, at a graduate course level:

  • Felsenstein, J. 2004. Inferring Phylogenies. Sinauer Associates, Sunderland, Massachusetts.

There are also some recent books that take a more practical hands-on approach, and give some detailed information on how to use programs, including PHYLIP programs. These include:

  • Lemey, P., Salemi, M., and A.-M. Vandamme (eds.) 2009. The Phylogenetic Handbook. A Practical Approach to Phylogenetic Analysis and Hypothesis Testing, 2nd edition. Cambridge University Press, Cambridge.
  • Hall, B. G. 2007. Phylogenetic Trees Made Easy: A How-To Manual, 3rd edition. Sinauer Associates, Sunderland, Massachusetts. (The Second Edition contained some information on using PHYLIP, but most of that has been dropped from this third edition).

In addition, one of these three review articles may help:

  • Swofford, D. L., G. J. Olsen, P. J. Waddell, and D. M. Hillis. 1996. Phylogenetic inference. pp. 407-514 in Molecular Systematics, 2nd ed., ed. D. M. Hillis, C. Moritz, and B. K. Mable. Sinauer Associates, Sunderland, Massachusetts.
  • Felsenstein, J. 1988. Phylogenies from molecular sequences: inference and reliability. Annual Review of Genetics 22: 521-565.
  • Felsenstein, J. 1988. Phylogenies and quantitative characters. Annual Review of Ecology and Systematics 19: 445-471.

A useful article introducing the inference of phylogenies at a more elementary level is:

  • Baldauf, S. L. 2003. Phylogeny for the faint of heart: a tutorial. Trends in Genetics 19: 345-351.

I have already mentioned above that there is an excellent guide to using PHYLIP 3.6 for molecular analyses available. It is by Jarno Tuimala:

  • Tuimala, J. 2004. A Primer to Phylogenetic Analysis using Phylip Package. 2nd edition. Center for Scientific Computing, Espoo, Finland.
and it is available as a PDF here.

Questions about distribution and citation:

"If I copied PHYLIP from a friend without you knowing, should I try to keep you from finding out?"
No. It is to your advantage and mine for you to let me know. If you did not get PHYLIP "officially" from me or from someone authorized by me, but copied a friend's version, you are not in my database of users. You may also have an old version which has since been substantially improved. I don't mind you "bootlegging" PHYLIP (it's free anyway), but you should realize that you may have copied an outdated version. If you are reading this Web page, you can get the latest version just as quickly over Internet. It will help both of us if you get onto my mailing list. If you are on it, then I will give your name to other nearby users when they ask for the names of nearby users, and they are urged to contact you and update your copy. (I benefit by getting a better feel for how many distributions there have been, and having a better mailing list to use to give other users local people to contact). Use the registration form which can be accessed through our web site's registration page.
"How do I make a citation to the PHYLIP package in the paper I am writing?"
One way is like this:

Felsenstein, J. 2009. PHYLIP (Phylogeny Inference Package) version 3.695. Distributed by the author. Department of Genome Sciences, University of Washington, Seattle.

or if the editor for whom you are writing insists that the citation must be to a printed publication, you could cite a notice for version 3.2 published in Cladistics:

Felsenstein, J. 1989. PHYLIP - Phylogeny Inference Package (Version 3.2). Cladistics 5: 164-166.

(This citation has been so commonly made that this is the most-cited paper ever in the journal Cladistics, I am the most-cited author ever in that journal, and these citations are responsible for more than 15% of the impact factor of that journal!).

For a while a printed version of the PHYLIP documentation was available and one could cite that. This is no longer true. Other than that, this is difficult, because I have never written a paper announcing PHYLIP! My 1985b paper in Evolution on the bootstrap method contains a one-paragraph Appendix describing the availability of this package, and that can also be cited as a reference for the package, although it was distributed since 1980 while the bootstrap paper is 1985. A paper on PHYLIP is needed mostly to give people something to cite, as word-of-mouth, references in other people's papers, and electronic newsgroup postings have spread the word about PHYLIP's existence quite effectively.

"Can I make copies of PHYLIP available to the students in my class?"
Generally, yes. Read the Copyright notice near the front of this main documentation page. If you charge money for PHYLIP, other than a minimal charge to cover cost of distribution, or you use it in a service for which you charge money, you will need to negotiate a royalty. But you can make it freely available and you do not need to get any special permission from us to do so.
"How many copies of PHYLIP have been distributed?"
We have about 28,000 registrations for PHYLIP. The number is not exact, since it does not count repeat registrations by the same person, and these are not always easy to detect (this number is an estimate based on a carefully examined sample of the registrations, to find out how many of them were re-registrations). Of course there are many more people who have got copies from friends, or who downloaded it without registering it. PHYLIP is probably the most widely distributed phylogeny package. In recent years magnetic tape distribution, diskette distribution and e-mail distribution of PHYLIP have disappeared (as I insist people use the Web distribution). But all this has been more than offset by, first, an explosion of distributions by anonymous ftp over Internet, and then a bigger explosion of Web distributions and registrations (about 6 registrations per day at the moment).
"Isn't it great that PHYLIP is the most widely-used package of phylogeny programs?"
It would be great if that were true, but I suspect that it is not true. Developers of other packages usually do not give out numbers of distributions or numbers of registrations of their package. Probably the best indication of level of use is the number of citations to these packages in the scientific literature. Doing a search using the Web Of Science, I find that PHYLIP is either third or fourth, the order of packages being PAUP*, MrBayes, and then either PHYLIP or PHYML. PHYLIP gets about 1,000 literature citations per year, PAUP* and MrBayes each get 2-3 times as many as that. As for uses rather than citations, that is very hard to assess. PHYLIP is widely used in teaching, which would account for many runs, but I do not know of a way to count these.

Questions about documentation

"Where can I get a printed version of the PHYLIP documents?"
For the moment, you can only get a printed version by printing it yourself. For versions 3.1 to 3.3 a printed version was sold by Christopher Meacham and Tom Duncan, then at the University Herbarium of the University of California at Berkeley. But they have had to discontinue this as it was too much work. You should be able to print out the documentation files on almost any printer and make yourself a printed version of whichever of them you need.
"Why have I been dropped from your newsletter mailing list?"
You haven't. The newsletter was dropped. It simply was too hard to mail it out to such a large mailing list. The last issue of the newsletter was Number 9 in May, 1987. The Listserver News Bulletins that we tried for a while have also been dropped as too hard to keep up to date. I am hoping that our World Wide Web site will take their place.

Additional Frequently Asked Questions, or: "Why didn't it occur to you to ...

... allow the options to be set on the command line?"
We could in Unix and Linux, or somewhat differently in Windows. But there are so many options that this would be difficult, especially when the options require additional information to be supplied such as rates of evolution for many categories of sites. You may be asking this question because you want to automate the operation of PHYLIP programs using batch files (command files) to run in background. If that is the issue, see the section of this main documentation page on "Running the programs in background or under control of a command file". It explains how to set the options using input redirection and a file that has the menu responses as keystrokes.
... write these programs in Java?"
Well, we might. It is not completely clear which of two contenders, C++ and Java, will become more widespread, and which one will gradually fade away. Whichever one is more successful, we will probably want to use for future versions of PHYLIP. As the C compilers that are used to compile PHYLIP are usually also able to compile C++, we will be moving in that direction, but with constant worrying about whether to convert PHYLIP to Java instead.
... forgot about all those inferior systems and just develop PHYLIP for Unix?"
This is self-answering, since the same people first said I should just develop it for Apple II's, then just for CP/M Z-80's, then just for IBM PCDOS, then just for Macintoshes or for Sun workstations, and then for Windows. If I had listened to them and done any one of these, I would have had a very hard time adapting the package to any of the other ones once these folks changed their mind (and most of them did)!
... write these programs in Pascal?"
These programs started out in Pascal in 1980. In 1993 we released both Pascal and C versions. The present version (3.6) and future versions will be C-only. I make fewer mistakes in Pascal and do like the language better than C, but C has overtaken Pascal and Pascal compilers are starting to be hard to find on some machines. Also C is a bit better standardized which makes the number of modifications a user has to make to adapt the programs to their system much less.
... write these programs in PROLOG (or Ada, or Modula-2, or SIMULA, or BCPL, or PL/I, or APL, or LISP)?"
These are all languages I have considered. All have advantages, but they are not really widespread (as are C, C++, and Java).
... include in the package a program to do the Distance Wagner method, (or successive approximations character weighting)?"
In most cases where I have not included other methods, it is because I decided that they had no substantial advantages over methods that were included (such as the programs Fitch, Kitsch, Neighbor, the T option of Mix and Dollop, and the "?" ancestral states option of the discrete characters parsimony programs).
... include in the package ordination methods and more clustering algorithms?"
Because this is not a clustering package, it's a package for phylogeny estimation. Those are different tasks with different objectives and mostly different methods. Mary Kuhner and Jon Yamato have, however, included in Neighbor an option for UPGMA clustering, which will be very similar to Kitsch in results.
... include in the package a program to do nucleotide sequence alignment?"
Well, yes, I should have, and this is scheduled to be in future releases. But multiple sequence alignment programs, in the era after Sankoff, Morel, and Cedergren's 1973 classic paper, need to use substantial computer horsepower to estimate the alignment and the tree together (but see Karl Nicholas's program GeneDoc or Ward Wheeler and David Gladstein's MALIGN, as well as more approximate methods of tree-based alignment used in ClustalW, TreeAlign, or POY).

(Fortunately) obsolete questions

(The following four questions, once common, have finally disappeared, I am pleased to report. I include them to give you some idea of what kinds of requests I had to cope with.)

"Why didn't it occur to you to ...

... let me log in to your computer in Seattle and copy the files out over a phone line?"
No thanks. It would cost you for a lot of long-distance telephone time, plus a half hour of my time and yours in which I had to explain to you how to log in and do the copying.
... send me a listing of your program?"
Damn it, it's not "a program", it's 37 programs, in a great many files. What were you thinking of doing, having 1800-line programs typed in by slaves at your end? If you were going to go to all that trouble why not try network transfer? If you have these then you can print out all the listings you want to and add them to the huge stack of printed output in the corner of your office.
... write a magnetic tape in our computer center's favorite format (inverted Ruritanian EBCDIC at 998 bpi)?"
Because the ANSI standard format is the most widely used one, and even though your computer center may pretend it can't read a tape written this way, if you sniff around you will find a utility to read it. It's just a lot easier for me to let you do that work. If I tried to put the tape into your format, I would probably get it wrong anyway.
... give us a version of these in FORTRAN?"
Because the programs are far easier to write and debug in C or Pascal, and cannot easily be rewritten into FORTRAN (they make extensive use of recursive calls and of records and pointers). In any case, C is widely available. If you don't have a C compiler or don't know how to use it, you are going to have to learn a language like C or Pascal sooner or later, and the sooner the better.


New Features in This Version

Version 3.6 has many new features:

There are many more, lesser features added as well.

Version 3.7 has some new features:


Coming Attractions, Future Plans

There are some obvious deficiencies in this version. Some of these holes will be filled in the next few releases (leading to version 4.0). They include:

  1. Obviously we need to start thinking about a more visual mouse/windows interface, but only if that can be used on X windows, Macintoshes, and Windows.
  2. Program Penny and its relatives will improved so as to run faster and find all most parsimonious trees more quickly.
  3. An "evolutionary clock" version of Contml will be done, and the same may also be done for Restml.
  4. We are gradually generalizing the tree structures in the programs to infer multifurcating trees as well as bifurcating ones. We should be able to have any program read any tree and know what to do with it, without the user having to fret about whether an unrooted tree was fed to a program that needs a rooted tree.
  5. In general, we need more support for protein sequences, including a codon model of change, allowing for different rates for synonymous and nonsynonymous changes.
  6. We also need more support for combining runs from multiple loci, allowing for different rates of evolution at the different loci.
  7. We will be expanding our use and production of XML data set files and XML tree files.
  8. A program to align molecular sequences on a predefined User Tree may ultimately be included. This will allow alignment and phylogeny reconstruction to procede iteratively by successive runs of two programs, one aligning on a tree and the other finding a better tree based on that alignment. In the shorter run a simple two-sequence alignment program may be included.
  9. An interactive "likelihood explorer" for DNA sequences will be written. This will allow, either with or without the assumption of a molecular clock, trees to be varied interactively so that the user can get a much better feel for the shape of the likelihood surface. Likelihood will be able to be plotted against branch lengths for any branch.
  10. If possible we will allow use of Hidden Markov Models for correcting for purine/pyrimidine richness variations among species, within the framework of the maximum likelihood programs. That the maximum likelihood programs do not allow for base composition variation is their major limitation at the moment.
  11. The Hidden Markov Model (regional rates) option of Dnaml and Dnamlk will be generalized to allow for rates at sites to gradually change as one moves along the tree, in an attempt to implement Fitch and Markowitz's (1970) notion of "covarions".
  12. A more sophisticated compatibility program should be included, if I can find one.
  13. We are economizing on the size of the source code, and enforcing some standardization of it, by putting frequently used routines in separate files which can be linked into various programs. This will enforce a rather complete standardization of our code.
  14. We will move our code to an object-oriented language, most likely C++. One could describe the language that version 3.4 was written in as "Pascal", version 3.5 as "Pascal written in C", version 3.6 as "C written in C", version 3.7 as "C++ written in C" and then 4.0 as "C++ written in C++". At least that scenario is one possibility.

There will also be many future developments in the programs that treat continuously-measured data (quantitative characters) and morphological or behavioral data with discrete states, as I have new ideas for analyzing these data in ways that connect to within-species quantitative genetic analyses. This will compete with parsimony analysis.


Endorsements

Here are some comments people have made in print about PHYLIP. Explanatory material in square brackets is my own. They fall naturally into three groups:

From the pages of Cladistics:

"Under no circumstances can we recommend PHYLIP/WAG [their name for the Wagner parsimony option of Mix]."
Luckow, M. and R. A. Pimentel (1985)

"PHYLIP has not proven very effective in implementing parsimony (Luckow and Pimentel, 1985)."
J. Carpenter (1987a)

"... PHYLIP. This is the computer program where every newsletter concerning it is mostly bug-catching, some of which have been put there by previous corrections. As Platnick (1987) documents, through dint of much labor useful results may be attained with this program, but I would suggest an easier way: FORMAT b:"
J. Carpenter (1987b)

"PHYLIP is bug-infested and both less effective and orders of magnitude slower than other programs ...."
"T. N. Nayenizgani" [J. S. Farris] (1990)

"Hennig86 [by J. S. Farris] provides such substantial improvements over previously available programs (for both mainframes and microcomputers) that it should now become the tool of choice for practising systematists."
N. Platnick (1989)

... in the pages of other journals:

"The availability, within PHYLIP of distance, compatibility, maximum likelihood, and generalized `invariants' algorithms (Cavender and Felsenstein, 1987) sets it apart from other packages .... One of the strengths of PHYLIP is its documentation ...."
Michael J. Sanderson (1990)
(Sanderson also criticizes PHYLIP for slowness and inflexibility of its parsimony algorithms, and compliments other packages on their strengths).

"This package of programs has gradually become a basic necessity to anyone working seriously on various aspects of phylogenetic inference .... The package includes more programs than any other known phylogeny package. But it is not just a collection of cladistic and related programs. The package has great value added to the whole, and for this it is unique and of extreme importance .... its various strengths are in the great array of methods provided ...."
Bernard R. Baum (1989)

(note also W. Fink's critical remarks (1986) on version 2.8 of PHYLIP).

... and in the comments made by users when they register:

"a program on phylogeny -- PHYLOGENY INTERFERENCE PACKAGE (PHYLIP). We would therefore like to ask ..."
[names withheld] (in 1994)

"I am struglling with your clever programs."
[name withheld] (in 1995)

"I'm famously computer illiterate - I look forward to many frustrating hours trying to run this program"
Desmond Maxwell (in 1998)

"I am a brave man. PHYLIP is a brave program. We'll do fine together."
Christopher Winchell (in 2000)

"The Mahabarata of phylogenetics looks better than ever."
Ross Crozier (in 2001)
"I love phylip. Tastes great and less filling!"
Byron Adams (in 2002)

References for the Documentation Files

In the documentation files that follow I frequently refer to papers in the literature. In order to centralize the references they are given in this section. If you want to find further papers beyond these, my book (Felsenstein, 2004) lists more than 1,000 further references.

Adams, E. N. 1972. Consensus techniques and the comparison of taxonomic trees. Systematic Zoology 21: 390-397.

Adams, E. N. 1986. N-trees as nestings: complexity, similarity, and consensus. Journal of Classification 3: 299-317.

Archie, J. W. 1989. A randomization test for phylogenetic information in systematic data. Systematic Zoology 38: 239-252.

Backeljau, T., L. De Bruyn, H. De Wolf, K. Jordaens, S. Van Dongen, and B. Winnepenninckx. 1996. Multiple UPGMA and neighbor-joining trees and the performance of some computer packages. Molecular Biology and Evolution 13: 309–313.

Barry, D., and J. A. Hartigan. 1987. Statistical analysis of hominoid molecular evolution. Statistical Science 2: 191-210.

Baum, B. R. 1989. PHYLIP: Phylogeny Inference Package. Version 3.2. (Software review). Quarterly Review of Biology 64: 539-541.

Bourque, M. 1978. Arbres de Steiner et reseaux dont certains sommets sont à localisation variable. Ph. D. Dissertation, Université de Montréal, Quebec.

Bron, C., and J. Kerbosch. 1973. Algorithm 457: Finding all cliques of an undirected graph. Communications of the Association for Computing Machinery 16: 575-577.

Camin, J. H., and R. R. Sokal. 1965. A method for deducing branching sequences in phylogeny. Evolution 19: 311-326.

Carpenter, J. 1987a. A report on the Society for the Study of Evolution workshop "Computer Programs for Inferring Phylogenies". Cladistics 3: 363-375.

Carpenter, J. 1987b. Cladistics of cladists. Cladistics 3: 363-375.

Cavalli-Sforza, L. L., and A. W. F. Edwards. 1967. Phylogenetic analysis: models and estimation procedures. Evolution 32: 550-570 (also American Journal of Human Genetics 19: 233-257).

Cavender, J. A. and J. Felsenstein. 1987. Invariants of phylogenies in a simple case with discrete states. Journal of Classification 4: 57-71.

Churchill, G.A. 1989. Stochastic models for heterogeneous DNA sequences. Bulletin of Mathematical Biology 51: 79-94.

Conn, E. E. and P. K. Stumpf. 1963. Outlines of Biochemistry. John Wiley and Sons, New York.

Day, W. H. E. 1983. Computationally difficult parsimony problems in phylogenetic systematics. Journal of Theoretical Biology 103: 429-438.

Dayhoff, M. O. and R. V. Eck. 1968. Atlas of Protein Sequence and Structure 1967-1968. National Biomedical Research Foundation, Silver Spring, Maryland.

Dayhoff, M. O., R. M. Schwartz, and B. C. Orcutt. 1979. A model of evolutionary change in proteins. pp. 345-352 in Atlas of Protein Sequence and Structure, volume 5, supplement 3, 1978, ed. M. O. Dayhoff. National Biomedical Research Foundation, Silver Spring, Maryland .

Dayhoff, M. O. 1979. Atlas of Protein Sequence and Structure, Volume 5, Supplement 3, 1978. National Biomedical Research Foundation, Washington, D.C.

DeBry, R. W. and N. A. Slade. 1985. Cladistic analysis of restriction endonuclease cleavage maps within a maximum-likelihood framework. Systematic Zoology 34: 21-34.

Dempster, A. P., N. M. Laird, and D. B. Rubin. 1977. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society B 39: 1-38.

Eck, R. V., and M. O. Dayhoff. 1966. Atlas of Protein Sequence and Structure 1966. National Biomedical Research Foundation, Silver Spring, Maryland.

Edwards, A. W. F., and L. L. Cavalli-Sforza. 1964. Reconstruction of evolutionary trees. pp. 67-76 in Phenetic and Phylogenetic Classification, ed. V. H. Heywood and J. McNeill. Systematics Association Volume No. 6. Systematics Association, London.

Estabrook, G. F., C. S. Johnson, Jr., and F. R. McMorris. 1976a. A mathematical foundation for the analysis of character compatibility. Mathematical Biosciences 23: 181-187.

Estabrook, G. F., C. S. Johnson, Jr., and F. R. McMorris. 1976b. An algebraic analysis of cladistic characters. Discrete Mathematics 16: 141-147.

Estabrook, G. F., F. R. McMorris, and C. A. Meacham. 1985. Comparison of undirected phylogenetic trees based on subtrees of four evolutionary units. Systematic Zoology 34: 193-200.

Faith, D. P. 1990. Chance marsupial relationships. Nature 345: 393-394.

Faith, D. P. and P. S. Cranston. 1991. Could a cladogram this short have arisen by chance alone?: On permutation tests for cladistic structure. Cladistics 7: 1-28.

Farris, J. S. 1977. Phylogenetic analysis under Dollo's Law. Systematic Zoology 26: 77-88.

Farris, J. S. 1978a. Inferring phylogenetic trees from chromosome inversion data. Systematic Zoology 27: 275-284.

Farris, J. S. 1981. Distance data in phylogenetic analysis. pp. 3-23 in Advances in Cladistics: Proceedings of the first meeting of the Willi Hennig Society, ed. V. A. Funk and D. R. Brooks. New York Botanical Garden, Bronx, New York.

Farris, J. S. 1983. The logical basis of phylogenetic analysis. pp. 1-47 in Advances in Cladistics, Volume 2, Proceedings of the Second Meeting of the Willi Hennig Society. ed. Norman I. Platnick and V. A. Funk. Columbia University Press, New York.

Farris, J. S. 1985. Distance data revisited. Cladistics 1: 67-85.

Farris, J. S. 1986. Distances and statistics. Cladistics 2: 144-157.

Farris, J. S. [“T. N. Nayenizgani”]. 1990. The systematics association enters its golden years (review of Prospects in Systematics, ed. D. Hawksworth). Cladistics 6: 307-314.

Farris, J. S., V. A. Albert, M. K&aauml;llersj&oauml;, D. Lipscomb, and A. G. Kluge. 1996. Parsimony jackknifing outperforms neighbor-joining. Cladistics 12: 99-124.

Felsenstein, J. 1973a. Maximum likelihood and minimum-steps methods for estimating evolutionary trees from data on discrete characters. Systematic Zoology 22: 240-249.

Felsenstein, J. 1973b. Maximum-likelihood estimation of evolutionary trees from continuous characters. American Journal of Human Genetics 25: 471-492.

Felsenstein, J. 1978a. The number of evolutionary trees. Systematic Zoology 27: 27-33.

Felsenstein, J. 1978b. Cases in which parsimony and compatibility methods will be positively misleading. Systematic Zoology 27: 401-410.

Felsenstein, J. 1979. Alternative methods of phylogenetic inference and their interrelationship. Systematic Zoology 28: 49-62.

Felsenstein, J. 1981a. Evolutionary trees from DNA sequences: a maximum likelihood approach. Journal of Molecular Evolution 17: 368-376.

Felsenstein, J. 1981b. A likelihood approach to character weighting and what it tells us about parsimony and compatibility. Biological Journal of the Linnean Society 16: 183-196.

Felsenstein, J. 1981c. Evolutionary trees from gene frequencies and quantitative characters: finding maximum likelihood estimates. Evolution 35: 1229-1242.

Felsenstein, J. 1982. Numerical methods for inferring evolutionary trees. Quarterly Review of Biology 57: 379-404.

Felsenstein, J. 1983b. Parsimony in systematics: biological and statistical issues. Annual Review of Ecology and Systematics 14: 313-333.

Felsenstein, J. 1984a. Distance methods for inferring phylogenies: a justification. Evolution 38: 16-24.

Felsenstein, J. 1984b. The statistical approach to inferring evolutionary trees and what it tells us about parsimony and compatibility. pp. 169-191 in: Cladistics: Perspectives in the Reconstruction of Evolutionary History, edited by T. Duncan and T. F. Stuessy. Columbia University Press, New York.

Felsenstein, J. 1985a. Confidence limits on phylogenies with a molecular clock. Systematic Zoology 34: 152-161.

Felsenstein, J. 1985b. Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39: 783-791.

Felsenstein, J. 1985c. Phylogenies from gene frequencies: a statistical problem. Systematic Zoology 34: 300-311.

Felsenstein, J. 1985d. Phylogenies and the comparative method. American Naturalist 125: 1-12.

Felsenstein, J. 1986. Distance methods: a reply to Farris. Cladistics 2: 130-144.

Felsenstein, J. and E. Sober. 1986. Parsimony and likelihood: an exchange. Systematic Zoology 35: 617-626.

Felsenstein, J. 1988a. Phylogenies and quantitative characters. Annual Review of Ecology and Systematics 19: 445-471.

Felsenstein, J. 1988b. Phylogenies from molecular sequences: inference and reliability. Annual Review of Genetics 22: 521-565.

Felsenstein, J. 1992. Phylogenies from restriction sites, a maximum likelihood approach. Evolution 46: 159-173.

Felsenstein, J. and G. A. Churchill. 1996. A hidden Markov model approach to variation among sites in rate of evolution Molecular Biology and Evolution 13: 93-104.

Felsenstein, J. 2004. Inferring Phylogenies. Sinauer Associates, Sunderland, Massachusetts.

Felsenstein, J. 2005. Using the threshold model of quantitative genetics for inferences within and between species. Philosophical Transactions of the Royal Society of London, Series B 360 1427-1434.

Felsenstein, J. 2008. Comparative methods with sampling error and within-species variation: contrasts revisited and revised. American Naturalist 171: 713-725.

Fink, W. L. 1986. Microcomputers and phylogenetic analysis. Science 234: 1135-1139.

Fitch, W. M., and E. Markowitz. 1970. An improved method for determining codon variability in a gene and its application to the rate of fixation of mutations in evolution. Biochemical Genetics 4: 579-593.

Fitch, W. M., and E. Margoliash. 1967. Construction of phylogenetic trees. Science 155: 279-284.

Fitch, W. M. 1971. Toward defining the course of evolution: minimum change for a specified tree topology. Systematic Zoology 20: 406-416.

Fitch, W. M. 1975. Toward finding the tree of maximum parsimony. pp. 189-230 in Proceedings of the Eighth International Conference on Numerical Taxonomy, ed. G. F. Estabrook. W. H. Freeman, San Francisco.

Fitch, W. M. and E. Markowitz. 1970. An improved method for determining codon variability and its application to the rate of fixation of mutations in evolution. Biochemical Genetics 4: 579-593.

George, D. G., L. T. Hunt, and W. C. Barker. 1988. Current methods in sequence comparison and analysis. pp. 127-149 in Macromolecular Sequencing and Synthesis, ed. D. H. Schlesinger. Alan R. Liss, New York.

Gilmour, R. 2000. Taxonomic markup language: applying XML to systematic data. Bioinformatics 16: 406-407.

Goldman, N., and Z. Yang. 1994. A codon-based model of nucleotide substitution for protein-coding DNA sequences. Molecular Biology and Evolution 11: 725-736.

Goldstein, D. B., A. Ru&iiacute;z-Linares, M. Feldman, and L. L. Cavalli-Sforza. 1995. Genetic absolute dating based on microsatellites and the origin of modern humans. Proceedings of the National Academy of Sciences USA 92: 6720-6727.

Gomberg, D. 1968. "Bayesian" post-diction in an evolution process. unpublished manuscript, University of Pavia, Italy.

Graham, R. L., and L. R. Foulds. 1982. Unlikelihood that minimal phylogenies for a realistic biological study can be constructed in reasonable computational time. Mathematical Biosciences 60: 133-142.

Hasegawa, M. and T. Yano. 1984a. Maximum likelihood method of phylogenetic inference from DNA sequence data. Bulletin of the Biometric Society of Japan No. 5: 1-7.

Hasegawa, M. and T. Yano. 1984b. Phylogeny and classification of Hominoidea as inferred from DNA sequence data. Proceedings of the Japan Academy 60 B: 389-392.

Hasegawa, M., Y. Iida, T. Yano, F. Takaiwa, and M. Iwabuchi. 1985a. Phylogenetic relationships among eukaryotic kingdoms as inferred from ribosomal RNA sequences. Journal of Molecular Evolution 22: 32-38.

Hasegawa, M., H. Kishino, and T. Yano. 1985b. Dating of the human-ape splitting by a molecular clock of mitochondrial DNA. Journal of Molecular Evolution 22: 160-174.

Hendy, M. D., and D. Penny. 1982. Branch and bound algorithms to determine minimal evolutionary trees. Mathematical Biosciences 59: 277-290.

Higgins, D. G. and P. M. Sharp. 1989. Fast and sensitive multiple sequence alignments on a microcomputer. Computer Applications in the Biological Sciences (CABIOS) 5: 151-153.

Hochbaum, D. S. and A. Pathria. 1997. Path costs in evolutionary tree reconstruction. Journal of Computational Biology 4: 163-175.

Holmquist, R., M. M. Miyamoto, and M. Goodman. 1988. Higher-primate phylogeny - why can't we decide? Molecular Biology and Evolution 5: 201-216.

Inger, R. F. 1967. The development of a phylogeny of frogs. Evolution 21: 369-384.

Jin, L. and M. Nei. 1990. Limitations of the evolutionary parsimony method of phylogenetic analysis. Molecular Biology and Evolution 7: 82-102.

Jones, D. T., W. R. Taylor and J. M. Thornton. 1992. The rapid generation of mutation data matrices from protein sequences. Computer Applications in the Biosciences (CABIOS) 8: 275-282.

Jukes, T. H. and C. R. Cantor. 1969. Evolution of protein molecules. pp. 21-132 in Mammalian Protein Metabolism, ed. H. N. Munro. Academic Press, New York.

Kidd, K. K. and L. A. Sgaramella-Zonta. 1971. Phylogenetic analysis: concepts and methods. American Journal of Human Genetics 23: 235-252.

Kim, J. and M. A. Burgman. 1988. Accuracy of phylogenetic-estimation methods using simulated allele-frequency data. Evolution 42: 596-602.

Kimura, M. 1980. A simple model for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. Journal of Molecular Evolution 16: 111-120.

Kimura, M. 1983. The Neutral Theory of Molecular Evolution. Cambridge University Press, Cambridge.

Kishino, H. and M. Hasegawa. 1989. Evaluation of the maximum likelihood estimate of the evolutionary tree topologies from DNA sequence data, and the branching order in Hominoidea. Journal of Molecular Evolution 29: 170-179.

Kluge, A. G., and J. S. Farris. 1969. Quantitative phyletics and the evolution of anurans. Systematic Zoology 18: 1-32.

Kosiol, C., and N. Goldman. 2005. Different versions of the Dayhoff rate matrix. Molecular Biology and Evolution 22: 193-199.

Kuhner, M. K. and J. Felsenstein. 1994. A simulation comparison of phylogeny algorithms under equal and unequal evolutionary rates. Molecular Biology and Evolution 11: 459-468 (Erratum 12: 525  1995).

Künsch, H. R. 1989. The jackknife and the bootstrap for general stationary observations. Annals of Statistics 17: 1217-1241.

Lake, J. A. 1987. A rate-independent technique for analysis of nucleic acid sequences: evolutionary parsimony. Molecular Biology and Evolution 4: 167-191.

Lake, J. A. 1994. Reconstructing evolutionary trees from DNA and protein sequences: paralinear distances. Proceedings of the Natonal Academy of Sciences, USA 91: 1455-1459.

Le Quesne, W. J. 1969. A method of selection of characters in numerical taxonomy. Systematic Zoology 18: 201-205.

Le Quesne, W. J. 1974. The uniquely evolved character concept and its cladistic application. Systematic Zoology 23: 513-517.

Lewis, H. R., and C. H. Papadimitriou. 1978. The efficiency of algorithms. Scientific American 238: 96-109 (January issue)

Lockhart, P. J., M. A. Steel, M. D. Hendy, and D. Penny. 1994. Recovering evolutionary trees under a more realistic model of sequence evolution. Molecular Biology and Evolution 11: 605-612.

Luckow, M. and D. Pimentel. 1985. An empirical comparison of numerical Wagner computer programs. Cladistics 1: 47-66.

Lynch, M. 1990. Methods for the analysis of comparative data in evolutionary biology. Evolution 45: 1065-1080.

Maddison, D. R. 1991. The discovery and importance of multiple islands of most-parsimonious trees. Systematic Zoology 40: 315-328.

Margush, T. and F. R. McMorris. 1981. Consensus n-trees. Bulletin of Mathematical Biology 43: 239-244.

Muse, S. V. and B. S. Gaut. 1994. A likelihood approach for comparing synonymous and nonsynonymous nucleotide substitution rates, with application to the chloroplast genome. Molecular Biology and Evolution 11: 715-724,

Nelson, G. 1979. Cladistic analysis and synthesis: principles and definitions, with a historical note on Adanson's Familles des Plantes (1763-1764). Systematic Zoology 28: 1-21.

Nei, M. 1972. Genetic distance between populations. American Naturalist 106: 283-292.

Nei, M. and W.-H. Li. 1979. Mathematical model for studying genetic variation in terms of restriction endonucleases. Proceedings of the National Academy of Sciences, USA 76: 5269-5273.

Nei, M. and T. Gojobori. 1986. Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Molecular Biology and Evolution 3: 418-426.

Nielsen, R., and Z. Yang. 1998. Likelihood models for detecting positively selected amino acid sites and applications to the HIV-1 envelope gene. Genetics 148: 929-936.

Nixon, K. C. 1999. The parsimony ratchet, a new method for rapid parsimony analysis. Cladistics 15: 407-414.

Page, R. D. M. 1989. Comments on component-compatibility in historical biogeography. Cladistics 5: 167-182.

Penny, D. and M. D. Hendy. 1985. Testing methods of evolutionary tree construction. Cladistics 1: 266-278.

Platnick, N. 1987. An empirical comparison of microcomputer parsimony programs. Cladistics 3: 121-144.

Platnick, N. 1989. An empirical comparison of microcomputer parsimony programs. II. Cladistics 5: 145-161.

Reynolds, J. B., B. S. Weir, and C. C. Cockerham. 1983. Estimation of the coancestry coefficient: basis for a short-term genetic distance. Genetics 105: 767-779.

Robinson, D. F. and L. R. Foulds. 1979. Comparison of weighted labelled trees. pp. 119-126 in Combinatorial Mathematics VI. Proceedings of the Sixth Australian Conference on Combinatorial Mathematics, Armidale, Australia, August, 1978, ed. A. F. Horadam and W. D. Wallis. Lecture Notes in Mathematics, No. 748. Springer-Verlag, Berlin.

Robinson, D. F. and L. R. Foulds. 1981. Comparison of phylogenetic trees. Mathematical Biosciences 53: 131-147.

Rohlf, F. J. and M. C. Wooten. 1988. Evaluation of the restricted maximum likelihood method for estimating phylogenetic trees using simulated allele- frequency data. Evolution 42: 581-595.

Rzhetsky, A., and M. Nei. 1992. Statistical properties of the ordinary least-squares, generalized least-squares, and minimum-evolution methods of phylogenetic inference. Journal of Molecular Evolution 35: 367-375 .

Saitou, N., and M. Nei. 1987. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Molecular Biology and Evolution 4: 406-425.

Sanderson, M. J. 1990. Flexible phylogeny reconstruction: a review of phylogenetic inference packages using parsimony. Systematic Zoology 39: 414-420.

Sankoff, D. D., C. Morel, R. J. Cedergren. 1973. Evolution of 5S RNA and the nonrandomness of base replacement. Nature New Biology 245: 232-234.

Shimodaira, H. and M. Hasegawa. 1999. Multiple comparisons of log-likelihoods with applications to phylogenetic inference. Molecular Biology and Evolution 16: 1114-1116.

Shimodaira, H. 2002. An approximately unbiased test of phylogenetic tree selection. Systematic Biology 51: 492-508.

Sokal, R. R. and P. H. A. Sneath. 1963. Principles of Numerical Taxonomy. W. H. Freeman, San Francisco.

Smouse, P. E. and W.-H. Li. 1987. Likelihood analysis of mitochondrial restriction-cleavage patterns for the human-chimpanzee-gorilla trichotomy. Evolution 41: 1162-1176.

Sober, E. 1983a. Parsimony in systematics: philosophical issues. Annual Review of Ecology and Systematics 14: 335-357.

Sober, E. 1983b. A likelihood justification of parsimony. Cladistics 1: 209-233.

Sober, E. 1988. Reconstructing the Past: Parsimony, Evolution, and Inference. MIT Press, Cambridge, Massachusetts.

Sokal, R. R., and P. H. A. Sneath. 1963. Principles of Numerical Taxonomy. W. H. Freeman, San Francisco.

Steel, M. A., P. J. Lockhart, and D. Penny. 1993. Confidence in evolutionary trees from biological sequence data. Nature 364: 440-442.

Steel, M. A. 1994. Recovering a tree from the Markov leaf colourations it generates under a Markov model. Applied Mathematics Letters 7: 19-23.

Studier, J. A. and K. J. Keppler. 1988. A note on the neighbor-joining algorithm of Saitou and Nei. Molecular Biology and Evolution 5: 729-731.

Swofford, D. L. and G. J. Olsen. 1990. Phylogeny reconstruction. Chapter 11, pages 411-501 in Molecular Systematics, ed. D. M. Hillis and C. Moritz. Sinauer Associates, Sunderland, Massachusetts.

Swofford, D. L., G. J. Olsen, P. J. Waddell, and D. M. Hillis. 1996. Phylogenetic inference. pp. 407-514 in Molecular Systematics, 2nd ed., ed. D. M. Hillis, C. Moritz, and B. K. Mable. Sinauer Associates, Sunderland, Massachusetts.

Templeton, A. R. 1983. Phylogenetic inference from restriction endonuclease cleavage site maps with particular reference to the evolution of humans and the apes. Evolution 37: 221-244.

Thompson, E. A. 1975. Human Evolutionary Trees. Cambridge University Press, Cambridge.

Veerassamy, S., A. Smith and E. R. M. Tillier. 2003. A transition probability model for amino acid substitutions from Blocks. Journal of Computational Biology 10: 997-1010.

Wright, S. 1934. An analysis of variability in number of digits in an inbred strain of guinea pigs. Genetics 19: 506-536.

Wu, C. F. J. 1986. Jackknife, bootstrap and other resampling plans in regression analysis. Annals of Statistics 14: 1261-1295.

Yang, Z. 1993. Maximum-likelihood estimation of phylogeny from DNA sequences when substitution rates differ over sites. Molecular Biology and Evolution 10: 1396-1401.

Yang, Z. 1994. Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: approximate methods. Journal of Molecular Evolution 39: 306-314.

Yang, Z. 1995. A space-time process model for the evolution of DNA sequences. Genetics 139: 993-1005.

Yang, Z. 1998. Likelihood ratio tests for detecting positive selection and application to primate lysozyme evolution. Molecular Biology and Evolution15: 568-573.

Yang, Z., and R. Nielsen. 1998. Synonymous and nonsynonymous rate variation in nuclear genes of mammals. Journal of Molecular Evolution 46: 409-418.

Yang, Z. 2006. Computational Molecular Evolution. Oxford University Press, Oxford.

Zharkikh, A. and W.-H. Li. 1995. Estimation of confidence in phylogeny: the complete-and-partial bootstrap technique. Molecular Biology and Evolution 4: 44-63.


Credits

Over the years various granting agencies have contributed to the support of the PHYLIP project (at first without knowing it). They are:

Years Agency Grant or Contract Number
2005-2009 NIH NIGMS R01 GM071639
2003-2007 NIH NIGMS R01 GM51929-05 (PI: Mary Kuhner)
1999-2003 NSF BIR-9527687
1999-2002 NIH NIGMS R01 GM51929-04
1999-2001 NIH NIMH R01 HG01989-01
1995-1999 NIH NIGMS R01 GM51929-01
1992-1995 National Science Foundation DEB-9207558
1992-1994 NIH NIGMS Shannon Award 2 R55 GM41716-04
1989-1992 NIH NIGMS 1 R01-GM41716-01
1990-1992 National Science Foundation BSR-8918333
1987-1990 National Science Foundation BSR-8614807
1979-1987 U.S. Department of Energy DE-AM06-76RLO2225 TA DE-AT06-76EV71005

However, starting in April, 2009 there is no grant support for PHYLIP.

I am particularly grateful to past program administrators William Moore, Irene Eckstrand, Peter Arzberger, and Conrad Istock, who have gone beyond the call of duty to make sure that PHYLIP continued.

Booby prizes for funding are awarded to:

The original Camin-Sokal parsimony program and the polymorphism parsimony program were written by me in 1977 and 1978. They were Pascal versions of earlier FORTRAN programs I wrote in 1966 and 1967 using the same algorithm to infer phylogenies under the Camin-Sokal and polymorphism parsimony criteria. Harvey Motulsky worked for me as a programmer in 1971 and wrote FORTRAN programs to carry out the Camin-Sokal, Dollo, and polymorphism methods (he is better-known these days as the author of the scientific data analysis package GraphPad). But most of the early work on PHYLIP other than my own was by Jerry Shurman and Mark Moehring. Jerry Shurman worked for me in the summers of 1979 and 1980, and Mark Moehring worked for me in the summers of 1980 and 1981. Both wrote original versions of many of the other programs, based on the original versions of my Camin-Sokal parsimony program and my polymorphism parsimony program. These formed the basis of Version 1 of the Package, first distributed in October, 1980.

Version 2, released in the spring of 1982, involved a fairly complete rewrite by me of many of those programs. Hisashi Horino for version 3.3 reworked some parts of the programs Clique and Consense to make their output more comprehensible, and has added some code to the tree-drawing programs Drawgram and Drawtree as well. He also worked on some of the Drawtree and Drawgram driver code.

Later programmers Akiko Fuseki, Sean Lamont, Andrew Keeffe, Daniel Yek, Dan Fineman, Patrick Colacurcio, Mike Palczewski, Doug Buxton, Ian Robertson, Marissa LaMadrid, Eric Rynes, and Elizabeth Walkup gave me substantial help with the 3.6 releases, and their excellent work is greatly appreciated. Akiko, in over 10 years of excellent work, did much of the hard work of adding new features and changing old ones in the 3.4 and 3.5 releases, centralized many of the C routines in support files, and is responsible for the new versions of Dnapars and Pars. Andrew prepared the Macintosh version, wrote Retree, added the ray-tracing and PICT code to the Draw... programs and has since done much other work. Sean was central to the conversion to C, and tested it extensively. Mike Palczewski reorganized the code and centralized routines, bringing us closer to object-oriented structure. My (then) postdoctoral fellow Mary Kuhner and her associate Jon Yamato created Neighbor, the neighbor-joining and UPGMA program, for the current release, for which I am also grateful (Naruya Saitou and Li Jin kindly encouraged us to use some of the code from their own implementation of this method). Lucas Mix created the protein likelihood programs Protml and Protmlk. Elisabeth Tillier provided the code for her PMB amino acid model. My current programmers Jim McGill and Bob Giansiracusa have made a great contribution to getting the current version working.

I am very grateful to over 400 users for algorithmic suggestions, complaints about features (or lack of features), and information about the behavior of their operating systems and compilers. A list of some of their names will be found at the credits page on the PHYLIP web site which is at http://evolution.gs.washington.edu/phylip/credits.html

A major contribution to this package has been made by others writing programs or parts of programs. Chris Meacham contributed the important program Factor, long demanded by users, and the even more important ones PLOTREE and PLOTGRAM. Important parts of the code in Drawgram and Drawtree were taken over from those two programs. Kent Fiala wrote function "reroot" to do outgroup-rooting, which was an essential part of many programs in earlier versions. Someone at the Western Australia Institute of Technology suggested the name PHYLIP (by writing it the label on the outside of a magnetic tape). Probably it was the late Julian Ford (I've lost the relevant letter).

The distribution of the package also owes much to Buz Wilson and Willem Ellis, who put a lot of effort into the early distributions of the PCDOS and Macintosh versions respectively. Christopher Meacham and Tom Duncan for three versions distributed a printed version of these documentation files (they could not continue to do so), and I am very grateful to them for those efforts. William H.E. Day and F. James Rohlf were very helpful in setting up the listserver news bulletin service which succeeded the PHYLIP newsletter for a time.

I also wish to thank the people who have made computer resources available to me, mostly in the loan of use of microcomputers. These include Jeremy Field, Clem Furlong, Rick Garber, Dan Jacobson, Rochelle Kochin, Monty Slatkin, Jim Archie, Jim Thomas, and George Gilchrist.

I should also note the computers used to develop this package: These include a CDC 6400, two DECSystem 1090s, my trusty old SOL-20, my old Osborne-1, a VAX 11/780, a VAX 8600, a MicroVAX I, a DECstation 3100, my old Toshiba 1100+, my DECstation 5000/200, a DECstation 5000/125, a Compudyne 486DX/33, a Trinity Genesis 386SX, a Zenith Z386, a Mac Classic, a DEC Alphastation 400 4/233, a Pentium 120, a Pentium 200, a PowerMac 6100, and a Macintosh G3. (One of the reasons we have been successful in achieving compatibility between different computer systems is that I have had to run them myself under so many different operating systems and compilers).


Other Phylogeny Programs Available Elsewhere

A comprehensive list of phylogeny programs is maintained at the PHYLIP web site on the Phylogeny Programs pages:

http://evolution.gs.washington.edu/phylip/software.html

Here we will simply mention some of the major general-purpose programs. For many more and much more, see those web pages.

PAUP*   A comprehensive program with parsimony, likelihood, and distance matrix methods. It competes with PHYLIP to be responsible for the most trees published. Written by David Swofford, now of Duke University and distributed by Sinauer Associates of Sunderland, Massachusetts. It is described in a web page. at http://www.sinauer.com/detail.php?id=8060. Current prices are $100 for the Macintosh version, $85 for the Windows version, and $150 for Unix versions for many kinds of workstations.

MrBayes   The leading program for Bayesian inference of phylogenies. It uses Markov Chain Monte Carlo inference to assess support for clades and to infer posterior distrubutions of parameters. Produced by John Huelsenbeck and Fredrik Ronquist, it is available at its web site at http://mrbayes.net as a Mac OS X or Windows executable, or in source code in C.

MEGA   A program by Sudhir Kumar of Arizona State University (written together with Koichiro Tamura, Joel Dudley and Masatoshi Nei). It can carry out parsimony and distance matrix methods for DNA sequence data. Version 4 for Windows, Macintosh, and Linux can be downloaded from the MEGA web site at http://www.megasoftware.net.

PAML   Ziheng Yang of the Department of Genetics and Biometry at University College, London has written this package of programs to carry out likelihood analysis of DNA and protein sequence data. It is one of the only packages able to use the codon model for protein sequence data which takes the genetic code reasonably fully into account. PAML is particularly strong in the options for coping with variability of rates of evolution from site to site, though it is less able than some other packages to search effectively for the best tree. It is available as C source code and as Mac OS X and Windows executables from its web site at http://abacus.gene.ucl.ac.uk/software/paml.html.

Phyml   Stephane Guindon, currently of the University of Auckland, New Zealand, has written Phyml, a fast likelihood program for molecular sequence data It is available as binaries from its web page at the ATGC site at the Université de Montpellier in France. Source code for Phyml, including later developments of the program, are available at its site at Google Code.

RAxML   Alexandros Stamatakis, of the Exelexis Lab at the Technische Universität München has written RAxML, a very fast likelihood program for molecular sequences. It is available from the Exelexis Lab software web page. Source code is available too. RAxML seems to be the fastest implementation of likelihood for molecular data.

TNT   This program, by Pablo Goloboff, J. S. Farris, and Kevin Nixon, is for searching large data sets for most parsimonious trees. The authors are respectively at the Instituto Miguel Lillo in Tucumán, Argentina, the Naturhistoriska Riksmuseet in Stockholm, Sweden, and the Hortorium, Cornell University, Ithaca, New York. TNT is described as faster than other methods, though not faster than NONA for small to medium data sets. It is distributed as Windows, Linux, and Mac OS X executables (the latter two require the PVM Parallel Virtual Machine library to be installed). The program and some support files including documentation are available from its download area at http://www.zmuc.dk/public/phylogeny/tnt (see the ReadMe! web page there). It is free, provided you agree to a license with some reasonable limitations.

DAMBE    A package written by Xuhua Xia of the Department of Biology of the University of Ottawa. Its initials stand for Data Analysis in Molecular Biology and Evolution. DAMBE is a general-purpose package for DNA and protein sequence phylogenies. It can read and convert a number of file formats, and has many features for descriptive statistics, and can compute a number of commonly-used distance matrix measures and infer phylogenies by parsimony, distance, or likelihood methods, including bootstrapping and jackknifing. There are a number of kinds of statistical tests of trees available and it can also display phylogenies. DAMBE includes a copy of ClustalW as well; DAMBE consists of Windows executables. It is available from its web site at http://dambe.bio.uottawa.ca/dambe.asp.

These are only a few of the over 380 different phylogeny packages that are now available (as of July, 2010 - the number keeps increasing). The others are described (and web links provided) at my Phylogeny Programs web pages at the address given above.


How You Can Help Me

Simply let me know of any problems you have had adapting the programs to your computer. I can often make "transparent" changes that, by making the code avoid the wilder, woolier, and less standard parts of C, not only help others who have your machine but even improve the chance of the programs functioning on new machines. I would like fairly detailed information on what gave trouble, on what operating system, machine, and (if relevant) compiler, and what had to be done to make the programs work. I am sometimes able to do some over-the-telephone trouble-shooting, particularly if I don't have to pay for the call, but electronic mail is a the best way for me to be asked about problems, as you can include your input and output files so I can see what is going on (please do not send them as Attachments, but as part of the body of a message). I'd really like these programs to be able to run with only routine changes on absolutely everything, down to and possibly including the Amana Touchmatic Radarange Microwave Oven which was an Intel 8080 system (in fact, early versions of this package did run successfully on Intel 8080 systems running the CP/M operating system). A PalmPilot version was contemplated too.

I would also like to know timings of programs from the package, when run on the three test input files provided above, for various computer and compiler combinations, so that I can provide this information in the section on speeds of this document.

For the phylogeny plotting programs Drawgram and Drawtree, I am particularly interested in knowing what has to be done to adapt them for other graphic file formats.

You can also be helpful to PHYLIP users in your part of the world by helping them get the latest version of PHYLIP from our web site and by helping them with any problems they may have in getting PHYLIP working on their data.

Your help is appreciated. I am always happy to hear suggestions for features and programs that ought to be incorporated in the package, but please do not be upset if I turn out to have already considered the particular possibility you suggest and decided against it.


In Case of Trouble

Read The (documentation) Files Meticulously ("RTFM"). If that doesn't solve the problem, please check the Frequently Asked Questions web page at the PHYLIP web site:

http://evolution.gs.washington.edu/phylip/faq.html

and the PHYLIP Bugs web page at that site:

http://evolution.gs.washington.edu/phylip/bugs.html

If none of these answers your question, get in touch with me. My email address is given below. If you do ask about a problem, please specify the program name, version of the package, computer operating system, and send me your data file so I can test the problem. Also it will help if you have the relevant output and documentation files so that you can refer to them in any correspondence. I can also be reached by telephone by calling me in my office: +1-(206)-543-0150, or at home: +1-(206)-526-9057 (how's that for user support!). If I cannot be reached at either place, a message can be left at the office of the Department of Genome Sciences, +1-(206)-221-7377 but I prefer strongly that I not call you, as in any phone consultation the least you can do is pay the phone bill. Better yet, use email.

Particularly if you are in a part of the world distant from me, you may also want to try to get in touch with other users of PHYLIP nearby. I can also, if requested, provide a list of nearby users.

Joe Felsenstein
Department of Genome Sciences
University of Washington
Box 355065
Seattle, Washington 98195-5065, U.S.A.

Electronic mail addresses:      joe (at) gs.washington.edu


phylip-3.697/doc/mix.html0000644004732000473200000003465712406201173015010 0ustar joefelsenst_g mix
version 3.696

Mix - Mixed method discrete characters parsimony

© Copyright 1986-2014 by Joseph Felsenstein. All rights reserved. License terms here.

Mix is a general parsimony program which carries out the Wagner and Camin-Sokal parsimony methods in mixture, where each character can have its method specified separately. The program defaults to carrying out Wagner parsimony.

The Camin-Sokal parsimony method explains the data by assuming that changes 0 --> 1 are allowed but not changes 1 --> 0. Wagner parsimony allows both kinds of changes. (This under the assumption that 0 is the ancestral state, though the program allows reassignment of the ancestral state, in which case we must reverse the state numbers 0 and 1 throughout this discussion). The criterion is to find the tree which requires the minimum number of changes. The Camin-Sokal method is due to Camin and Sokal (1965) and the Wagner method to Eck and Dayhoff (1966) and to Kluge and Farris (1969).

Here are the assumptions of these two methods:

  1. Ancestral states are known (Camin-Sokal) or unknown (Wagner).
  2. Different characters evolve independently.
  3. Different lineages evolve independently.
  4. Changes 0 --> 1 are much more probable than changes 1 --> 0 (Camin-Sokal) or equally probable (Wagner).
  5. Both of these kinds of changes are a priori improbable over the evolutionary time spans involved in the differentiation of the group in question.
  6. Other kinds of evolutionary event such as retention of polymorphism are far less probable than 0 --> 1 changes.
  7. Rates of evolution in different lineages are sufficiently low that two changes in a long segment of the tree are far less probable than one change in a short segment.

That these are the assumptions of parsimony methods has been documented in a series of papers of mine: (1973a, 1978b, 1979, 1981b, 1983b, 1988b). For an opposing view arguing that the parsimony methods make no substantive assumptions such as these, see the papers by Farris (1983) and Sober (1983a, 1983b), but also read the exchange between Felsenstein and Sober (1986).

INPUT FORMAT

The input for Mix is the standard input for discrete characters programs, described above in the documentation file for the discrete-characters programs. States "?", "P", and "B" are allowed.

The options are selected using a menu:


Mixed parsimony algorithm, version 3.69

Settings for this run:
  U                 Search for best tree?  Yes
  X                     Use Mixed method?  No
  P                     Parsimony method?  Wagner
  J     Randomize input order of species?  No. Use input order
  O                        Outgroup root?  No, use as outgroup species  1
  T              Use Threshold parsimony?  No, use ordinary parsimony
  A   Use ancestral states in input file?  No
  W                       Sites weighted?  No
  M           Analyze multiple data sets?  No
  0   Terminal type (IBM PC, ANSI, none)?  ANSI
  1    Print out the data at start of run  No
  2  Print indications of progress of run  Yes
  3                        Print out tree  Yes
  4     Print out steps in each character  No
  5     Print states at all nodes of tree  No
  6       Write out trees onto tree file?  Yes

Are these settings correct? (type Y or the letter for one to change)

The options U, X, J, O, T, A, and M are the usual User Tree, miXed methods, Jumble, Outgroup, Ancestral States, and Multiple Data Sets options, described either in the main documentation file or in the Discrete Characters Programs documentation file. The user-defined trees supplied if you use the U option must be given as rooted trees with two-way splits (bifurcations). The O option is acted upon only if the final tree is unrooted and is not a user-defined tree. One of the important uses of the the O option is to root the tree so that if there are any characters in which the ancestral states have not been specified, the program will print out a table showing which ancestral states require the fewest steps. Note that when any of the characters has Camin-Sokal parsimony assumed for it, the tree is rooted and the O option will have no effect.

The option P toggles between the Camin-Sokal parsimony criterion and the default Wagner parsimony criterion. Option X invokes mixed-method parsimony. If the A option is invoked, the ancestor is not to be counted as one of the species.

The F (Factors) option is not available in this program, as it would have no effect on the result even if that information were provided in the input file.

OUTPUT FORMAT

Output is standard: a list of equally parsimonious trees, which will be printed as rooted or unrooted depending on which is appropriate, and, if the user chooses, a table of the number of changes of state required in each character. If the Wagner option is in force for a character, it may not be possible to unambiguously locate the places on the tree where the changes occur, as there may be multiple possibilities. If the user selects menu option 5, a table is printed out after each tree, showing for each branch whether there are known to be changes in the branch, and what the states are inferred to have been at the top end of the branch. If the inferred state is a "?" there will be multiple equally-parsimonious assignments of states; the user must work these out for themselves by hand.

If the Camin-Sokal parsimony method is invoked and the Ancestors option is also used, then the program will infer, for any character whose ancestral state is unknown ("?") whether the ancestral state 0 or 1 will give the fewest state changes. If these are tied, then it may not be possible for the program to infer the state in the internal nodes, and these will all be printed as ".". If this has happened and you want to know more about the states at the internal nodes, you will find helpful to use Move to display the tree and examine its interior states, as the algorithm in Move shows all that can be known in this case about the interior states, including where there is and is not amibiguity. The algorithm in Mix gives up more easily on displaying these states.

If the A option is not used, then the program will assume 0 as the ancestral state for those characters following the Camin-Sokal method, and will assume that the ancestral state is unknown for those characters following Wagner parsimony. If any characters have unknown ancestral states, and if the resulting tree is rooted (even by outgroup), a table will also be printed out showing the best guesses of which are the ancestral states in each character. You will find it useful to understand the difference between the Camin-Sokal parsimony criterion with unknown ancestral state and the Wagner parsimony criterion.

If the U (User Tree) option is used and more than one tree is supplied, the program also performs a statistical test of each of these trees against the best tree. This test, which is a version of the test proposed by Alan Templeton (1983) and evaluated in a test case by me (1985a). It is closely parallel to a test using log likelihood differences invented by Kishino and Hasegawa (1989), and uses the mean and variance of step differences between trees, taken across characters. If the mean is more than 1.96 standard deviations different then the trees are declared significantly different. The program prints out a table of the steps for each tree, the differences of each from the highest one, the variance of that quantity as determined by the step differences at individual characters, and a conclusion as to whether that tree is or is not significantly worse than the best one. It is important to understand that the test assumes that all the binary characters are evolving independently, which is unlikely to be true for many suites of morphological characters.

If there are more than two trees, the test done is an extension of the KHT test, due to Shimodaira and Hasegawa (1999). They pointed out that a correction for the number of trees was necessary, and they introduced a resampling method to make this correction. In the version used here the variances and covariances of the sums of steps across characters are computed for all pairs of trees. To test whether the difference between each tree and the best one is larger than could have been expected if they all had the same expected number of steps, numbers of steps for all trees are sampled with these covariances and equal means (Shimodaira and Hasegawa's "least favorable hypothesis"), and a P value is computed from the fraction of times the difference between the tree's value and the lowest number of steps exceeds that actually observed. Note that this sampling needs random numbers, and so the program will prompt the user for a random number seed if one has not already been supplied. With the two-tree KHT test no random numbers are used.

In either the KHT or the SH test the program prints out a table of the number of steps for each tree, the differences of each from the lowest one, the variance of that quantity as determined by the differences of the numbers of steps at individual characters, and a conclusion as to whether that tree is or is not significantly worse than the best one.

If option 6 is left in its default state the trees found will be written to a tree file, so that they are available to be used in other programs. If the program finds multiple trees tied for best, all of these are written out onto the output tree file. Each is followed by a numerical weight in square brackets (such as [0.25000]). This is needed when we use the trees to make a consensus tree of the results of bootstrapping or jackknifing, to avoid overrepresenting replicates that find many tied trees.

At the beginning of the program is a constant, maxtrees, the maximum number of trees which the program will store for output.

The program is descended from earlier programs Sokal and Wagner which have long since been removed from the PHYLIP package, since Mix has all their capabilites and more.


TEST DATA SET

     5    6
Alpha     110110
Beta      110000
Gamma     100110
Delta     001001
Epsilon   001110


TEST SET OUTPUT (with all numerical options on)


Mixed parsimony algorithm, version 3.69

5 species, 6 characters

Wagner parsimony method


Name         Characters
----         ----------

Alpha        11011 0
Beta         11000 0
Gamma        10011 0
Delta        00100 1
Epsilon      00111 0



     4 trees in all found




           +--Epsilon   
     +-----4  
     !     +--Gamma     
  +--2  
  !  !     +--Delta     
--1  +-----3  
  !        +--Beta      
  !  
  +-----------Alpha     

  remember: this is an unrooted tree!


requires a total of      9.000

steps in each character:
         0   1   2   3   4   5   6   7   8   9
     *-----------------------------------------
    0!       2   2   2   1   1   1            

From    To     Any Steps?    State at upper node
                             ( . means same as in the node below it on tree)

          1                1?011 0
  1       2         no     ..... .
  2       4        maybe   .0... .
  4    Epsilon      yes    0.1.. .
  4    Gamma        no     ..... .
  2       3         yes    ...00 .
  3    Delta        yes    001.. 1
  3    Beta        maybe   .1... .
  1    Alpha       maybe   .1... .





     +--------Gamma     
     !  
  +--2     +--Epsilon   
  !  !  +--4  
  !  +--3  +--Delta     
--1     !  
  !     +-----Beta      
  !  
  +-----------Alpha     

  remember: this is an unrooted tree!


requires a total of      9.000

steps in each character:
         0   1   2   3   4   5   6   7   8   9
     *-----------------------------------------
    0!       1   2   1   2   2   1            

From    To     Any Steps?    State at upper node
                             ( . means same as in the node below it on tree)

          1                1?011 0
  1       2         no     ..... .
  2    Gamma       maybe   .0... .
  2       3        maybe   ...?? .
  3       4         yes    001.. .
  4    Epsilon     maybe   ...11 .
  4    Delta        yes    ...00 1
  3    Beta        maybe   .1.00 .
  1    Alpha       maybe   .1... .





     +--------Epsilon   
  +--4  
  !  !  +-----Gamma     
  !  +--2  
--1     !  +--Delta     
  !     +--3  
  !        +--Beta      
  !  
  +-----------Alpha     

  remember: this is an unrooted tree!


requires a total of      9.000

steps in each character:
         0   1   2   3   4   5   6   7   8   9
     *-----------------------------------------
    0!       2   2   2   1   1   1            

From    To     Any Steps?    State at upper node
                             ( . means same as in the node below it on tree)

          1                1?011 0
  1       4        maybe   .0... .
  4    Epsilon      yes    0.1.. .
  4       2         no     ..... .
  2    Gamma        no     ..... .
  2       3         yes    ...00 .
  3    Delta        yes    0.1.. 1
  3    Beta         yes    .1... .
  1    Alpha       maybe   .1... .





     +--------Gamma     
  +--2  
  !  !  +-----Epsilon   
  !  +--4  
--1     !  +--Delta     
  !     +--3  
  !        +--Beta      
  !  
  +-----------Alpha     

  remember: this is an unrooted tree!


requires a total of      9.000

steps in each character:
         0   1   2   3   4   5   6   7   8   9
     *-----------------------------------------
    0!       2   2   2   1   1   1            

From    To     Any Steps?    State at upper node
                             ( . means same as in the node below it on tree)

          1                1?011 0
  1       2        maybe   .0... .
  2    Gamma        no     ..... .
  2       4        maybe   ?.?.. .
  4    Epsilon     maybe   0.1.. .
  4       3         yes    ...00 .
  3    Delta        yes    0.1.. 1
  3    Beta         yes    110.. .
  1    Alpha       maybe   .1... .


phylip-3.697/doc/move.html0000644004732000473200000004045512406201173015152 0ustar joefelsenst_g move

version 3.696

Move - Interactive mixed method parsimony

© Copyright 1986-2014 by Joseph Felsenstein. All rights reserved. License terms here.

Move is an interactive parsimony program, inspired by Wayne Maddison and David Maddison's marvellous program MacClade, which was written for Macintosh computers. Move reads in a data set which is prepared in almost the same format as one for the mixed method parsimony program Mix. It allows the user to choose an initial tree, and displays this tree on the screen. The user can look at different characters and the way their states are distributed on that tree, given the most parsimonious reconstruction of state changes for that particular tree. The user then can specify how the tree is to be rearraranged, rerooted or written out to a file. By looking at different rearrangements of the tree the user can manually search for the most parsimonious tree, and can get a feel for how different characters are affected by changes in the tree topology.

This program is compatible with fewer computer systems than the other programs in PHYLIP. It can be adapted to MSDOS systems or to any system whose screen or terminals emulate DEC VT100 terminals (such as Telnet programs for logging in to remote computers over a TCP/IP network, VT100-compatible windows in the X windowing system, and any terminal compatible with ANSI standard terminals). For other screen types, there is a generic option which does not make use of screen graphics characters to display the character states. This will be less effective, as the states will be less easy to see when displayed.

The input data file is set up almost identically to the data files for Mix.

The user interaction starts with the program presenting a menu. The menu looks like this:


Interactive mixed parsimony algorithm, version 3.696

Settings for this run:
  X                         Use Mixed method?  No
  P                         Parsimony method?  Wagner
  A                     Use ancestral states?  No
  F                  Use factors information?  No
  O                            Outgroup root?  No, use as outgroup species   1
  W                           Sites weighted?  No
  T                  Use Threshold parsimony?  No, use ordinary parsimony
  U  Initial tree (arbitrary, user, specify)?  Arbitrary
  0       Graphics type (IBM PC, ANSI, none)?  ANSI
  S                 Width of terminal screen?  80
  L                Number of lines on screen?  24

Are these settings correct? (type Y or the letter for one to change)

The P (Parsimony method) option selects among Wagner parsimony and Camin-Sokal parsimony. If X (miXed methods) is selected the P menu item disappears, as it is then irrelevant.

The X (miXed methods), A (Ancestors), F (Factors), O (Outgroup), T (Threshold), and 0 (Graphics type) options are the usual ones and are described in the main documentation page and in the discrete characters program documentation page.

The U (initial tree) option allows the user to choose whether the initial tree is to be arbitrary, interactively specified by the user, or read from a tree file. Typing U causes the program to change among the three possibilities in turn. I would recommend that for a first run, you allow the tree to be set up arbitrarily (the default), as the "specify" choice is difficult to use and the "user tree" choice requires that you have available a tree file with the tree topology of the initial tree. Its default name is intree. The program will ask you for its name if it looks for the input tree file and does not find one of this name. If you wish to set up some particular tree you can also do that by the rearrangement commands specified below.

The T (threshold) option allows a continuum of methods between parsimony and compatibility. Thresholds less than or equal to 1.0 do not have any meaning and should not be used: they will result in a tree dependent only on the input order of species and not at all on the data!

The usual W (Weights) option is available in Move. It allows integer weights up to 36, using the symbols 0-9 and A-Z. Increased weight on a step increases both the number of parsimony steps on the character and the contribution it makes to the number of compatibilities.

The F (Factors) option is available in this program. It is only used to inform the program which groups of characters are to be counted together in computing the number of characters compatible with the tree. Thus if three binary characters are all factors of the same multistate character, the multistate character will be counted as compatible with the tree only if all three factors are compatible with it.

The S (Screen width) option allows the width in characters of the display to be adjusted when more then 80 characters can be displayed on the user's screen.

The L (screen Lines) option allows the user to change the height of the screen (in lines of characters) that is assumed to be available on the display. This may be particularly helpful when displaying large trees on terminals that have more than 24 lines per screen, or on workstation or X-terminal screens that can emulate the ANSI terminals with more than 24 lines.

After the initial menu is displayed and the choices are made, the program then sets up an initial tree and displays it. Below it will be a one-line menu of possible commands, which looks like this:

NEXT? (Options: R # + - S . T U W O F C H ? X Q) (H or ? for Help)

If you type H or ? you will get a single screen showing a description of each of these commands in a few words. Here are slightly more detailed descriptions:

R
("Rearrange"). This command asks for the number of a node which is to be removed from the tree. It and everything to the right of it on the tree is to be removed (by breaking the branch immediately below it). The command also asks for the number of a node below which that group is to be inserted. If an impossible number is given, the program refuses to carry out the rearrangement and asks for a new command. The rearranged tree is displayed: it will often have a different number of steps than the original. If you wish to undo a rearrangement, use the Undo command, for which see below.

#
This command, and the +, - and S commands described below, determine which character has its states displayed on the branches of the trees. The initial tree displayed by the program does not show states of sites. When # is typed, the program does not ask the user which character is to be shown but automatically shows the states of the next binary character that is not compatible with the tree (the next character that does not perfectly fit the current tree). The search for this character "wraps around" so that if it reaches the last character without finding one that is not compatible with the tree, the search continues at the first character; if no incompatible character is found the current character is shown, and if no current character is shown then the first character is shown. The display takes the form of different symbols or textures on the branches of the tree. The state of each branch is actually the state of the node above it. A key of the symbols or shadings used for states 0, 1 and ? are shown next to the tree. State ? means that either state 0 or state 1 could exist at that point on the tree, and that the user may want to consider the different possibilities, which are usually apparent by inspection.

+
This command is the same as # except that it goes forward one character, showing the states of the next character. If no character has been shown, using + will cause the first character to be shown. Once the last character has been reached, using + again will show the first character.

-
This command is the same as + except that it goes backwards, showing the states of the previous character. If no character has been shown, using - will cause the last character to be shown. Once character number 1 has been reached, using - again will show the last character.

S
("Show"). This command is the same as + and - except that it causes the program to ask you for the number of a character. That character is the one whose states will be displayed. If you give the character number as 0, the program will go back to not showing the states of the characters.

. (dot)
This command simply causes the current tree to be redisplayed. It is of use when the tree has partly disappeared off of the top of the screen owing to too many responses to commands being printed out at the bottom of the screen.

T
("Try rearrangements"). This command asks for the name of a node. The part of the tree at and above that node is removed from the tree. The program tries to re-insert it in each possible location on the tree (this may take some time, and the program reminds you to wait). Then it prints out a summary. For each possible location the program prints out the number of the node to the right of the place of insertion and the number of steps required in each case. These are divided into those that are better, tied, or worse than the current tree. Once this summary is printed out, the group that was removed is inserted into its original position. It is up to you to use the R command to actually carry out any the arrangements that have been tried.

U
("Undo"). This command reverses the effect of the most recent rearrangement, outgroup re-rooting, or flipping of branches. It returns to the previous tree topology. It will be of great use when rearranging the tree and when a rearrangement proves worse than the preceding one -- it permits you to abandon the new one and return to the previous one without remembering its topology in detail.

W
("Write"). This command writes out the current tree onto a tree output file. If the file already has been written to by this run of Move, it will ask you whether you want to replace the contents of the file, add the tree to the end of the file, or not write out the tree to the file. The tree is written in the standard format used by PHYLIP (a subset of the Newick standard). It is in the proper format to serve as the User-Defined Tree for setting up the initial tree in a subsequent run of the program. Note that if you provided the initial tree topology in a tree file and replace its contents, that initial tree will be lost.

O
("Outgroup"). This asks for the number of a node which is to be the outgroup. The tree will be redisplayed with that node as the left descendant of the bottom fork. Under some options (for example the Camin-Sokal parsimony method or the Ancestor state options), the number of steps required on the tree may change on re-rooting. Note that it is possible to use this to make a multi-species group the outgroup (i.e., you can give the number of an interior node of the tree as the outgroup, and the program will re-root the tree properly with that on the left of the bottom fork).

F
("Flip"). This asks for a node number and then flips the two branches at that node, so that the left-right order of branches at that node is changed. This does not actually change the tree topology (or the number of steps on that tree) but it does change the appearance of the tree. .br
C
("Clade"). When the data consist of more than 12 species (or more than half the number of lines on the screen if this is not 24), it may be difficult to display the tree on one screen. In that case the tree will be squeezed down to one line per species. This is too small to see all the interior states of the tree. The C command instructs the program to print out only that part of the tree (the "clade") from a certain node on up. The program will prompt you for the number of this node. Remember that thereafter you are not looking at the whole tree. To go back to looking at the whole tree give the C command again and enter "0" for the node number when asked. Most users will not want to use this option unless forced to.

H
("Help"). Prints a one-screen summary of what the commands do, a few words for each command.

?
("huh?"). A synonym for H. Same as Help command.

X
("Exit"). Exit from program. If the current tree has not yet been saved into a file, the program will ask you whether it should be saved.

Q
("Quit"). A synonym for X. Same as the eXit command.

ADAPTING THE PROGRAM TO YOUR COMPUTER AND TO YOUR TERMINAL

As we have seen, the initial menu of the program allows you to choose among three screen types (PCDOS, Ansi, and none). If you want to avoid having to make this choice every time, you can change some of the constants in the file phylip.h to have the terminal type initialize itself in the proper way, and recompile. We have tried to have the default values be correct for PC, Macintosh, and Unix screens. If the setting is "none" (which is necessary on Macintosh MacOS 9 screens), the special graphics characters will not be used to indicate nucleotide states, but only letters will be used for the four nucleotides. This is less easy to look at.

The constants that need attention are ANSICRT and IBMCRT. Currently these are both set to "false" on Macintosh MacOS 9 systems, to "true" on MacOS X and on Unix/Linux systems, and IBMCRT is set to "true" on Windows systems. If your system has an ANSI compatible terminal, you might want to find the definition of ANSICRT in phylip.h and set it to "true", and IBMCRT to "false".

MORE ABOUT THE PARSIMONY CRITERION

Move uses as its numerical criterion the Wagner and Camin-Sokal parsimony methods in mixture, where each character can have its method specified separately. The program defaults to carrying out Wagner parsimony.

The Camin-Sokal parsimony method explains the data by assuming that changes 0 --> 1 are allowed but not changes 1 --> 0. Wagner parsimony allows both kinds of changes. (This is under the assumption that 0 is the ancestral state, though the program allows reassignment of the ancestral state, in which case we must reverse the state numbers 0 and 1 throughout this discussion). The criterion is to find the tree which requires the minimum number of changes. The Camin- Sokal method is due to Camin and Sokal (1965) and the Wagner method to Eck and Dayhoff (1966) and to Kluge and Farris (1969).

Here are the assumptions of these two methods:

  1. Ancestral states are known (Camin-Sokal) or unknown (Wagner).
  2. Different characters evolve independently.
  3. Different lineages evolve independently.
  4. Changes 0 --> 1 are much more probable than changes 1 --> 0 (Camin-Sokal) or equally probable (Wagner).
  5. Both of these kinds of changes are a priori improbable over the evolutionary time spans involved in the differentiation of the group in question.
  6. Other kinds of evolutionary event such as retention of polymorphism are far less probable than 0 --> 1 changes.
  7. Rates of evolution in different lineages are sufficiently low that two changes in a long segment of the tree are far less probable than one change in a short segment.

That these are the assumptions of parsimony methods has been documented in a series of papers of mine: (1973a, 1978b, 1979, 1981b, 1983b, 1988b). For an opposing view arguing that the parsimony methods make no substantive assumptions such as these, see the papers by Farris (1983) and Sober (1983a, 1983b), but also read the exchange between Felsenstein and Sober (1986).

Below is a test data set, but we cannot show the output it generates because of the interactive nature of the program.


TEST DATA SET

     5    6
Alpha     110110
Beta      110000
Gamma     100110
Delta     001001
Epsilon   001110
phylip-3.697/doc/neighbor.html0000644004732000473200000002162212406201173015774 0ustar joefelsenst_g neighbor

version 3.696

Neighbor -- Neighbor-Joining and UPGMA methods

© Copyright 1991-2014 by Joseph Felsenstein. All rights reserved. License terms here.

This program implements the Neighbor-Joining method of Saitou and Nei (1987) and the UPGMA method of clustering. The program was written by Mary Kuhner and Jon Yamato, using some code from program Fitch. An important part of the code was translated from FORTRAN code from the neighbor-joining program written by Naruya Saitou and by Li Jin, and is used with the kind permission of Drs. Saitou and Jin.

Neighbor constructs a tree by successive clustering of lineages, setting branch lengths as the lineages join. The tree is not rearranged thereafter. The tree does not assume an evolutionary clock, so that it is in effect an unrooted tree. It should be somewhat similar to the tree obtained by Fitch. The program cannot evaluate a User tree, nor can it prevent branch lengths from becoming negative. However the algorithm is far faster than Fitch or Kitsch. This will make it particularly effective in their place for large studies or for bootstrap or jackknife resampling studies which require runs on multiple data sets.

The UPGMA option constructs a tree by successive (agglomerative) clustering using an average-linkage method of clustering. It has some relationship to Kitsch, in that when the tree topology turns out the same, the branch lengths with UPGMA will turn out to be the same as with the P = 0 option of Kitsch.

The options for Neighbor are selected through the menu, which looks like this:


Neighbor-Joining/UPGMA method version 3.69

Settings for this run:
  N       Neighbor-joining or UPGMA tree?  Neighbor-joining
  O                        Outgroup root?  No, use as outgroup species  1
  L         Lower-triangular data matrix?  No
  R         Upper-triangular data matrix?  No
  S                        Subreplicates?  No
  J     Randomize input order of species?  No. Use input order
  M           Analyze multiple data sets?  No
  0   Terminal type (IBM PC, ANSI, none)?  ANSI
  1    Print out the data at start of run  No
  2  Print indications of progress of run  Yes
  3                        Print out tree  Yes
  4       Write out trees onto tree file?  Yes


  Y to accept these or type the letter for one to change

Most of the input options (L, R, S, J, and M) are as given in the Distance Matrix Programs documentation file, and their input format is the same as given there. The O (Outgroup) option is described in the main documentation file of this package. It is not available when the UPGMA option is selected. The Jumble option (J) does not allow multiple jumbles (as most of the other programs that have it do), as there is no objective way of choosing which of the multiple results is best, there being no explicit criterion for optimality of the tree.

An important use of the Jumble option is in the use of Neighbor with bootstrap samples. Backeljau et al. (1996) and Farris et al. (1996) point out that when there are ties in the distance matrix, Neighbor will resolve them in a way dependent on the order of species in the input file. If we have many bootstrap samples from a data set, and run Neighbor on them, we can then get apparent strong support for one resolution of a multifurcation, purely as an artifact of the order of species in the input file. By using a random order of input species for each bootstrap, the problem disappears, as Farris et al. (1996) acknowledge. Neighbor therefore has the Jumble option turned on whenever multiple distance matrices (the M option) is activated. Only one Jumble needs to be done per data set in that case.

Option N chooses between the Neighbor-Joining and UPGMA methods. Option S is the usual Subreplication option. Here, however, it is present only to allow Neighbor to read the input data: the number of replicates is actually ignored, even though it is read in. Note that this means that one cannot use it to have missing data in the input file, if Neighbor is to be used.

The output consists of an tree (rooted if UPGMA, unrooted if Neighbor-Joining) and the lengths of the interior segments. The Average Percent Standard Deviation is not computed or printed out. If the tree found by Neighbor is fed into Fitch as a User Tree, it will compute this quantity if one also selects the N option of Fitch to ensure that none of the branch lengths is re-estimated.

As Neighbor runs it prints out an account of the successive clustering levels, if you allow it to. This is mostly for reassurance and can be suppressed using menu option 2. In this printout of cluster levels the word "OTU" refers to a tip species, and the word "NODE" to an interior node of the resulting tree.

The constants available for modification at the beginning of the program are "namelength" which gives the length of a species name, and the usual boolean constants that initialize the terminal type. There is no feature saving multiple trees tied for best, partly because we do not expect exact ties except in cases where the branch lengths make the nature of the tie obvious, as when a branch is of zero length.

The major advantage of Neighbor is its speed: it requires a time only proportional to the cube of the number of species. It is significantly faster than version 3.5 of this program. By contrast Fitch and Kitsch require a time that rises as the fourth power of the number of species. Thus Neighbor is well-suited to bootstrapping studies and to analysis of very large trees. Our simulation studies (Kuhner and Felsenstein, 1994) show that, contrary to statements in the literature by others, Neighbor does not get as accurate an estimate of the phylogeny as does Fitch. However it does nearly as well, and in view of its speed this will make it a quite useful program.


TEST DATA SET

    7
Bovine      0.0000  1.6866  1.7198  1.6606  1.5243  1.6043  1.5905
Mouse       1.6866  0.0000  1.5232  1.4841  1.4465  1.4389  1.4629
Gibbon      1.7198  1.5232  0.0000  0.7115  0.5958  0.6179  0.5583
Orang       1.6606  1.4841  0.7115  0.0000  0.4631  0.5061  0.4710
Gorilla     1.5243  1.4465  0.5958  0.4631  0.0000  0.3484  0.3083
Chimp       1.6043  1.4389  0.6179  0.5061  0.3484  0.0000  0.2692
Human       1.5905  1.4629  0.5583  0.4710  0.3083  0.2692  0.0000


OUTPUT FROM TEST DATA SET (with all numerical options on)


   7 Populations

Neighbor-Joining/UPGMA method version 3.69


 Neighbor-joining method

 Negative branch lengths allowed


Name                       Distances
----                       ---------

Bovine        0.00000   1.68660   1.71980   1.66060   1.52430   1.60430
              1.59050
Mouse         1.68660   0.00000   1.52320   1.48410   1.44650   1.43890
              1.46290
Gibbon        1.71980   1.52320   0.00000   0.71150   0.59580   0.61790
              0.55830
Orang         1.66060   1.48410   0.71150   0.00000   0.46310   0.50610
              0.47100
Gorilla       1.52430   1.44650   0.59580   0.46310   0.00000   0.34840
              0.30830
Chimp         1.60430   1.43890   0.61790   0.50610   0.34840   0.00000
              0.26920
Human         1.59050   1.46290   0.55830   0.47100   0.30830   0.26920
              0.00000


  +---------------------------------------------Mouse     
  ! 
  !                        +---------------------Gibbon    
  1------------------------2 
  !                        !  +----------------Orang     
  !                        +--5 
  !                           ! +--------Gorilla   
  !                           +-4 
  !                             ! +--------Chimp     
  !                             +-3 
  !                               +------Human     
  ! 
  +------------------------------------------------------Bovine    


remember: this is an unrooted tree!

Between        And            Length
-------        ---            ------
   1          Mouse           0.76891
   1             2            0.42027
   2          Gibbon          0.35793
   2             5            0.04648
   5          Orang           0.28469
   5             4            0.02696
   4          Gorilla         0.15393
   4             3            0.03982
   3          Chimp           0.15167
   3          Human           0.11753
   1          Bovine          0.91769


phylip-3.697/doc/pars.html0000644004732000473200000003271512406201173015151 0ustar joefelsenst_g pars

version 3.696

Pars - Discrete character parsimony

© Copyright 1986-2014 by Joseph Felsenstein. All rights reserved. License terms here.

Pars is a general parsimony program which carries out the Wagner parsimony method with multiple states. Wagner parsimony allows changes among all states. The criterion is to find the tree which requires the minimum number of changes. The Wagner method was originated by Eck and Dayhoff (1966) and by Kluge and Farris (1969). Here are its assumptions:

  1. Ancestral states are unknown.
  2. Different characters evolve independently.
  3. Different lineages evolve independently.
  4. Changes to all other states are equally probable (Wagner).
  5. These changes are a priori improbable over the evolutionary time spans involved in the differentiation of the group in question.
  6. Other kinds of evolutionary event such as retention of polymorphism are far less probable than these state changes.
  7. Rates of evolution in different lineages are sufficiently low that two changes in a long segment of the tree are far less probable than one change in a short segment.

That these are the assumptions of parsimony methods has been documented in a series of papers of mine: (1973a, 1978b, 1979, 1981b, 1983b, 1988b). For an opposing view arguing that the parsimony methods make no substantive assumptions such as these, see the papers by Farris (1983) and Sober (1983a, 1983b), but also read the exchange between Felsenstein and Sober (1986).

INPUT FORMAT

The input for Pars is the standard input for discrete characters programs, described above in the documentation file for the discrete-characters programs, except that multiple states (up to 8 of them) are allowed. Any characters other than "?" are allowed as states, up to a maximum of 8 states. In fact, one can use different symbols in different columns of the data matrix, although it is rather unlikely that you would want to do that. The symbols you can use are:

But note that these do not include blank (" "). Blanks in the input data are simply skipped by the program, so that they can be used to make characters into groups for ease of viewing. The "?" (question mark) symbol has special meaning. It is allowed in the input but is not available as the symbol of a state. Rather, it means that the state is unknown.

Pars can handle both bifurcating and multifurcating trees. In doing its search for most parsimonious trees, it adds species not only by creating new forks in the middle of existing branches, but it also tries putting them at the end of new branches which are added to existing forks. Thus it searches among both bifurcating and multifurcating trees. If a branch in a tree does not have any characters which might change in that branch in the most parsimonious tree, it does not save that tree. Thus in any tree that results, a branch exists only if some character has a most parsimonious reconstruction that would involve change in that branch.

It also saves a number of trees tied for best (you can alter the number it saves using the V option in the menu). When rearranging trees, it tries rearrangements of all of the saved trees. This makes the algorithm slower than earlier programs such as Mix.

The options are selected using a menu:


Discrete character parsimony algorithm, version 3.69

Setting for this run:
  U                 Search for best tree?  Yes
  S                        Search option?  More thorough search
  V              Number of trees to save?  100
  J     Randomize input order of species?  No. Use input order
  O                        Outgroup root?  No, use as outgroup species 1
  T              Use Threshold parsimony?  No, use ordinary parsimony
  W                       Sites weighted?  No
  M           Analyze multiple data sets?  No
  I            Input species interleaved?  Yes
  0   Terminal type (IBM PC, ANSI, none)?  ANSI
  1    Print out the data at start of run  No
  2  Print indications of progress of run  Yes
  3                        Print out tree  Yes
  4          Print out steps in each site  No
  5  Print character at all nodes of tree  No
  6       Write out trees onto tree file?  Yes

  Y to accept these or type the letter for one to change

The Weights (W) option takes the weights from a file whose default name is "weights". The weights follow the format described in the main documentation file, with integer weights from 0 to 35 allowed by using the characters 0, 1, 2, ..., 9 and A, B, ... Z.

The User tree (option U) is read from a file whose default name is intree. The trees can be multifurcating. They must be preceded in the file by a line giving the number of trees in the file.

The options J, O, T, and M are the usual Jumble, Outgroup, Threshold parsimony, and Multiple Data Sets options, described either in the main documentation file or in the Discrete Characters Programs documentation file.

The S (search) option controls how, and how much, rearrangement is done on the tied trees that are saved by the program. If the "More thorough search" option (the default) is chosen, the program will save multiple tied trees, without collapsing internal branches that have no evidence of change on them. It will subsequently rearrange on all parts of each of those trees. If the "Less thorough search" option is chosen, before saving, the program will collapse all branches that have no evidence that there is any change on that branch. This leads to less attempted rearrangement. If the "Rearrange on one best tree" option is chosen, only the first of the tied trees is used for rearrangement. This is faster but less thorough. If your trees are likely to have large multifurcations, do not use the default "More thorough search" option as it could result in too large a number of trees being saved.

The M (multiple data sets option) will ask you whether you want to use multiple sets of weights (from the weights file) or multiple data sets. The ability to use a single data set with multiple weights means that much less disk space will be used for this input data. The bootstrapping and jackknifing tool Seqboot has the ability to create a weights file with multiple weights.

The O (outgroup) option will have no effect if the U (user-defined tree) option is in effect. The T (threshold) option allows a continuum of methods between parsimony and compatibility. Thresholds less than or equal to 1.0 do not have any meaning and should not be used: they will result in a tree dependent only on the input order of species and not at all on the data!

OUTPUT FORMAT

Output is standard: if option 1 is toggled on, the data is printed out, with the convention that "." means "the same as in the first species". Then comes a list of equally parsimonious trees. Each tree has branch lengths. These are computed using an algorithm published by Hochbaum and Pathria (1997) which I first heard of from Wayne Maddison who invented it independently of them. This algorithm averages the number of reconstructed changes of state over all sites over all possible most parsimonious placements of the changes of state among branches. Note that it does not correct in any way for multiple changes that overlay each other.

If option 2 is toggled on a table of the number of changes of state required in each character is also printed. If option 5 is toggled on, a table is printed out after each tree, showing for each branch whether there are known to be changes in the branch, and what the states are inferred to have been at the top end of the branch. This is a reconstruction of the ancestral sequences in the tree. If you choose option 5, a menu item D appears which gives you the opportunity to turn off dot-differencing so that complete ancestral sequences are shown. If the inferred state is a "?", there will be multiple equally-parsimonious assignments of states; the user must work these out for themselves by hand. If option 6 is left in its default state the trees found will be written to a tree file, so that they are available to be used in other programs. If the program finds multiple trees tied for best, all of these are written out onto the output tree file. Each is followed by a numerical weight in square brackets (such as [0.25000]). This is needed when we use the trees to make a consensus tree of the results of bootstrapping or jackknifing, to avoid overrepresenting replicates that find many tied trees.

If the U (User Tree) option is used and more than one tree is supplied, the program also performs a statistical test of each of these trees against the best tree. This test is a version of the test proposed by Alan Templeton (1983), evaluated in a test case by me (1985a). It is closely parallel to a test using log likelihood differences due to Kishino and Hasegawa (1989), and uses the mean and variance of step differences between trees, taken across sites. If the mean is more than 1.96 standard deviations different then the trees are declared significantly different. The program prints out a table of the steps for each tree, the differences of each from the best one, the variance of that quantity as determined by the step differences at individual sites, and a conclusion as to whether that tree is or is not significantly worse than the best one. It is important to understand that the test assumes that all the discrete characters are evolving independently, which is unlikely to be true for many suites of morphological characters.

If there are more than two trees, the test done is an extension of the KHT test, due to Shimodaira and Hasegawa (1999). They pointed out that a correction for the number of trees was necessary, and they introduced a resampling method to make this correction. In the version used here the variances and covariances of the sums of steps across characters are computed for all pairs of trees. To test whether the difference between each tree and the best one is larger than could have been expected if they all had the same expected number of steps, numbers of steps for all trees are sampled with these covariances and equal means (Shimodaira and Hasegawa's "least favorable hypothesis"), and a P value is computed from the fraction of times the difference between the tree's value and the lowest number of steps exceeds that actually observed. Note that this sampling needs random numbers, and so the program will prompt the user for a random number seed if one has not already been supplied. With the two-tree KHT test no random numbers are used.

In either the KHT or the SH test the program prints out a table of the number of steps for each tree, the differences of each from the lowest one, the variance of that quantity as determined by the differences of the numbers of steps at individual characters, and a conclusion as to whether that tree is or is not significantly worse than the best one.

Option 6 in the menu controls whether the tree estimated by the program is written onto a tree file. The default name of this output tree file is "outtree". If the U option is in effect, all the user-defined trees are written to the output tree file.


TEST DATA SET

     5    6
Alpha     110110
Beta      110000
Gamma     100110
Delta     001001
Epsilon   001110


TEST SET OUTPUT (with all numerical options on)


Discrete character parsimony algorithm, version 3.69

 5 species,   6  sites


Name         Sequences
----         ---------

Alpha        110110
Beta         ...00.
Gamma        .0....
Delta        001001
Epsilon      001...



One most parsimonious tree found:


                            +Epsilon   
           +----------------3  
  +--------2                +-------------------------Delta     
  |        |  
  |        +Gamma     
  |  
  1----------------Beta      
  |  
  +Alpha     


requires a total of      8.000

  between      and       length
  -------      ---       ------
     1           2         1.00
     2           3         2.00
     3      Epsilon        0.00
     3      Delta          3.00
     2      Gamma          0.00
     1      Beta           2.00
     1      Alpha          0.00

steps in each site:
         0   1   2   3   4   5   6   7   8   9
     *-----------------------------------------
    0|       1   1   1   2   2   1            

From    To     Any Steps?    State at upper node
                             ( . means same as in the node below it on tree)

          1                110110
   1      2         yes    .0....
   2      3         yes    0.1...
   3   Epsilon      no     ......
   3   Delta        yes    ...001
   2   Gamma        no     ......
   1   Beta         yes    ...00.
   1   Alpha        no     ......


phylip-3.697/doc/phylip.gif0000644004732000473200000000163612406201173015310 0ustar joefelsenst_gGIF87a€€ñÿÿQQûÿÿÿ,€€þœ©Ëí£œ´Ú+Þû† G≦Î&òjí‹1Í÷ î|Äâp ⣓¦ÉE…ͧSH Ú³z OJ—é¨äˆ×™nÚüžÙtÆ´ŒÈë|Å=¯$—5HXÒòèUÈÈxˆ˜¤è%ØXùˆ'©¹©v‰É)èù‰•9šÚ9g ©ú:ÔŠùš+ëJ«¹w{”«(È›BöKÊ"<Œú+(€¬òUì"à<ÚÊ\ͤª½ÝÓ-*× N$lnT9X¾nDÙ&?HMŸ¯¿ÏßïÏ6ˆBÀ1Y(Ø«3°!‚ %¤“0BÄ'<XQ! þ#\\3ñAÈ!|3rDI‘+œ “’ÁK’$Þyli%æ‚UlºÌ‚Ï!Î':Øói4 Òò–æÚàèÏN§öDU%U™E¹ÖÄÚ°‚;öXÜ 1+Ív1È š eWsµ²-eQž]5 êrQËò{mð{0EK ÊUü7lcG Æ›X2S (¯5'b©'<ÛA{ÙìaÈH‹0íGíÌÌW'ðäûšu_¥&E£Žpô)ÍƬ›6¸ƒ“¯6nõXËÅ¿–†.‘÷Yë¨o'îVwpÊò;*ä²U{·k˜{nðÓÇNn³óèÂþ è±ßÜ<˜”À\ wR`x…Ì'_/Ò§à~d¥§ßƒª‰0ÛpmÉ`Þ‚ÖH¸vÞ• ƒZP"тڙ(âu'ªâi5Ƈ[1z"77Z4ãã!ác2JÇ!…‹mèƒw)°ÅÙ1)Ãy3( $•W¾˜Œ‘"©"–6j™dƒrydŽg’¹%šzYž”ÀYÙ™œîX&žbÙy”O˜×%›ºIĘ]`a¢„a^~)š¨£E.º'¤‘ÒÑ¡¤ÇYúŒh:&§öñé}¢VªŽ…¤JשøµR*¢®ÂêS´Ê g?±V§æ?›ÖÆ ¾Ææá£ý Kz,€KöŠlj`þÚl°Jå'æ>¨VU­µ³m¨¢vkÁ¬à†{길Bj®†¦‹^¢ì¶b®ïJ{ì¼íÊkï²õæK.¾ü¦öoþœ,³÷+ìÁÆr«ð” 7Œã¾<ñ¹W1ÇøðÆ¡ q¶‹ü/ÉüšìC;phylip-3.697/doc/proml.html0000644004732000473200000010314312406201173015327 0ustar joefelsenst_g proml

version 3.696

Proml -- Protein Maximum Likelihood program

© Copyright 1986-2014 by Joseph Felsenstein. All rights reserved. License terms here.

This program implements the maximum likelihood method for protein amino acid sequences. It uses the either the Jones-Taylor-Thornton or the Dayhoff probability model of change between amino acids. The assumptions of these present models are:

  1. Each position in the sequence evolves independently.
  2. Different lineages evolve independently.
  3. Each position undergoes substitution at an expected rate which is chosen from a series of rates (each with a probability of occurrence) which we specify.
  4. All relevant positions are included in the sequence, not just those that have changed or those that are "phylogenetically informative".
  5. The probabilities of change between amino acids are given by the model of Jones, Taylor, and Thornton (1992), the PMB model of Veerassamy, Smith and Tillier (2003), or the DCMut version (Kosiol and Goldman, 2005) of the PAM model of Dayhoff (Dayhoff and Eck, 1968; Dayhoff et. al., 1979).

Note the assumption that we are looking at all positions, including those that have not changed at all. It is important not to restrict attention to some positions based on whether or not they have changed; doing that would bias branch lengths by making them too long, and that in turn would cause the method to misinterpret the meaning of those positions that had changed.

This program uses a Hidden Markov Model (HMM) method of inferring different rates of evolution at different amino acid positions. This was described in a paper by me and Gary Churchill (1996). It allows us to specify to the program that there will be a number of different possible evolutionary rates, what the prior probabilities of occurrence of each is, and what the average length of a patch of positions all having the same rate is. The rates can also be chosen by the program to approximate a Gamma distribution of rates, or a Gamma distribution plus a class of invariant positions. The program computes the the likelihood by summing it over all possible assignments of rates to positions, weighting each by its prior probability of occurrence.

For example, if we have used the C and A options (described below) to specify that there are three possible rates of evolution, 1.0, 2.4, and 0.0, that the prior probabilities of a position having these rates are 0.4, 0.3, and 0.3, and that the average patch length (number of consecutive positions with the same rate) is 2.0, the program will sum the likelihood over all possibilities, but giving less weight to those that (say) assign all positions to rate 2.4, or that fail to have consecutive positions that have the same rate.

The Hidden Markov Model framework for rate variation among positions was independently developed by Yang (1993, 1994, 1995). We have implemented a general scheme for a Hidden Markov Model of rates; we allow the rates and their prior probabilities to be specified arbitrarily by the user, or by a discrete approximation to a Gamma distribution of rates (Yang, 1995), or by a mixture of a Gamma distribution and a class of invariant positions.

This feature effectively removes the artificial assumption that all positions have the same rate, and also means that we need not know in advance the identities of the positions that have a particular rate of evolution.

Another layer of rate variation also is available. The user can assign categories of rates to each positions (for example, we might want amino acid positions in the active site of a protein to change more slowly than other positions. This is done with the categories input file and the C option. We then specify (using the menu) the relative rates of evolution of amino acid positions in the different categories. For example, we might specify that positions in the active site evolve at relative rates of 0.2 compared to 1.0 at other positions. If we are assuming that a particular position maintains a cysteine bridge to another, we may want to put it in a category of positions (including perhaps the initial position of the protein sequence which maintains methionine) which changes at a rate of 0.0.

If both user-assigned rate categories and Hidden Markov Model rates are allowed, the program assumes that the actual rate at a position is the product of the user-assigned category rate and the Hidden Markov Model regional rate. (This may not always make perfect biological sense: it would be more natural to assume some upper bound to the rate, as we have discussed in the Felsenstein and Churchill paper). Nevertheless you may want to use both types of rate variation.

INPUT FORMAT AND OPTIONS

Subject to these assumptions, the program is a correct maximum likelihood method. The input is fairly standard, with one addition. As usual the first line of the file gives the number of species and the number of amino acid positions.

Next come the species data. Each sequence starts on a new line, has a ten-character species name that must be blank-filled to be of that length, followed immediately by the species data in the one-letter amino acid code. The sequences must either be in the "interleaved" or "sequential" formats described in the Molecular Sequence Programs document. The I option selects between them. The sequences can have internal blanks in the sequence but there must be no extra blanks at the end of the terminated line. Note that a blank is not a valid symbol for a deletion.

The options are selected using an interactive menu. The menu looks like this:


Amino acid sequence Maximum Likelihood method, version 3.696

Settings for this run:
  U                 Search for best tree?  Yes
  P    JTT, PMB or PAM probability model?  Jones-Taylor-Thornton
  C                One category of sites?  Yes
  R           Rate variation among sites?  constant rate of change
  W                       Sites weighted?  No
  S        Speedier but rougher analysis?  Yes
  G                Global rearrangements?  No
  J   Randomize input order of sequences?  No. Use input order
  O                        Outgroup root?  No, use as outgroup species  1
  M           Analyze multiple data sets?  No
  I          Input sequences interleaved?  Yes
  0   Terminal type (IBM PC, ANSI, none)?  ANSI
  1    Print out the data at start of run  No
  2  Print indications of progress of run  Yes
  3                        Print out tree  Yes
  4       Write out trees onto tree file?  Yes
  5   Reconstruct hypothetical sequences?  No

  Y to accept these or type the letter for one to change

The user either types "Y" (followed, of course, by a carriage-return) if the settings shown are to be accepted, or the letter or digit corresponding to an option that is to be changed.

The options U, W, J, O, M, and 0 are the usual ones. They are described in the main documentation file of this package. Option I is the same as in other molecular sequence programs and is described in the documentation file for the sequence programs.

The P option toggles between three models of amino acid change. One is the Jones-Taylor-Thornton model, another the PMB (Probability Matrix from Blocks) model of Veerassamy, Smith and Tillier (2003), another the DCMut model (Kosiol and Goldman, 2005) based on the the Dayhoff PAM matrix model. These are all based on Margaret Dayhoff's (Dayhoff and Eck, 1968; Dayhoff et. al., 1979) method of empirical tabulation of changes of amino acid sequences, and conversion of these to a probability model of amino acid change which is used to make a transition probability matrix which allows prediction of the probability of changing from any one amino acid to any other, and also predicts equilibrium amino acid composition.

The default method is that of Jones, Taylor, and Thornton (1992). This is similar to the Dayhoff PAM model, except that it is based on a recounting of the number of observed changes in amino acids, using a much larger sample of protein sequences than did Dayhoff. Because its sample is so much larger this model is to be preferred over the original Dayhoff PAM model. The PMB model was recently derived from the Blocks database of conserved protein motifs, and is described in a paper by Veerassamy, Smith and Tillier (2003). The Dayhoff model uses the DCMut version (Kosiol and Goldman, 2005) of Margaret Dayhoff's PAM matrix.

The R (Hidden Markov Model rates) option allows the user to approximate a Gamma distribution of rates among positions, or a Gamma distribution plus a class of invariant positions, or to specify how many categories of substitution rates there will be in a Hidden Markov Model of rate variation, and what are the rates and probabilities for each. By repeatedly selecting the R option one toggles among no rate variation, the Gamma, Gamma+I, and general HMM possibilities.

If you choose Gamma or Gamma+I the program will ask how many rate categories you want. If you have chosen Gamma+I, keep in mind that one rate category will be set aside for the invariant class and only the remaining ones used to approximate the Gamma distribution. For the approximation we do not use the quantile method of Yang (1995) but instead use a quadrature method using generalized Laguerre polynomials. This should give a good approximation to the Gamma distribution with as few as 5 or 6 categories.

In the Gamma and Gamma+I cases, the user will be asked to supply the coefficient of variation of the rate of substitution among positions. This is different from the parameters used by Nei and Jin (1990) but related to them: their parameter a is also known as "alpha", the shape parameter of the Gamma distribution. It is related to the coefficient of variation by

     CV = 1 / a1/2

or

     a = 1 / (CV)2

(their parameter b is absorbed here by the requirement that time is scaled so that the mean rate of evolution is 1 per unit time, which means that a = b). As we consider cases in which the rates are less variable we should set a larger and larger, as CV gets smaller and smaller.

If the user instead chooses the general Hidden Markov Model option, they are first asked how many HMM rate categories there will be (for the moment there is an upper limit of 9, which should not be restrictive). Then the program asks for the rates for each category. These rates are only meaningful relative to each other, so that rates 1.0, 2.0, and 2.4 have the exact same effect as rates 2.0, 4.0, and 4.8. Note that an HMM rate category can have rate of change 0, so that this allows us to take into account that there may be a category of amino acid positions that are invariant. Note that the run time of the program will be proportional to the number of HMM rate categories: twice as many categories means twice as long a run. Finally the program will ask for the probabilities of a random amino acid position falling into each of these regional rate categories. These probabilities must be nonnegative and sum to 1. Default for the program is one category, with rate 1.0 and probability 1.0 (actually the rate does not matter in that case).

If more than one HMM rate category is specified, then another option, A, becomes visible in the menu. This allows us to specify that we want to assume that positions that have the same HMM rate category are expected to be clustered so that there is autocorrelation of rates. The program asks for the value of the average patch length. This is an expected length of patches that have the same rate. If it is 1, the rates of successive positions will be independent. If it is, say, 10.25, then the chance of change to a new rate will be 1/10.25 after every position. However the "new rate" is randomly drawn from the mix of rates, and hence could even be the same. So the actual observed length of patches with the same rate will be a bit larger than 10.25. Note below that if you choose multiple patches, there will be an estimate in the output file as to which combination of rate categories contributed most to the likelihood.

Note that the autocorrelation scheme we use is somewhat different from Yang's (1995) autocorrelated Gamma distribution. I am unsure whether this difference is of any importance -- our scheme is chosen for the ease with which it can be implemented.

The C option allows user-defined rate categories. The user is prompted for the number of user-defined rates, and for the rates themselves, which cannot be negative but can be zero. These numbers, which must be nonnegative (some could be 0), are defined relative to each other, so that if rates for three categories are set to 1 : 3 : 2.5 this would have the same meaning as setting them to 2 : 6 : 5. The assignment of rates to amino acid positions is then made by reading a file whose default name is "categories". It should contain a string of digits 1 through 9. A new line or a blank can occur after any character in this string. Thus the categories file might look like this:

122231111122411155
1155333333444

With the current options R, A, and C the program has a good ability to infer different rates at different positions and estimate phylogenies under a more realistic model. Note that Likelihood Ratio Tests can be used to test whether one combination of rates is significantly better than another, provided one rate scheme represents a restriction of another with fewer parameters. The number of parameters needed for rate variation is the number of regional rate categories, plus the number of user-defined rate categories less 2, plus one if the regional rate categories have a nonzero autocorrelation.

The G (global search) option causes, after the last species is added to the tree, each possible group to be removed and re-added. This improves the result, since the position of every species is reconsidered. It approximately triples the run-time of the program.

The User tree (option U) is read from a file whose default name is intree. The trees can be multifurcating. They must be preceded in the file by a line giving the number of trees in the file.

If the U (user tree) option is chosen another option appears in the menu, the L option. If it is selected, it signals the program that it should take any branch lengths that are in the user tree and simply evaluate the likelihood of that tree, without further altering those branch lengths. This means that if some branches have lengths and others do not, the program will estimate the lengths of those that do not have lengths given in the user tree. Note that the program Retree can be used to add and remove lengths from a tree.

The U option can read a multifurcating tree. This allows us to test the hypothesis that a certain branch has zero length (we can also do this by using Retree to set the length of that branch to 0.0 when it is present in the tree). By doing a series of runs with different specified lengths for a branch we can plot a likelihood curve for its branch length while allowing all other branches to adjust their lengths to it. If all branches have lengths specified, none of them will be iterated. This is useful to allow a tree produced by another method to have its likelihood evaluated. The L option has no effect and does not appear in the menu if the U option is not used.

The W (Weights) option is invoked in the usual way, with only weights 0 and 1 allowed. It selects a set of positions to be analyzed, ignoring the others. The positions selected are those with weight 1. If the W option is not invoked, all positions are analyzed. The Weights (W) option takes the weights from a file whose default name is "weights". The weights follow the format described in the main documentation file.

The M (multiple data sets) option will ask you whether you want to use multiple sets of weights (from the weights file) or multiple data sets from the input file. The ability to use a single data set with multiple weights means that much less disk space will be used for this input data. The bootstrapping and jackknifing tool Seqboot has the ability to create a weights file with multiple weights. Note also that when we use multiple weights for bootstrapping we can also then maintain different rate categories for different sites in a meaningful way. If you use the multiple data sets option rather than multiple weights, you should not at the same time use the user-defined rate categories option (option C), because the user-defined rate categories could then be associated with the wrong sites. This is not a concern when the M option is used by using multiple weights.

The algorithm used for searching among trees uses a technique invented by David Swofford and J. S. Rogers. This involves not iterating most branch lengths on most trees when searching among tree topologies, This is of necessity a "quick-and-dirty" search but it saves much time. There is a menu option (option S) which can turn off this search and revert to the earlier search method which iterated branch lengths in all topologies. This will be substantially slower but will also be a bit more likely to find the tree topology of highest likelihood. If the Swofford/Rogers search finds the best tree topology, the branch lengths inferred will be almost precisely the same as they would be with the more thorough search, as the maximization of likelihood with respect to branch lengths for the final tree is not different in the two kinds of search.

OUTPUT FORMAT

The output starts by giving the number of species and the number of amino acid positions.

If the R (HMM rates) option is used a table of the relative rates of expected substitution at each category of positions is printed, as well as the probabilities of each of those rates.

There then follow the data sequences, if the user has selected the menu option to print them, with the sequences printed in groups of ten amino acids. The trees found are printed as an unrooted tree topology (possibly rooted by outgroup if so requested). The internal nodes are numbered arbitrarily for the sake of identification. The number of trees evaluated so far and the log likelihood of the tree are also given. Note that the trees printed out have a trifurcation at the base. The branch lengths in the diagram are roughly proportional to the estimated branch lengths, except that very short branches are printed out at least three characters in length so that the connections can be seen. The unit of branch length is the expected fraction of amino acids changed (so that 1.0 is 100 PAMs).

A table is printed showing the length of each tree segment (in units of expected amino acid substitutions per position), as well as (very) rough confidence limits on their lengths. If a confidence limit is negative, this indicates that rearrangement of the tree in that region is not excluded, while if both limits are positive, rearrangement is still not necessarily excluded because the variance calculation on which the confidence limits are based results in an underestimate, which makes the confidence limits too narrow.

In addition to the confidence limits, the program performs a crude Likelihood Ratio Test (LRT) for each branch of the tree. The program computes the ratio of likelihoods with and without this branch length forced to zero length. This done by comparing the likelihoods changing only that branch length. A truly correct LRT would force that branch length to zero and also allow the other branch lengths to adjust to that. The result would be a likelihood ratio closer to 1. Therefore the present LRT will err on the side of being too significant. YOU ARE WARNED AGAINST TAKING IT TOO SERIOUSLY. If you want to get a better likelihood curve for a branch length you can do multiple runs with different prespecified lengths for that branch, as discussed above in the discussion of the L option.

One should also realize that if you are looking not at a previously-chosen branch but at all branches, that you are seeing the results of multiple tests. With 20 tests, one is expected to reach significance at the P = .05 level purely by chance. You should therefore use a much more conservative significance level, such as .05 divided by the number of tests. The significance of these tests is shown by printing asterisks next to the confidence interval on each branch length. It is important to keep in mind that both the confidence limits and the tests are very rough and approximate, and probably indicate more significance than they should. Nevertheless, maximum likelihood is one of the few methods that can give you any indication of its own error; most other methods simply fail to warn the user that there is any error! (In fact, whole philosophical schools of taxonomists exist whose main point seems to be that there isn't any error, that the "most parsimonious" tree is the best tree by definition and that's that).

The log likelihood printed out with the final tree can be used to perform various likelihood ratio tests. One can, for example, compare runs with different values of the relative rate of change in the active site and in the rest of the protein to determine which value is the maximum likelihood estimate, and what is the allowable range of values (using a likelihood ratio test, which you will find described in mathematical statistics books). One could also estimate the base frequencies in the same way. Both of these, particularly the latter, require multiple runs of the program to evaluate different possible values, and this might get expensive.

If the U (User Tree) option is used and more than one tree is supplied, and the program is not told to assume autocorrelation between the rates at different amino acid positions, the program also performs a statistical test of each of these trees against the one with highest likelihood. If there are two user trees, the test done is one which is due to Kishino and Hasegawa (1989), a version of a test originally introduced by Templeton (1983). In this implementation it uses the mean and variance of log-likelihood differences between trees, taken across amino acid positions. If the two trees' means are more than 1.96 standard deviations different then the trees are declared significantly different. This use of the empirical variance of log-likelihood differences is more robust and nonparametric than the classical likelihood ratio test, and may to some extent compensate for the any lack of realism in the model underlying this program.

If there are more than two trees, the test done is an extension of the KHT test, due to Shimodaira and Hasegawa (1999). They pointed out that a correction for the number of trees was necessary, and they introduced a resampling method to make this correction. In the version used here the variances and covariances of the sum of log likelihoods across amino acid positions are computed for all pairs of trees. To test whether the difference between each tree and the best one is larger than could have been expected if they all had the same expected log-likelihood, log-likelihoods for all trees are sampled with these covariances and equal means (Shimodaira and Hasegawa's "least favorable hypothesis"), and a P value is computed from the fraction of times the difference between the tree's value and the highest log-likelihood exceeds that actually observed. Note that this sampling needs random numbers, and so the program will prompt the user for a random number seed if one has not already been supplied. With the two-tree KHT test no random numbers are used.

In either the KHT or the SH test the program prints out a table of the log-likelihoods of each tree, the differences of each from the highest one, the variance of that quantity as determined by the log-likelihood differences at individual sites, and a conclusion as to whether that tree is or is not significantly worse than the best one. However the test is not available if we assume that there is autocorrelation of rates at neighboring positions (option A) and is not done in those cases.

The branch lengths printed out are scaled in terms of 100 times the expected numbers of amino acid substitutions, scaled so that the average rate of change, averaged over all the positions analyzed, is set to 100.0, if there are multiple categories of positions. This means that whether or not there are multiple categories of positions, the expected percentage of change for very small branches is equal to the branch length. Of course, when a branch is twice as long this does not mean that there will be twice as much net change expected along it, since some of the changes occur in the same position and overlie or even reverse each other. underlying numbers of changes. That means that a branch of length 26 is 26 times as long as one which would show a 1% difference between the amino acid sequences at the beginning and end of the branch, but we would not expect the sequences at the beginning and end of the branch to be 26% different, as there would be some overlaying of changes.

Confidence limits on the branch lengths are also given. Of course a negative value of the branch length is meaningless, and a confidence limit overlapping zero simply means that the branch length is not necessarily significantly different from zero. Because of limitations of the numerical algorithm, branch length estimates of zero will often print out as small numbers such as 0.00001. If you see a branch length that small, it is really estimated to be of zero length.

Another possible source of confusion is the existence of negative values for the log likelihood. This is not really a problem; the log likelihood is not a probability but the logarithm of a probability. When it is negative it simply means that the corresponding probability is less than one (since we are seeing its logarithm). The log likelihood is maximized by being made more positive: -30.23 is worse than -29.14.

At the end of the output, if the R option is in effect with multiple HMM rates, the program will print a list of what amino acid position categories contributed the most to the final likelihood. This combination of HMM rate categories need not have contributed a majority of the likelihood, just a plurality. Still, it will be helpful as a view of where the program infers that the higher and lower rates are. Note that the use in this calculations of the prior probabilities of different rates, and the average patch length, gives this inference a "smoothed" appearance: some other combination of rates might make a greater contribution to the likelihood, but be discounted because it conflicts with this prior information. See the example output below to see what this printout of rate categories looks like. A second list will also be printed out, showing for each position which rate accounted for the highest fraction of the likelihood. If the fraction of the likelihood accounted for is less than 95%, a dot is printed instead.

Option 3 in the menu controls whether the tree is printed out into the output file. This is on by default, and usually you will want to leave it this way. However for runs with multiple data sets such as bootstrapping runs, you will primarily be interested in the trees which are written onto the output tree file, rather than the trees printed on the output file. To keep the output file from becoming too large, it may be wisest to use option 3 to prevent trees being printed onto the output file.

Option 4 in the menu controls whether the tree estimated by the program is written onto a tree file. The default name of this output tree file is "outtree". If the U option is in effect, all the user-defined trees are written to the output tree file.

Option 5 in the menu controls whether ancestral states are estimated at each node in the tree. If it is in effect, a table of ancestral sequences is printed out (including the sequences in the tip species which are the input sequences). The symbol printed out is for the amino acid which accounts for the largest fraction of the likelihood at that position. In that table, if a position has an amino acid which accounts for more than 95% of the likelihood, its symbol printed in capital letters (W rather than w). One limitation of the current version of the program is that when there are multiple HMM rates (option R) the reconstructed amino acids are based on only the single assignment of rates to positions which accounts for the largest amount of the likelihood. Thus the assessment of 95% of the likelihood, in tabulating the ancestral states, refers to 95% of the likelihood that is accounted for by that particular combination of rates.

PROGRAM CONSTANTS

The constants defined at the beginning of the program include "maxtrees", the maximum number of user trees that can be processed. It is small (100) at present to save some further memory but the cost of increasing it is not very great. Other constants include "maxcategories", the maximum number of position categories, "namelength", the length of species names in characters, and three others, "smoothings", "iterations", and "epsilon", that help "tune" the algorithm and define the compromise between execution speed and the quality of the branch lengths found by iteratively maximizing the likelihood. Reducing iterations and smoothings, and increasing epsilon, will result in faster execution but a worse result. These values will not usually have to be changed.

The program spends most of its time doing real arithmetic. The algorithm, with separate and independent computations occurring for each pattern, lends itself readily to parallel processing.

PAST AND FUTURE OF THE PROGRAM

This program is derived in version 3.6 by Lucas Mix from Dnaml, with which it shares many of its data structures and much of its strategy.


TEST DATA SET

(Note that although these may look like DNA sequences, they are being treated as protein sequences consisting entirely of alanine, cystine, glycine, and threonine).

   5   13
Alpha     AACGTGGCCAAAT
Beta      AAGGTCGCCAAAC
Gamma     CATTTCGTCACAA
Delta     GGTATTTCGGCCT
Epsilon   GGGATCTCGGCCC


CONTENTS OF OUTPUT FILE (with all numerical options on)

(It was run with HMM rates having gamma-distributed rates approximated by 5 rate categories, with coefficient of variation of rates 1.0, and with patch length parameter = 1.5. Two user-defined rate categories were used, one for the first 6 positions, the other for the last 7, with rates 1.0 : 2.0. Weights were used, with sites 1 and 13 given weight 0, and all others weight 1.)


Amino acid sequence Maximum Likelihood method, version 3.69

 5 species,  13  sites

    Site categories are:

             1111112222 222


    Sites are weighted as follows:

             01111 11111 110

Jones-Taylor-Thornton model of amino acid change


Name            Sequences
----            ---------

Alpha        AACGTGGCCA AAT
Beta         ..G..C.... ..C
Gamma        C.TT.C.T.. C.A
Delta        GGTA.TT.GG CC.
Epsilon      GGGA.CT.GG CCC



Discrete approximation to gamma distributed rates
 Coefficient of variation of rates = 1.000000  (alpha = 1.000000)

States in HMM   Rate of change    Probability

        1           0.264            0.522
        2           1.413            0.399
        3           3.596            0.076
        4           7.086            0.0036
        5          12.641            0.000023

Expected length of a patch of sites having the same rate =    1.500


Site category   Rate of change

        1           1.000
        2           2.000



  +Beta      
  |  
  |                                             +Epsilon   
  |       +-------------------------------------3  
  1-------2                                     +--------Delta     
  |       |  
  |       +----------Gamma     
  |  
  +------Alpha     


remember: this is an unrooted tree!

Ln Likelihood =  -104.53314

 Between        And            Length      Approx. Confidence Limits
 -------        ---            ------      ------- ---------- ------

     1          Alpha             0.46548     (     zero,     1.16234) **
     1          Beta              0.00010     (     zero,     0.56371)
     1             2              0.53585     (     zero,     1.53611) *
     2             3              2.52202     (     zero,     5.51952) **
     3          Epsilon           0.00010     (     zero,     0.70102)
     3          Delta             0.56179     (     zero,     1.37921) **
     2          Gamma             0.72465     (     zero,     1.87900) **

     *  = significantly positive, P < 0.05
     ** = significantly positive, P < 0.01

Combination of categories that contributes the most to the likelihood:

             1122111111 111

Most probable category at each site if > 0.95 probability ("." otherwise)

             ....1....1 1..

Probable sequences at interior nodes:

  node       Reconstructed sequence (caps if > 0.95)

    1        .AGGTCGCCA AA.
 Beta        AAGGTCGCCA AAC
    2        .AggTCGCCA CA.
    3        .GGATCTCGG CC.
 Epsilon     GGGATCTCGG CCC
 Delta       GGTATTTCGG CCT
 Gamma       CATTTCGTCA CAA
 Alpha       AACGTGGCCA AAT

phylip-3.697/doc/promlk.html0000644004732000473200000007533512406201173015515 0ustar joefelsenst_g promlk

version 3.696

Promlk -- Protein maximum likelihood program
with molecular clock

© Copyright 2000-2014 by Joseph Felsenstein. All rights reserved. License terms here.

This program implements the maximum likelihood method for protein amino acid sequences under the constraint that the trees estimated must be consistent with a molecular clock. The molecular clock is the assumption that the tips of the tree are all equidistant, in branch length, from its root. This program is indirectly related to Proml. It uses the Dayhoff probability model of change between amino acids. Its algorithmic details are not yet published, but many of them are similar to Dnamlk.

The assumptions of the model are:

  1. Each position in the sequence evolves independently.
  2. Different lineages evolve independently.
  3. Each position undergoes substitution at an expected rate which is chosen from a series of rates (each with a probability of occurrence) which we specify.
  4. All relevant positions are included in the sequence, not just those that have changed or those that are "phylogenetically informative".
  5. The probabilities of change between amino acids are given by the model of Jones, Taylor, and Thornton (1992), the PMB model of Veerassamy, Smith and Tillier (2003), or the DCMut version (Kosiol and Goldman, 2005) of the PAM model of Dayhoff (Dayhoff and Eck, 1968; Dayhoff et. al., 1979).

Note the assumption that we are looking at all positions, including those that have not changed at all. It is important not to restrict attention to some positions based on whether or not they have changed; doing that would bias branch lengths by making them too long, and that in turn would cause the method to misinterpret the meaning of those positions that had changed.

This program uses a Hidden Markov Model (HMM) method of inferring different rates of evolution at different amino acid positions. This was described in a paper by me and Gary Churchill (1996). It allows us to specify to the program that there will be a number of different possible evolutionary rates, what the prior probabilities of occurrence of each is, and what the average length of a patch of positions all having the same rate. The rates can also be chosen by the program to approximate a Gamma distribution of rates, or a Gamma distribution plus a class of invariant positions. The program computes the likelihood by summing it over all possible assignments of rates to positions, weighting each by its prior probability of occurrence.

For example, if we have used the C and A options (described below) to specify that there are three possible rates of evolution, 1.0, 2.4, and 0.0, that the prior probabilities of a position having these rates are 0.4, 0.3, and 0.3, and that the average patch length (number of consecutive positions with the same rate) is 2.0, the program will sum the likelihood over all possibilities, but giving less weight to those that (say) assign all positions to rate 2.4, or that fail to have consecutive positions that have the same rate.

The Hidden Markov Model framework for rate variation among positions was independently developed by Yang (1993, 1994, 1995). We have implemented a general scheme for a Hidden Markov Model of rates; we allow the rates and their prior probabilities to be specified arbitrarily by the user, or by a discrete approximation to a Gamma distribution of rates (Yang, 1995), or by a mixture of a Gamma distribution and a class of invariant positions.

This feature effectively removes the artificial assumption that all positions have the same rate, and also means that we need not know in advance the identities of the positions that have a particular rate of evolution.

Another layer of rate variation also is available. The user can assign categories of rates to each positions (for example, we might want amino acid positions in the active site of a protein to change more slowly than other positions. This is done with the categories input file and the C option. We then specify (using the menu) the relative rates of evolution of amino acid positions in the different categories. For example, we might specify that positions in the active site evolve at relative rates of 0.2 compared to 1.0 at other positions. If we are assuming that a particular position maintains a cysteine bridge to another, we may want to put it in a category of positions (including perhaps the initial position of the protein sequence which maintains methionine) which changes at a rate of 0.0.

If both user-assigned rate categories and Hidden Markov Model rates are allowed, the program assumes that the actual rate at a position is the product of the user-assigned category rate and the Hidden Markov Model regional rate. (This may not always make perfect biological sense: it would be more natural to assume some upper bound to the rate, as we have discussed in the Felsenstein and Churchill paper). Nevertheless you may want to use both types of rate variation.

INPUT FORMAT AND OPTIONS

Subject to these assumptions, the program is a correct maximum likelihood method. The input is fairly standard, with one addition. As usual the first line of the file gives the number of species and the number of amino acid positions.

Next come the species data. Each sequence starts on a new line, has a ten-character species name that must be blank-filled to be of that length, followed immediately by the species data in the one-letter amino acid code. The sequences must either be in the "interleaved" or "sequential" formats described in the Molecular Sequence Programs document. The I option selects between them. The sequences can have internal blanks in the sequence but there must be no extra blanks at the end of the terminated line. Note that a blank is not a valid symbol for a deletion.

The options are selected using an interactive menu. The menu looks like this:


Amino acid sequence
   Maximum Likelihood method with molecular clock, version 3.69

Settings for this run:
  U                 Search for best tree?  Yes
  P    JTT, PMB or PAM probability model?  Jones-Taylor-Thornton
  C   One category of substitution rates?  Yes
  R           Rate variation among sites?  constant rate of change
  G                Global rearrangements?  No
  W                       Sites weighted?  No
  J   Randomize input order of sequences?  No. Use input order
  M           Analyze multiple data sets?  No
  I          Input sequences interleaved?  Yes
  0   Terminal type (IBM PC, ANSI, none)?  ANSI
  1    Print out the data at start of run  No
  2  Print indications of progress of run  Yes
  3                        Print out tree  Yes
  4       Write out trees onto tree file?  Yes
  5   Reconstruct hypothetical sequences?  No

Are these settings correct? (type Y or the letter for one to change)

The user either types "Y" (followed, of course, by a carriage-return) if the settings shown are to be accepted, or the letter or digit corresponding to an option that is to be changed.

The options U, W, J, O, M, and 0 are the usual ones. They are described in the main documentation file of this package. Option I is the same as in other molecular sequence programs and is described in the documentation file for the sequence programs.

The P option toggles between three models of amino acid change. One is the Jones-Taylor-Thornton model, another the PMB (Probability Matrix from Blocks) model of Veerassamy, Smith and Tillier (2003), another the DCMut model (Kosiol and Goldman, 2005) based on the the Dayhoff PAM matrix model. These are all based on Margaret Dayhoff's (Dayhoff and Eck, 1968; Dayhoff et. al., 1979) method of empirical tabulation of changes of amino acid sequences, and conversion of these to a probability model of amino acid change which is used to make a transition probability matrix which allows prediction of the probability of changing from any one amino acid to any other, and also predicts equilibrium amino acid composition.

The R (Hidden Markov Model rates) option allows the user to approximate a Gamma distribution of rates among positions, or a Gamma distribution plus a class of invariant positions, or to specify how many categories of substitution rates there will be in a Hidden Markov Model of rate variation, and what are the rates and probabilities for each. By repeatedly selecting the R option one toggles among no rate variation, the Gamma, Gamma+I, and general HMM possibilities.

If you choose Gamma or Gamma+I the program will ask how many rate categories you want. If you have chosen Gamma+I, keep in mind that one rate category will be set aside for the invariant class and only the remaining ones used to approximate the Gamma distribution. For the approximation we do not use the quantile method of Yang (1995) but instead use a quadrature method using generalized Laguerre polynomials. This should give a good approximation to the Gamma distribution with as few as 5 or 6 categories.

In the Gamma and Gamma+I cases, the user will be asked to supply the coefficient of variation of the rate of substitution among positions. This is different from the parameters used by Nei and Jin (1990) but related to them: their parameter a is also known as "alpha", the shape parameter of the Gamma distribution. It is related to the coefficient of variation by

     CV = 1 / a1/2

or

     a = 1 / (CV)2

(their parameter b is absorbed here by the requirement that time is scaled so that the mean rate of evolution is 1 per unit time, which means that a = b). As we consider cases in which the rates are less variable we should set a larger and larger, as CV gets smaller and smaller.

If the user instead chooses the general Hidden Markov Model option, they are first asked how many HMM rate categories there will be (for the moment there is an upper limit of 9, which should not be restrictive). Then the program asks for the rates for each category. These rates are only meaningful relative to each other, so that rates 1.0, 2.0, and 2.4 have the exact same effect as rates 2.0, 4.0, and 4.8. Note that an HMM rate category can have rate of change 0, so that this allows us to take into account that there may be a category of amino acid positions that are invariant. Note that the run time of the program will be proportional to the number of HMM rate categories: twice as many categories means twice as long a run. Finally the program will ask for the probabilities of a random amino acid position falling into each of these regional rate categories. These probabilities must be nonnegative and sum to 1. Default for the program is one category, with rate 1.0 and probability 1.0 (actually the rate does not matter in that case).

If more than one HMM rate category is specified, then another option, A, becomes visible in the menu. This allows us to specify that we want to assume that positions that have the same HMM rate category are expected to be clustered so that there is autocorrelation of rates. The program asks for the value of the average patch length. This is an expected length of patches that have the same rate. If it is 1, the rates of successive positions will be independent. If it is, say, 10.25, then the chance of change to a new rate will be 1/10.25 after every position. However the "new rate" is randomly drawn from the mix of rates, and hence could even be the same. So the actual observed length of patches with the same rate will be a bit larger than 10.25. Note below that if you choose multiple patches, there will be an estimate in the output file as to which combination of rate categories contributed most to the likelihood.

Note that the autocorrelation scheme we use is somewhat different from Yang's (1995) autocorrelated Gamma distribution. I am unsure whether this difference is of any importance -- our scheme is chosen for the ease with which it can be implemented.

The C option allows user-defined rate categories. The user is prompted for the number of user-defined rates, and for the rates themselves, which cannot be negative but can be zero. These numbers, which must be nonnegative (some could be 0), are defined relative to each other, so that if rates for three categories are set to 1 : 3 : 2.5 this would have the same meaning as setting them to 2 : 6 : 5. The assignment of rates to amino acid positions is then made by reading a file whose default name is "categories". It should contain a string of digits 1 through 9. A new line or a blank can occur after any character in this string. Thus the categories file might look like this:

122231111122411155
1155333333444

With the current options R, A, and C the program has a good ability to infer different rates at different positions and estimate phylogenies under a more realistic model. Note that Likelihood Ratio Tests can be used to test whether one combination of rates is significantly better than another, provided one rate scheme represents a restriction of another with fewer parameters. The number of parameters needed for rate variation is the number of regional rate categories, plus the number of user-defined rate categories less 2, plus one if the regional rate categories have a nonzero autocorrelation.

The G (global search) option causes, after the last species is added to the tree, each possible group to be removed and re-added. This improves the result, since the position of every species is reconsidered. It approximately triples the run-time of the program.

The User tree (option U) is read from a file whose default name is intree. The trees can be multifurcating. This allows us to test the hypothesis that a given branch has zero length.

If the U (user tree) option is chosen another option appears in the menu, the L option. If it is selected, it signals the program that it should take any branch lengths that are in the user tree and simply evaluate the likelihood of that tree, without further altering those branch lengths. In the case of a clock, if some branches have lengths and others do not, the program does not estimate the lengths of those that do not have lengths given in the user tree. If any of the branches do not have lengths, the program re-estimates the lengths of all of them. This is done because estimating some and not others is hard in the case of a clock.

The W (Weights) option is invoked in the usual way, with only weights 0 and 1 allowed. It selects a set of positions to be analyzed, ignoring the others. The positions selected are those with weight 1. If the W option is not invoked, all positions are analyzed. The Weights (W) option takes the weights from a file whose default name is "weights". The weights follow the format described in the main documentation file.

The M (multiple data sets) option will ask you whether you want to use multiple sets of weights (from the weights file) or multiple data sets from the input file. The ability to use a single data set with multiple weights means that much less disk space will be used for this input data. The bootstrapping and jackknifing tool Seqboot has the ability to create a weights file with multiple weights. Note also that when we use multiple weights for bootstrapping we can also then maintain different rate categories for different sites in a meaningful way. If you use the multiple data sets option rather than multiple weights, you should not at the same time use the user-defined rate categories option (option C), because the user-defined rate categories could then be associated with the wrong sites. This is not a concern when the M option is used by using multiple weights.

The algorithm used for searching among trees is faster than it was in version 3.5, thanks to using a technique invented by David Swofford and J. S. Rogers. This involves not iterating most branch lengths on most trees when searching among tree topologies, This is of necessity a "quick-and-dirty" search but it saves much time.

OUTPUT FORMAT

The output starts by giving the number of species, the number of amino acid positions.

If the R (HMM rates) option is used a table of the relative rates of expected substitution at each category of positions is printed, as well as the probabilities of each of those rates.

There then follow the data sequences, if the user has selected the menu option to print them out, with the base sequences printed in groups of ten amino acids. The trees found are printed as a rooted tree topology. The internal nodes are numbered arbitrarily for the sake of identification. The number of trees evaluated so far and the log likelihood of the tree are also given. The branch lengths in the diagram are roughly proportional to the estimated branch lengths, except that very short branches are printed out at least three characters in length so that the connections can be seen. The unit of branch length is the expected fraction of amino acids changed (so that 1.0 is 100 PAMs).

A table is printed showing the length of each tree segment, and the time (in units of expected amino acid substitutions per position) of each fork in the tree, measured from the root of the tree. I have not attempted to include code for approximate confidence limits on branch points, as I have done for branch lengths in Proml, both because of the extreme crudeness of that test, and because the variation of times for different forks would be highly correlated.

The log likelihood printed out with the final tree can be used to perform various likelihood ratio tests. One can, for example, compare runs with different values of the relative rate of change in the active site and in the rest of the protein to determine which value is the maximum likelihood estimate, and what is the allowable range of values (using a likelihood ratio test, which you will find described in mathematical statistics books). One could also estimate the base frequencies in the same way. Both of these, particularly the latter, require multiple runs of the program to evaluate different possible values, and this might get expensive.

This program makes possible a (reasonably) legitimate statistical test of the molecular clock. To do such a test, run Proml and Promlk on the same data. If the trees obtained are of the same topology (when considered as unrooted), it is legitimate to compare their likelihoods by the likelihood ratio test. In Proml the likelihood has been computed by estimating 2n-3 branch lengths, if there are n tips on the tree. In Promlk it has been computed by estimating n-1 branching times (in effect, n-1 branch lengths). The difference in the number of parameters is (2n-3)-(n-1) = n-2. To perform the test take the difference in log likelihoods between the two runs (Proml should be the higher of the two, barring numerical iteration difficulties) and double it. Look this up on a chi-square distribution with n-2 degrees of freedom. If the result is significant, the log likelihood has been significantly increased by allowing all 2n-3 branch lengths to be estimated instead of just n-1, and molecular clock may be rejected.

If the U (User Tree) option is used and more than one tree is supplied, and the program is not told to assume autocorrelation between the rates at different amino acid positions, the program also performs a statistical test of each of these trees against the one with highest likelihood. If there are two user trees, the test done is one which is due to Kishino and Hasegawa (1989), a version of a test originally introduced by Templeton (1983). In this implementation it uses the mean and variance of log-likelihood differences between trees, taken across amino acid positions. If the two trees' means are more than 1.96 standard deviations different then the trees are declared significantly different. This use of the empirical variance of log-likelihood differences is more robust and nonparametric than the classical likelihood ratio test, and may to some extent compensate for the any lack of realism in the model underlying this program.

If there are more than two trees, the test done is an extension of the KHT test, due to Shimodaira and Hasegawa (1999). They pointed out that a correction for the number of trees was necessary, and they introduced a resampling method to make this correction. In the version used here the variances and covariances of the sum of log likelihoods across amino acid positions are computed for all pairs of trees. To test whether the difference between each tree and the best one is larger than could have been expected if they all had the same expected log-likelihood, log-likelihoods for all trees are sampled with these covariances and equal means (Shimodaira and Hasegawa's "least favorable hypothesis"), and a P value is computed from the fraction of times the difference between the tree's value and the highest log-likelihood exceeds that actually observed. Note that this sampling needs random numbers, and so the program will prompt the user for a random number seed if one has not already been supplied. With the two-tree KHT test no random numbers are used.

In either the KHT or the SH test the program prints out a table of the log-likelihoods of each tree, the differences of each from the highest one, the variance of that quantity as determined by the log-likelihood differences at individual sites, and a conclusion as to whether that tree is or is not significantly worse than the best one. However the test is not available if we assume that there is autocorrelation of rates at neighboring positions (option A) and is not done in those cases.

The branch lengths printed out are scaled in terms of 100 times the expected numbers of amino acid substitutions, scaled so that the average rate of change, averaged over all the positions analyzed, is set to 100.0, if there are multiple categories of positions. This means that whether or not there are multiple categories of positions, the expected percentage of change for very small branches is equal to the branch length. Of course, when a branch is twice as long this does not mean that there will be twice as much net change expected along it, since some of the changes occur in the same position and overlie or even reverse each other. underlying numbers of changes. That means that a branch of length 26 is 26 times as long as one which would show a 1% difference between the amino acid sequences at the beginning and end of the branch, but we would not expect the sequences at the beginning and end of the branch to be 26% different, as there would be some overlaying of changes.

Because of limitations of the numerical algorithm, branch length estimates of zero will often print out as small numbers such as 0.00001. If you see a branch length that small, it is really estimated to be of zero length.

Another possible source of confusion is the existence of negative values for the log likelihood. This is not really a problem; the log likelihood is not a probability but the logarithm of a probability. When it is negative it simply means that the corresponding probability is less than one (since we are seeing its logarithm). The log likelihood is maximized by being made more positive: -30.23 is worse than -29.14.

At the end of the output, if the R option is in effect with multiple HMM rates, the program will print a list of what amino acid position categories contributed the most to the final likelihood. This combination of HMM rate categories need not have contributed a majority of the likelihood, just a plurality. Still, it will be helpful as a view of where the program infers that the higher and lower rates are. Note that the use in this calculations of the prior probabilities of different rates, and the average patch length, gives this inference a "smoothed" appearance: some other combination of rates might make a greater contribution to the likelihood, but be discounted because it conflicts with this prior information. See the example output below to see what this printout of rate categories looks like. A second list will also be printed out, showing for each position which rate accounted for the highest fraction of the likelihood. If the fraction of the likelihood accounted for is less than 95%, a dot is printed instead.

Option 3 in the menu controls whether the tree is printed out into the output file. This is on by default, and usually you will want to leave it this way. However for runs with multiple data sets such as bootstrapping runs, you will primarily be interested in the trees which are written onto the output tree file, rather than the trees printed on the output file. To keep the output file from becoming too large, it may be wisest to use option 3 to prevent trees being printed onto the output file.

Option 4 in the menu controls whether the tree estimated by the program is written onto a tree file. The default name of this output tree file is "outtree". If the U option is in effect, all the user-defined trees are written to the output tree file.

Option 5 in the menu controls whether ancestral states are estimated at each node in the tree. If it is in effect, a table of ancestral sequences is printed out (including the sequences in the tip species which are the input sequences). The symbol printed out is for the amino acid which accounts for the largest fraction of the likelihood at that position. In that table, if a position has an amino acid which accounts for more than 95% of the likelihood, its symbol printed in capital letters (W rather than w). One limitation of the current version of the program is that when there are multiple HMM rates (option R) the reconstructed amino acids are based on only the single assignment of rates to positions which accounts for the largest amount of the likelihood. Thus the assessment of 95% of the likelihood, in tabulating the ancestral states, refers to 95% of the likelihood that is accounted for by that particular combination of rates.

PROGRAM CONSTANTS

The constants defined at the beginning of the program include "maxtrees", the maximum number of user trees that can be processed. It is small (100) at present to save some further memory but the cost of increasing it is not very great. Other constants include "maxcategories", the maximum number of position categories, "namelength", the length of species names in characters, and three others, "smoothings", "iterations", and "epsilon", that help "tune" the algorithm and define the compromise between execution speed and the quality of the branch lengths found by iteratively maximizing the likelihood. Reducing iterations and smoothings, and increasing epsilon, will result in faster execution but a worse result. These values will not usually have to be changed.

The program spends most of its time doing real arithmetic. The algorithm, with separate and independent computations occurring for each pattern, lends itself readily to parallel processing.

PAST AND FUTURE OF THE PROGRAM

This program was developed in version 3.6 by Lucas Mix by combining code from Dnamlk and from Proml.


TEST DATA SET

   5   13
Alpha     AACGTGGCCAAAT
Beta      AAGGTCGCCAAAC
Gamma     CATTTCGTCACAA
Delta     GGTATTTCGGCCT
Epsilon   GGGATCTCGGCCC


CONTENTS OF OUTPUT FILE (with all numerical options on)

(It was run with HMM rates having gamma-distributed rates approximated by 5 rate categories, with coefficient of variation of rates 1.0, and with patch length parameter = 1.5. Two user-defined rate categories were used, one for the first 6 positions, the other for the last 7, with rates 1.0 : 2.0. Weights were used, with sites 1 and 13 given weight 0, and all others weight 1.)


Amino acid sequence Maximum Likelihood method, version 3.69

 5 species,  13  sites

    Site categories are:

             1111112222 222


    Sites are weighted as follows:

             01111 11111 110

Jones-Taylor-Thornton model of amino acid change


Name            Sequences
----            ---------

Alpha        AACGTGGCCA AAT
Beta         ..G..C.... ..C
Gamma        C.TT.C.T.. C.A
Delta        GGTA.TT.GG CC.
Epsilon      GGGA.CT.GG CCC



Discrete approximation to gamma distributed rates
 Coefficient of variation of rates = 1.000000  (alpha = 1.000000)

States in HMM   Rate of change    Probability

        1           0.264            0.522
        2           1.413            0.399
        3           3.596            0.076
        4           7.086            0.0036
        5          12.641            0.000023

Expected length of a patch of sites having the same rate =    1.500


Site category   Rate of change

        1           1.000
        2           2.000



  +Beta      
  |  
  |                                             +Epsilon   
  |       +-------------------------------------3  
  1-------2                                     +--------Delta     
  |       |  
  |       +----------Gamma     
  |  
  +------Alpha     


remember: this is an unrooted tree!

Ln Likelihood =  -104.53314

 Between        And            Length      Approx. Confidence Limits
 -------        ---            ------      ------- ---------- ------

     1          Alpha             0.46548     (     zero,     1.16234) **
     1          Beta              0.00010     (     zero,     0.56371)
     1             2              0.53585     (     zero,     1.53611) *
     2             3              2.52202     (     zero,     5.51952) **
     3          Epsilon           0.00010     (     zero,     0.70102)
     3          Delta             0.56179     (     zero,     1.37921) **
     2          Gamma             0.72465     (     zero,     1.87900) **

     *  = significantly positive, P < 0.05
     ** = significantly positive, P < 0.01

Combination of categories that contributes the most to the likelihood:

             1122111111 111

Most probable category at each site if > 0.95 probability ("." otherwise)

             ....1....1 1..

Probable sequences at interior nodes:

  node       Reconstructed sequence (caps if > 0.95)

    1        .AGGTCGCCA AA.
 Beta        AAGGTCGCCA AAC
    2        .AggTCGCCA CA.
    3        .GGATCTCGG CC.
 Epsilon     GGGATCTCGG CCC
 Delta       GGTATTTCGG CCT
 Gamma       CATTTCGTCA CAA
 Alpha       AACGTGGCCA AAT

phylip-3.697/doc/protdist.html0000644004732000473200000005244512406201173016056 0ustar joefelsenst_g protdist

version 3.696

Protdist -- Program to compute distance matrix
from protein sequences

© Copyright 1983, 2000-2014 by Joseph Felsenstein. All rights reserved. License terms here.

This program uses protein sequences to compute a distance matrix, under four different models of amino acid replacement. It can also compute a table of similarity between the amino acid sequences. The distance for each pair of species estimates the total branch length between the two species, and can be used in the distance matrix programs Fitch, Kitsch or Neighbor. This is an alternative to using the sequence data itself in the parsimony program Protpars.

The program reads in protein sequences and writes an output file containing the distance matrix or similarity table. The five models of amino acid substitution are one which is based on the Jones, Taylor and Thornton (1992) model of amino acid change, the PMB model (Veerassamy, Smith and Tillier, 2003) which is derived from the Blocks database of conserved protein motifs, the DCMut model (Kosiol and Goldman, 2005) based on the PAM matrices of Margaret Dayhoff, one due to Kimura (1983, p.75) which approximates it based simply on the fraction of similar amino acids, and one based on a model in which the amino acids are divided up into groups, with change occurring based on the genetic code but with greater difficulty of changing between groups. The program correctly takes into account a variety of sequence ambiguities.

The five methods are:

(1) The Dayhoff PAM matrix. This uses the DCMut model (Kosiol and Goldman, 2005) which is a version of the PAM model of Margaret Dayhoff. The PAM model is an empirical one that scales probabilities of change from one amino acid to another in terms of a unit which is an expected 1% change between two amino acid sequences. The PAM 001 matrix is used to make a transition probability matrix which allows prediction of the probability of changing from any one amino acid to any other, and also predicts equilibrium amino acid composition. The program assumes that these probabilities are correct and bases its computations of distance on them. The distance that is computed is scaled in units of expected fraction of amino acids changed. This is a unit such that 1.0 is 100 PAM's.

(2) The Jones-Taylor-Thornton model. This is similar to the Dayhoff PAM model, except that it is based on a recounting of the number of observed changes in amino acids by Jones, Taylor, and Thornton (1992). They used a much larger sample of protein sequences than did Dayhoff. The distance is scaled in units of the expected fraction of amino acids changed (100 PAM's). Because its sample is so much larger this model is to be preferred over the original Dayhoff PAM model. It is the default model in this program.

(3) The PMB (Probability Matrix from Blocks) model. This is derived using the Blocks database of conserved protein motifs. It is described in a paper by Veerassamy, Smith and Tillier (2003). Elisabeth Tillier kindly made the matrices available for this model.

(4) Kimura's distance. This is a rough-and-ready distance formula for approximating PAM distance by simply measuring the fraction of amino acids, p, that differs between two sequences and computing the distance as (Kimura, 1983)

     D = - loge ( 1 - p - 0.2 p2 ).

This is very quick to do but has some obvious limitations. It does not take into account which amino acids differ or to what amino acids they change, so some information is lost. The units of the distance measure are the fraction of amino acids differing, as also in the case of the PAM distance. If the fraction of amino acids differing gets larger than about 0.8541 the distance becomes infinite. Note that this can happen with bootstrapped sequences even when the original sequences are below this level of difference.

(5) The Categories distance. This is my own concoction. I imagined a nucleotide sequence changing according to Kimura's 2-parameter model, with the exception that some changes of amino acids are less likely than others. The amino acids are grouped into a series of categories. Any base change that does not change which category the amino acid is in is allowed, but if an amino acid changes category this is allowed only a certain fraction of the time. The fraction is called the "ease" and there is a parameter for it, which is 1.0 when all changes are allowed and near 0.0 when changes between categories are nearly impossible.

In this option I have allowed the user to select the Transition/Transversion ratio, which of several genetic codes to use, and which categorization of amino acids to use. There are three of them, a somewhat random sample:

(a)
The George-Hunt-Barker (1988) classification of amino acids,
(b)
A classification provided by my colleague Ben Hall when I asked him for one,
(c)
One I found in an old "baby biochemistry" book (Conn and Stumpf, 1963), which contains most of the biochemistry I was ever taught, and all that I ever learned.

Interestingly enough, all of them are consistent with the same linear ordering of amino acids, which they divide up in different ways. For the Categories model I have set as default the George/Hunt/Barker classification with the "ease" parameter set to 0.457 which is approximately the value implied by the empirical rates in the Dayhoff PAM matrix.

The method uses, as I have noted, Kimura's (1980) 2-parameter model of DNA change. The Kimura "2-parameter" model allows for a difference between transition and transversion rates. Its transition probability matrix for a short interval of time is:

              To:     A        G        C        T
                   ---------------------------------
               A  | 1-a-2b     a         b       b
       From:   G  |   a      1-a-2b      b       b
               C  |   b        b       1-a-2b    a
               T  |   b        b         a     1-a-2b

where a is u dt, the product of the rate of transitions per unit time and dt is the length dt of the time interval, and b is v dt, the product of half the rate of transversions (i.e., the rate of a specific transversion) and the length dt of the time interval.

Each distance that is calculated is an estimate, from that particular pair of species, of the divergence time between those two species. The Kimura distance is straightforward to compute. The other two are considerably slower, and they look at all positions, and find that distance which makes the likelihood highest. This likelihood is in effect the length of the internal branch in a two-species tree that connects these two species. Its likelihood is just the product, under the model, of the probabilities of each position having the (one or) two amino acids that are actually found. This is fairly slow to compute.

The computation proceeds from an eigenanalysis (spectral decomposition) of the transition probability matrix. In the case of the PAM 001 matrix the eigenvalues and eigenvectors are precomputed and are hard-coded into the program in over 400 statements. In the case of the Categories model the program computes the eigenvalues and eigenvectors itself, which will add a delay. But the delay is independent of the number of species as the calculation is done only once, at the outset.

The actual algorithm for estimating the distance is in both cases a bisection algorithm which tries to find the point at which the derivative of the likelihood is zero. Some of the kinds of ambiguous amino acids like "glx" are correctly taken into account. However, gaps are treated as if they are unkown nucleotides, which means those positions get dropped from that particular comparison. However, they are not dropped from the whole analysis. You need not eliminate regions containing gaps, as long as you are reasonably sure of the alignment there.

Note that there is an assumption that we are looking at all positions, including those that have not changed at all. It is important not to restrict attention to some positions based on whether or not they have changed; doing that would bias the distances by making them too large, and that in turn would cause the distances to misinterpret the meaning of those positions that had changed.

The program can now correct distances for unequal rates of change at different amino acid positions. This correction, which was introduced for DNA sequences by Jin and Nei (1990), assumes that the distribution of rates of change among amino acid positions follows a Gamma distribution. The user is asked for the value of a parameter that determines the amount of variation of rates among amino acid positions. Instead of the more widely-known coefficient alpha, Protdist uses the coefficient of variation (ratio of the standard deviation to the mean) of rates among amino acid positions. So if there is 20% variation in rates, the CV is is 0.20. The square of the C.V. is also the reciprocal of the better-known "shape parameter", alpha, of the Gamma distribution, so in this case the shape parameter alpha = 1/(0.20*0.20) = 25. If you want to achieve a particular value of alpha, such as 10, you will want to use a CV of 1/sqrt(10) = 1/3.162 = 0.3162.

In addition to the five distance calculations, the program can also compute a table of similarities between amino acid sequences. These values are the fractions of amino acid positions identical between the sequences. The diagonal values are 1.0000. No attempt is made to count similarity of nonidentical amino acids, so that no credit is given for having (for example) different hydrophobic amino acids at the corresponding positions in the two sequences. This option has been requested by many users, who need it for descriptive purposes. It is not intended that the table be used for inferring the tree.

INPUT FORMAT AND OPTIONS

Input is fairly standard, with one addition. As usual the first line of the file gives the number of species and the number of sites. There follows the character W if the Weights option is being used.

Next come the species data. Each sequence starts on a new line, has a ten-character species name that must be blank-filled to be of that length, followed immediately by the species data in the one-letter code. The sequences must either be in the "interleaved" or "sequential" formats described in the Molecular Sequence Programs document. The I option selects between them. The sequences can have internal blanks in the sequence but there must be no extra blanks at the end of the terminated line. Note that a blank is not a valid symbol for a deletion.

After that are the lines (if any) containing the information for the W option, as described below.

The options are selected using an interactive menu. The menu looks like this:


Protein distance algorithm, version 3.69

Settings for this run:
  P  Use JTT, PMB, PAM, Kimura, categories model?  Jones-Taylor-Thornton matrix
  G  Gamma distribution of rates among positions?  No
  C           One category of substitution rates?  Yes
  W                    Use weights for positions?  No
  M                   Analyze multiple data sets?  No
  I                  Input sequences interleaved?  Yes
  0                 Terminal type (IBM PC, ANSI)?  ANSI
  1            Print out the data at start of run  No
  2          Print indications of progress of run  Yes

Are these settings correct? (type Y or the letter for one to change)

The user either types "Y" (followed, of course, by a carriage-return) if the settings shown are to be accepted, or the letter or digit corresponding to an option that is to be changed.

The P option selects one of the five distance methods, or the similarity table. It toggles among these six methods. The default method, if none is specified, is the Jones-Taylor-Thornton model. If the Categories distance is selected another menu option, T, will appear allowing the user to supply the Transition/Transversion ratio that should be assumed at the underlying DNA level, and another one, C, which allows the user to select among various nuclear and mitochondrial genetic codes. The transition/transversion ratio can be any number from 0.5 upwards.

The G option chooses Gamma distributed rates of evolution across amino acid psoitions. The program will prompt you for the Coefficient of Variation of rates. As is noted above, this is 1/sqrt(alpha) if alpha is the more familiar "shape coefficient" of the Gamma distribution. If the G option is not selected, the program defaults to having no variation of rates among sites.

The C option allows user-defined rate categories. The user is prompted for the number of user-defined rates, and for the rates themselves, which cannot be negative but can be zero. These numbers, which must be nonnegative (some could be 0), are defined relative to each other, so that if rates for three categories are set to 1 : 3 : 2.5 this would have the same meaning as setting them to 2 : 6 : 5. The assignment of rates to sites is then made by reading a file whose default name is "categories". It should contain a string of digits 1 through 9. A new line or a blank can occur after any character in this string. Thus the categories file might look like this:

122231111122411155
1155333333444

If both user-assigned rate categories and Gamma-distributed rates are allowed, the program assumes that the actual rate at a site is the product of the user-assigned category rate and the Gamma-distributed rate. This allows you to specify that certain sites have higher or lower rates of change while also allowing the program to allow variation of rates in addition to that.

The M (multiple data sets) option will ask you whether you want to use multiple sets of weights (from the weights file) or multiple data sets from the input file. The ability to use a single data set with multiple weights means that much less disk space will be used for this input data. The bootstrapping and jackknifing tool Seqboot has the ability to create a weights file with multiple weights. Note also that when we use multiple weights for bootstrapping we can also then maintain different rate categories for different sites in a meaningful way. If you use the multiple data sets option rather than multiple weights, you should not at the same time use the user-defined rate categories option (option C), because the user-defined rate categories could then be associated with the wrong sites. This is not a concern when the M option is used by using multiple weights.

Option 0 is the usual one. It is described in the main documentation file of this package. Option I is the same as in other molecular sequence programs and is described in the documentation file for the sequence programs.

The W (Weights) option is invoked in the usual way, with only weights 0 and 1 allowed. It selects a set of sites to be analyzed, ignoring the others. The sites selected are those with weight 1. If the W option is not invoked, all sites are analyzed.

OUTPUT FORMAT

As the distances are computed, the program prints on your screen or terminal the names of the species in turn, followed by one dot (".") for each other species for which the distance to that species has been computed. Thus if there are ten species, the first species name is printed out, followed by one dot, then on the next line the next species name is printed out followed by two dots, then the next followed by three dots, and so on. The pattern of dots should form a triangle. When the distance matrix has been written out to the output file, the user is notified of that.

The output file contains on its first line the number of species. The distance matrix is then printed in standard form, with each species starting on a new line with the species name, followed by the distances to the species in order. These continue onto a new line after every nine distances. The distance matrix is square with zero distances on the diagonal. In general the format of the distance matrix is such that it can serve as input to any of the distance matrix programs.

If the similarity table is selected, the table that is produced is not in a format that can be used as input to the distance matrix programs. It has a heading, and the species names are also put at the tops of the columns of the table (or rather, the first 8 characters of each species name is there, the other two characters omitted to save space). There is not an option to put the table into a format that can be read by the distance matrix programs, nor is there one to make it into a table of fractions of difference by subtracting the similarity values from 1. This is done deliberately to make it more difficult to use these values to construct trees. The similarity values are not corrected for multiple changes, and their use to construct trees (even after converting them to fractions of difference) would be wrong, as it would lead to severe conflict between the distant pairs of sequences and the close pairs of sequences.

If the option to print out the data is selected, the output file will precede the data by more complete information on the input and the menu selections. The output file begins by giving the number of species and the number of characters, and the identity of the distance measure that is being used.

In the Categories model of substitution, the distances printed out are scaled in terms of expected numbers of substitutions, counting both transitions and transversions but not replacements of a base by itself, and scaled so that the average rate of change is set to 1.0. For the Dayhoff PAM and Kimura models the distance are scaled in terms of the expected numbers of amino acid substitutions per site. Of course, when a branch is twice as long this does not mean that there will be twice as much net change expected along it, since some of the changes may occur in the same site and overlie or even reverse each other. The branch lengths estimates here are in terms of the expected underlying numbers of changes. That means that a branch of length 0.26 is 26 times as long as one which would show a 1% difference between the protein (or nucleotide) sequences at the beginning and end of the branch. But we would not expect the sequences at the beginning and end of the branch to be 26% different, as there would be some overlaying of changes.

One problem that can arise is that two or more of the species can be so dissimilar that the distance between them would have to be infinite, as the likelihood rises indefinitely as the estimated divergence time increases. For example, with the Kimura model, if the two sequences differ in 85.41% or more of their positions then the estimate of divergence time would be infinite. Since there is no way to represent an infinite distance in the output file, the program regards this as an error, issues a warning message indicating which pair of species are causing the problem, and computes a distance of -1.0.

PROGRAM CONSTANTS

The constants that are available to be changed by the user at the beginning of the program include "namelength", the length of species names in characters, and "epsilon", a parameter which controls the accuracy of the results of the iterations which estimate the distances. Making "epsilon" smaller will increase run times but result in more decimal places of accuracy. This should not be necessary.

The program spends most of its time doing real arithmetic. Any software or hardware changes that speed up that arithmetic will speed it up by a nearly proportional amount.


TEST DATA SET

(Note that although these may look like DNA sequences, they are being treated as protein sequences consisting entirely of alanine, cystine, glycine, and threonine).

   5   13
Alpha     AACGTGGCCACAT
Beta      AAGGTCGCCACAC
Gamma     CAGTTCGCCACAA
Delta     GAGATTTCCGCCT
Epsilon   GAGATCTCCGCCC


CONTENTS OF OUTPUT FILE (with all numerical options on )

(Note that when the numerical options are not on, the output file produced is in the correct format to be used as an input file in the distance matrix programs).


  Jones-Taylor-Thornton model distance

Name            Sequences
----            ---------

Alpha        AACGTGGCCA CAT
Beta         ..G..C.... ..C
Gamma        C.GT.C.... ..A
Delta        G.GA.TT..G .C.
Epsilon      G.GA.CT..G .CC



Alpha       0.000000  0.331834  0.628142  1.036660  1.365098
Beta        0.331834  0.000000  0.377406  1.102689  0.682218
Gamma       0.628142  0.377406  0.000000  0.979550  0.866781
Delta       1.036660  1.102689  0.979550  0.000000  0.227515
Epsilon     1.365098  0.682218  0.866781  0.227515  0.000000
phylip-3.697/doc/protpars.html0000644004732000473200000003472012406201173016054 0ustar joefelsenst_g protpars

version 3.696

Protpars -- Protein Sequence Parsimony Method

© Copyright 1986-2014 by Joseph Felsenstein. All rights reserved. License terms here.

This program infers an unrooted phylogeny from protein sequences, using a new method intermediate between the approaches of Eck and Dayhoff (1966) and Fitch (1971). Eck and Dayhoff (1966) allowed any amino acid to change to any other, and counted the number of such changes needed to evolve the protein sequences on each given phylogeny. This has the problem that it allows replacements which are not consistent with the genetic code, counting them equally with replacements that are consistent. Fitch, on the other hand, counted the minimum number of nucleotide substitutions that would be needed to achieve the given protein sequences. This counts silent changes equally with those that change the amino acid.

The present method insists that any changes of amino acid be consistent with the genetic code so that, for example, lysine is allowed to change to methionine but not to proline. However, changes between two amino acids via a third are allowed and counted as two changes if each of the two replacements is individually allowed. This sometimes allows changes that at first sight you would think should be outlawed. Thus we can change from phenylalanine to glutamine via leucine in two steps total. Consulting the genetic code, you will find that there is a leucine codon one step away from a phenylalanine codon, and a leucine codon one step away from glutamine. But they are not the same leucine codon. It actually takes three base substitutions to get from either of the phenylalanine codons TTT and TTC to either of the glutamine codons CAA or CAG. Why then does this program count only two? The answer is that recent DNA sequence comparisons seem to show that synonymous changes are considerably faster and easier than ones that change the amino acid. We are assuming that, in effect, synonymous changes occur so much more readily that they need not be counted. Thus, in the chain of changes TTT (Phe) --> CTT (Leu) --> CTA (Leu) --> CAA (Glu), the middle one is not counted because it does not change the amino acid (leucine).

To maintain consistency with the genetic code, it is necessary for the program internally to treat serine as two separate states (ser1 and ser2) since the two groups of serine codons are not adjacent in the code. Changes to the state "deletion" are counted as three steps to prevent the algorithm from assuming unnecessary deletions. The state "unknown" is simply taken to mean that the amino acid, which has not been determined, will in each part of a tree that is evaluated be assumed to be whichever one causes the fewest steps.

The assumptions of this method (which has not been described in the literature), are thus something like this:

  1. Changes in different sites are independent.
  2. Changes in different lineages are independent.
  3. The probability of a base substitution that changes the amino acid sequence is small over the lengths of time involved in a branch of the phylogeny.
  4. The expected amounts of change in different branches of the phylogeny do not vary by so much that two changes in a high-rate branch are more probable than one change in a low-rate branch.
  5. The expected amounts of change do not vary enough among sites that two changes in one site are more probable than one change in another.
  6. The probability of a base change that is synonymous is much higher than the probability of a change that is not synonymous.

That these are the assumptions of parsimony methods has been documented in a series of papers of mine: (1973a, 1978b, 1979, 1981b, 1983b, 1988b). For an opposing view arguing that the parsimony methods make no substantive assumptions such as these, see the papers by Farris (1983) and Sober (1983a, 1983b, 1988), but also read the exchange between Felsenstein and Sober (1986).

The input for the program is fairly standard. The first line contains the number of species and the number of amino acid positions (counting any stop codons that you want to include).

Next come the species data. Each sequence starts on a new line, has a ten-character species name that must be blank-filled to be of that length, followed immediately by the species data in the one-letter code. The sequences must either be in the "interleaved" or "sequential" formats described in the Molecular Sequence Programs document. The I option selects between them. The sequences can have internal blanks in the sequence but there must be no extra blanks at the end of the terminated line. Note that a blank is not a valid symbol for a deletion.

The protein sequences are given by the one-letter codes described in the Molecular Sequence Programs documentation file. Note that if two polypeptide chains are being used that are of different lengths owing to one terminating before the other, they should be coded as (say)

             HIINMA*????
             HIPNMGVWABT

since after the stop codon we do not definitely know that there has been a deletion, and do not know what amino acid would have been there. If DNA studies tell us that there is DNA sequence in that region, then we could use "X" rather than "?". Note that "X" means an unknown amino acid, but definitely an amino acid, while "?" could mean either that or a deletion. The distinction is often significant in regions where there are deletions: one may want to encode a six-base deletion as "-?????" since that way the program will only count one deletion, not six deletion events, when the deletion arises. However, if there are overlapping deletions it may not be so easy to know what coding is correct.

One will usually want to use "?" after a stop codon, if one does not know what amino acid is there. If the DNA sequence has been observed there, one probably ought to resist putting in the amino acids that this DNA would code for, and one should use "X" instead, because under the assumptions implicit in this parsimony method, changes to any noncoding sequence are much easier than changes in a coding region that change the amino acid, so that they shouldn't be counted anyway!

The form of this information is the standard one described in the main documentation file. For the U option the tree provided must be a rooted bifurcating tree, with the root placed anywhere you want, since that root placement does not affect anything.

The options are selected using an interactive menu. The menu looks like this:


Protein parsimony algorithm, version 3.69

Setting for this run:
  U                 Search for best tree?  Yes
  J   Randomize input order of sequences?  No. Use input order
  O                        Outgroup root?  No, use as outgroup species  1
  T              Use Threshold parsimony?  No, use ordinary parsimony
  C               Use which genetic code?  Universal
  W                       Sites weighted?  No
  M           Analyze multiple data sets?  No
  I          Input sequences interleaved?  Yes
  0   Terminal type (IBM PC, ANSI, none)?  ANSI
  1    Print out the data at start of run  No
  2  Print indications of progress of run  Yes
  3                        Print out tree  Yes
  4          Print out steps in each site  No
  5  Print sequences at all nodes of tree  No
  6       Write out trees onto tree file?  Yes

Are these settings correct? (type Y or the letter for one to change)

The user either types "Y" (followed, of course, by a carriage-return) if the settings shown are to be accepted, or the letter or digit corresponding to an option that is to be changed.

The options U, J, O, T, W, M, and 0 are the usual ones. They are described in the main documentation file of this package. Option I is the same as in other molecular sequence programs and is described in the documentation file for the sequence programs. Option C allows the user to select among various nuclear and mitochondrial genetic codes. There is no provision for coping with data where different genetic codes have been used in different organisms.

In the U (User tree) option, the trees should not be preceded by a line with the number of trees on it.

Output is standard: if option 1 is toggled on, the data is printed out, with the convention that "." means "the same as in the first species". Then comes a list of equally parsimonious trees, and (if option 2 is toggled on) a table of the number of changes of state required in each position. If option 5 is toggled on, a table is printed out after each tree, showing for each branch whether there are known to be changes in the branch, and what the states are inferred to have been at the top end of the branch. This is a reconstruction of the ancestral sequences in the tree. If you choose option 5, a menu item "." appears which gives you the opportunity to turn off dot-differencing so that complete ancestral sequences are shown. If the inferred state is a "?" there will be multiple equally-parsimonious assignments of states; the user must work these out for themselves by hand. If option 6 is left in its default state the trees found will be written to a tree file, so that they are available to be used in other programs. If the program finds multiple trees tied for best, all of these are written out onto the output tree file. Each is followed by a numerical weight in square brackets (such as [0.25000]). This is needed when we use the trees to make a consensus tree of the results of bootstrapping or jackknifing, to avoid overrepresenting replicates that find many tied trees.

If the U (User Tree) option is used and more than one tree is supplied, the program also performs a statistical test of each of these trees against the best tree. This test is a version of the test proposed by Alan Templeton (1983), and evaluated in a test case by me (1985a). It is closely parallel to a test using log likelihood differences due to Kishino and Hasegawa (1989), and uses the mean and variance of step differences between trees, taken across positions. If the mean is more than 1.96 standard deviations different then the trees are declared significantly different. The program prints out a table of the steps for each tree, the differences of each from the best one, the variance of that quantity as determined by the step differences at individual positions, and a conclusion as to whether that tree is or is not significantly worse than the best one.

The program is derived from Mix but has had some rather elaborate bookkeeping using sets of bits installed. It is not a very fast program but is speeded up substantially over version 3.2.


TEST DATA SET

     5    10
Alpha     ABCDEFGHIK
Beta      AB--EFGHIK
Gamma     ?BCDSFG*??
Delta     CIKDEFGHIK
Epsilon   DIKDEFGHIK


CONTENTS OF OUTPUT FILE (with all numerical options on)


Protein parsimony algorithm, version 3.69

 5 species,  10  sites


Name          Sequences
----          ---------

Alpha        ABCDEFGHIK 
Beta         ..--...... 
Gamma        ?...S..*?? 
Delta        CIK....... 
Epsilon      DIK....... 




     3 trees in all found




     +--------Gamma     
     !  
  +--2     +--Epsilon   
  !  !  +--4  
  !  +--3  +--Delta     
  1     !  
  !     +-----Beta      
  !  
  +-----------Alpha     

  remember: this is an unrooted tree!


requires a total of     16.000

steps in each position:
         0   1   2   3   4   5   6   7   8   9
     *-----------------------------------------
    0!       3   1   5   3   2   0   0   2   0
   10!   0                                    

From    To     Any Steps?    State at upper node
                             ( . means same as in the node below it on tree)


         1                ANCDEFGHIK 
  1      2         no     .......... 
  2   Gamma        yes    ?B..S..*?? 
  2      3         yes    ..?....... 
  3      4         yes    ?IK....... 
  4   Epsilon     maybe   D......... 
  4   Delta        yes    C......... 
  3   Beta         yes    .B--...... 
  1   Alpha       maybe   .B........ 





           +--Epsilon   
        +--4  
     +--3  +--Delta     
     !  !  
  +--2  +-----Gamma     
  !  !  
  1  +--------Beta      
  !  
  +-----------Alpha     

  remember: this is an unrooted tree!


requires a total of     16.000

steps in each position:
         0   1   2   3   4   5   6   7   8   9
     *-----------------------------------------
    0!       3   1   5   3   2   0   0   2   0
   10!   0                                    

From    To     Any Steps?    State at upper node
                             ( . means same as in the node below it on tree)


         1                ANCDEFGHIK 
  1      2         no     .......... 
  2      3        maybe   ?......... 
  3      4         yes    .IK....... 
  4   Epsilon     maybe   D......... 
  4   Delta        yes    C......... 
  3   Gamma        yes    ?B..S..*?? 
  2   Beta         yes    .B--...... 
  1   Alpha       maybe   .B........ 





           +--Epsilon   
     +-----4  
     !     +--Delta     
  +--3  
  !  !     +--Gamma     
  1  +-----2  
  !        +--Beta      
  !  
  +-----------Alpha     

  remember: this is an unrooted tree!


requires a total of     16.000

steps in each position:
         0   1   2   3   4   5   6   7   8   9
     *-----------------------------------------
    0!       3   1   5   3   2   0   0   2   0
   10!   0                                    

From    To     Any Steps?    State at upper node
                             ( . means same as in the node below it on tree)


         1                ANCDEFGHIK 
  1      3         no     .......... 
  3      4         yes    ?IK....... 
  4   Epsilon     maybe   D......... 
  4   Delta        yes    C......... 
  3      2         no     .......... 
  2   Gamma        yes    ?B..S..*?? 
  2   Beta         yes    .B--...... 
  1   Alpha       maybe   .B........ 


phylip-3.697/doc/restdist.html0000644004732000473200000004240312406201173016040 0ustar joefelsenst_g restdist

version 3.696

Restdist -- Program to compute distance matrix
from restriction sites or fragments

© Copyright 2000-2014 by Joseph Felsenstein. All rights reserved. License terms here.

Restdist reads the same restriction sites format as Restml and computes a restriction sites distance. It can also compute a restriction fragments distance. The original restriction fragments and restriction sites distance methods were introduced by Nei and Li (1979). Their original method for restriction fragments is also available in this program, although its default methods are my modifications of the original Nei and Li methods.

These two distances assume that the restriction sites are accidental byproducts of random change of nucleotide sequences. For my restriction sites distance the DNA sequences are assumed to be changing according to the Kimura 2-parameter model of DNA change (Kimura, 1980). The user can set the transition/transversion rate for the model. For my restriction fragments distance there is an implicit assumption of a Jukes-Cantor (1969) model of change. The user can also set the parameter of a correction for unequal rates of evolution between sites in the DNA sequences, using a Gamma distribution of rates among sites. The Jukes-Cantor model is also implicit in the restriction fragments distance of Nei and Li(1979). It does not allow us to correct for a Gamma distribution of rates among sites.

Restriction Sites Distance

The restriction sites distances use data coded for the presence or absence of individual restriction sites (usually as + and - or 0 and 1). My distance is based on the proportion, out of all sites observed in one species or the other, which are present in both species. This is done to correct for the ascertainment of sites, for the fact that we are not aware of many sites because they do not appear in any species.

My distance starts by computing from the particular pair of species the fraction

                 n++
   f =  ---------------------
         n++ + 1/2 (n+- + n-+)
where n++ is the number of sites contained in both species, n+- is the number of sites contained in the first of the two species but not in the second, and n-+ is the number of sites contained in the second of the two species but not in the first. This is the fraction of sites that are present in one species which are present in both. Since the number of sites present in the two species will often differ, the denominator is the average of the number of sites found in the two species.

If each restriction site is s nucleotides long, the probability that a restriction site is present in the other species, given that it is present in a species, is

      Qs,
where Q is the probability that a nucleotide has no net change as one goes from the one species to the other. It may have changed in between; we are interested in the probability that that nucleotide site is the same base in both species, irrespective of what has happened in between.

The distance is then computed by finding the branch length of a two-species tree (connecting these two species with a single branch) such that Q equals the s-th root of f. For this the program computes Q for various values of branch length, iterating them by a Newton-Raphson algorithm until the two quantities are equal.

The resulting distance should be numerically close to the original restriction sites distance of Nei and Li (1979) when divergence is small. Theirs computes the probability of retention of a site in a way that assumes that the site is present in the common ancestor of the two species. Ours does not make this assumption. It is inspired by theirs, but differs in this detail. Their distance also assumes a Jukes-Cantor (1969) model of base change, and does not allow for transitions being more frequent than transversions. In this sense mine generalizes theirs somewhat. Their distance does include, as mine does as well, a correction for Gamma distribution of rate of change among nucleotide sites.

I have made their original distance available here (option N).

Restriction Fragments Distance

For restriction fragments data we use a different distance. If we average over all restriction fragment lengths, each at its own expected frequency, the probability that the fragment will still be in existence after a certain amount of branch length, we must take into account the probability that the two restriction sites at the ends of the fragment do not mutate, and the probability that no new restriction site occurs within the fragment in that amount of branch length. The result for a restriction site length of s is:

                Q2s
          f = --------
               2 - Qs
(The details of the derivation are given in my book Inferring Phylogenies (Felsenstein, 2004).) Given the observed fraction of restriction sites retained, f, we can solve a quadratic equation from the above expression for Qs. That makes it easy to obtain a value of Q, and the branch length can then be estimated by adjusting it so the probability of a base not changing is equal to that value.

Alternatively, if we use the Nei and Li (1979) restriction fragments distancen (available in this program using menu option N), this involves solving for g in the nonlinear equation

       g  =  [ f (3 - 2g) ]1/4
and then the distance is given by
       d  =  - (2/r) loge(g)
where r is the length of the restriction site.

Comparing these two restriction fragments distances in a case where their underlying DNA model is the same (which is when the transition/transversion ratio of the modified model is set to 0.5), you will find that they are very close to each other, differing very little at small distances, with the modified distance becoming smaller than the Nei/Li distance at larger distances. It will therefore matter very little which one you use.

A Comment About RAPDs and AFLPs

Although these distances are designed for restriction sites and restriction fragments data, they can be applied to RAPD and AFLP data as well. RAPD (Randomly Amplified Polymorphic DNA) and AFLP (Amplified Fragment Length Polymorphism) data consist of presence or absence of individual bands on a gel. The bands are segments of DNA with PCR primers at each end. These primers are defined sequences of known length (often about 10 nucleotides each). For AFLPs the relevant length is the primer length, plus three nucleotides. Mutation in these sequences makes them no longer be primers, just as in the case of restriction sites. Thus a pair of 10-nucleotide primers will behave much the same as a 20-nucleotide restriction site, for RAPDs (26 for AFLPs). You can use the restriction sites distance as the distance between RAPD or AFLP patterns if you set the proper value for the total length of the site to the total length of the primers (plus 6 in the case of AFLPs). Of course there are many possible sources of noise in these data, including confusing fragments of similar length for each other and having primers near each other in the genome, and these are not taken into account in the statistical model used here.

INPUT FORMAT AND OPTIONS

The input is fairly standard, with one addition. As usual the first line of the file gives the number of species and the number of sites, but there is also a third number, which is the number of different restriction enzymes that were used to detect the restriction sites. Thus a data set with 10 species and 35 different sites, representing digestion with 4 different enzymes, would have the first line of the data file look like this:

   10   35    4

The site data are in standard form. Each species starts with a species name whose maximum length is given by the constant "nmlngth" (whose value in the program as distributed is 10 characters). The name should, as usual, be padded out to that length with blanks if necessary. The sites data then follows, one character per site (any blanks will be skipped and ignored). Like the DNA and protein sequence data, the restriction sites data may be either in the "interleaved" form or the "sequential" form. Note that if you are analyzing restriction sites data with the programs Dollop or Mix or other discrete character programs, at the moment those programs do not use the "aligned" or "interleaved" data format. Therefore you may want to avoid that format when you have restriction sites data that you will want to feed into those programs.

The presence of a site is indicated by a "+" and the absence by a "-". I have also allowed the use of "1" and "0" as synonyms for "+" and "-", for compatibility with Mix and Dollop which do not allow "+" and "-". If the presence of the site is unknown (for example, if the DNA containing it has been deleted so that one does not know whether it would have contained the site) then the state "?" can be used to indicate that the state of this site is unknown.

The options are selected using an interactive menu. The menu looks like this:


Restriction site or fragment distances, version 3.69

Settings for this run:
  R           Restriction sites or fragments?  Sites
  N        Original or modified Nei/Li model?  Modified
  G  Gamma distribution of rates among sites?  No
  T            Transition/transversion ratio?  2.000000
  S                              Site length?  6.0
  L                  Form of distance matrix?  Square
  M               Analyze multiple data sets?  No
  I              Input sequences interleaved?  Yes
  0       Terminal type (IBM PC, ANSI, none)?  ANSI
  1       Print out the data at start of run?  No
  2     Print indications of progress of run?  Yes

  Y to accept these or type the letter for one to change

The user either types "Y" (followed, of course, by a carriage-return) if the settings shown are to be accepted, or the letter or digit corresponding to an option that is to be changed.

The R option toggles between a restriction sites distance, which is the default setting, and a restriction fragments distance. In both cases, another option appears, the N (Nei/Li) option. This allows the user to choose the original Nei and Li (1979) restriction sites or fragments distances rather than my modified versions, which are the defaults.

If the G (Gamma distribution) option is selected, the user will be asked to supply the coefficient of variation of the rate of substitution among sites. This is different from the parameters used by Nei and Jin, who introduced Gamma distribution of rates in DNA distances, but related to their parameters: their parameter a is also known as "alpha", the shape parameter of the Gamma distribution. It is related to the coefficient of variation by

     CV = 1 / a1/2

or

     a = 1 / (CV)2

(their parameter b is absorbed here by the requirement that time is scaled so that the mean rate of evolution is 1 per unit time, which means that a = b). As we consider cases in which the rates are less variable we should set a larger and larger, as CV gets smaller and smaller.

The Gamma distribution option is not available when using the original Nei/Li restriction fragments distance.

The T option is the Transition/transversion option. The user is prompted for a real number greater than 0.0, as the expected ratio of transitions to transversions. Note that this is the resulting expected ratio of transitions to transversions. The default value of the T parameter if you do not use the T option is 2.0. The T option is not available when you choose the original Nei/Li restriction fragment distance, which assumes a Jukes-Cantor (1969) model of DNA change, for which the transition/transversion ratio is in effect fixed at 0.5.

The S option selects the site length. This is set to a default value of 6. It can be set to any positive integer. While in the Restml program there is an upper limit on the restriction site length (set by memory limitations), in Restdist there is no effective limit on the size of the restriction sites. A value of 20, which might be appropriate in many cases for RAPD or AFLP data, is typically not practical in Restml, but it is useable in Restdist.

Option L specifies that the output file will have a square matrix of distances. It can be used to change to lower-triangular data matrices. This will usually not be necessary, but if the distance matrices are going to be very large, this alternative can reduce their size by half. The programs which are to use them should then of course be informed that they can expect lower-triangular distance matrices.

The M, I, and 0 options are the usual Multiple data set, Interleaved input, and screen terminal type options. These are described in the main documentation file.

Option 1 specifies that the input data will be written out on the output file before the distances. This is off by default. If it is done, it will make the output file unusable as input to our distance matrix programs.

Option 2 turns off or on the indications of the progress of the run. The program prints out a row of dots (".") indicating the calculation of individual distances. Since the distance matrix is symmetrical, the program only computes the distances for the upper triangle of the distance matrix, and then duplicates the distance to the other corner of the matrix. Thus the rows of dots start out at full length, and then get shorter and shorter.

OUTPUT FORMAT

The output file contains on its first line the number of species. The distance matrix is then printed in standard form, with each species starting on a new line with the species name, followed by the distances to the species in order. These continue onto a new line after every nine distances. If the L option is used, the matrix of distances is in lower triangular form, so that only the distances to the other species that precede each species are printed. Otherwise the distance matrix is square with zero distances on the diagonal. In general the format of the distance matrix is such that it can serve as input to any of the distance matrix programs.

If the option to print out the data is selected, the output file will precede the data by more complete information on the input and the menu selections. The output file begins by giving the number of species and the number of characters.

The distances printed out are scaled in terms of expected numbers of substitutions per DNA site, counting both transitions and transversions but not replacements of a base by itself, and scaled so that the average rate of change, averaged over all sites analyzed, is set to 1.0. Thus when the G option is used, the rate of change at one site may be higher than at another, but their mean is expected to be 1.

PROGRAM CONSTANTS

The constants available to be changed are "initialv" and "iterationsr". The constant "initialv" is the starting value of the distance in the iterations. This will typically not need to be changed. The constant "iterationsr" is the number of times that the Newton-Raphson method which is used to solve the equations for the distances is iterated. The program can be speeded up by reducing the number of iterations from the default value of 20, but at the possible risk of computing the distance less accurately.


TEST DATA SET

   5   13   2
Alpha     ++-+-++--+++-
Beta      ++++--+--+++-
Gamma     -+--+-++-+-++
Delta     ++-+----++---
Epsilon   ++++----++---


CONTENTS OF OUTPUT FILE (with all numerical options on)

(Note that when the options for displaying the input data are turned off, the output is in a form suitable for use as an input file in the distance matrix programs).


    5 Species,   13 Sites

Name            Sites
----            -----

Alpha        ++-+-++--+ ++-
Beta         ++++--+--+ ++-
Gamma        -+--+-++-+ -++
Delta        ++-+----++ ---
Epsilon      ++++----++ ---


Alpha       0.000000  0.022368  0.107681  0.082634  0.095581
Beta        0.022368  0.000000  0.107681  0.082634  0.056895
Gamma       0.107681  0.107681  0.000000  0.192466  0.207319
Delta       0.082634  0.082634  0.192466  0.000000  0.015949
Epsilon     0.095581  0.056895  0.207319  0.015949  0.000000
phylip-3.697/doc/restml.html0000644004732000473200000004710012406201173015504 0ustar joefelsenst_g restml

version 3.696

Restml -- Restriction sites Maximum Likelihood program

© Copyright 1986-2014 by Joseph Felsenstein. All rights reserved. License terms here.

This program implements a maximum likelihood method for restriction sites data (not restriction fragment data). This program is one of the slowest programs in this package, and can be very tedious to run. It is possible to have the program search for the maximum likelihood tree. It will be more practical for some users (those that do not have fast machines) to use the U (User Tree) option, which takes less run time, optimizing branch lengths and computing likelihoods for particular tree topologies suggested by the user. The model used here is essentially identical to that used by Smouse and Li (1987) who give explicit expressions for computing the likelihood for three-species trees. It does not place prior probabilities on trees as they do. The present program extends their approach to multiple species by a technique which, while it does not give explicit expressions for likelihoods, does enable their computation and the iterative improvement of branch lengths. It also allows for multiple restriction enzymes. The algorithm has been described in a paper (Felsenstein, 1992). Another relevant paper is that of DeBry and Slade (1985).

The assumptions of the present model are:

  1. Each restriction site evolves independently.
  2. Different lineages evolve independently.
  3. Each site undergoes substitution at an expected rate which we specify.
  4. Substitutions consist of replacement of a nucleotide by one of the other three nucleotides, chosen at random.

Note that if the existing base is, say, an A, the chance of it being replaced by a G is 1/3, and so is the chance that it is replaced by a T. This means that there can be no difference in the (expected) rate of transitions and transversions. Users who are upset at this might ponder the fact that a version allowing different rates of transitions and transversions would run an estimated 16 times slower. If it also allowed for unequal frequencies of the four bases, it would run about 300,000 times slower! For the moment, until a better method is available, I guess I'll stick with this one!

INPUT FORMAT AND OPTIONS

Subject to these assumptions, the program is an approximately correct maximum likelihood method. The input is fairly standard, with one addition. As usual the first line of the file gives the number of species and the number of sites, but there is also a third number, which is the number of different restriction enzymes that were used to detect the restriction sites. Thus a data set with 10 species and 35 different sites, representing digestion with 4 different enzymes, would have the first line of the data file look like this:

   10   35    4

The site data are in standard form. Each species starts with a species name whose maximum length is given by the constant "nmlngth" (whose value in the program as distributed is 10 characters). The name should, as usual, be padded out to that length with blanks if necessary. The sites data then follows, one character per site (any blanks will be skipped and ignored). Like the DNA and protein sequence data, the restriction sites data may be either in the "interleaved" form or the "sequential" form. Note that if you are analyzing restriction sites data with the programs Dollop or Mix or other discrete character programs, at the moment those programs do not use the "aligned" or "interleaved" data format. Therefore you may want to avoid that format when you have restriction sites data that you will want to feed into those programs.

The presence of a site is indicated by a "+" and the absence by a "-". I have also allowed the use of "1" and "0" as synonyms for "+" and "-", for compatibility with Mix and Dollop which do not allow "+" and "-". If the presence of the site is unknown (for example, if the DNA containing it has been deleted so that one does not know whether it would have contained the site) then the state "?" can be used to indicate that the state of this site is unknown.

User-defined trees may follow the data in the usual way. The trees must be unrooted, which means that at their base they must have a trifurcation.

The options are selected by a menu, which looks like this:


Restriction site Maximum Likelihood method, version 3.69

Settings for this run:
  U                 Search for best tree?  Yes
  A               Are all sites detected?  No
  S        Speedier but rougher analysis?  Yes
  G                Global rearrangements?  No
  J   Randomize input order of sequences?  No. Use input order
  L                          Site length?  6
  O                        Outgroup root?  No, use as outgroup species  1
  M           Analyze multiple data sets?  No
  I          Input sequences interleaved?  Yes
  0   Terminal type (IBM PC, ANSI, none)?  ANSI
  1    Print out the data at start of run  No
  2  Print indications of progress of run  Yes
  3                        Print out tree  Yes
  4       Write out trees onto tree file?  Yes

  Y to accept these or type the letter for one to change

The U, J, O, M, and 0 options are the usual ones, described in the main documentation file. The user trees for option U are read from a file whose default name is intree. The I option selects between Interleaved and Sequential input data formats, and is described in the documentation file for the molecular sequences programs.

The G (global search) option causes, after the last species is added to the tree, each possible group to be removed and re-added. This improves the result, since the position of every species is reconsidered. It approximately triples the run-time of the program.

The two options specific to this program are the A, and L options. The L (Length) option allows the user to specify the length in bases of the restriction sites. At the moment the program assumes that all sites have the same length (for example, that all enzymes are 6-base-cutters). The default value for this parameter is 6, which will be used if the L option is not invoked. A desirable future development for the package would be allowing the L parameter to be different for every site. It would also be desirable to allow for ambiguities in the recognition site, since some enzymes recognize 2 or 4 sequences. Both of these would require fairly complicated programming or else slower execution times.

The A (All) option specifies that all sites are detected, even those for which all of the species have the recognition sequence absent (character state "-"). The default condition is that it is assumed that such sites will not occur in the data. The likelihood computed when the A option is not used is the probability of the pattern of sites given that tree and conditional on the pattern not being all absences. This will be realistic for most data, except for cases in which the data are extracted from sites data for a larger number of species, in which case some of the site positions could have all absences in the subset of species. In such cases an effective way of analyzing the data would be to omit those sites and not use the A option, as such positions, even if not absolutely excluded, are nevertheless less likely than random to have been incorporated in the data set.

The W (Weights) option, which is invoked in the input file rather than in the menu, allows the user to select a subset of sites to be analyzed. It is invoked in the usual way, except that only weights 0 and 1 are allowed. If the W option is not used, all sites will be analyzed. If the Weights option is used, there must be a W in the first line of the input file.

OUTPUT FORMAT

The output starts by giving the number of species, and the number of sites. If the default condition is used instead of the A option the program states that it is assuming that sites absent in all species have been omitted. The value of the site length (6 bases, for example) is also given.

If option 2 (print out data) has been selected, there then follow the restriction site sequences, printed in groups of ten sites. The trees found are printed as an unrooted tree topology (possibly rooted by outgroup if so requested). The internal nodes are numbered arbitrarily for the sake of identification. The number of trees evaluated so far and the log likelihood of the tree are also given.

A table is printed showing the length of each tree segment, as well as (very) rough confidence limits on the length. As with Dnaml, if a confidence limit is negative, this indicates that rearrangement of the tree in that region is not excluded, while if both limits are positive, rearrangement is still not necessarily excluded because the variance calculation on which the confidence limits are based results in an underestimate, which makes the confidence limits too narrow.

In addition to the confidence limits, the program performs a crude Likelihood Ratio Test (LRT) for each branch of the tree. The program computes the ratio of likelihoods with and without this branch length forced to zero length. This done by comparing the likelihoods changing only that branch length. A truly correct LRT would force that branch length to zero and also allow the other branch lengths to adjust to that. The result would be a likelihood ratio closer to 1. Therefore the present LRT will err on the side of being too significant.

One should also realize that if you are looking not at a previously-chosen branch but at all branches, that you are seeing the results of multiple tests. With 20 tests, one is expected to reach significance at the P = .05 level purely by chance. You should therefore use a much more conservative significance level, such as .05 divided by the number of tests. The significance of these tests is shown by printing asterisks next to the confidence interval on each branch length. It is important to keep in mind that both the confidence limits and the tests are very rough and approximate, and probably indicate more significance than they should. Nevertheless, maximum likelihood is one of the few methods that can give you any indication of its own error; most other methods simply fail to warn the user that there is any error! (In fact, whole philosophical schools of taxonomists exist whose main point seems to be that there isn't any error, that the "most parsimonious" tree is the best tree by definition and that's that).

The log likelihood printed out with the final tree can be used to perform various likelihood ratio tests. Remember that testing one tree topology against another is not a simple matter, because two different tree topologies are not hypotheses that are nested one within the other. If the trees differ by only one branch swap, it seems to be conservative to test the difference between their likelihoods with one degree of freedom, but other than that little is known and more work on this is needed.

If the U (User Tree) option is used and more than one tree is supplied, and the program is not told to assume autocorrelation between the rates at different sites, the program also performs a statistical test of each of these trees against the one with highest likelihood. If there are two user trees, the test done is one which is due to Kishino and Hasegawa (1989), a version of a test originally introduced by Templeton (1983). In this implementation it uses the mean and variance of log-likelihood differences between trees, taken across sites. If the two trees' means are more than 1.96 standard deviations different then the trees are declared significantly different. This use of the empirical variance of log-likelihood differences is more robust and nonparametric than the classical likelihood ratio test, and may to some extent compensate for the any lack of realism in the model underlying this program.

If there are more than two trees, the test done is an extension of the KHT test, due to Shimodaira and Hasegawa (1999). They pointed out that a correction for the number of trees was necessary, and they introduced a resampling method to make this correction. In the version used here the variances and covariances of the sum of log likelihoods across sites are computed for all pairs of trees. To test whether the difference between each tree and the best one is larger than could have been expected if they all had the same expected log-likelihood, log-likelihoods for all trees are sampled with these covariances and equal means (Shimodaira and Hasegawa's "least favorable hypothesis"), and a P value is computed from the fraction of times the difference between the tree's value and the highest log-likelihood exceeds that actually observed. Note that this sampling needs random numbers, and so the program will prompt the user for a random number seed if one has not already been supplied. With the two-tree KHT test no random numbers are used.

In either the KHT or the SH test the program prints out a table of the log-likelihoods of each tree, the differences of each from the highest one, the variance of that quantity as determined by the log-likelihood differences at individual sites, and a conclusion as to whether that tree is or is not significantly worse than the best one.

The branch lengths printed out are scaled in terms of expected numbers of base substitutions, not counting replacements of a base by itself. Of course, when a branch is twice as long this does not mean that there will be twice as much net change expected along it, since some of the changes occur in the same site and overlie or even reverse each other. Confidence limits on the branch lengths are also given. Of course a negative value of the branch length is meaningless, and a confidence limit overlapping zero simply means that the branch length is not necessarily significantly different from zero. Because of limitations of the numerical algorithm, branch length estimates of zero will often print out as small numbers such as 0.00001. If you see a branch length that small, it is really estimated to be of zero length.

Another possible source of confusion is the existence of negative values for the log likelihood. This is not really a problem; the log likelihood is not a probability but the logarithm of a probability, and since probabilities never exceed 1.0 this logarithm will typically be negative. The log likelihood is maximized by being made more positive: -30.23 is worse than -29.14. The log likelihood will not always be negative since a combinatorial constant has been left out of the expression for the likelihood. This does not affect the tree found or the likelihood ratios (or log likelihood differences) between trees.

THE ALGORITHM

The program uses a Newton-Raphson algorithm to update one branch length at a time. This is faster than the EM algorithm which was described in my paper on restriction sites maximum likelihood (Felsenstein, 1992). The likelihood that is being maximized is the same one used by Smouse and Li (1987) extended for multiple species. moving down on the likelihood surface. You may have to "tune" the value of extrapol to suit your data.

PROGRAM CONSTANTS

The constants include "maxcutter" (set in phylip.h), the maximum length of an enzyme recognition site. The memory used by the program will be approximately proportional to this value, which is 8 in the distribution copy. The program also uses constants "iterations" and "smoothings", and decreasing "epsilon". Reducing "iterations" and "smoothings" or increasing "epsilon" will result in faster execution but a worse result. These values will not usually have to be changed.

The program spends most of its time doing real arithmetic. The algorithm, with separate and independent computations occurring at each site, lends itself readily to parallel processing.

A feature of the algorithm is that it saves time by recognizing sites at which the pattern of presence/absence is the same, and does that computation only once. Thus if we have only four species but a large number of sites, there are only about (ignoring ambiguous bases) 16 different patterns of presence/absence (2 x 2 x 2 x 2) that can occur. The program automatically counts occurrences of each and does the computation for each pattern only once, so that it only needs to do as much computation as would be needed with at most 16 sites, even though the number of sites is actually much larger. Thus the program will run very effectively with few species and many sites.

PAST AND FUTURE OF THE PROGRAM

This program was developed by modifying Dnaml version 3.1 and also adding some of the modifications that were added to Dnaml version 3.2, with which it shares many of its data structures and much of its strategy. Version 3.6 changed from EM iterations of branch lengths, which involved arbitrary extrapolation factors, to the Newton-Raphson algorithm, which improved the speed of the program (though only from "very slow" to "slow").

There are a number of obvious directions in which the program needs to be modified in the future. Extension to allow for different rates of transition and transversion is straightforward, but would slow down the program considerably, as I have mentioned above. I have not included in the program any provision for saving and printing out multiple trees tied for highest likelihood, in part because an exact tie is unlikely.


TEST DATA SET

   5   13   2
Alpha     ++-+-++--+++-
Beta      ++++--+--+++-
Gamma     -+--+-++-+-++
Delta     ++-+----++---
Epsilon   ++++----++---


CONTENTS OF OUTPUT FILE (if all numerical options are on)


Restriction site Maximum Likelihood method, version 3.69

   5 Species,   13 Sites,   2 Enzymes

  Recognition sequences all 6 bases long

Sites absent from all species are assumed to have been omitted


Name            Sites
----            -----

Alpha        ++-+-++--+ ++-
Beta         ++++--+--+ ++-
Gamma        -+--+-++-+ -++
Delta        ++-+----++ ---
Epsilon      ++++----++ ---





  +Beta      
  |  
  |      +Epsilon   
  |  +---3  
  2--1   +Delta     
  |  |  
  |  +-----Gamma     
  |  
  +Alpha     


remember: this is an unrooted tree!

Ln Likelihood =   -40.49476

 
 Between        And            Length      Approx. Confidence Limits
 -------        ---            ------      ------- ---------- ------
   2          Beta            0.00100     (     zero,    infinity)
   2             1            0.00010     (     zero,     0.04003)
   1             3            0.05898     (     zero,     0.12733) **
   3          Epsilon         0.00100     (     zero,    infinity)
   3          Delta           0.01458     (     zero,     0.04490) **
   1          Gamma           0.11470     (  0.01732,     0.22664) **
   2          Alpha           0.02473     (     zero,     0.06180) **

     *  = significantly positive, P < 0.05
     ** = significantly positive, P < 0.01


phylip-3.697/doc/retree.html0000644004732000473200000005115712406201173015473 0ustar joefelsenst_g retree

version 3.696

Retree -- Interactive Tree Rearrangement

© Copyright 1993-2014 by Joseph Felsenstein. All rights reserved. License terms here.

Retree is a tree editor. It reads in a tree, or allows the user to construct one, and displays this tree on the screen. The user then can specify how the tree is to be rearranged, rerooted or written out to a file.

The input trees are in one file (with default file name intree), the output trees are written into another (outtree). The user can reroot, flip branches, change names of species, change or remove branch lengths, and move around to look at various parts of the tree if it is too large to fit on the screen. The trees can be multifurcating at any level, although the user is warned that many PHYLIP programs still cannot handle multifurcations above the root, or even at the root.

A major use for this program will be to change rootedness of trees so that a rooted tree derived from one program can be fed in as an unrooted tree to another (you are asked about this when you give the command to write out the tree onto the tree output file). It will also be useful for specifying the length of a branch in a tree where you want a program like Dnaml, Dnamlk, Fitch, or Contml to hold that branch length constant (see the L suboption of the User Tree option in those programs). It will also be useful for changing the order of species for purely cosmetic reasons for Drawgram and Drawtree, including using the Midpoint method of rooting the tree. It can also be used to write out a tree file in the Nexus format used by Paup and MacClade or in our XML tree file format.

This program uses graphic characters that show the tree to best advantage on some computer systems. Its graphic characters will work best on MSDOS systems or MSDOS windows in Windows, and to any system whose screen or terminals emulate ANSI standard terminals such as old Digitial VT100 terminals, Telnet programs, or VT100-compatible windows in the X windowing system. For any other screen types, (such as Macintosh windows) there is a generic option which does not make use of screen graphics characters. The program will work well in those cases, but the tree it displays will look a bit uglier.

As we will see below, the initial menu of the program allows you to choose among three screen types (PCDOS, Ansi, and none). If you want to avoid having to make this choice every time, you can change some of the constants in the file phylip.h to have the terminal type initialize itself in the proper way, and recompile. We have tried to have the default values be correct for PC, Macintosh, and Unix screens. If the setting is "none" (which is necessary on Macintosh MacOS 9 screens), the special graphics characters will not be used to indicate nucleotide states, but only letters will be used for the four nucleotides. This is less easy to look at.

The constants that need attention are ANSICRT and IBMCRT. Currently these are both set to "false" on Macintosh MacOS 9 systems, to "true" on MacOS X and on Unix/Linux systems, and IBMCRT is set to "true" on Windows systems. If your system has an ANSI compatible terminal, you might want to find the definition of ANSICRT in phylip.h and set it to "true", and IBMCRT to "false".

The user interaction starts with the program presenting a menu. The menu looks like this:


Tree Rearrangement, version 3.69

Settings for this run:
  U          Initial tree (arbitrary, user, specify)?  User tree from tree file
  N   Format to write out trees (PHYLIP, Nexus, XML)?  PHYLIP
  0                     Graphics type (IBM PC, ANSI)?  ANSI
  W       Width of terminal screen, of plotting area?  80, 80
  L                        Number of lines on screen?  24

Are these settings correct? (type Y or the letter for one to change)

The 0 (Graphics type) option is the usual one and is described in the main documentation file. The U (initial tree) option allows the user to choose whether the initial tree is to be arbitrary, interactively specified by the user, or read from a tree file. Typing U causes the program to change among the three possibilities in turn. Usually we will want to use a User Tree from a file. It requires that you have available a tree file with the tree topology of the initial tree. If you wish to set up some other particular tree you can either use the "specify" choice in the initial tree option (which is somewhat clumsy to use) or rearrange a User Tree of an arbitrary tree into the shape you want by using the rearrangement commands given below.

The L (screen Lines) option allows the user to change the height of the screen (in lines of characters) that is assumed to be available on the display. This may be particularly helpful when displaying large trees on displays that have more than 24 lines per screen, or on workstation or X-terminal screens that can emulate the ANSI terminals with more than 24 lines.

The N (output file format) option allows the user to specify that the tree files that are written by the program will be in one of three formats:

  1. The PHYLIP default file format (the Newick standard) used by the programs in this package.
  2. The Nexus format defined by David Swofford and by Wayne Maddison and David Maddison for their programs PAUP and MacClade. A tree file written in Nexus format should be directly readable by those programs (They also have options to read a regular PHYLIP tree file as well).
  3. An XML tree file format which we have defined.

The XML tree file format is fairly simple. The tree file, which may have multiple trees, is enclosed in a pair of <PHYLOGENIES> ... </PHYLOGENIES> tags. Each tree is included in tags <PHYLOGENY> ... </PHYLOGENY>. Each branch of the tree is enclosed in a pair of tags <CLADE> ... </CLADE>, which enclose the branch and all its descendants. If the branch has a length, this is given by the LENGTH attribute of the CLADE tag, so that the pair of tags looks like this:

<CLADE LENGTH="0.09362"> ... </CLADE>

A tip of the tree is at the end of a branch (and hence that branch is enclosed in a pair of <CLADE> ... </CLADE> tags). Its name is enclosed by <NAME> ... </NAME> tags. Here is an XML tree:

<phylogenies>
  <phylogeny>
    <clade>
      <clade length="0.87231"><name>Mouse</name></clade>
      <clade length="0.49807"><name>Bovine</name></clade>
      <clade length="0.39538">
        <clade length="0.25930"><name>Gibbon</name></clade>
        <clade length="0.10815">
          <clade length="0.24166"><name>Orang</name></clade>
          <clade length="0.04405">
            <clade length="0.12322"><name>Gorilla</name></clade>
            <clade length="0.06026">
              <clade length="0.13846"><name>Chimp</name></clade>
              <clade length="0.0857"><name>Human</name></clade>
            </clade>
          </clade>
        </clade>
      </clade>
    </clade>
  </phylogeny>
</phylogenies>
  

The indentation is for readability but is not part of the XML tree standard, which ignores that kind of white space.

What programs can read an XML tree? None right now, not even PHYLIP programs! But soon our lab's LAMARC package will have programs that can read an XML tree. XML is rapidly becoming the standard for representing and interchanging complex data -- it is time to have an XML tree standard. Certain extensions are obvious (to represent the bootstrap proportion for a branch, use BOOTP=0.83 in the CLADE tag, for example).

There are other proposals for an XML tree standard. They have many similarities to this one, but are not identical to it. At the moment there is no mechanism in place for deciding between them other than seeing which get widely used. Here are links to other proposals:

Taxonomic Markup Language http://www.albany.edu/~gilmr/pubxml/.
and preprint at
xml.coverpages.org/gilmour-TML.pdf
published in the paper by
Ron Gilmour (2000).
Andrew Rambaut's
BEAST XML phylogeny format
See page 9 of PDF of BEAST manual at
http://evolve.zoo.ox.ac.uk/beast/
An XML format for phylogenies is briefly described there.
treeml http://www.nomencurator.org/InfoVis2003/download/treeml.dtd
(see also example: )
http://www.cs.umd.edu/hcil/iv03contest/datasets/treeml-sample.xml
Jean-Daniel Fekete's DTD
for a tree XML file

The W (screen and window Width) option specifies the width in characters of the area which the trees will be plotted to fit into. This is by default 80 characters so that they will fit on a normal width terminal. The actual width of the display on the terminal (normally 80 characters) will be regarded as a window displaying part of the tree. Thus you could set the "plotting area" to 132 characters, and inform the program that the screen width is 80 characters. Then the program will display only part of the tree at any one time. Below we will show how to move the "window" and see other parts of the tree.

After the initial menu is displayed and the choices are made, the program then sets up an initial tree and displays it. Below it will be a one-line menu of possible commands. Here is what the tree and the menu look like (this is the tree specified by the example input tree given at the bottom of this page, as it displays when the terminal type is "none"):

                                      ,>>1:Human
                                   ,>22
                                ,>21  `>>2:Chimp
                                !  !
                             ,>20  `>>>>>3:Gorilla
                             !  !
                 ,>>>>>>>>>>19  `>>>>>>>>4:Orang
                 !           !
              ,>18           `>>>>>>>>>>>5:Gibbon
              !  !
              !  !              ,>>>>>>>>6:Barbary Ma
              !  `>>>>>>>>>>>>>23
              !                 !  ,>>>>>7:Crab-e. Ma
     ,>>>>>>>17                 `>24
     !        !                    !  ,>>8:Rhesus Mac
     !        !                    `>25
     !        !                       `>>9:Jpn Macaq
  ,>16        !
  !  !        `>>>>>>>>>>>>>>>>>>>>>>>>>10:Squir. Mon
  !  !
  !  !                                ,>11:Tarsier
** 7 lines below screen **

NEXT? (Options: R . U W O T F D B N H J K L C + ? X Q) (? for Help)

The tree that was read in had no branch lengths on its branches. The absence of a branch length is indicated by drawing the branch with ">" characters (>>>>>>>). When branches have branch lengths, they are drawn with "-" characters (-------) and their lengths on the screen are approximately proportional to the branch length.

If you type "?" you will get a single screen showing a description of each of these commands in a few words. Here are slightly more detailed descriptions of the commands:

R
("Rearrange"). This command asks for the number of a node which is to be removed from the tree. It and everything to the right of it on the tree is to be removed (by breaking the branch immediately below it). (This is also everything "above" it on the tree when the tree grows upwards, but as the tree grows from left to right on the screen we use "right" rather than "above"). The command also asks whether that branch is to be inserted At a node or Before a node. The first will insert it as an additional branch coming out of an existing node (creating a more multifurcating tree), and the second will insert it so that a new internal node is created in the tree, located in the branch that precedes the node (to the left of it), with the branch that is inserted coming off from that new node. In both cases the program asks you for the number of a node at (or before) which that group is to be inserted. If an impossible number is given, the program refuses to carry out the rearrangement and asks for a new command. The rearranged tree is displayed: it will often have a different number of steps than the original. If you wish to undo a rearrangement, use the Undo command, for which see below.

.
(dot) This command simply causes the current tree to be redisplayed. It is of use when the tree has partly disappeared off of the top of the screen owing to too many responses to commands being printed out at the bottom of the screen.

=
(toggle display of branch lengths). This option is available whenever the tree has a full set of branch lengths. It toggles on and off whether the tree displayed on the screen is shown with the relative branch lengths roughly correct. (It cannot be better than roughly correct because the display is in units of length of whole character widths on the screen). It does not actually remove any branch lengths from the tree: if the tree showing on the screen seems to have no branch lengths after use of the "=" option, if it were written out at that point, it would still have a full set of branch lengths.

U
("Undo"). This command reverses the effect of the most recent rearrangement, outgroup re-rooting, or flipping of branches. It returns to the previous tree topology. It will be of great use when rearranging the tree, and when one makes a mistake, it permits you to abandon the new one and return to the previous one without remembering its topology in detail. Some operations, such as the simultaneous removal of lengths from all branches, cannot be reversed.

W
("Write"). This command writes out the current tree onto a tree output file. If the file already has been written to by this run of Retree, it will ask you whether you want to replace the contents of the file, add the tree to the end of the file, or not write out the tree to the file. It will also ask you whether you want the tree to be written out as Rooted or Unrooted. If you choose Unrooted, it will write the outermost split of the tree as a three-way split with the three branches being those that issue from one of the nodes. This node will be the left (upper) interior node which is next to the root, or the other one if there is no interior node to the left (above) the root. The tree is written in the standard format used by PHYLIP (a subset of the Newick standard), in the Nexus format, or in an XML tree file format. A normal PHYLIP tree is in the proper format to serve as the User-Defined Tree for setting up the initial tree in a subsequent run of the program. However, some programs also require a line in the tree input file that gives the number of trees in the file. You may have to add this line using an editor such as vi, Emacs, Windows Notepad, or MacOS's Simpletext.

O
("Outgroup"). This asks for the number of a node which is to be the outgroup. The tree will be redisplayed with that node as the left descendant of the root fork. Note that it is possible to use this to make a multi-species group the outgroup (i.e., you can give the number of an interior node of the tree as the outgroup, and the program will re-root the tree properly with that on the left of the bottom fork).

M
("Midpoint root"). This reroots a tree that has a complete set of branches using the Midpoint rooting method. That rooting method finds the centroid of the tree -- the point that is equidistant from the two farthest points of the tree, and roots the tree there. This is the point in the middle of the longest path from one tip to another in the tree. This has the effect of making the two farthest tips stick out an equal distance to the right. Note that as the tree is rerooted, the scale may change on the screen so that it looks like it has suddenly gotten a bit longer. It will not have actually changed in total length. This option is not in the menu if the tree does not have a full set of branch lengths.

T
("Transpose"). This asks for a node number and then flips the two branches at that node, so that the left-right order of branches at that node is changed. This also does not actually change the tree topology but it does change the appearance of the tree. However, unlike the F option discussed below, the individual subtrees defined by those branches do not have the order of any branches reversed in them.

F
("Flip"). This asks for a node number and then flips the entire subtree at that node, so that the left-right order of branches in the whole subtree is changed. This does not actually change the tree topology but it does change the appearance of the tree. Note that it works differently than the F option in the programs Move, Dnamove, and Dolmove, which is actually like the T option mentioned above.

B
("Branch length"). This asks you for the number of a node which is at the end of a branch length, then asks you whether you want to enter a branch length for that branch, change the branch length for that branch (if there is one already) or remove the branch length from the branch.

N
("Name"). This asks you which species you want to change the name for (referring to it by the number for that branch), then gives you the option of either removing the name, typing a new name, or leaving the name as is. Be sure not to try to enter a parentheses ("(" or ")"), a colon (":"), a comma (",") or a semicolon (";") in a name, as those may be mistaken for structural information about the tree when the tree file is read by another program.

H, J, K, or L.
These are the movement commands for scrolling the "window" across a tree. H moves the "window" leftwards (though not beyond column 1), J moves it down, K up, and L right. The "window" will move 20 columns or rows at a time, and the tree will be redrawn in the new "window". Note that this amount of movement is not a full screen.

C
("Clade"). The C command instructs the program to print out only that part of the tree (the "clade") from a certain node on up. The program will prompt you for the number of this node. Remember that thereafter you are not looking at the whole tree. To go back to looking at the whole tree give the C command again and enter "0" for the node number when asked. Most users will not want to use this option unless forced to, as much can be accomplished with the window movement commands H, J, K, and L.

+
("next tree"). This causes the program to read in the next tree in the input file, if there is one.

?
("Help"). Prints a one-screen summary of what the commands do, a few words for each command.

X
("Exit"). Exit from program. If the current tree has not yet been saved into a file, the program will first ask you whether it should be saved.

Q
("Quit"). A synonym for X. Same as the eXit command.

The program was written by Andrew Keeffe, using some code from Dnamove, which he also wrote.

Below is a test tree file. We have already showed (above), what the resulting tree display looks like when the terminal type is "none". For ANSI or IBM PC screens it will look better, using the graphics characters of those screens, which we do not attempt to show here.


TEST INPUT TREE FILE

((((((((Human,Chimp),Gorilla),Orang),Gibbon),(Barbary_Ma,(Crab-e._Ma,
(Rhesus_Mac,Jpn_Macaq)))),Squir._Mon),((Tarsier,Lemur),Bovine)),Mouse);
phylip-3.697/doc/seqboot.html0000644004732000473200000006705612406201173015666 0ustar joefelsenst_g seqboot

version 3.696

Seqboot -- Bootstrap, Jackknife, or Permutation Resampling
of Molecular Sequence, Restriction Site,
Gene Frequency or Character Data

© Copyright 1991-2014 by Joseph Felsenstein. All rights reserved. License terms here.

Seqboot is a general bootstrapping and data set translation tool. It is intended to allow you to generate multiple data sets that are resampled versions of the input data set. Since almost all programs in the package can analyze these multiple data sets, this allows almost anything in this package to be bootstrapped, jackknifed, or permuted. Seqboot can handle molecular sequences, binary characters, restriction sites, or gene frequencies. It can also convert data sets between Sequential and Interleaved format, and into the NEXUS format or into a new XML sequence alignment format.

To carry out a bootstrap (or jackknife, or permutation test) with some method in the package, you may need to use three programs. First, you need to run Seqboot to take the original data set and produce a large number of bootstrapped or jackknifed data sets (somewhere between 100 and 1000 is usually adequate). Then you need to find the phylogeny estimate for each of these, using the particular method of interest. For example, if you were using Dnapars you would first run Seqboot and make a file with 100 bootstrapped data sets. Then you would give this file the proper name to have it be the input file for Dnapars. Running Dnapars with the M (Multiple Data Sets) menu choice and informing it to expect 100 data sets, you would generate a big output file as well as a treefile with the trees from the 100 data sets. This treefile could be renamed so that it would serve as the input for Consense. When Consense is run the majority rule consensus tree will result, showing the outcome of the analysis.

This may sound tedious, but the run of Consense is fast, and that of Seqboot is fairly fast, so that it will not actually take any longer than a run of a single bootstrap program with the same original data and the same number of replicates. This is not very hard and allows bootstrapping or jackknifing on many of the methods in this package. The same steps are necessary with all of them. Doing things this way some of the intermediate files (the tree file from the Dnapars run, for example) can be used to summarize the results of the bootstrap in other ways than the majority rule consensus method does.

If you are using the Distance Matrix programs, you will have to add one extra step to this, calculating distance matrices from each of the replicate data sets, using Dnadist or Gendist. So (for example) you would run Seqboot, then run Dnadist using the output of Seqboot as its input, then run (say) Neighbor using the output of Dnadist as its input, and then run Consense using the tree file from Neighbor as its input.

The resampling methods available are:

The data input file is of standard form for molecular sequences (either in interleaved or sequential form), restriction sites, gene frequencies, or binary morphological characters.

When the program runs it first asks you for a random number seed. This should be an integer greater than zero (and probably less than 32767) and which is of the form 4n+1, that is, it leaves a remainder of 1 when divided by 4. This can be judged by looking at the last two digits of the integer (for instance 7651 is not of form 4n+1 as 51, when divided by 4, leaves the remainder 3). The random number seed is used to start the random number generator. If the randum number seed is not odd, the program will request it again. Any odd number can be used, but may result in a random number sequence that repeats itself after less than the full one billion numbers. Usually this is not a problem. As the random numbers appear to be unpredictable, there is no such thing as a "good" seed -- the numbers produced from one seed are statistically indistinguishable from those produced by another, and it is not true that the numbers produced from one seed (say 4533) are similar to those produced from a nearby seed (say 4537).

Then the program shows you a menu to allow you to choose options. The menu looks like this:


Bootstrapping algorithm, version 3.69

Settings for this run:
  D      Sequence, Morph, Rest., Gene Freqs?  Molecular sequences
  J  Bootstrap, Jackknife, Permute, Rewrite?  Bootstrap
  %    Regular or altered sampling fraction?  regular
  B      Block size for block-bootstrapping?  1 (regular bootstrap)
  R                     How many replicates?  100
  W              Read weights of characters?  No
  C                Read categories of sites?  No
  S     Write out data sets or just weights?  Data sets
  I             Input sequences interleaved?  Yes
  0      Terminal type (IBM PC, ANSI, none)?  ANSI
  1       Print out the data at start of run  No
  2     Print indications of progress of run  Yes

  Y to accept these or type the letter for one to change

The user selects options by typing one of the letters in the left column, and continues to do so until all options are correctly set. Then the program can be run by typing Y.

It is important to select the correct data type (the D selection). Each time D is typed the program will change data type, proceeding successively through Molecular Sequences, Discrete Morphological Characters, Restriction Sites, and Gene Frequencies. Some of these will cause additional entries to appear in the menu. If Molecular Sequences or Restriction Sites settings are chosen the I (Interleaved) option appears in the menu (and as Molecular Sequences are also the default, it therefore appears in the first menu). It is the usual I option discussed in the Molecular Sequences document file and in the main documentation files for the package, and is on by default.

If the Restriction Sites option is chosen the menu option E appears, which asks whether the input file contains a third number on the first line of the file, for the number of restriction enzymes used to detect these sites. This is necessary because data sets for Restml need this third number, but other programs do not, and Seqboot needs to know what to expect.

If the Gene Frequencies option is chosen a menu option A appears which allows the user to specify that all alleles at each locus are in the input file. The default setting is that one allele is absent at each locus. Note that for sampling methods such as the bootstrap and jackknife, whole loci are sampled, not individual alleles.

The J option allows the user to select Bootstrapping, Delete-Half-Jackknifing, the Archie-Faith permutation of species within characters, permutation of character order, shuffling character order separately within each species, or Rewriting. It changes successively among these each time J is typed.

The P menu option appears if the data are molecular sequences and the J option is used to choose the Rewrite option. It gives you the choice between our normal PHYLIP format or a new (and nonstandard) XML sequence alignment format. This encloses the alignment between <ALIGNMENT> ... </ALIGNMENT> tags. Each sequence between <SEQUENCE> ... </SEQUENCE> tags, has a TYPE attribute of the sequence which is either "dna", "rna" or "protein". This is set by default to "dna" but can be changed by the user in an S Sequence type menu option. Each sequence has its name, enclosed between <NAME> ... </NAME> tags, and the data itself, enclosed between <DATA> ... </DATA> tags. The XML option is not available unless the data are molecular sequences. It is a new format -- no programs yet read it. In other cases the P menu option does not appear and the PHYLIP output format is assumed. Here is a simple example of this XML sequence alignment format, for the (silly) data set used in our main documentation file:

<alignment>
   <sequence type="dna">
      <name>Archaeopt</name>
      <data>CGATGCTTAC CGC</data>
   </sequence>

   <sequence type="dna">
      <name>Hesperorni</name>
      <data>CGTTACTCGT TGT</data>
   </sequence>

   <sequence type="dna">
      <name>Baluchithe</name>
      <data>TAATGTTAAT TGT</data>
   </sequence>

   <sequence type="dna">
      <name>B. virgini</name>
      <data>TAATGTTCGT TGT</data>
   </sequence>

   <sequence type="dna">
      <name>Brontosaur</name>
      <data>CAAAACCCAT CAT</data>
   </sequence>

   <sequence type="dna">
      <name>B.subtilis</name>
      <data>GGCAGCCAAT CAC</data>
   </sequence>

</alignment>

For the gene frequencies and restriction sites data types, this Rewrite option does not change the data set. The option will be useful mostly to write the data out in a standard format, in cases where the input file is messy-looking.

The B option selects the Block Bootstrap. When you select option B the program will ask you to enter the block length. When the block length is 1, this means that we are doing regular bootstrapping rather than block-bootstrapping.

The % option allows the user control over what fraction of the characters are sampled in the bootstrap and jackknife methods. Normally the bootstrap samples a number of times equal to the number of characters, and the jackknife samples half that number. This option permits you to specify a smaller fraction of characters to be sampled. Note that doing so is "statistically incorrect", but it is available here for whatever other purposes you may have in mind. Note that the fraction you will be asked to enter is the fraction of characters sampled, not the fraction left out. If you specify 100 as the fraction of sites retained and are using the jackknife, the data set will simply be rewritten. Note (as mentioned below) that this can be used together with the W (Weights) option to rewrite a data set while omitting a particular set of sites.

The R option allows the user to set the number of replicate data sets. This defaults to 100. Most statisticians would be happiest with 1000 to 10,000 replicates in a bootstrap, but 100 gives a rough picture. You will have to decide this based on how long a running time of the tree programs you are willing to tolerate. (The time needed to do the sampling in this program is not much of an issue).

The W (Weights) option allows weights to be read from a file whose default name is "weights". The weights follow the format described in the main documentation file. Weights can only be 0 or 1, and act to select the characters (or sites) that will be used in the resampling, the others being ignored and always omitted from the output data sets. If you use W together with the S (just weights) option, you write a file of weights (whose default name is "outweights"). In that file, any character whose original weight is 0 will have weight 0, the other weights varying according to the resampling. Note that if you write out data sets rather than weights (not using the S option), this output weights file is not written, as the characters are written different numbers of times in the data output file. Note that with restriction sites, the weights are not used by some of the programs. Writing out files of weights will not be helpful with those programs. For the moment, with all gene frequencies programs the weights are also not used.

Note that it is possible to use Seqboot to rewrite a data set while omitting certain sites. This can be done, not with the rewrite choice in option J, but with its jackknife choice. Choose the delete-half jackknife, but then use the % option to set the fraction of sites sampled to 100%. Also use the W option to read a set of weights that select which sites to retain (those with weights 1 instead of 0). Use the R option to set the number of replicates to 1. The program will write one data set, with all the sites that have weights 1, in order.

The C (Categories) option can be used with molecular sequence programs to allow assignment of sites or amino acid positions to user-defined rate categories. The assignment of rates to sites is then made by reading a file whose default name is "categories". It should contain a string of digits 1 through 9. A new line or a blank can occur after any character in this string. Thus the categories file might look like this:

122231111122411155
1155333333444

The only use of the Categories information in Seqboot is that they are sampled along with the sites (or amino acid positions) and are written out onto a file whose default name is "outcategories", which has one set of categories information for each bootstrap or jackknife replicate.

In the discrete characters data type, three more options appear in the menu. These are the N (aNcestors), X (miXture of methods), and F (Factors) options. They may be useful with the program Mix, which allows input of ancestors information and information specifying the mixture of parsimony methods to be used. Factors information is also read and used by programs Move, Dolmove, and Clique, in calculating how many multistate characters are compatible with a tree. The mixture, ancestors, and factors information for the characters are specified in input files whose default names are "ancestors", "mixture", and "factors". Seqboot produces output files that properly reflect what the resampling implies for these files. The corresponding output files have default file names "outancestors", "outmixture", and "outfactors".

For futher description of the mixture, ancestors, and factors file formats and contents see the Discrete Characters Programs documentation file.

The S option is a particularly important one. It is used whether to produce multiple output files or multiple weights. If your data set is large, a file with (say) 1000 such data sets can be very large and may use up too much space on your system. If you choose the S option, the program will instead produce a weights file with multiple sets of weights. The default name of this file is "outweights". Except for some programs that cannot handle multiple sets of weights, PHYLIP programs have an M (multiple data sets) option that asks the user whether to use multiple data sets or multiple sets of weights. If the latter is selected when running those programs, they read one data set, but analyze it multiple times, each time reading a new set of weights. As both bootstrapping and jackknifing can be thought of as reweighting the characters, this accomplishes the same thing (the multiple weights option is not available for the various kinds of permutation). As the file with multiple sets of weights is much smaller than a file with multiple data sets, this can be an attractive way to save file space. When multiple sets of weights are chosen, they reflect the sampling as well as any set of weights that was read in, so that you can use Seqboot's W option as well.

The 0 (Terminal type) option is the usual one.

Saving time by combining results of separate runs

Often runs of distance programs, or of phylogeny programs, on large numbers of bootstrap replicates are very time-consuming. If you have multiple computers, you can save time by splitting up these runs among multiple machines. For example, if you have 1000 replicate data sets (or weights) from bootstrapping, you could divide these into ten files of 100 data sets (or you could simply use Seqboot ten times with different random number seeds). If these are run on ten separate computers, the execution time is speeded up by as much as a factor of 10. Each input file of 100 data sets results in an output tree file. These can be concatenated end-to-end using a word processor program or using a command such as the Unix/Linux cat command. Make sure that these files are not turned into Microsoft Word format when this is done. The consensus tree program Consense will hande the concatenated tree file properly.

If a distance matrix method is being used, you can also produce the distance matrices on different machines, and concatenate them end-to-end to produce an input file of distance matrices for Fitch, Kitsch, or Neighbor. This is particularly relevant for Neighbor, which in most cases makes trees more quickly than the distance matrices can be produced.

Input File

The data files read by Seqboot are the standard ones for the various kinds of data. For molecular sequences the sequences may be either interleaved or sequential, and similarly for restriction sites. Restriction sites data may either have or not have the third argument, the number of restriction enzymes used. Discrete morphological characters are always assumed to be in sequential format. Gene frequencies data start with the number of species and the number of loci, and then follow that by a line with the number of alleles at each locus. The data for each locus may either have one entry for each allele, or omit one allele at each locus. The details of the formats are given in the main documentation file, and in the documentation files for the groups of programs.

Output

The output file will contain the data sets generated by the resampling process. Note that, when Gene Frequencies data is used or when Discrete Morphological characters with the Factors option are used, the number of characters in each data set may vary. It may also vary if there are an odd number of characters or sites and the Delete-Half-Jackknife resampling method is used, for then there will be a 50% chance of choosing (n+1)/2 characters and a 50% chance of choosing (n-1)/2 characters.

The Factors option causes the characters to be resampled together. If (say) three adjacent characters all have the same factor, so that they all are understood to be recoding one multistate character, they will be resampled together as a group.

The numerical options 1 and 2 in the menu also affect the output file. If 1 is chosen (it is off by default) the program will print the original input data set on the output file before the resampled data sets. I cannot actually see why anyone would want to do this. Option 2 toggles the feature (on by default) that prints out up to 20 times during the resampling process a notification that the program has completed a certain number of data sets. Thus if 100 resampled data sets are being produced, every 5 data sets a line is printed saying which data set has just been completed. This option should be turned off if the program is running in background and silence is desirable. At the end of execution the program will always (whatever the setting of option 2) print a couple of lines saying that output has been written to the output file.

Size and Speed

The program runs moderately quickly, though more slowly when the Permutation resampling method is used than with the others.


TEST DATA SET

    5    6
Alpha     AACAAC
Beta      AACCCC
Gamma     ACCAAC
Delta     CCACCA
Epsilon   CCAAAC


CONTENTS OF OUTPUT FILE

(If Replicates are set to 10 and seed to 4333)

    5     6
Alpha      ACAAAC
Beta       ACCCCC
Gamma      ACAAAC
Delta      CACCCA
Epsilon    CAAAAC
    5     6
Alpha      AAAACC
Beta       AACCCC
Gamma      CCAACC
Delta      CCCCAA
Epsilon    CCAACC
    5     6
Alpha      ACAAAC
Beta       ACCCCC
Gamma      CCAAAC
Delta      CACCCA
Epsilon    CAAAAC
    5     6
Alpha      ACCAAA
Beta       ACCCCC
Gamma      ACCAAA
Delta      CAACCC
Epsilon    CAAAAA
    5     6
Alpha      ACAAAC
Beta       ACCCCC
Gamma      ACAAAC
Delta      CACCCA
Epsilon    CAAAAC
    5     6
Alpha      AAAACA
Beta       AAAACC
Gamma      AAACCA
Delta      CCCCAC
Epsilon    CCCCAA
    5     6
Alpha      AAACCC
Beta       CCCCCC
Gamma      AAACCC
Delta      CCCAAA
Epsilon    AAACCC
    5     6
Alpha      AAAACC
Beta       AACCCC
Gamma      AAAACC
Delta      CCCCAA
Epsilon    CCAACC
    5     6
Alpha      AAAAAC
Beta       AACCCC
Gamma      CCAAAC
Delta      CCCCCA
Epsilon    CCAAAC
    5     6
Alpha      AACCAC
Beta       AACCCC
Gamma      AACCAC
Delta      CCAACA
Epsilon    CCAAAC

phylip-3.697/doc/sequence.html0000644004732000473200000003676612406201173016026 0ustar joefelsenst_g sequence

version 3.696

Molecular Sequence Programs

© Copyright 1986-2014 by Joseph Felsenstein. All rights reserved. License terms here.

These programs estimate phylogenies from protein sequence or nucleic acid sequence data. Protpars uses a parsimony method intermediate between Eck and Dayhoff's method (1966) of allowing transitions between all amino acids and counting those, and Fitch's (1971) method of counting the number of nucleotide changes that would be needed to evolve the protein sequence. Dnapars uses the parsimony method allowing changes between all bases and counting the number of those. Dnamove is an interactive parsimony program allowing the user to rearrange trees by hand and see where character states change. Dnapenny uses the branch-and-bound method to search for all most parsimonious trees in the nucleic acid sequence case. Dnacomp adapts to nucleotide sequences the compatibility (largest clique) approach. Dnainvar does not directly estimate a phylogeny, but computes Lake's (1987) and Cavender's (Cavender and Felsenstein, 1987) phylogenetic invariants, which are quantities whose values depend on the phylogeny. Dnaml does a maximum likelihood estimate of the phylogeny (Felsenstein, 1981a). Dnamlk is similar to Dnaml but assumes a molecular clock. Dnadist computes distance measures between pairs of species from nucleotide sequences, distances that can then be used by the distance matrix programs Fitch and Kitsch. Restml does a maximum likelihood estimate from restriction sites data. Seqboot allows you to read in a data set and then produce multiple data sets from it by bootstrapping, delete-half jackknifing, or by permuting within sites. This then allows most of these methods to be bootstrapped or jackknifed, and for the Permutation Tail Probability Test of Archie (1989) and Faith and Cranston (1991) to be carried out.

The input and output format for Restml is described in its document files. In general its input format is similar to those described here, except that the one-letter codes for restriction sites is specific to that program and is described in that document file. Since the input formats for the eight DNA sequence and two protein sequence programs apply to more than one program, they are described here. Their input formats are standard, making use of the IUPAC standards.

INTERLEAVED AND SEQUENTIAL FORMATS

The sequences can continue over multiple lines; when this is done the sequences must be either in "interleaved" format, similar to the output of alignment programs, or "sequential" format. These are described in the main documentation file. In sequential format all of one sequence is given, possibly on multiple lines, before the next starts. In interleaved format the first part of the file should contain the first part of each of the sequences, then optionally a line containing nothing but a carriage-return character, then the second part of each sequence, and so on. Only the first parts of the sequences should be preceded by names. Here is a hypothetical example of interleaved format:

  5    42
Turkey    AAGCTNGGGC ATTTCAGGGT
Salmo gairAAGCCTTGGC AGTGCAGGGT
H. SapiensACCGGTTGGC CGTTCAGGGT
Chimp     AAACCCTTGC CGTTACGCTT
Gorilla   AAACCCTTGC CGGTACGCTT

GAGCCCGGGC AATACAGGGT AT
GAGCCGTGGC CGGGCACGGT AT
ACAGGTTGGC CGTTCAGGGT AA
AAACCGAGGC CGGGACACTC AT
AAACCATTGC CGGTACGCTT AA

while in sequential format the same sequences would be:

  5    42
Turkey    AAGCTNGGGC ATTTCAGGGT
GAGCCCGGGC AATACAGGGT AT
Salmo gairAAGCCTTGGC AGTGCAGGGT
GAGCCGTGGC CGGGCACGGT AT
H. SapiensACCGGTTGGC CGTTCAGGGT
ACAGGTTGGC CGTTCAGGGT AA
Chimp     AAACCCTTGC CGTTACGCTT
AAACCGAGGC CGGGACACTC AT
Gorilla   AAACCCTTGC CGGTACGCTT
AAACCATTGC CGGTACGCTT AA

Note, of course, that a portion of a sequence like this:

300 AAGCGTGAAC GTTGTACTAA TRCAG

is perfectly legal, assuming that the species name has gone before, and is filled out to full length by blanks. The above digits and blanks will be ignored, the sequence being taken as starting at the first base symbol (in this case an A). This should enable you to use output from many multiple-sequence alignment programs with only minimal editing.

INPUT FOR THE DNA SEQUENCE PROGRAMS

The input format for the DNA sequence programs is standard: the data have A's, G's, C's and T's (or U's). The first line of the input file contains the number of species and the number of sites. As with the other programs, options information may follow this. Following this, each species starts on a new line. The first 10 characters of that line are the species name. There then follows the base sequence of that species, each character being one of the letters A, B, C, D, G, H, K, M, N, O, R, S, T, U, V, W, X, Y, ?, or - (a period was also previously allowed but it is no longer allowed, because it sometimes is used in different senses in other programs). Blanks will be ignored, and so will numerical digits. This allows GENBANK and EMBL sequence entries to be read with minimum editing.

These characters can be either upper or lower case. The algorithms convert all input characters to upper case (which is how they are treated). The characters constitute the IUPAC (IUB) nucleic acid code plus some slight extensions. They enable input of nucleic acid sequences taking full account of any ambiguities in the sequence.

SymbolMeaning
AAdenine
GGuanine
CCytosine
TThymine
UUracil
YpYrimidine(C or T)
RpuRine(A or G)
W"Weak"(A or T)
S"Strong"(C or G)
K"Keto"(T or G)
M"aMino"(C or A)
Bnot A(C or G or T)
Dnot C(A or G or T)
Hnot G(A or C or T)
Vnot T(A or C or G)
X,N,?unknown(A or C or G or T)
Odeletion
-deletion

INPUT FOR THE PROTEIN SEQUENCE PROGRAMS

The input for the protein sequence programs is fairly standard. The first line contains the number of species and the number of amino acid positions (counting any stop codons that you want to include). These are followed on the same line by the options. The only options which need information in the input file are U (User Tree) and W (Weights). They are as described in the main documentation file. If the W (Weights) option is used there must be a W in the first line of the input file.

Next come the species data. Each sequence starts on a new line, has a ten-character species name that must be blank-filled to be of that length, followed immediately by the species data in the one-letter code. The sequences must either be in the "interleaved" or "sequential" formats. The I option selects between them. The sequences can have internal blanks in the sequence but there must be no extra blanks at the end of the terminated line. Note that a blank is not a valid symbol for a deletion.

The protein sequences are given by the one-letter code used by the late Margaret Dayhoff's group in the Atlas of Protein Sequences, and consistent with the IUB standard abbreviations. In the present version it is:

SymbolStands for
Aala
Basx
Ccys
Dasp
Eglu
Fphe
Ggly
Hhis
Iileu
J(not used)
Klys
Lleu
Mmet
Nasn
O(not used)
Ppro
Qgln
Rarg
Sser
Tthr
U(not used)
Vval
Wtrp
Xunknown amino acid
Ytyr
Zglx
*nonsense (stop)
?unknown amino acid or deletion
-deletion

where "nonsense", and "unknown" mean respectively a nonsense (chain termination) codon and an amino acid whose identity has not been determined. The state "asx" means "either asn or asp", and the state "glx" means "either gln or glu" and the state "deletion" means that alignment studies indicate a deletion has happened in the ancestry of this position, so that it is no longer present. Note that if two polypeptide chains are being used that are of different length owing to one terminating before the other, they can be coded as (say)

             HIINMA*????
             HIPNMGVWABT
since after the stop codon we do not definitely know that there has been a deletion, and do not know what amino acid would have been there. If DNA studies tell us that there is DNA sequence in that region, then we could use "X" rather than "?". Note that "X" means an unknown amino acid, but definitely an amino acid, while "?" could mean either that or a deletion. Otherwise one will usually want to use "?" after a stop codon, if one does not know what amino acid is there. If the DNA sequence has been observed there, one probably ought to resist putting in the amino acids that this DNA would code for, and one should use "X" instead, because under the assumptions implicit in either the parsimony or the distance methods, changes to any noncoding sequence are much easier than changes in a coding region that change the amino acid

Here are the same one-letter codes tabulated the other way 'round:

Amino acidOne-letter code
alaA
argR
asnN
aspD
asxB
cysC
glnQ
gluE
glyG
glxZ
hisH
ileuI
leuL
lysK
metM
pheF
proP
serS
thrT
trpW
tyrY
valV
deletion-
nonsense (stop)*
unknown amino acidX
unknown (incl. deletion)?

THE OPTIONS

The programs allow options chosen from their menus. Many of these are as described in the main documentation file, particularly the options J, O, U, T, W, and Y. (Although T has a different meaning in the programs Dnaml and Dnadist than in the others).

The U option indicates that user-defined trees are provided at the end of the input file. This happens in the usual way, except that for Protpars, Dnapars, Dnacomp, and Dnamlk, the trees must be strictly bifurcating, containing only two-way splits, e. g.: ((A,B),(C,(D,E)));. For Dnaml and Restml it must have a trifurcation at its base, e. g.: ((A,B),C,(D,E));. The root of the tree may in those cases be placed arbitrarily, since the trees needed are actually unrooted, though they look different when printed out. The program Retree should enable you to reroot the trees without having to hand-edit or retype them. For Dnamove the U option is not available (although there is an equivalent feature which uses rooted user trees).

A feature of the nucleotide sequence programs other than Dnamove is that they save time and computer memory space by recognizing sites at which the pattern of bases is the same, and doing their computation only once. Thus if we have only four species but a large number of sites, there are (ignoring ambiguous bases) only about 256 different patterns of nucleotides (4 x 4 x 4 x 4) that can occur. The programs automatically count how many occurrences there are of each and then only needs to do as much computation as would be needed with 256 sites, even though the number of sites is actually much larger. If there are ambiguities (such as Y or R nucleotides), these are also handled correctly, and do not cause trouble. The programs store the full sequences but reserve other space for bookkeeping only for the distinct patterns. This saves space. Thus the programs will run very effectively with few species and many sites. On larger numbers of species, if rates of evolution are small, many of the sites will be invariant (such as having all A's) and thus will mostly have one of four patterns. The programs will in this way automatically avoid doing duplicate computations for such sites. phylip-3.697/doc/treedist.html0000644004732000473200000005061212406201173016023 0ustar joefelsenst_g treedist

version 3.696

Treedist -- distances between trees

© Copyright 2000-2014 by Joseph Felsenstein. All rights reserved. License terms here.

This program computes distances between trees. Two distances are computed, the Branch Score Distance of Kuhner and Felsenstein (1994), and the more widely known Symmetric Difference of Robinson and Foulds (1981). The Branch Score Distance uses branch lengths, and can only be calculated when the trees have lengths on all branches. The Symmetric Difference does not use branch length information, only the tree topologies. It must also be borne in mind that neither distance has any immediate statistical interpretation -- we cannot say whether a larger distance is significantly larger than a smaller one.

These distances are computed by considering all possible branches that could exist on the the two trees. Each branch divides the set of species into two groups -- the ones connected to one end of the branch and the ones connected to the other. This makes a partition of the full set of species. The following tree (in Newick notation)

  ((A,C),(D,(B,E))) 
has two internal branches. One induces the partition {A, C  |  B, D, E} and the other induces the partition {A, C, D  |  B, E}. A different tree with the same set of species,
  (((A,D),C),(B,E)) 
has internal branches that correspond to the two partitions {A, C, D  |  B, E} and {A, D  |  B, C, E}. Note that the other branches, all of which are external branches, induce partitions that separate one species from all the others. Thus there are 5 partitions like this: {C  |  A, B, D, E} on each of these trees. These are always present on all trees, provided that each tree has each species at the end of its own branch.

In the case of the Branch Score distance, each partition that does exist on a tree also has a branch length associated with it. Thus if the tree is

  (((A:0.1,D:0.25):0.05,C:0.01):0.2,(B:0.3,E:0.8):0.2) 
the list of partitions and their branch lengths is:
{A  |  B, C, D, E}    0.1
{D  |  A, B, C, E}    0.25
{A, D  |  B, C, E}    0.05
{C  |  A, B, D, E}    0.01
{A, D, C  |  B, E}    0.4
{B  |  A, C, D, E}    0.3
{E  |  A, B, C, D}    0.8

Note that the tree is being treated as unrooted here, so that the branch lengths on either side of the rootmost node are summed up to get a branch length of 0.4.

The Branch Score Distance imagines us as having made a list of all possible partitions, the ones shown above and also all 7 other possible partitions, which correspond to branches that are not found in this tree. These are assigned branch lengths of 0. For two trees, we imagine constructing these lists, and then summing the squared differences between the branch lengths. Thus if both trees have branches {A, D  |  B, C, E}, the sum contains the square of the difference between the branch lengths. If one tree has the branch and the other doesn't, it contains the square of the difference between the branch length and zero (in other words, the square of that branch length). If both trees do not have a particular branch, nothing is added to the sum because the difference is then between 0 and 0.

The Branch Score Distance takes this sum of squared differences and computes its square root. Note that it has some desirable properties. When small branches differ in tree topology, it is not very big. When branches are both present but differ in length, it is affected.

The Symmetric Difference is simply a count of how many partitions there are, among the two trees, that are on one tree and not on the other. In the example above there are two partitions, {A, C  |  B, D, E} and {A, D  |  B, C, E}, each of which is present on only one of the two trees. The Symmetric Difference between the two trees is therefore 2. When the two trees are fully resolved bifurcating trees, their symmetric distance must be an even number; it can range from 0 to twice the number of internal branches, so that for n species it can be as large as 2n-6 (for 3 species or more).

Note the relationship between the two distances. If all branches in the two trees have length 1.0, the Branch Score Distance is the square root of the Symmetric Difference, as each branch that is present in one but not in the other results in 1.0 being added to the sum of squared differences.

We have assumed that nothing is lost if the trees are treated as unrooted trees. It is easy to define a counterpart to the Branch Score Distance and one to the Symmetric Difference for these rooted trees. Each branch then defines a set of species, namely the clade defined by that branch. Thus if the first of the two trees above were considered as a rooted tree it would define the three clades {A, C}, {B, D, E}, and {B, E}. The Branch Score Distance is computed from the branch lengths for all possible sets of species, with 0 put for each set that does not occur on that tree. The table above will be nearly the same, but with two entries instead of one for the sets on either side of the root, {A C D} and {B E}. The Symmetric Difference between two rooted trees is simply the count of the number of clades that are defined by one but not by the other. For the second tree the clades would be {A, D}, {B, C, E}, and {B, E}. The Symmetric Difference between these two rooted trees would then be 4.

Although the examples we have discussed have involved fully bifurcating trees, the input trees can have multifurcations. This does not cause any complication for the Branch Score Distance. For the Symmetric Difference, it can lead to distances that are odd numbers.

However, note one strong restriction. The trees should all have the same list of species. If you use one set of species in the first two trees, and another in the second two, and choose distances for adjacent pairs, the distances will be incorrect and will depend on the order of these pairs in the input tree file, in odd ways.

INPUT AND OPTIONS

The program reads one or two input tree files. If there is one input tree file, its default name is intree. If there are two their default names are intree and intree2. The tree files may either have the number of trees on their first line, or not. If the number of trees is given, it is actually ignored and all trees in the tree file are considered, even if there are more trees than indicated by the number. There is no maximum number of trees that can be processed but, if you feed in too many, there may be an error message about running out of memory. The problem is particularly acute if you choose the option to examine all possible pairs of trees in an input tree file, or all possible pairs of trees one from each of two input tree files. Thus if there are 1,000 trees in the input tree file, keep in mind that all possible pairs means 1,000,000 pairs to be examined!

Earlier versions of this program had an option to change the value of the variable maxgrp to overcome a hard-wired limit on the number of species the program can handle. This feature has been removed as it is no longer necessary.

The options are selected from a menu, which looks like this:


Tree distance program, version 3.69

Settings for this run:
 D                         Distance Type:  Branch Score Distance
 O                         Outgroup root:  No, use as outgroup species  1
 R         Trees to be treated as Rooted:  No
 T    Terminal type (IBM PC, ANSI, none):  ANSI
 1  Print indications of progress of run:  Yes
 2                 Tree distance submenu:  Distance between adjacent pairs

Are these settings correct? (type Y or the letter for one to change)

The D option chooses which distance measure to use. The Branch Score Distance is the default. If it is in force, and any of the trees which are read in have even one branch that fails to have a length, the program will terminate with an error. If the Symmetric Difference option is chosen, no check of branch lengths is made.

The O option allows you to root the trees using an outgroup. It is specified by giving its number, where the species are numbered in the order they appear in the first tree. Outgroup-rooting all the trees does not affect the distances if the trees are treated as unrooted, and if it is done and trees are treated as rooted, the distances turn out to be the same as the unrooted ones. Thus it is unlikely that you will find this option of interest.

The R option controls whether the Symmetric Distance that is computed is to treat the trees as unrooted or rooted. Unrooted is the default.

The terminal type (0) and progress (1) options do not need description here.

Option 2 controls how many tree files are read in, which trees are to be compared, and how the output is to be presented. It causes another menu to appear:


Tree Pairing Submenu:
 A     Distances between adjacent pairs in tree file.
 P     Distances between all possible pairs in tree file.
 C     Distances between corresponding pairs in one tree file and another.
 L     Distances between all pairs in one tree file and another.

 Choose one: (A,P,C,L)

Option A computes the distances between successive pairs of trees in the tree input file -- between trees 1 and 2, trees 3 and 4, trees 5 and 6, and so on. If there are an odd number of trees in the input tree file the last tree will be ignored and a warning message printed to remind the user that nothing was done with it.

Option P computes distances between all pairs of trees in the input tree file. Thus with 10 trees 10 x 10 = 100 distances will be computed, including distances between each tree and itself.

Option C takes input from two tree files and computes distances between corresponding members of the two tree files. Thus distances will be computed between tree 1 of the first tree file and tree 1 of the second one, between tree 2 of the first file and tree 2 of the second one, and so on. If the number of trees in the two files differs, the extra trees in the file that has more of them are ignored and a warning is printed out.

Option L computes distances between all pairs of trees, where one tree is taken from one tree file and the other from the other tree file. Thus if the first tree file has 7 trees and the second has 5 trees, 7 x 5 = 35 different distances will be computed.

If option 2 is not selected, the program defaults to looking at one tree file and computing distances of adjacent pairs (so that option A is the default).

OUTPUT

The results of the analysis are written onto an output file whose default file name is outfile.

If any of the four types of analysis are selected, the program asks the user how they want the results presented. Here is that menu for options P or L:


Distances output options:
 F     Full matrix.
 V     One pair per line, verbose.
 S     One pair per line, sparse.

 Choose one: (F,V,S)

The Full matrix (choice F) is a table showing all distances. It is written onto the output file. The table is presented as groups of 10 columns. Here is the Full matrix for the 12 trees in the input tree file which is given as an example at the end of this page.


Tree distance program, version 3.69

Symmetric differences between all pairs of trees in tree file:



          1     2     3     4     5     6     7     8     9    10 
      \------------------------------------------------------------
    1 |   0     4     2    10    10    10    10    10    10    10  
    2 |   4     0     2    10     8    10     8    10     8    10  
    3 |   2     2     0    10    10    10    10    10    10    10  
    4 |  10    10    10     0     2     2     4     2     4     0  
    5 |  10     8    10     2     0     4     2     4     2     2  
    6 |  10    10    10     2     4     0     2     2     4     2  
    7 |  10     8    10     4     2     2     0     4     2     4  
    8 |  10    10    10     2     4     2     4     0     2     2  
    9 |  10     8    10     4     2     4     2     2     0     4  
   10 |  10    10    10     0     2     2     4     2     4     0  
   11 |   2     2     0    10    10    10    10    10    10    10  
   12 |  10    10    10     2     4     2     4     0     2     2  

         11    12 
      \------------
    1 |   2    10  
    2 |   2    10  
    3 |   0    10  
    4 |  10     2  
    5 |  10     4  
    6 |  10     2  
    7 |  10     4  
    8 |  10     0  
    9 |  10     2  
   10 |  10     2  
   11 |   0    10  
   12 |  10     0  


The Full matrix is only available for analyses P and L (not for A or C).

Option V (Verbose) writes one distance per line. The Verbose output is the default. Here it is for the example data set given below:


Tree distance program, version 3.69

Symmetric differences between adjacent pairs of trees:

Trees 1 and 2:    4
Trees 3 and 4:    10
Trees 5 and 6:    4
Trees 7 and 8:    4
Trees 9 and 10:    4
Trees 11 and 12:    10

Option S (Sparse or terse) is similar except that all that is given on each line are the numbers of the two trees and the distance, separated by blanks. This may be a convenient format if you want to write a program to read these numbers in, and you want to spare yourself the effort of having the program wade through the words on each line in the Verbose output. The first four lines of the Sparse output are titles that your program would want to skip past. Here is the Sparse output for the example trees.

1 2 4
3 4 10
5 6 4
7 8 4
9 10 4
11 12 10

CREDITS AND FUTURE

Treedist was originally written by Dan Fineman, with fixes by Doug Buxton. We also hope in the future to compute a distance based on quartets shared and not shared by trees (implicit in the work of Estabrook, McMorris, and Meacham, 1985). We will also implement the tree distance of Robinson and Foulds (1979), which is like the Branch Score Distance but uses absolute values of differences between branch lengths rather than sums of squares of differences.


TEST DATA SET 1

(A,(B,(H,(D,(J,(((G,E),(F,I)),C))))));
(A,(B,(D,((J,H),(((G,E),(F,I)),C)))));
(A,(B,(D,(H,(J,(((G,E),(F,I)),C))))));
(A,(B,(E,(G,((F,I),((J,(H,D)),C))))));
(A,(B,(E,(G,((F,I),(((J,H),D),C))))));
(A,(B,(E,((F,I),(G,((J,(H,D)),C))))));
(A,(B,(E,((F,I),(G,(((J,H),D),C))))));
(A,(B,(E,((G,(F,I)),((J,(H,D)),C)))));
(A,(B,(E,((G,(F,I)),(((J,H),D),C)))));
(A,(B,(E,(G,((F,I),((J,(H,D)),C))))));
(A,(B,(D,(H,(J,(((G,E),(F,I)),C))))));
(A,(B,(E,((G,(F,I)),((J,(H,D)),C)))));

The output from the setting in the D menu choice of the Symmetric Difference for this test set is given above (it is the Verbose output example).


TEST DATA SET 2

This data set is the first part of the previous one, but with branch lengths on the trees, to serve as an example for the Branch Score distance.

(A:0.1,(B:0.1,(H:0.1,(D:0.1,(J:0.1,(((G:0.1,E:0.1):0.1,(F:0.1,I:0.1):0.1):0.1,
C:0.1):0.1):0.1):0.1):0.1):0.1);
(A:0.1,(B:0.1,(D:0.1,((J:0.1,H:0.1):0.1,(((G:0.1,E:0.1):0.1,
(F:0.1,I:0.1):0.1):0.1,C:0.1):0.1):0.1):0.1):0.1);
(A:0.1,(B:0.1,(D:0.1,(H:0.1,(J:0.1,(((G:0.1,E:0.1):0.1,(F:0.1,I:0.1):0.1):0.1,
C:0.1):0.1):0.1):0.1):0.1):0.1);
(A:0.1,(B:0.1,(E:0.1,(G:0.1,((F:0.1,I:0.1):0.1,((J:0.1,(H:0.1,D:0.1):0.1):0.1,
C:0.1):0.1):0.1):0.1):0.1):0.1);
(A:0.1,(B:0.1,(E:0.1,(G:0.1,((F:0.1,I:0.1):0.1,(((J:0.1,H:0.1):0.1,D:0.1):0.1,
C:0.1):0.1):0.1):0.1):0.1):0.1);
(A:0.1,(B:0.1,(E:0.1,((F:0.1,I:0.1):0.1,(G:0.1,((J:0.1,(H:0.1,D:0.1):0.1):0.1,
C:0.1):0.1):0.1):0.1):0.1):0.1);
(A:0.1,(B:0.1,(E:0.1,((F:0.1,I:0.1):0.1,(G:0.1,(((J:0.1,H:0.1):0.1,D:0.1):0.1,
C:0.1):0.1):0.1):0.1):0.1):0.1);
(A:0.1,(B:0.1,(E:0.1,((G:0.1,(F:0.1,I:0.1):0.1):0.1,((J:0.1,(H:0.1,
D:0.1):0.1):0.1,C:0.1):0.1):0.1):0.1):0.1);
(A:0.1,(B:0.1,(E:0.1,((G:0.1,(F:0.1,I:0.1):0.1):0.1,(((J:0.1,H:0.1):0.1,
D:0.1):0.1,C:0.1):0.1):0.1):0.1):0.1);
(A:0.1,(B:0.1,(E:0.1,(G:0.1,((F:0.1,I:0.1):0.1,((J:0.1,(H:0.1,D:0.1):0.1):0.1,
C:0.1):0.1):0.1):0.1):0.1):0.1);
(A:0.1,(B:0.1,(D:0.1,(H:0.1,(J:0.1,(((G:0.1,E:0.1):0.1,(F:0.1,I:0.1):0.1):0.1,
C:0.1):0.1):0.1):0.1):0.1):0.1);
(A:0.1,(B:0.1,(E:0.1,((G:0.1,(F:0.1,I:0.1):0.1):0.1,((J:0.1,(H:0.1,
D:0.1):0.1):0.1,C:0.1):0.1):0.1):0.1):0.1);


TEST SET 2 OUTPUT

This was run using the default Branch Score distance, and asking in option 2 for the P (all pairs in file) setting and the F (Full matrix output) setting.


Tree distance program, version 3.69

Branch score distances between all pairs of trees in tree file:



                1           2           3           4           5           6           7 
      \------------------------------------------------------------------------------------
    1 |         0         0.2    0.141421    0.316228    0.316228    0.316228    0.316228  
    2 |       0.2           0    0.141421    0.316228    0.282843    0.316228    0.282843  
    3 |  0.141421    0.141421           0    0.316228    0.316228    0.316228    0.316228  
    4 |  0.316228    0.316228    0.316228           0    0.141421    0.141421         0.2  
    5 |  0.316228    0.282843    0.316228    0.141421           0         0.2    0.141421  
    6 |  0.316228    0.316228    0.316228    0.141421         0.2           0    0.141421  
    7 |  0.316228    0.282843    0.316228         0.2    0.141421    0.141421           0  
    8 |  0.316228    0.316228    0.316228    0.141421         0.2    0.141421         0.2  
    9 |  0.316228    0.282843    0.316228         0.2    0.141421         0.2    0.141421  
   10 |  0.316228    0.316228    0.316228           0    0.141421    0.141421         0.2  
   11 |  0.141421    0.141421           0    0.316228    0.316228    0.316228    0.316228  
   12 |  0.316228    0.316228    0.316228    0.141421         0.2    0.141421         0.2  

                8           9          10          11          12 
      \------------------------------------------------------------
    1 |  0.316228    0.316228    0.316228    0.141421    0.316228  
    2 |  0.316228    0.282843    0.316228    0.141421    0.316228  
    3 |  0.316228    0.316228    0.316228           0    0.316228  
    4 |  0.141421         0.2           0    0.316228    0.141421  
    5 |       0.2    0.141421    0.141421    0.316228         0.2  
    6 |  0.141421         0.2    0.141421    0.316228    0.141421  
    7 |       0.2    0.141421         0.2    0.316228         0.2  
    8 |         0    0.141421    0.141421    0.316228           0  
    9 |  0.141421           0         0.2    0.316228    0.141421  
   10 |  0.141421         0.2           0    0.316228    0.141421  
   11 |  0.316228    0.316228    0.316228           0    0.316228  
   12 |         0    0.141421    0.141421    0.316228           0  


phylip-3.697/exe/0000755004732000473200000000000012406201151013316 5ustar joefelsenst_gphylip-3.697/phylip.html0000644004732000473200000001665112407622307014754 0ustar joefelsenst_g phylip

v3.696

PHYLIP programs and documentation

PHYLIP, the PHYLogeny Inference Package, consists of 35 programs. There are documentation files for each program, in the form of web pages in HTML 3.2. There are also documentation web pages for each group of programs, and a main documentation file that is the basic introduction to the package. Before running any of the programs you should read it.

Below you will find a list of the programs and the documentation files. The names of the documentation files are highlighted as links that will take you to those documentation files.

Introduction to PHYLIP

main documentation file

Molecular sequence methods

molecular sequence programs documentation file
protparsprotein parsimony documentation file
dnaparsDNA sequence parsimony documentation file
dnapennyDNA parsimony branch and bound documentation file
dnamoveinteractive DNA parsimony documentation file
dnacompDNA compatibility documentation file
dnamlDNA maximum likelihood documentation file
dnamlkDNA maximum likelihood with clock documentation file
promlProtein sequence maximum likelihood documentation file
promlkProtein sequence maximum likelihood with clock documentation file
dnainvarDNA invariants documentation file
dnadistDNA distance documentation file
protdistProtein sequence distance documentation file
restdistRestriction sites and fragments distances documentation file
restmlRestriction sites maximum likelihood documentation file
seqbootBootstrapping/Jackknifing documentation file

Distance matrix methods

Distance matrix programs documentation file
fitchFitch-Margoliash distance matrix method documentation file
kitschFitch-Margoliash distance matrix with clock documentation file
neighborNeighbor-Joining and UPGMA method documentation file

Gene frequencies and continuous characters

Continuous characters and gene frequencies documentation file
contmlMaximum likelihood continuous characters and gene frequencies documentation file
contrastContrast method documentation file
gendistGenetic distance documentation file

Discrete characters methods

Discrete characters methods documentation file
parsUnordered multistate parsimony documentation file
mixMixed method parsimony documentation file
pennyBranch and bound mixed method parsimony documentation file
moveInteractive mixed method parsimony documentation file
dollopDollo and polymorphism parsimony documentation file
dolpennyDollo and polymorphism branch and bound parsimony documentation file
dolmoveDollo and polymorphism interactive parsimony documentation file
clique0/1 characters compatibility method documentation file
factorCharacter recoding program documentation file

Tree drawing, consensus, tree editing, tree distances

Tree drawing programs documentation file
drawgramRooted tree drawing program documentation file
drawtreeUnrooted tree drawing program documentation file
  
consenseConsensus tree program documentation file
treedistTree distance program documentation file
retreeinteractive tree rearrangement program documentation file

phylip-3.697/src/0000755004732000473200000000000013212364240013331 5ustar joefelsenst_gphylip-3.697/src/clique.c0000644004732000473200000011506212406201116014760 0ustar joefelsenst_g#include "phylip.h" #include "disc.h" /* Written by Joseph Felsenstein, Jerry Shurman, Hisashi Horino, Akiko Fuseki, Sean Lamont, and Andrew Keeffe. Copyright (c) 1993-2014, Joseph Felsenstein All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #define FormWide 80 /* width of outfile page */ typedef boolean *aPtr; typedef long *SpPtr, *ChPtr; typedef struct vecrec { aPtr vec; struct vecrec *next; } vecrec; typedef vecrec **aDataPtr; typedef vecrec **Matrix; #ifndef OLDC /* function prototypes */ void clique_gnu(vecrec **); void clique_chuck(vecrec *); void nunode(node **); void getoptions(void); void clique_setuptree(void); void allocrest(void); void doinit(void); void clique_inputancestors(void); void clique_printancestors(void); void clique_inputfactors(void); void inputoptions(void); void clique_inputdata(void); boolean Compatible(long, long); void SetUp(vecrec **); void Intersect(boolean *, boolean *, boolean *); long CountStates(boolean *); void Gen1(long , long, boolean *, boolean *, boolean *); boolean Ingroupstate(long ); void makeset(void); void Init(long *, long *, long *, aPtr); void ChSort(long *, long *, long); void PrintClique(boolean *); void bigsubset(long *, long); void recontraverse(node **, long *, long, long); void reconstruct(long, long); void reroot(node *); void clique_coordinates(node *, long *, long); void clique_drawline(long); void clique_printree(void); void DoAll(boolean *, boolean *, boolean *, long); void Gen2(long, long, boolean *, boolean *, boolean *); void GetMaxCliques(vecrec **); void reallocchars(void); /* function prototypes */ #endif Char infilename[FNMLNGTH], outfilename[FNMLNGTH], outtreename[FNMLNGTH], ancfilename[FNMLNGTH], factfilename[FNMLNGTH], weightfilename[FNMLNGTH]; long ActualChars, Cliqmin, outgrno, col, ith, msets, setsz; boolean ancvar, Clmin, Factors, outgropt, trout, weights, noroot, justwts, printcomp, progress, treeprint, mulsets, firstset; long nodes; aPtr ancone; Char *Factor; long *ActChar, *oldweight; aDataPtr Data; Matrix Comp; /* the character compatibility matrix */ node *root; long **grouping; pointptr treenode; /* pointers to all nodes in tree */ vecrec *garbage; /* these variables are to DoAll in the pascal Version. */ aPtr aChars; boolean *Rarer; long n, MaxChars; SpPtr SpOrder; ChPtr ChOrder; /* variables for GetMaxCliques: */ vecrec **Comp2; long tcount; aPtr Temp, Processed, Rarer2; void clique_gnu(vecrec **p) { /* this and the following are do-it-yourself garbage collectors. Make a new node or pull one off the garbage list */ if (garbage != NULL) { *p = garbage; garbage = garbage->next; } else { *p = (vecrec *)Malloc((long)sizeof(vecrec)); (*p)->vec = (aPtr)Malloc((long)chars*sizeof(boolean)); } (*p)->next = NULL; } /* clique_gnu */ void clique_chuck(vecrec *p) { /* collect garbage on p -- put it on front of garbage list */ p->next = garbage; garbage = p; } /* clique_chuck */ void nunode(node **p) { /* replacement for NEW */ *p = (node *)Malloc((long)sizeof(node)); (*p)->next = NULL; (*p)->tip = false; } /* nunode */ void getoptions(void) { /* interactively set options */ long loopcount, loopcount2; Char ch, ch2; boolean done; fprintf(outfile, "\nLargest clique program, version %s\n\n",VERSION); putchar('\n'); ancvar = false; Clmin = false; Factors = false; outgrno = 1; outgropt = false; trout = true; weights = false; justwts = false; printdata = false; printcomp = false; progress = true; treeprint = true; loopcount = 0; do { cleerhome(); printf("\nLargest clique program, version %s\n\n",VERSION); printf("Settings for this run:\n"); printf(" A Use ancestral states in input file? %s\n", (ancvar ? "Yes" : "No")); printf(" F Use factors information? %s\n", Factors ? "Yes" : "No"); printf(" W Sites weighted? %s\n", (weights ? "Yes" : "No")); printf(" C Specify minimum clique size?"); if (Clmin) printf(" Yes, at size%3ld\n", Cliqmin); else printf(" No\n"); printf(" O Outgroup root? %s%3ld\n", (outgropt ? "Yes, at species number" : "No, use as outgroup species"),outgrno); printf(" M Analyze multiple data sets?"); if (mulsets) printf(" Yes, %2ld %s\n", msets, (justwts ? "sets of weights" : "data sets")); else printf(" No\n"); printf(" 0 Terminal type (IBM PC, ANSI, none)? %s\n", ibmpc ? "IBM PC" : ansi ? "ANSI" : "(none)"); printf(" 1 Print out the data at start of run %s\n", (printdata ? "Yes" : "No")); printf(" 2 Print indications of progress of run %s\n", (progress ? "Yes" : "No")); printf(" 3 Print out compatibility matrix %s\n", (printcomp ? "Yes" : "No")); printf(" 4 Print out tree %s\n", (treeprint ? "Yes" : "No")); printf(" 5 Write out trees onto tree file? %s\n", (trout ? "Yes" : "No")); if(weights && justwts){ printf( "WARNING: W option and Multiple Weights options are both on. "); printf( "The W menu option is unnecessary and has no additional effect. \n"); } printf("\n Y to accept these or type the letter for one to change\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); uppercase(&ch); done = (ch == 'Y'); if (!done) { if (strchr("OACMFW012345",ch) != NULL){ switch (ch) { case 'A': ancvar = !ancvar; break; case 'F': Factors = !Factors; break; case 'W': weights = !weights; break; case 'C': Clmin = !Clmin; if (Clmin) { loopcount2 = 0; do { printf("Minimum clique size:\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%ld%*[^\n]", &Cliqmin); getchar(); countup(&loopcount2, 10); } while (Cliqmin < 0); } break; case 'O': outgropt = !outgropt; if (outgropt) initoutgroup(&outgrno, spp); break; case 'M': mulsets = !mulsets; if (mulsets){ printf("Multiple data sets or multiple weights?"); loopcount2 = 0; do { printf(" (type D or W)\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch2); getchar(); if (ch2 == '\n') ch2 = ' '; uppercase(&ch2); countup(&loopcount2, 10); } while ((ch2 != 'W') && (ch2 != 'D')); justwts = (ch2 == 'W'); if (justwts) justweights(&msets); else initdatasets(&msets); } break; case '0': initterminal(&ibmpc, &ansi); break; case '1': printdata = !printdata; break; case '2': progress = !progress; break; case '3': printcomp = !printcomp; break; case '4': treeprint = !treeprint; break; case '5': trout = !trout; break; } } else printf("Not a possible option!\n"); countup(&loopcount, 100); } } while (!done); } /* getoptions */ void clique_setuptree(void) { /* initialization of tree pointers, variables */ long i; treenode = (pointptr)Malloc((long)spp*sizeof(node *)); for (i = 0; i < spp; i++) { treenode[i] = (node *)Malloc((long)sizeof(node)); treenode[i]->next = NULL; treenode[i]->back = NULL; treenode[i]->index = i + 1; treenode[i]->tip = false; } } /* clique_setuptree */ void reallocchars() { long i; Comp = (Matrix)Malloc((long)chars*sizeof(vecrec *)); for (i = 0; i < (chars); i++) clique_gnu(&Comp[i]); ancone = (aPtr)Malloc((long)chars*sizeof(boolean)); Factor = (Char *)Malloc((long)chars*sizeof(Char)); ActChar = (long *)Malloc((long)chars*sizeof(long)); oldweight = (long *)Malloc((long)chars*sizeof(long)); weight = (long *)Malloc((long)chars*sizeof(long)); ActualChars = chars; for (i = 1; i <= (chars); i++) ActChar[i - 1] = i; } void allocrest(void) { long i; Data = (aDataPtr)Malloc((long)spp*sizeof(vecrec *)); for (i = 0; i < (spp); i++) clique_gnu(&Data[i]); Comp = (Matrix)Malloc((long)chars*sizeof(vecrec *)); for (i = 0; i < (chars); i++) clique_gnu(&Comp[i]); setsz = (long)ceil(((double)spp+1.0)/(double)SETBITS); ancone = (aPtr)Malloc((long)chars*sizeof(boolean)); Factor = (Char *)Malloc((long)chars*sizeof(Char)); ActChar = (long *)Malloc((long)chars*sizeof(long)); oldweight = (long *)Malloc((long)chars*sizeof(long)); weight = (long *)Malloc((long)chars*sizeof(long)); nayme = (naym *)Malloc((long)spp*sizeof(naym)); } /* allocrest */ void doinit(void) { /* initializes variables */ inputnumbersold(&spp, &chars, &nonodes, 1); getoptions(); if (printdata) fprintf(outfile, "%2ld species, %3ld characters\n", spp, chars); clique_setuptree(); allocrest(); } /* doinit */ void clique_inputancestors(void) { /* reads the ancestral states for each character */ long i; Char ch; for (i = 0; i < (chars); i++) { do { if (eoln(ancfile)) scan_eoln(ancfile); ch = gettc(ancfile); if (ch == '\n') ch = ' '; } while (ch == ' '); switch (ch) { case '1': ancone[i] = true; break; case '0': ancone[i] = false; break; default: printf("BAD ANCESTOR STATE: %c AT CHARACTER %4ld\n", ch, i + 1); exxit(-1); } } scan_eoln(ancfile); } /* clique_inputancestors */ void clique_printancestors(void) { /* print out list of ancestral states */ long i; fprintf(outfile, "Ancestral states:\n"); for (i = 1; i <= nmlngth + 2; i++) putc(' ', outfile); for (i = 1; i <= (chars); i++) { newline(outfile, i, 55, (long)nmlngth + 1); if (ancone[i - 1]) putc('1', outfile); else putc('0', outfile); if (i % 5 == 0) putc(' ', outfile); } fprintf(outfile, "\n\n"); } /* clique_printancestors */ void clique_inputfactors(void) { /* reads the factor symbols */ long i; ActualChars = 1; for (i = 1; i <= (chars); i++) { if (eoln(factfile)) scan_eoln(factfile); Factor[i - 1] = gettc(factfile); if (i > 1) { if (Factor[i - 1] != Factor[i - 2]) ActualChars++; } ActChar[i - 1] = ActualChars; } scan_eoln(factfile); } /* clique_inputfactors */ void inputoptions(void) { /* reads the species names and character data */ long i; if(justwts){ if (!firstset) samenumsp(&chars, ith); if(firstset){ ActualChars = chars; for (i = 1; i <= (chars); i++) ActChar[i - 1] = i; scan_eoln(infile); } else reallocchars(); for (i = 0; i < (chars); i++) oldweight[i] = 1; inputweights(chars, oldweight, &weights); if(firstset && ancvar) clique_inputancestors(); if(firstset && Factors) clique_inputfactors(); if (printdata) printweights(outfile, 0, ActualChars, oldweight, "Characters"); if (Factors) printfactors(outfile, chars, Factor, ""); if (firstset && ancvar && printdata) clique_printancestors(); noroot = !(outgropt || ancvar); } else { ActualChars = chars; for (i = 1; i <= (chars); i++) ActChar[i - 1] = i; for (i = 0; i < (chars); i++) oldweight[i] = 1; scan_eoln(infile); if(weights) inputweights(chars, oldweight, &weights); if(ancvar) clique_inputancestors(); if(Factors) clique_inputfactors(); if (weights && printdata) printweights(outfile, 0, ActualChars, oldweight, "Characters"); if (Factors) printfactors(outfile, chars, Factor, ""); if (ancvar && printdata) clique_printancestors(); noroot = !(outgropt || ancvar); } } /* inputoptions */ void clique_inputdata(void) { long i, j; Char ch; j = chars / 2 + (chars / 5 - 1) / 2 - 5; if (j < 0) j = 0; if (j > 27) j = 27; if (printdata) { fprintf(outfile, "Species "); for (i = 1; i <= j; i++) putc(' ', outfile); fprintf(outfile, "Character states\n"); fprintf(outfile, "------- "); for (i = 1; i <= j; i++) putc(' ', outfile); fprintf(outfile, "--------- ------\n\n"); } for (i = 0; i < (spp); i++) { initname(i); if (printdata) for (j = 0; j < nmlngth; j++) putc(nayme[i][j], outfile); if (printdata) fprintf(outfile, " "); for (j = 1; j <= (chars); j++) { do { if (eoln(infile)) scan_eoln(infile); ch = gettc(infile); } while (ch == ' ' || ch == '\t'); if (printdata) { putc(ch, outfile); newline(outfile, j, 55, (long)nmlngth + 1); if (j % 5 == 0) putc(' ', outfile); } if (ch != '0' && ch != '1') { printf("\n\nERROR: Bad character state: %c (not 0 or 1)", ch); printf(" at character %ld of species %ld\n\n", j, i + 1); exxit(-1); } Data[i]->vec[j - 1] = (ch == '1'); } scan_eoln(infile); if (printdata) putc('\n', outfile); } putc('\n', outfile); for (i = 0; i < (chars); i++) { if (i + 1 == 1 || !Factors) weight[i] = oldweight[i]; else if (Factor[i] != Factor[i - 1]) weight[ActChar[i] - 1] = oldweight[i]; } } /* clique_inputdata */ boolean Compatible(long ch1, long ch2) { /* TRUE if two characters ch1 < ch2 are compatible */ long i, j, k; boolean Compt, Done1, Done2; boolean Info[4]; Compt = true; j = 1; while (ch1 > ActChar[j - 1]) j++; Done1 = (ch1 != ActChar[j - 1]); while (!Done1) { k = j; while (ch2 > ActChar[k - 1]) k++; Done2 = (ch2 != ActChar[k - 1]); while (!Done2) { for (i = 0; i <= 3; i++) Info[i] = false; if (ancvar) { if (ancone[j - 1] && ancone[k - 1]) Info[0] = true; else if (ancone[j - 1] && !ancone[k - 1]) Info[1] = true; else if (!ancone[j - 1] && ancone[k - 1]) Info[2] = true; else Info[3] = true; } for (i = 0; i < (spp); i++) { if (Data[i]->vec[j - 1] && Data[i]->vec[k - 1]) Info[0] = true; else if (Data[i]->vec[j - 1] && !Data[i]->vec[k - 1]) Info[1] = true; else if (!Data[i]->vec[j - 1] && Data[i]->vec[k - 1]) Info[2] = true; else Info[3] = true; } Compt = (Compt && !(Info[0] && Info[1] && Info[2] && Info[3])); k++; Done2 = (k > chars); if (!Done2) Done2 = (ch2 != ActChar[k - 1]); } j++; Done1 = (j > chars); if (!Done1) Done1 = (ch1 != ActChar[j - 1]); } return Compt; } /* Compatible */ void SetUp(vecrec **Comp) { /* sets up the compatibility matrix */ long i, j; if (printcomp) { if (Factors) fprintf(outfile, " (For original multistate characters)\n"); fprintf(outfile, "Character Compatibility Matrix (1 if compatible)\n"); fprintf(outfile, "--------- ------------- ------ -- -- -----------\n\n"); } for (i = 0; i < (ActualChars); i++) { if (printcomp) { for (j = 1; j <= ((48 - ActualChars) / 2); j++) putc(' ', outfile); for (j = 1; j < i + 1; j++) { if (Comp[i]->vec[j - 1]) putc('1', outfile); else putc('.', outfile); newline(outfile, j, 70, (long)nmlngth + 1); } } Comp[i]->vec[i] = true; if (printcomp) putc('1', outfile); for (j = i + 1; j < (ActualChars); j++) { Comp[i]->vec[j] = Compatible(i + 1, j + 1); if (printcomp) { if (Comp[i]->vec[j]) putc('1', outfile); else putc('.', outfile); } Comp[j]->vec[i] = Comp[i]->vec[j]; } if (printcomp) putc('\n', outfile); } putc('\n', outfile); } /* SetUp */ void Intersect(boolean *V1, boolean *V2, boolean *V3) { /* takes the logical intersection V1 AND V2 */ long i; for (i = 0; i < (ActualChars); i++) V3[i] = (V1[i] && V2[i]); } /* Intersect */ long CountStates(boolean *V) { /* counts the 1's in V */ long i, TempCount; TempCount = 0; for (i = 0; i < (ActualChars); i++) { if (V[i]) TempCount += weight[i]; } return TempCount; } /* CountStates */ void Gen1(long i, long CurSize, boolean *aChars, boolean *Candidates, boolean *Excluded) { /* finds largest size cliques and prints them out */ long CurSize2, j, k, Actual, Possible; boolean Futile; vecrec *Chars2, *Cands2, *Excl2, *Cprime, *Exprime; clique_gnu(&Chars2); clique_gnu(&Cands2); clique_gnu(&Excl2); clique_gnu(&Cprime); clique_gnu(&Exprime); CurSize2 = CurSize; memcpy(Chars2->vec, aChars, chars*sizeof(boolean)); memcpy(Cands2->vec, Candidates, chars*sizeof(boolean)); memcpy(Excl2->vec, Excluded, chars*sizeof(boolean)); j = i; while (j <= ActualChars) { if (Cands2->vec[j - 1]) { Chars2->vec[j - 1] = true; Cands2->vec[j - 1] = false; CurSize2 += weight[j - 1]; Possible = CountStates(Cands2->vec); Intersect(Cands2->vec, Comp2[j - 1]->vec, Cprime->vec); Actual = CountStates(Cprime->vec); Intersect(Excl2->vec, Comp2[j - 1]->vec, Exprime->vec); Futile = false; for (k = 0; k <= j - 2; k++) { if (Exprime->vec[k] && !Futile) { Intersect(Cprime->vec, Comp2[k]->vec, Temp); Futile = (CountStates(Temp) == Actual); } } if (CurSize2 + Actual >= Cliqmin && !Futile) { if (Actual > 0) Gen1(j + 1,CurSize2,Chars2->vec,Cprime->vec,Exprime->vec); else if (CurSize2 > Cliqmin) { Cliqmin = CurSize2; if (tcount >= 0) tcount = 1; } else if (CurSize2 == Cliqmin) tcount++; } if (Possible > Actual) { Chars2->vec[j - 1] = false; Excl2->vec[j - 1] = true; CurSize2 -= weight[j - 1]; } else j = ActualChars; } j++; } clique_chuck(Chars2); clique_chuck(Cands2); clique_chuck(Excl2); clique_chuck(Cprime); clique_chuck(Exprime); } /* Gen1 */ boolean Ingroupstate(long i) { /* the ingroup state for the i-th character */ boolean outstate; if (noroot) { outstate = Data[0]->vec[i - 1]; return (!outstate); } if (ancvar) outstate = ancone[i - 1]; else outstate = Data[outgrno - 1]->vec[i - 1]; return (!outstate); } /* Ingroupstate */ void makeset(void) { /* make up set of species for given set of characters */ long i, j, k, m; boolean instate; long *st; st = (long *)Malloc(setsz*sizeof(long)); n = 0; for (i = 0; i < (MaxChars); i++) { for (j = 0; j < setsz; j++) st[j] = 0; instate = Ingroupstate(ChOrder[i]); for (j = 0; j < (spp); j++) { if (Data[SpOrder[j] - 1]->vec[ChOrder[i] - 1] == instate) { m = (long)(SpOrder[j]/SETBITS); st[m] = ((long)st[m]) | (1L << (SpOrder[j] % SETBITS)); } } memcpy(grouping[++n - 1], st, setsz*sizeof(long)); } for (i = 0; i < (spp); i++) { k = (long)(SpOrder[i]/SETBITS); grouping[++n - 1][k] = 1L << (SpOrder[i] % SETBITS); } free(st); } /* makeset */ void Init(long *ChOrder, long *Count, long *MaxChars, aPtr aChars) { /* initialize vectors and character count */ long i, j, temp; boolean instate; *MaxChars = 0; for (i = 1; i <= (chars); i++) { if (aChars[ActChar[i - 1] - 1]) { (*MaxChars)++; ChOrder[*MaxChars - 1] = i; instate = Ingroupstate(i); temp = 0; for (j = 0; j < (spp); j++) { if (Data[j]->vec[i - 1] == instate) temp++; } Count[i - 1] = temp; } } } /*Init */ void ChSort(long *ChOrder, long *Count, long MaxChars) { /* sorts the characters by number of ingroup states */ long j, temp; boolean ordered; ordered = false; while (!ordered) { ordered = true; for (j = 1; j < MaxChars; j++) { if (Count[ChOrder[j - 1] - 1] < Count[ChOrder[j] - 1]) { ordered = false; temp = ChOrder[j - 1]; ChOrder[j - 1] = ChOrder[j]; ChOrder[j] = temp; } } } } /* ChSort */ void PrintClique(boolean *aChars) { /* prints the characters in a clique */ long i, j; fprintf(outfile, "\n\n"); if (Factors) { fprintf(outfile, "Actual Characters: ("); j = 0; for (i = 1; i <= (ActualChars); i++) { if (aChars[i - 1]) { fprintf(outfile, "%3ld", i); j++; newline(outfile, j, (long)((FormWide - 22) / 3), (long)nmlngth + 1); } } fprintf(outfile, ")\n"); } if (Factors) fprintf(outfile, "Binary "); fprintf(outfile, "Characters: ("); j = 0; for (i = 1; i <= (chars); i++) { if (aChars[ActChar[i - 1] - 1]) { fprintf(outfile, "%3ld", i); j++; if (Factors) newline(outfile, j, (long)((FormWide - 22) / 3), (long)nmlngth + 1); else newline(outfile, j, (long)((FormWide - 15) / 3), (long)nmlngth + 1); } } fprintf(outfile, ")\n\n"); } /* PrintClique */ void bigsubset(long *st, long n) { /* find a maximal subset of st among the groupings */ long i, j; long *su; boolean max, same; su = (long *)Malloc(setsz*sizeof(long)); for (i = 0; i < setsz; i++) su[i] = 0; for (i = 0; i < n; i++) { max = true; for (j = 0; j < setsz; j++) if ((grouping[i][j] & ~st[j]) != 0) max = false; if (max) { same = true; for (j = 0; j < setsz; j++) if (grouping[i][j] != st[j]) same = false; if (!same) { for (j = 0; j < setsz; j++) if ((su[j] & ~grouping[i][j]) != 0) max = false; if (max) { same = true; for (j = 0; j < setsz; j++) if (grouping[i][j] != su[j]) same = false; if (!same) memcpy(su, grouping[i], setsz*sizeof(long)); } } } } memcpy(st, su, setsz*sizeof(long)); free(su); } /* bigsubset */ void recontraverse(node **p, long *st, long n, long MaxChars) { /* traverse to reconstruct the tree from the characters */ long i, j, k, maxpos; long *tempset, *st2; boolean found, zero, zero2, same; node *q; j = k = 0; for (i = 1; i <= (spp); i++) { if (((1L << (i % SETBITS)) & st[(long)(i / SETBITS)]) != 0) { k++; j = i; } } if (k == 1) { *p = treenode[j - 1]; (*p)->tip = true; (*p)->index = j; return; } nunode(p); (*p)->index = 0; tempset = (long*)Malloc(setsz*sizeof(long)); memcpy(tempset, st, setsz*sizeof(long)); q = *p; zero = true; for (i = 0; i < setsz; i++) if (tempset[i] != 0) zero = false; if (!zero) bigsubset(tempset, n); zero = true; zero2 = true; for (i = 0; i < setsz; i++) if (st[i] != 0) zero = false; if (!zero) { for (i = 0; i < setsz; i++) if (tempset[i] != 0) zero2 = false; } st2 = (long *)Malloc(setsz*sizeof(long)); memcpy(st2, st, setsz*sizeof(long)); while (!zero2) { nunode(&q->next); q = q->next; recontraverse(&q->back, tempset, n,MaxChars); i = 1; maxpos = 0; while (i <= MaxChars) { same = true; for (j = 0; j < setsz; j++) if (grouping[i - 1][j] != tempset[j]) same = false; if (same) maxpos = i; i++; } q->back->maxpos = maxpos; q->back->back = q; for (j = 0; j < setsz; j++) st2[j] &= ~tempset[j]; memcpy(tempset, st2, setsz*sizeof(long)); found = false; i = 1; while (!found && i <= n) { same = true; for (j = 0; j < setsz; j++) if (grouping[i - 1][j] != tempset[j]) same = false; if (same) found = true; else i++; } zero = true; for (j = 0; j < setsz; j++) if (tempset[j] != 0) zero = false; if (!zero && !found) bigsubset(tempset, n); zero = true; zero2 = true; for (j = 0; j < setsz; j++) if (st2[j] != 0) zero = false; if (!zero) for (j = 0; j < setsz; j++) if (tempset[j] != 0) zero2 = false; } q->next = *p; free(tempset); free(st2); } /* recontraverse */ void reconstruct(long n, long MaxChars) { /* reconstruct tree from the subsets */ long i; long *s; s = (long *)Malloc(setsz*sizeof(long)); for (i = 0; i < setsz; i++) { if (i+1 == setsz) { s[i] = 1L << ((spp % SETBITS) + 1); if (setsz > 1) s[i] -= 1; else s[i] -= 1L << 1; } else if (i == 0) { if (setsz > 1) s[i] = ~0L - 1; } else { if (setsz > 2) s[i] = ~0L; } } recontraverse(&root,s,n,MaxChars); free(s); } /* reconstruct */ void reroot(node *outgroup) { /* reorients tree, putting outgroup in desired position. */ long i; boolean nroot; node *p, *q; nroot = false; p = root->next; while (p != root) { if (outgroup->back == p) { nroot = true; p = root; } else p = p->next; } if (nroot) return; p = root; i = 0; while (p->next != root) { p = p->next; i++; } if (i == 2) { root->next->back->back = p->back; p->back->back = root->next->back; q = root->next; } else { p->next = root->next; nunode(&root->next); q = root->next; nunode(&q->next); p = q->next; p->next = root; q->tip = false; p->tip = false; } q->back = outgroup; p->back = outgroup->back; outgroup->back->back = p; outgroup->back = q; } /* reroot */ void clique_coordinates(node *p, long *tipy, long MaxChars) { /* establishes coordinates of nodes */ node *q, *first, *last; long maxx; if (p->tip) { p->xcoord = 0; p->ycoord = *tipy; p->ymin = *tipy; p->ymax = *tipy; (*tipy) += down; return; } q = p->next; maxx = 0; while (q != p) { clique_coordinates(q->back, tipy, MaxChars); if (!q->back->tip) { if (q->back->xcoord > maxx) maxx = q->back->xcoord; } q = q->next; } first = p->next->back; q = p; while (q->next != p) q = q->next; last = q->back; p->xcoord = (MaxChars - p->maxpos) * 3 - 2; if (p == root) p->xcoord += 2; p->ycoord = (first->ycoord + last->ycoord) / 2; p->ymin = first->ymin; p->ymax = last->ymax; } /* clique_coordinates */ void clique_drawline(long i) { /* draws one row of the tree diagram by moving up tree */ node *p, *q; long n, m, j, k, l, sumlocpos, size, locpos, branchpos; long *poslist; boolean extra, done, plus, found, same; node *r, *first = NULL, *last = NULL; poslist = (long *)Malloc((long)(spp + MaxChars)*sizeof(long)); branchpos = 0; p = root; q = root; fprintf(outfile, " "); extra = false; plus = false; do { if (!p->tip) { found = false; r = p->next; while (r != p && !found) { if (i >= r->back->ymin && i <= r->back->ymax) { q = r->back; found = true; } else r = r->next; } first = p->next->back; r = p; while (r->next != p) r = r->next; last = r->back; } done = (p->tip || p == q); n = p->xcoord - q->xcoord; m = n; if (extra) { n--; extra = false; } if ((long)q->ycoord == i && !done) { if (!q->tip) { putc('+', outfile); plus = true; j = 1; for (k = 1; k <= (q->maxpos); k++) { same = true; for (l = 0; l < setsz; l++) if (grouping[k - 1][l] != grouping[q->maxpos - 1][l]) same = false; if (same) { poslist[j - 1] = k; j++; } } size = j - 1; if (size == 0) { for (k = 1; k < n; k++) putc('-', outfile); sumlocpos = n; } else { sumlocpos = 0; j = 1; while (j <= size) { locpos = poslist[j - 1] * 3; if (j != 1) locpos -= poslist[j - 2] * 3; else locpos -= branchpos; for (k = 1; k < locpos; k++) putc('-', outfile); if (Rarer[ChOrder[poslist[j - 1] - 1] - 1]) putc('1', outfile); else putc('0', outfile); sumlocpos += locpos; j++; } for (j = sumlocpos + 1; j < n; j++) putc('-', outfile); putc('+', outfile); if (m > 0) branchpos += m; extra = true; } } else { if (!plus) { putc('+', outfile); plus = false; } else n++; j = 1; for (k = 1; k <= (q->maxpos); k++) { same = true; for (l = 0; l < setsz; l++) if (grouping[k - 1][l] != grouping[q->maxpos - 1][l]) same = false; if (same) { poslist[j - 1] = k; j++; } } size = j - 1; if (size == 0) { for (k = 1; k <= n; k++) putc('-', outfile); sumlocpos = n; } else { sumlocpos = 0; j = 1; while (j <= size) { locpos = poslist[j - 1] * 3; if (j != 1) locpos -= poslist[j - 2] * 3; else locpos -= branchpos; for (k = 1; k < locpos; k++) putc('-', outfile); if (Rarer[ChOrder[poslist[j - 1] - 1] - 1]) putc('1', outfile); else putc('0', outfile); sumlocpos += locpos; j++; } for (j = sumlocpos + 1; j <= n; j++) putc('-', outfile); if (m > 0) branchpos += m; } putc('-', outfile); } } else if (!p->tip && (long)last->ycoord > i && (long)first->ycoord < i && (i != (long)p->ycoord || p == root)) { putc('!', outfile); for (j = 1; j < n; j++) putc(' ', outfile); plus = false; if (m > 0) branchpos += m; } else { for (j = 1; j <= n; j++) putc(' ', outfile); plus = false; if (m > 0) branchpos += m; } if (q != p) p = q; } while (!done); if (p->ycoord == i && p->tip) { for (j = 0; j < nmlngth; j++) putc(nayme[p->index - 1][j], outfile); } putc('\n', outfile); free(poslist); } /* clique_drawline */ void clique_printree(void) { /* prints out diagram of the tree */ long tipy, i; if (!treeprint) return; tipy = 1; clique_coordinates(root, &tipy, MaxChars); fprintf(outfile, "\n Tree and"); if (Factors) fprintf(outfile, " binary"); fprintf(outfile, " characters:\n\n"); fprintf(outfile, " "); for (i = 0; i < (MaxChars); i++) fprintf(outfile, "%3ld", ChOrder[i]); fprintf(outfile, "\n "); for (i = 0; i < (MaxChars); i++) { if (Rarer[ChOrder[i] - 1]) fprintf(outfile, "%3c", '1'); else fprintf(outfile, "%3c", '0'); } fprintf(outfile, "\n\n"); for (i = 1; i <= (tipy - down); i++) clique_drawline(i); fprintf(outfile, "\nremember: this is an unrooted tree!\n\n"); } /* clique_printree */ void DoAll(boolean *Chars_,boolean *Processed,boolean *Rarer_,long tcount) { /* print out a clique and its tree */ long i, j; ChPtr Count; aChars = (aPtr)Malloc((long)chars*sizeof(boolean)); SpOrder = (SpPtr)Malloc((long)spp*sizeof(long)); ChOrder = (ChPtr)Malloc((long)chars*sizeof(long)); Count = (ChPtr)Malloc((long)chars*sizeof(long)); memcpy(aChars, Chars_, chars*sizeof(boolean)); Rarer = Rarer_; Init(ChOrder, Count, &MaxChars, aChars); ChSort(ChOrder, Count, MaxChars); for (i = 1; i <= (spp); i++) SpOrder[i - 1] = i; for (i = 1; i <= (chars); i++) { if (aChars[ActChar[i - 1] - 1]) { if (!Processed[ActChar[i - 1] - 1]) { Rarer[i - 1] = Ingroupstate(i); Processed[ActChar[i - 1] - 1] = true; } } } PrintClique(aChars); grouping = (long **)Malloc((long)(spp + MaxChars)*sizeof(long *)); for (i = 0; i < spp + MaxChars; i++) { grouping[i] = (long *)Malloc(setsz*sizeof(long)); for (j = 0; j < setsz; j++) grouping[i][j] = 0; } makeset(); clique_setuptree(); reconstruct(n,MaxChars); if (noroot) reroot(treenode[outgrno - 1]); clique_printree(); if (trout) { col = 0; treeout(root, tcount+1, &col, root); } free(SpOrder); free(ChOrder); free(Count); for (i = 0; i < spp + MaxChars; i++) free(grouping[i]); free(grouping); } /* DoAll */ void Gen2(long i, long CurSize, boolean *aChars, boolean *Candidates, boolean *Excluded) { /* finds largest size cliques and prints them out */ long CurSize2, j, k, Actual, Possible; boolean Futile; vecrec *Chars2, *Cands2, *Excl2, *Cprime, *Exprime; clique_gnu(&Chars2); clique_gnu(&Cands2); clique_gnu(&Excl2); clique_gnu(&Cprime); clique_gnu(&Exprime); CurSize2 = CurSize; memcpy(Chars2->vec, aChars, chars*sizeof(boolean)); memcpy(Cands2->vec, Candidates, chars*sizeof(boolean)); memcpy(Excl2->vec, Excluded, chars*sizeof(boolean)); j = i; while (j <= ActualChars) { if (Cands2->vec[j - 1]) { Chars2->vec[j - 1] = true; Cands2->vec[j - 1] = false; CurSize2 += weight[j - 1]; Possible = CountStates(Cands2->vec); Intersect(Cands2->vec, Comp2[j - 1]->vec, Cprime->vec); Actual = CountStates(Cprime->vec); Intersect(Excl2->vec, Comp2[j - 1]->vec, Exprime->vec); Futile = false; for (k = 0; k <= j - 2; k++) { if (Exprime->vec[k] && !Futile) { Intersect(Cprime->vec, Comp2[k]->vec, Temp); Futile = (CountStates(Temp) == Actual); } } if (CurSize2 + Actual >= Cliqmin && !Futile) { if (Actual > 0) Gen2(j + 1,CurSize2,Chars2->vec,Cprime->vec,Exprime->vec); else DoAll(Chars2->vec,Processed,Rarer2,tcount); } if (Possible > Actual) { Chars2->vec[j - 1] = false; Excl2->vec[j - 1] = true; CurSize2 -= weight[j - 1]; } else j = ActualChars; } j++; } clique_chuck(Chars2); clique_chuck(Cands2); clique_chuck(Excl2); clique_chuck(Cprime); clique_chuck(Exprime); } /* Gen2 */ void GetMaxCliques(vecrec **Comp_) { /* recursively generates the largest cliques */ long i; aPtr aChars, Candidates, Excluded; Temp = (aPtr)Malloc((long)chars*sizeof(boolean)); Processed = (aPtr)Malloc((long)chars*sizeof(boolean)); Rarer2 = (aPtr)Malloc((long)chars*sizeof(boolean)); aChars = (aPtr)Malloc((long)chars*sizeof(boolean)); Candidates = (aPtr)Malloc((long)chars*sizeof(boolean)); Excluded = (aPtr)Malloc((long)chars*sizeof(boolean)); Comp2 = Comp_; putc('\n', outfile); if (Clmin) { fprintf(outfile, "Cliques with at least%3ld characters\n", Cliqmin); fprintf(outfile, "------- ---- -- ----- -- ----------\n"); } else { Cliqmin = 0; fprintf(outfile, "Largest Cliques\n"); fprintf(outfile, "------- -------\n"); for (i = 0; i < (ActualChars); i++) { aChars[i] = false; Excluded[i] = false; Candidates[i] = true; } tcount = 0; Gen1(1, 0, aChars, Candidates, Excluded); } for (i = 0; i < (ActualChars); i++) { aChars[i] = false; Candidates[i] = true; Processed[i] = false; Excluded[i] = false; } Gen2(1, 0, aChars, Candidates, Excluded); putc('\n', outfile); free(Temp); free(Processed); free(Rarer2); free(aChars); free(Candidates); free(Excluded); } /* GetMaxCliques */ int main(int argc, Char *argv[]) { /* Main Program */ #ifdef MAC argc = 1; /* macsetup("Clique","Clique"); */ argv[0] = "Clique"; #endif init(argc, argv); openfile(&infile,INFILE,"input file", "r",argv[0],infilename); openfile(&outfile,OUTFILE,"output file", "w",argv[0],outfilename); openfile(&outtree,OUTTREE,"output tree file", "w",argv[0],outtreename); ibmpc = IBMCRT; ansi = ANSICRT; mulsets = false; firstset = true; msets = 1; doinit(); if(ancvar) openfile(&ancfile,ANCFILE,"ancestors file", "r",argv[0],ancfilename); if(Factors) openfile(&factfile,FACTFILE,"factors file", "r",argv[0],factfilename); if (weights || justwts) openfile(&weightfile,WEIGHTFILE,"weights file","r",argv[0],weightfilename); for (ith = 1; ith <= (msets); ith++) { inputoptions(); if(!justwts || firstset) clique_inputdata(); firstset = false; SetUp(Comp); if (msets > 1 && !justwts) { fprintf(outfile, "Data set # %ld:\n\n",ith); if (progress) printf("\nData set # %ld:\n",ith); } if (justwts){ fprintf(outfile, "Weights set # %ld:\n\n", ith); if (progress) printf("\nWeights set # %ld:\n\n", ith); } GetMaxCliques(Comp); if (progress) { printf("\nOutput written to file \"%s\"\n",outfilename); if (trout) printf("\nTree"); if (tcount > 1) printf("s"); printf(" written on file \"%s\"\n\n", outtreename); } } FClose(infile); FClose(outfile); FClose(outtree); #ifdef MAC fixmacfile(outfilename); fixmacfile(outtreename); #endif #ifdef WIN32 phyRestoreConsoleAttributes(); #endif printf("Done.\n\n"); return 0; } phylip-3.697/src/cons.c0000644004732000473200000011713113212364161014445 0ustar joefelsenst_g#include "phylip.h" #include "cons.h" int tree_pairing; Char outfilename[FNMLNGTH], intreename[FNMLNGTH], intree2name[FNMLNGTH], outtreename[FNMLNGTH]; node *root; long numopts, outgrno, col, setsz; long maxgrp; /* max. no. of groups in all trees found */ boolean trout, firsttree, noroot, outgropt, didreroot, prntsets, progress, treeprint, goteof, strict, mr=false, mre=false, ml=false; /* initialized all false for Treedist */ pointarray nodep; pointarray treenode; group_type **grouping, **grping2, **group2;/* to store groups found */ double *lengths, *lengths2; long **order, **order2, lasti; group_type *fullset; node *grbg; long tipy; double **timesseen, **tmseen2, **times2 ; double *tchange2; double trweight, ntrees, mlfrac; /* prototypes */ void censor(void); boolean compatible(long, long); void elimboth(long); void enterpartition (group_type*, long*); void reorient(node* n); /* begin hash table code */ #define NUM_BUCKETS 100 typedef struct namenode { struct namenode *next; plotstring naym; int hitCount; } namenode; typedef namenode **hashtype; hashtype hashp; long namesGetBucket(plotstring); void namesAdd(plotstring); boolean namesSearch(plotstring); void namesDelete(plotstring); void namesClearTable(void); void namesCheckTable(void); void missingnameRecurs(node *p); /** * namesGetBucket - return the bucket for a given name */ long namesGetBucket(plotstring searchname) { long i; long sum = 0; for (i = 0; (i < MAXNCH) && (searchname[i] != '\0'); i++) { sum += searchname[i]; } return (sum % NUM_BUCKETS); } /** * namesAdd - add a name to the hash table * * The argument is added at the head of the appropriate linked list. No * checking is done for duplicates. The caller can call * namesSearch to check for an existing name prior to calling * namesAdd. */ void namesAdd(plotstring addname) { long bucket = namesGetBucket(addname); namenode *hp, *temp; temp = hashp[bucket]; hashp[bucket] = (namenode *)Malloc(sizeof(namenode)); hp = hashp[bucket]; strcpy(hp->naym, addname); hp->next = temp; hp->hitCount = 0; } /** * namesSearch - search for a name in the hash table * * Return true if the name is found, else false. */ boolean namesSearch(plotstring searchname) { long i = namesGetBucket(searchname); namenode *p; p = hashp[i]; if (p == NULL) { return false; } do { if (strcmp(searchname,p->naym) == 0) { p->hitCount++; return true; } p = p->next; } while (p != NULL); return false; } /** * Go through hash table and check that the hit count on all entries is one. * If it is zero, then a species was missed, if it is two, then there is a * duplicate species. */ void namesCheckTable(void) { namenode *p; long i; for (i=0; i< NUM_BUCKETS; i++) { p = hashp[i]; while (p != NULL){ if(p->hitCount >1){ printf("\n\nERROR in user tree: duplicate name found: "); puts(p->naym); printf("\n\n"); exxit(-1); } else if(p->hitCount == 0){ printf("\n\nERROR in user tree: name %s not found\n\n\n", p->naym); exxit(-1); } p->hitCount = 0; p = p->next; } } } /** * namesClearTable - empty names out of the table and * return allocated memory */ void namesClearTable(void) { long i; namenode *p, *temp; for (i=0; i< NUM_BUCKETS; i++) { p = hashp[i]; if (p != NULL) { do { temp = p; p = p->next; free(temp); } while (p != NULL); hashp[i] = NULL; } } } /* end hash table code */ void initconsnode(node **p, node **grbg, node *q, long len, long nodei, long *ntips, long *parens, initops whichinit, pointarray treenode, pointarray nodep, Char *str, Char *ch, FILE *intree) { /* initializes a node */ long i; char c; boolean minusread; double valyew, divisor, fracchange; switch (whichinit) { case bottom: gnu(grbg, p); (*p)->index = nodei; (*p)->tip = false; for (i=0; inayme[i] = '\0'; nodep[(*p)->index - 1] = (*p); (*p)->v = 0; break; case nonbottom: gnu(grbg, p); (*p)->index = nodei; (*p)->v = 0; break; case tip: (*ntips)++; gnu(grbg, p); nodep[(*ntips) - 1] = *p; setupnode(*p, *ntips); (*p)->tip = true; strncpy ((*p)->nayme, str, MAXNCH); if (firsttree && prntsets) { fprintf(outfile, " %ld. ", *ntips); for (i = 0; i < len; i++) putc(str[i], outfile); putc('\n', outfile); if ((*ntips > 0) && (((*ntips) % 10) == 0)) putc('\n', outfile); } (*p)->v = 0; break; case length: processlength(&valyew, &divisor, ch, &minusread, intree, parens); fracchange = 1.0; (*p)->v = valyew / divisor / fracchange; break; case treewt: if (!eoln(intree)) { if (fscanf(intree, "%lf", &trweight) == 1) { getch(ch, parens, intree); if (*ch != ']') { printf("\n\nERROR: Missing right square bracket\n\n"); exxit(-1); } else { getch(ch, parens, intree); if (*ch != ';') { printf("\n\nERROR: Missing semicolon after square brackets\n\n"); exxit(-1); } } } else { printf("\n\nERROR: Expecting tree weight in last comment field\n\n"); exxit(-1); } } break; case unittrwt: /* This comes not only when setting trweight but also at the end of * any tree. The following code saves the current position in a * file and reads to a new line. If there is a new line then we're * at the end of tree, otherwise warn the user. This function should * really leave the file alone, so once we're done with 'intree' * we seek the position back so that it doesn't look like we did * anything */ trweight = 1.0 ; i = ftell (intree); c = ' '; while (c == ' ') { if (eoff(intree)) { fseek(intree,i,SEEK_SET); return; } c = gettc(intree); } fseek(intree,i,SEEK_SET); if ( c != '\n' && c!= '\r') printf("WARNING: Tree weight set to 1.0\n"); if ( c == '\r' ) if ( (c == gettc(intree)) != '\n') ungetc(c, intree); break; case hsnolength: (*p)->v = -1; /* signal value that a length is missing */ break; default: /* cases hslength, iter, hsnolength */ break; /* should there be an error message here?*/ } } /* initconsnode */ void censor(void) { /* delete groups that are too rare to be in the consensus tree */ long i; i = 1; do { if (timesseen[i-1]) if (!(mre || (mr && (2*(*timesseen[i-1]) > ntrees)) || (ml && ((*timesseen[i-1]) > mlfrac*ntrees)) || (strict && ((*timesseen[i-1]) == ntrees)))) { free(grouping[i - 1]); free(timesseen[i - 1]); grouping[i - 1] = NULL; timesseen[i - 1] = NULL; } i++; } while (i < maxgrp); } /* censor */ void compress(long *n) { /* push all the nonempty subsets to the front end of their array */ long i, j; i = 1; j = 1; do { while (grouping[i - 1] != NULL) i++; if (j <= i) j = i + 1; while ((grouping[j - 1] == NULL) && (j < maxgrp)) j++; if (j < maxgrp) { grouping[i - 1] = (group_type *)Malloc(setsz * sizeof(group_type)); timesseen[i - 1] = (double *)Malloc(sizeof(double)); memcpy(grouping[i - 1], grouping[j - 1], setsz * sizeof(group_type)); *timesseen[i - 1] = *timesseen[j - 1]; free(grouping[j - 1]); free(timesseen[j - 1]); grouping[j - 1] = NULL; timesseen[j - 1] = NULL; } } while (j != maxgrp); (*n) = i - 1; } /* compress */ void sort(long n) { /* Shell sort keeping grouping, timesseen in same order */ long gap, i, j; group_type *stemp; double rtemp; gap = n / 2; stemp = (group_type *)Malloc(setsz * sizeof(group_type)); while (gap > 0) { for (i = gap + 1; i <= n; i++) { j = i - gap; while (j > 0) { if (*timesseen[j - 1] < *timesseen[j + gap - 1]) { memcpy(stemp, grouping[j - 1], setsz * sizeof(group_type)); memcpy(grouping[j - 1], grouping[j + gap - 1], setsz * sizeof(group_type)); memcpy(grouping[j + gap - 1], stemp, setsz * sizeof(group_type)); rtemp = *timesseen[j - 1]; *timesseen[j - 1] = *timesseen[j + gap - 1]; *timesseen[j + gap - 1] = rtemp; } j -= gap; } } gap /= 2; } free(stemp); } /* sort */ boolean compatible(long i, long j) { /* are groups i and j compatible? */ boolean comp; long k; comp = true; for (k = 0; k < setsz; k++) if ((grouping[i][k] & grouping[j][k]) != 0) comp = false; if (!comp) { comp = true; for (k = 0; k < setsz; k++) if ((grouping[i][k] & ~grouping[j][k]) != 0) comp = false; if (!comp) { comp = true; for (k = 0; k < setsz; k++) if ((grouping[j][k] & ~grouping[i][k]) != 0) comp = false; if (!comp) { comp = noroot; if (comp) { for (k = 0; k < setsz; k++) if ((fullset[k] & ~grouping[i][k] & ~grouping[j][k]) != 0) comp = false; } } } } return comp; } /* compatible */ void eliminate(long *n, long *n2) { /* eliminate groups incompatible with preceding ones */ long i, j, k; boolean comp; for (i = 2; i <= (*n); i++) { comp = true; for (j = 0; comp && (j <= i - 2); j++) { if ((timesseen[j] != NULL) && *timesseen[j] > 0) { comp = compatible(i-1,j); if (!comp) { (*n2)++; times2[(*n2) - 1] = (double *)Malloc(sizeof(double)); group2[(*n2) - 1] = (group_type *)Malloc(setsz * sizeof(group_type)); *times2[(*n2) - 1] = *timesseen[i - 1]; memcpy(group2[(*n2) - 1], grouping[i - 1], setsz * sizeof(group_type)); *timesseen[i - 1] = 0.0; for (k = 0; k < setsz; k++) grouping[i - 1][k] = 0; } } } if (*timesseen[i - 1] == 0.0) { free(grouping[i - 1]); free(timesseen[i - 1]); timesseen[i - 1] = NULL; grouping[i - 1] = NULL; } } } /* eliminate */ void printset(long n) { /* print out the n sets of species */ long i, j, k, size; boolean noneprinted; fprintf(outfile, "\nSet (species in order) "); for (i = 1; i <= spp - 25; i++) putc(' ', outfile); fprintf(outfile, " How many times out of %7.2f\n\n", ntrees); noneprinted = true; for (i = 0; i < n; i++) { if ((timesseen[i] != NULL) && (*timesseen[i] > 0)) { size = 0; k = 0; for (j = 1; j <= spp; j++) { if (j == ((k+1)*SETBITS+1)) k++; if (((1L << (j - 1 - k*SETBITS)) & grouping[i][k]) != 0) size++; } if (size != 1 && !(noroot && size >= (spp-1))) { noneprinted = false; k = 0; for (j = 1; j <= spp; j++) { if (j == ((k+1)*SETBITS+1)) k++; if (((1L << (j - 1 - k*SETBITS)) & grouping[i][k]) != 0) putc('*', outfile); else putc('.', outfile); if (j % 10 == 0) putc(' ', outfile); } for (j = 1; j <= 23 - spp; j++) putc(' ', outfile); fprintf(outfile, " %5.2f\n", *timesseen[i]); } } } if (noneprinted) fprintf(outfile, " NONE\n"); } /* printset */ void bigsubset(group_type *st, long n) { /* Find a maximal subset of st among the n groupings, to be the set at the base of the tree. */ long i, j; group_type *su; boolean max, same; su = (group_type *)Malloc(setsz * sizeof(group_type)); for (i = 0; i < setsz; i++) su[i] = 0; for (i = 0; i < n; i++) { max = true; for (j = 0; j < setsz; j++) if ((grouping[i][j] & ~st[j]) != 0) max = false; if (max) { same = true; for (j = 0; j < setsz; j++) if (grouping[i][j] != st[j]) same = false; max = !same; } if (max) { for (j = 0; j < setsz; j ++) if ((su[j] & ~grouping[i][j]) != 0) max = false; if (max) { same = true; for (j = 0; j < setsz; j ++) if (su[j] != grouping[i][j]) same = false; max = !same; } if (max) memcpy(su, grouping[i], setsz * sizeof(group_type)); } } memcpy(st, su, setsz * sizeof(group_type)); free(su); } /* bigsubset */ void recontraverse(node **p, group_type *st, long n, long *nextnode) { /* traverse to add next node to consensus tree */ long i, j = 0, k = 0, l = 0; boolean found, same = 0, zero, zero2; group_type *tempset, *st2; node *q, *r; for (i = 1; i <= spp; i++) { /* count species in set */ if (i == ((l+1)*SETBITS+1)) l++; if (((1L << (i - 1 - l*SETBITS)) & st[l]) != 0) { k++; /* k is the number of species in the set */ j = i; /* j is set to last species in the set */ } } if (k == 1) { /* if only 1, set up that tip */ *p = nodep[j - 1]; (*p)->tip = true; (*p)->index = j; return; } gnu(&grbg, p); /* otherwise make interior node */ (*p)->tip = false; (*p)->index = *nextnode; nodep[*nextnode - 1] = *p; (*nextnode)++; (*p)->deltav = 0.0; for (i = 0; i < n; i++) { /* go through all sets */ same = true; /* to find one which is this one */ for (j = 0; j < setsz; j++) if (grouping[i][j] != st[j]) same = false; if (same) (*p)->deltav = *timesseen[i]; } tempset = (group_type *)Malloc(setsz * sizeof(group_type)); memcpy(tempset, st, setsz * sizeof(group_type)); q = *p; st2 = (group_type *)Malloc(setsz * sizeof(group_type)); memcpy(st2, st, setsz * sizeof(group_type)); zero = true; /* having made two copies of the set ... */ for (j = 0; j < setsz; j++) /* see if they are empty */ if (tempset[j] != 0) zero = false; if (!zero) bigsubset(tempset, n); /* find biggest set within it */ zero = zero2 = false; /* ... tempset is that subset */ while (!zero && !zero2) { zero = zero2 = true; for (j = 0; j < setsz; j++) { if (st2[j] != 0) zero = false; if (tempset[j] != 0) zero2 = false; } if (!zero && !zero2) { gnu(&grbg, &q->next); q->next->index = q->index; q = q->next; q->tip = false; r = *p; recontraverse(&q->back, tempset, n, nextnode); /* put it on tree */ *p = r; q->back->back = q; for (j = 0; j < setsz; j++) st2[j] &= ~tempset[j]; /* remove that subset from the set */ memcpy(tempset, st2, setsz * sizeof(group_type)); /* that becomes set */ found = false; i = 1; while (!found && i <= n) { if (grouping[i - 1] != 0) { same = true; for (j = 0; j < setsz; j++) if (grouping[i - 1][j] != tempset[j]) same = false; } if ((grouping[i - 1] != 0) && same) found = true; else i++; } zero = true; for (j = 0; j < setsz; j++) if (tempset[j] != 0) zero = false; if (!zero && !found) bigsubset(tempset, n); } } q->next = *p; free(tempset); free(st2); } /* recontraverse */ void reconstruct(long n) { /* reconstruct tree from the subsets */ long nextnode; group_type *s; nextnode = spp + 1; s = (group_type *)Malloc(setsz * sizeof(group_type)); memcpy(s, fullset, setsz * sizeof(group_type)); recontraverse(&root, s, n, &nextnode); free(s); } /* reconstruct */ void coordinates(node *p, long *tipy) { /* establishes coordinates of nodes */ node *q, *first, *last; long maxx; if (p->tip) { p->xcoord = 0; p->ycoord = *tipy; p->ymin = *tipy; p->ymax = *tipy; (*tipy) += down; return; } q = p->next; maxx = 0; while (q != p) { coordinates(q->back, tipy); if (!q->back->tip) { if (q->back->xcoord > maxx) maxx = q->back->xcoord; } q = q->next; } first = p->next->back; q = p; while (q->next != p) q = q->next; last = q->back; p->xcoord = maxx + OVER; p->ycoord = (long)((first->ycoord + last->ycoord) / 2); p->ymin = first->ymin; p->ymax = last->ymax; } /* coordinates */ void drawline(long i) { /* draws one row of the tree diagram by moving up tree */ node *p, *q; long n, j; boolean extra, done, trif; node *r, *first = NULL, *last = NULL; boolean found; p = root; q = root; fprintf(outfile, " "); extra = false; trif = false; do { if (!p->tip) { found = false; r = p->next; while (r != p && !found) { if (i >= r->back->ymin && i <= r->back->ymax) { q = r->back; found = true; } else r = r->next; } first = p->next->back; r = p; while (r->next != p) r = r->next; last = r->back; } done = (p->tip || p == q); n = p->xcoord - q->xcoord; if (extra) { n--; extra = false; } if (q->ycoord == i && !done) { if (trif) putc('-', outfile); else putc('+', outfile); trif = false; if (!q->tip) { for (j = 1; j <= n - 8; j++) putc('-', outfile); if (noroot && (root->next->next->next == root) && (((root->next->back == q) && root->next->next->back->tip) || ((root->next->next->back == q) && root->next->back->tip))) fprintf(outfile, "-------|"); else { if (!strict) { /* write number of times seen */ if (q->deltav >= 10000) fprintf(outfile, "-%5.0f-|", (double)q->deltav); else if (q->deltav >= 1000) fprintf(outfile, "--%4.0f-|", (double)q->deltav); else if (q->deltav >= 100) fprintf(outfile, "-%5.1f-|", (double)q->deltav); else if (q->deltav >= 10) fprintf(outfile, "--%4.1f-|", (double)q->deltav); else fprintf(outfile, "--%4.2f-|", (double)q->deltav); } else fprintf(outfile, "-------|"); } extra = true; trif = true; } else { for (j = 1; j < n; j++) putc('-', outfile); } } else if (!p->tip && last->ycoord > i && first->ycoord < i && (i != p->ycoord || p == root)) { putc('|', outfile); for (j = 1; j < n; j++) putc(' ', outfile); } else { for (j = 1; j <= n; j++) putc(' ', outfile); if (trif) trif = false; } if (q != p) p = q; } while (!done); if (p->ycoord == i && p->tip) { for (j = 0; (j < MAXNCH) && (p->nayme[j] != '\0'); j++) putc(p->nayme[j], outfile); } putc('\n', outfile); } /* drawline */ void printree() { /* prints out diagram of the tree */ long i; long tipy; if (treeprint) { fprintf(outfile, "\nCONSENSUS TREE:\n"); if (mr || mre || ml) { if (noroot) { fprintf(outfile, "the numbers on the branches indicate the number\n"); fprintf(outfile, "of times the partition of the species into the two sets\n"); fprintf(outfile, "which are separated by that branch occurred\n"); } else { fprintf(outfile, "the numbers forks indicate the number\n"); fprintf(outfile, "of times the group consisting of the species\n"); fprintf(outfile, "which are to the right of that fork occurred\n"); } fprintf(outfile, "among the trees, out of %6.2f trees\n", ntrees); if (ntrees <= 1.001) fprintf(outfile, "(trees had fractional weights)\n"); } tipy = 1; coordinates(root, &tipy); putc('\n', outfile); for (i = 1; i <= tipy - down; i++) drawline(i); putc('\n', outfile); } if (noroot) { fprintf(outfile, "\n remember:"); if (didreroot) fprintf(outfile, " (though rerooted by outgroup)"); fprintf(outfile, " this is an unrooted tree!\n"); } putc('\n', outfile); } /* printree */ void enterpartition (group_type *s1, long *n) { /* try to put this partition in list of partitions. If implied by others, don't bother. If others implied by it, replace them. If this one vacuous because only one element in s1, forget it */ long i, j; boolean found; /* this stuff all to be rewritten but left here so pieces can be used */ found = false; for (i = 0; i < (*n); i++) { /* go through looking whether it is there */ found = true; for (j = 0; j < setsz; j++) { /* check both parts of partition */ found = found && (grouping[i][j] == s1[j]); found = found && (group2[i][j] == (fullset[j] & (~s1[j]))); } if (found) break; } if (!found) { /* if not, add it to the slot after the end, which must be empty */ grouping[i] = (group_type *)Malloc(setsz * sizeof(group_type)); timesseen[i] = (double *)Malloc(sizeof(double)); group2[i] = (group_type *)Malloc(setsz * sizeof(group_type)); for (j = 0; j < setsz; j++) grouping[i][j] = s1[j]; *timesseen[i] = 1; (*n)++; } } /* enterpartition */ void elimboth(long n) { /* for Adams case: eliminate pairs of groups incompatible with each other */ long i, j; boolean comp; for (i = 0; i < n-1; i++) { for (j = i+1; j < n; j++) { comp = compatible(i,j); if (!comp) { *timesseen[i] = 0.0; *timesseen[j] = 0.0; } } if (*timesseen[i] == 0.0) { free(grouping[i]); free(timesseen[i]); timesseen[i] = NULL; grouping[i] = NULL; } } if (*timesseen[n-1] == 0.0) { free(grouping[n-1]); free(timesseen[n-1]); timesseen[n-1] = NULL; grouping[n-1] = NULL; } } /* elimboth */ void consensus(pattern_elm ***pattern_array, long trees_in) { long i, n, n2, tipy; group2 = (group_type **) Malloc(maxgrp*sizeof(group_type *)); for (i = 0; i < maxgrp; i++) group2[i] = NULL; times2 = (double **)Malloc(maxgrp*sizeof(double *)); for (i = 0; i < maxgrp; i++) times2[i] = NULL; n2 = 0; censor(); /* drop groups that are too rare */ compress(&n); /* push everybody to front of array */ if (!strict) { /* drop those incompatible, if any */ sort(n); eliminate(&n, &n2); compress(&n); } reconstruct(n); tipy = 1; coordinates(root, &tipy); if (prntsets) { fprintf(outfile, "\nSets included in the consensus tree\n"); printset(n); for (i = 0; i < n2; i++) { if (!grouping[i]) { grouping[i] = (group_type *)Malloc(setsz * sizeof(group_type)); timesseen[i] = (double *)Malloc(sizeof(double)); } memcpy(grouping[i], group2[i], setsz * sizeof(group_type)); *timesseen[i] = *times2[i]; } n = n2; fprintf(outfile, "\n\nSets NOT included in consensus tree:"); if (n2 == 0) fprintf(outfile, " NONE\n"); else { putc('\n', outfile); printset(n); } } putc('\n', outfile); if (strict) fprintf(outfile, "\nStrict consensus tree\n"); if (mre) fprintf(outfile, "\nExtended majority rule consensus tree\n"); if (ml) { fprintf(outfile, "\nM consensus tree (l = %4.2f)\n", mlfrac); fprintf(outfile, " l\n"); } if (mr) fprintf(outfile, "\nMajority rule consensus tree\n"); printree(); free(nayme); for (i = 0; i < maxgrp; i++) free(grouping[i]); free(grouping); for (i = 0; i < maxgrp; i++) free(order[i]); free(order); for (i = 0; i < maxgrp; i++) if (timesseen[i] != NULL) free(timesseen[i]); free(timesseen); } /* consensus */ void rehash() { group_type *s; long i, j; double temp, ss, smult; boolean done; long old_maxgrp = maxgrp; long new_maxgrp = maxgrp*2; tmseen2 = (double **)Malloc(new_maxgrp*sizeof(double *)); grping2 = (group_type **)Malloc(new_maxgrp*sizeof(group_type *)); order2 = (long **)Malloc(new_maxgrp*sizeof(long *)); lengths2 = (double *)Malloc(new_maxgrp*sizeof(double)); tchange2 = (double *)Malloc(new_maxgrp*sizeof(double)); for (i = 0; i < new_maxgrp; i++) { tmseen2[i] = NULL; grping2[i] = NULL; order2[i] = NULL; lengths2[i] = 0.0; tchange2[i] = 0.0; } smult = (sqrt(5.0) - 1) / 2; s = (group_type *)Malloc(setsz * sizeof(group_type)); for (i = 0; i < old_maxgrp; i++) { long old_index = *order[i]; long new_index = -1; memcpy(s, grouping[old_index], setsz * sizeof(group_type)); ss = 0.0; for (j = 0; j < setsz; j++) ss += s[j] * smult; /* pow(2, SETBITS*j)*/; temp = ss; new_index = (long)(new_maxgrp * (temp - floor(temp))); done = false; while (!done) { if (!grping2[new_index]) { grping2[new_index] = (group_type *)Malloc(setsz * sizeof(group_type)); memcpy(grping2[new_index], grouping[old_index], setsz * sizeof(group_type)); order2[i] = (long *)Malloc(sizeof(long)); *order2[i] = new_index; tmseen2[new_index] = (double *)Malloc(sizeof(double)); *tmseen2[new_index] = *timesseen[old_index]; lengths2[new_index] = lengths[old_index]; free(grouping[old_index]); free(timesseen[old_index]); free(order[i]); grouping[old_index] = NULL; timesseen[old_index] = NULL; order[i] = NULL; done = true; /* successfully found place for this item */ } else { new_index++; if (new_index >= new_maxgrp) new_index -= new_maxgrp; } } } free(lengths); free(timesseen); free(grouping); free(order); free(s); timesseen = tmseen2; grouping = grping2; lengths = lengths2; order = order2; maxgrp = new_maxgrp; } /* rehash */ void enternodeset(node* r) { /* enter a set of species into the hash table */ long i, j, start; double ss, n; boolean done, same; double times ; group_type *s; s = r->nodeset; /* do not enter full sets */ same = true; for (i = 0; i < setsz; i++) if (s[i] != fullset[i]) same = false; if (same) return; times = trweight; ss = 0.0; /* compute the hashcode for the set */ n = ((sqrt(5.0) - 1.0) / 2.0); /* use an irrational multiplier */ for (i = 0; i < setsz; i++) ss += s[i] * n; i = (long)(maxgrp * (ss - floor(ss))) + 1; /* use fractional part of code */ start = i; done = false; /* go through seeing if it is there */ while (!done) { if (grouping[i - 1]) { /* ... i.e. if group is absent, or */ same = false; /* (will be false if timesseen = 0) */ if (!(timesseen[i-1] == 0)) { same = true; for (j = 0; j < setsz; j++) { if (s[j] != grouping[i - 1][j]) same = false; } } else { /* if group is present but timessen = 0 */ for (j = 0; j < setsz; j++) /* replace by correct group */ grouping[i - 1][j] = s[j]; *timesseen[i-1] = 1; } } if (grouping[i - 1] && same) { /* if it is there, increment timesseen */ *timesseen[i - 1] += times; lengths[i - 1] = nodep[r->index - 1]->v; done = true; } else if (!grouping[i - 1]) { /* if not there and slot empty ... */ grouping[i - 1] = (group_type *)Malloc(setsz * sizeof(group_type)); lasti++; order[lasti] = (long *)Malloc(sizeof(long)); timesseen[i - 1] = (double *)Malloc(sizeof(double)); memcpy(grouping[i - 1], s, setsz * sizeof(group_type)); *timesseen[i - 1] = times; *order[lasti] = i - 1; done = true; lengths[i - 1] = nodep[r->index -1]->v; } else { /* otherwise look to put it in next slot ... */ i++; if (i > maxgrp) i -= maxgrp; } if (!done && i == start) { /* if no place to put it, expand hash table */ rehash(); done = true; enternodeset(r); /* calls this procedure again, but now there should be space */ } } } /* enternodeset */ /* recursively crawls through tree, setting nodeset values to be the * bitwise OR of bits from downstream nodes */ void accumulate(node *r) { node *q; long i; /* zero out nodeset values. since we are re-using tree nodes, * the malloc only happens the first time we encounter a node. */ if (!r->nodeset) { r->nodeset = (group_type *)Malloc(setsz * sizeof(group_type)); } for (i = 0; i < setsz; i++) { r->nodeset[i] = 0L; } if (r->tip) { /* tip nodes should have a single bit set corresponding to index-1 */ i = (r->index-1) / (long)SETBITS; r->nodeset[i] = 1L << (r->index - 1 - i*SETBITS); } else { /* for loop should not visit r->back -- we've likely come from there */ for (q = r->next; q != r; q = q->next) { /* recursive call to this function */ accumulate(q->back); /* bitwise OR of bits from downstream nodes */ for (i = 0; i < setsz; i++) r->nodeset[i] |= q->back->nodeset[i]; } } if ((!r->tip && (r->next->next != r)) || r->tip) enternodeset(r); } /* accumulate */ void dupname2(Char *name, node *p, node *this) { /* search for a duplicate name recursively */ node *q; if (p->tip) { if (p != this) { if (namesSearch(p->nayme)) { printf("\n\nERROR in user tree: duplicate name found: "); puts(p->nayme); printf("\n\n"); exxit(-1); } else { namesAdd(p->nayme); } } } else { q = p; while (p->next != q) { dupname2(name, p->next->back, this); p = p->next; } } } /* dupname2 */ void dupname(node *p) { /* Recursively searches tree, starting at p, to verify that * each tip name occurs only once. When called with root as * its argument, at final recusive exit, all tip names should * be in the hash "hashp". */ node *q; if (p->tip) { if (namesSearch(p->nayme)) { printf("\n\nERROR in user tree: duplicate name found: "); puts(p->nayme); printf("\n\n"); exxit(-1); } else { namesAdd(p->nayme); } } else { q = p; while (p->next != q) { dupname(p->next->back); p = p->next; } } } /* dupname */ void missingnameRecurs(node *p) { /* search for missing names in first tree */ node *q; if (p->tip) { if (!namesSearch(p->nayme)) { printf("\n\nERROR in user tree: name %s not found in first tree\n\n\n", p->nayme); exxit(-1); } } else { q = p; while (p->next != q) { missingnameRecurs(p->next->back); p = p->next; } } } /* missingnameRecurs */ /** * wrapper for recursive missingname function */ void missingname(node *p){ missingnameRecurs(p); namesCheckTable(); } /* missingname */ void gdispose(node *p) { /* go through tree throwing away nodes */ node *q, *r; if (p->tip) { chuck(&grbg, p); return; } q = p->next; while (q != p) { gdispose(q->back); r = q; q = q->next; chuck(&grbg, r); } chuck(&grbg, p); } /* gdispose */ void initreenode(node *p) { /* traverse tree and assign species names to tip nodes */ node *q; if (p->tip) { memcpy(nayme[p->index - 1], p->nayme, MAXNCH); } else { q = p->next; while (q && q != p) { initreenode(q->back); q = q->next; } } } /* initreenode */ void reroot(node *outgroup, long *nextnode) { /* reroots and reorients tree, placing root at outgroup */ long i; node *p, *q; double newv; /* count root's children & find last */ p = root; i = 0; while (p->next != root) { p = p->next; i++; } if (i == 2) { /* 2 children: */ q = root->next; newv = q->back->v + p->back->v; /* if outgroup is already here, just move * its length to the other branch and finish */ if (outgroup == p->back) { /* flip branch order at root so that outgroup * is first, just to be consistent */ root->next = p; p->next = q; q->next = root; q->back->v = newv; p->back->v = 0; return; } if (outgroup == q) { p->back->v = newv; q->back->v = 0; return; } /* detach root by linking child nodes */ q->back->back = p->back; p->back->back = q->back; p->back->v = newv; q->back->v = newv; } else { /* 3+ children */ p->next = root->next; /* join old root nodes */ nodep[root->index-1] = root->next; /* make root->next the primary node */ /* create new root nodes */ gnu(&grbg, &root->next); q = root->next; gnu(&grbg, &q->next); p = q->next; p->next = root; q->tip = false; p->tip = false; nodep[*nextnode] = root; (*nextnode)++; root->index = *nextnode; root->next->index = root->index; root->next->next->index = root->index; } newv = outgroup->v; /* root is 3 "floating" nodes */ /* q == root->next */ /* p == root->next->next */ /* attach root at outgroup */ q->back = outgroup; p->back = outgroup->back; outgroup->back->back = p; outgroup->back = q; outgroup->v = 0; outgroup->back->v = 0; root->v = 0; p->v = newv; p->back->v = newv; reorient(root); } /* reroot */ void reorient(node* n) { node* p; if ( n->tip ) return; if ( nodep[n->index - 1] != n ) { nodep[n->index - 1] = n; if ( n->back ) n->v = n->back->v; } for ( p = n->next ; p != n ; p = p->next) reorient(p->back); } void store_pattern (pattern_elm ***pattern_array, int trees_in_file) { /* put a tree's groups into a pattern array. Don't forget that when not Adams, grouping[] is not compressed. . . */ long i, total_groups=0, j=0, k; /* First, find out how many groups exist in the given tree. */ for (i = 0 ; i < maxgrp ; i++) if ((grouping[i] != NULL) && (*timesseen[i] > 0)) /* If this is group exists and is present in the current tree, */ total_groups++ ; /* Then allocate a space to store the bit patterns. . . */ for (i = 0 ; i < setsz ; i++) { pattern_array[i][trees_in_file] = (pattern_elm *) Malloc(sizeof(pattern_elm)) ; pattern_array[i][trees_in_file]->apattern = (group_type *) Malloc (total_groups * sizeof (group_type)) ; pattern_array[i][trees_in_file]->length = (double *) Malloc (maxgrp * sizeof (double)) ; for ( j = 0 ; j < maxgrp ; j++ ) { pattern_array[i][trees_in_file]->length[j] = -1; } pattern_array[i][trees_in_file]->patternsize = (long *)Malloc(sizeof(long)); } j = 0; /* Then go through groupings again, and copy in each element appropriately. */ for (i = 0 ; i < maxgrp ; i++) if (grouping[i] != NULL) { if (*timesseen[i] > 0) { for (k = 0 ; k < setsz ; k++) pattern_array[k][trees_in_file]->apattern[j] = grouping[i][k] ; pattern_array[0][trees_in_file]->length[j] = lengths[i]; j++ ; /* EWFIX.BUG.756 updates timesseen_changes to the current value pointed to by timesseen treedist uses this to determine if group i has been seen by comparing timesseen_changes[i] (the count now) with timesseen[i] (the count after reading next tree) We could make treedist more efficient by not keeping timesseen (and groupings, etc) around, but doing it this way allows us to share code between treedist and consense. */ *timesseen[i] = 0; } } *pattern_array[0][trees_in_file]->patternsize = total_groups; } /* store_pattern */ boolean samename(naym name1, plotstring name2) { return !(strncmp(name1, name2, MAXNCH)); } /* samename */ void reordertips() { /* Reorders nodep[] and indexing to match species order from first tree */ /* Assumes tree has spp tips and nayme[] has spp elements, and that there is a * one-to-one mapping between tip names and the names in nayme[]. */ long i, j; node *t; for (i = 0; i < spp-1; i++) { for (j = i + 1; j < spp; j++) { if (samename(nayme[i], nodep[j]->nayme)) { /* switch the pointers in * nodep[] and set index accordingly for each node. */ t = nodep[i]; nodep[i] = nodep[j]; nodep[i]->index = i+1; nodep[j] = t; nodep[j]->index = j+1; break; /* next i */ } } } } /* reordertips */ void read_groups (pattern_elm ****pattern_array, long total_trees, long tip_count, FILE *intree) { /* read the trees. Accumulate sets. */ int i, j, k; boolean haslengths, initial; long nextnode, trees_read = 0; /* do allocation first *****************************************/ grouping = (group_type **) Malloc(maxgrp*sizeof(group_type *)); lengths = (double *) Malloc(maxgrp*sizeof(double)); for (i = 0; i < maxgrp; i++) grouping[i] = NULL; order = (long **) Malloc(maxgrp*sizeof(long *)); for (i = 0; i < maxgrp; i++) order[i] = NULL; timesseen = (double **)Malloc(maxgrp*sizeof(double *)); for (i = 0; i < maxgrp; i++) timesseen[i] = NULL; nayme = (naym *)Malloc(tip_count*sizeof(naym)); hashp = (hashtype)Malloc(sizeof(namenode) * NUM_BUCKETS); for (i=0;i= 1); if (!done1) { printf("ERROR: Bad outgroup number: %ld\n", outgrno); printf(" Must be greater than zero\n"); } countup(&loopcount2, 10); } while (done1 != true); } break; case 'R': noroot = !noroot; break; case 'T': initterminal(&ibmpc, &ansi); break; case '1': prntsets = !prntsets; break; case '2': progress = !progress; break; case '3': treeprint = !treeprint; break; case '4': trout = !trout; break; } } else printf("Not a possible option!\n"); } countup(&loopcount, 100); } while (!done); if (ml) { do { printf("\nFraction (l) of times a branch must appear\n"); fflush(stdout); scanf("%lf%*[^\n]", &mlfrac); getchar(); } while ((mlfrac < 0.5) || (mlfrac > 1.0)); } } /* getoptions */ void count_siblings(node **p) { node *tmp_node; int i; if (!(*p)) { /* This is a leaf, */ return; } else { tmp_node = (*p)->next; } for (i = 0 ; i < 1000; i++) { if (tmp_node == (*p)) { /* When we've gone through all the siblings, */ break; } else if (tmp_node) { tmp_node = tmp_node->next; } else { /* Should this be executed? */ return ; } } } /* count_siblings */ void treeout(node *p) { /* write out file with representation of final tree */ long i, n = 0; Char c; node *q; double x; count_siblings (&p); if (p->tip) { /* If we're at a node which is a leaf, figure out how long the name is and print it out. */ for (i = 1; i <= MAXNCH; i++) { if (p->nayme[i - 1] != '\0') n = i; } for (i = 0; i < n; i++) { c = p->nayme[i]; if (c == ' ') c = '_'; putc(c, outtree); } col += n; } else { /* If we're at a furcation, print out the proper formatting, loop through all the children, calling the procedure recursively. */ putc('(', outtree); col++; q = p->next; while (q != p) { /* This should terminate when we've gone through all the siblings, */ treeout(q->back); q = q->next; if (q == p) break; putc(',', outtree); col++; if (col > 60) { putc('\n', outtree); col = 0; } } putc(')', outtree); col++; } if (p->tip) x = ntrees; else x = (double)p->deltav; if (p == root) { /* When we're all done with this tree, */ fprintf(outtree, ";\n"); return; } /* Figure out how many characters the branch length requires: */ else { if (!strict) { if (x >= 100.0) { fprintf(outtree, ":%5.1f", x); col += 4; } else if (x >= 10.0) { fprintf(outtree, ":%4.1f", x); col += 3; } else if (x >= 1.00) { fprintf(outtree, ":%4.2f", x); col += 3; } } } } /* treeout */ int main(int argc, Char *argv[]) { /* Local variables added by Dan F. */ pattern_elm ***pattern_array; long trees_in = 0; long i, j; long tip_count = 0; node *p, *q; #ifdef MAC argc = 1; /* macsetup("Consense", ""); */ argv[0] = "Consense"; #endif init(argc, argv); /* Open in binary: ftell() is broken for UNIX line-endings under WIN32 */ openfile(&intree, INTREE, "input tree file", "rb", argv[0], intreename); openfile(&outfile, OUTFILE, "output file", "w", argv[0], outfilename); /* Initialize option-based variables, then ask for changes regarding their values. */ getoptions(); ntrees = 0.0; maxgrp = 32767; /* initial size of set hash table */ lasti = -1; if (trout) openfile(&outtree, OUTTREE, "output tree file", "w", argv[0], outtreename); if (prntsets) fprintf(outfile, "Species in order: \n\n"); trees_in = countsemic(&intree); countcomma(&intree,&tip_count); tip_count++; /* countcomma does a raw comma count, tips is one greater */ /* Read the tree file and put together grouping, order, and timesseen */ read_groups (&pattern_array, trees_in, tip_count, intree); /* Compute the consensus tree. */ putc('\n', outfile); nodep = (pointarray)Malloc(2*(1+spp)*sizeof(node *)); for (i = 0; i < spp; i++) { nodep[i] = (node *)Malloc(sizeof(node)); for (j = 0; j < MAXNCH; j++) nodep[i]->nayme[j] = '\0'; strncpy(nodep[i]->nayme, nayme[i], MAXNCH); } for (i = spp; i < 2*(1+spp); i++) nodep[i] = NULL; consensus(pattern_array, trees_in); printf("\n"); if (trout) { treeout(root); if (progress) printf("Consensus tree written to file \"%s\"\n\n", outtreename); } if (progress) printf("Output written to file \"%s\"\n\n", outfilename); for (i = 0; i < spp; i++) free(nodep[i]); for (i = spp; i < 2*(1 + spp); i++) { if (nodep[i] != NULL) { p = nodep[i]->next; do { q = p->next; free(p); p = q; } while (p != nodep[i]); free(p); } } free(nodep); FClose(outtree); FClose(intree); FClose(outfile); #ifdef MAC fixmacfile(outfilename); fixmacfile(outtreename); #endif printf("Done.\n\n"); #ifdef WIN32 phyRestoreConsoleAttributes(); #endif return 0; } /* main */ phylip-3.697/src/cont.c0000644004732000473200000001774112407037161014456 0ustar joefelsenst_g/* version 3.696. Written by Joseph Felsenstein, Akiko Fuseki, Sean Lamont, and Andrew Keeffe. Copyright (c) 1999-2014, Joseph Felsenstein All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #include #include "phylip.h" #include "cont.h" void alloctree(pointarray *treenode, long nonodes) { /* allocate treenode dynamically */ /* used in contml & contrast */ long i, j; node *p, *q; *treenode = (pointarray)Malloc(nonodes*sizeof(node *)); for (i = 0; i < spp; i++) (*treenode)[i] = (node *)Malloc(sizeof(node)); for (i = spp; i < nonodes; i++) { q = NULL; for (j = 1; j <= 3; j++) { p = (node *)Malloc(sizeof(node)); p->next = q; q = p; } p->next->next->next = p; (*treenode)[i] = p; } } /* alloctree */ void freetree(pointarray *treenode, long nonodes) { long i, j; node *p, *q; for (i = 0; i < spp; i++) free((*treenode)[i]); for (i = spp; i < nonodes; i++) { p = (*treenode)[i]; for (j = 1; j <= 3; j++) { q = p; p = p->next; free(q); } } free(*treenode); } /* freetree */ void setuptree(tree *a, long nonodes) { /* initialize a tree */ /* used in contml & contrast */ long i, j; node *p; for (i = 1; i <= spp; i++) { a->nodep[i - 1]->back = NULL; a->nodep[i - 1]->tip = (i <= spp); a->nodep[i - 1]->iter = true; a->nodep[i - 1]->index = i; } for (i = spp + 1; i <= nonodes; i++) { p = a->nodep[i - 1]; for (j = 1; j <= 3; j++) { p->back = NULL; p->tip = false; p->iter = true; p->index = i; p = p->next; } } a->likelihood = -DBL_MAX; a->start = a->nodep[0]; } /* setuptree */ void allocview(tree *a, long nonodes, long totalleles) { /* allocate view */ /* used in contml */ long i, j; node *p; for (i = 0; i < spp; i++) a->nodep[i]->view = (phenotype3)Malloc(totalleles*sizeof(double)); for (i = spp; i < nonodes; i++) { p = a->nodep[i]; for (j = 1; j <= 3; j++) { p->view = (phenotype3)Malloc(totalleles*sizeof(double)); p = p->next; } } } /* allocview */ void freeview(tree *a, long nonodes) { /* deallocate view */ /* used in contml */ long i, j; node *p; for (i = 0; i < spp; i++) free(a->nodep[i]->view); for (i = spp; i < nonodes; i++) { p = a->nodep[i]; for (j = 1; j <= 3; j++) { free(p->view); p = p->next; } } } /* freeview */ void standev2(long numtrees, long maxwhich, long a, long b, double maxlogl, double *l0gl, double **l0gf, longer seed) { /* do paired sites test (KHT or SH) on user-defined trees */ /* used in contml */ double **covar, *P, *f, *r; long i, j, k; double sumw, sum, sum2, sd; double temp; #define SAMPLES 1000 if (numtrees == 2) { fprintf(outfile, "Kishino-Hasegawa-Templeton test\n\n"); fprintf(outfile, "Tree logL Diff logL Its S.D."); fprintf(outfile, " Significantly worse?\n\n"); i = 1; while (i <= numtrees) { fprintf(outfile, "%3ld%10.1f", i, l0gl[i - 1]); if (maxwhich == i) fprintf(outfile, " <------ best\n"); else { sumw = 0.0; sum = 0.0; sum2 = 0.0; for (j = a; j <= b; j++) { sumw += 1; temp = l0gf[i - 1][j] - l0gf[maxwhich - 1][j]; sum += temp; sum2 += temp * temp; } temp = sum / sumw; sd = sqrt(sumw / (sumw - 1.0) * (sum2 - temp * temp)); fprintf(outfile, "%10.1f%12.4f", (l0gl[i - 1])-maxlogl, sd); if (sum > 1.95996 * sd) fprintf(outfile, " Yes\n"); else fprintf(outfile, " No\n"); } i++; } fprintf(outfile, "\n\n"); } else { /* Shimodaira-Hasegawa test using normal approximation */ if(numtrees > MAXSHIMOTREES){ fprintf(outfile, "Shimodaira-Hasegawa test on first %d of %ld trees\n\n" , MAXSHIMOTREES, numtrees); numtrees = MAXSHIMOTREES; } else { fprintf(outfile, "Shimodaira-Hasegawa test\n\n"); } covar = (double **)Malloc(numtrees*sizeof(double *)); sumw = b-a+1; for (i = 0; i < numtrees; i++) covar[i] = (double *)Malloc(numtrees*sizeof(double)); for (i = 0; i < numtrees; i++) { /* compute covariances of trees */ sum = l0gl[i]/sumw; for (j = 0; j <=i; j++) { sum2 = l0gl[j]/sumw; temp = 0.0; for (k = a; k <= b ; k++) { temp = temp + (l0gf[i][k]-sum)*(l0gf[j][k]-sum2); } covar[i][j] = temp; if (i != j) covar[j][i] = temp; } } for (i = 0; i < numtrees; i++) { /* in-place Cholesky decomposition of trees x trees covariance matrix */ sum = 0.0; for (j = 0; j <= i-1; j++) sum = sum + covar[i][j] * covar[i][j]; temp = sqrt(covar[i][i] - sum); covar[i][i] = temp; for (j = i+1; j < numtrees; j++) { sum = 0.0; for (k = 0; k < i; k++) sum = sum + covar[i][k] * covar[j][k]; if (fabs(temp) < 1.0E-12) covar[j][i] = 0.0; else covar[j][i] = (covar[j][i] - sum)/temp; } } f = (double *)Malloc(numtrees*sizeof(double)); /* resampled likelihoods */ P = (double *)Malloc(numtrees*sizeof(double)); /* vector of P's of trees */ r = (double *)Malloc(numtrees*sizeof(double)); /* store Normal variates */ for (i = 0; i < numtrees; i++) P[i] = 0.0; for (i = 1; i <= SAMPLES; i++) { /* loop over resampled trees */ for (j = 0; j < numtrees; j++) /* draw Normal variates */ r[j] = normrand(seed); for (j = 0; j < numtrees; j++) { /* compute vectors */ sum = 0.0; for (k = 0; k <= j; k++) sum += covar[j][k]*r[k]; f[j] = sum; } sum = f[1]; for (j = 1; j < numtrees; j++) /* get max of vector */ if (f[j] > sum) sum = f[j]; for (j = 0; j < numtrees; j++) /* accumulate P's */ if (maxlogl-l0gl[j] < sum-f[j]) P[j] += 1.0/SAMPLES; } fprintf(outfile, "Tree logL Diff logL P value"); fprintf(outfile, " Significantly worse?\n\n"); for (i = 0; i < numtrees; i++) { fprintf(outfile, "%3ld%10.1f", i+1, l0gl[i]); if ((maxwhich-1) == i) fprintf(outfile, " <------ best\n"); else { fprintf(outfile, " %9.1f %10.3f", l0gl[i]-maxlogl, P[i]); if (P[i] < 0.05) fprintf(outfile, " Yes\n"); else fprintf(outfile, " No\n"); } } fprintf(outfile, "\n"); free(P); /* free the variables we Malloc'ed */ free(f); free(r); for (i = 0; i < numtrees; i++) free(covar[i]); free(covar); } } /* standev */ phylip-3.697/src/cont.h0000644004732000473200000000346512407037226014463 0ustar joefelsenst_g /* version 3.696. Written by Joseph Felsenstein, Akiko Fuseki, Sean Lamont, and Andrew Keeffe. Copyright (c) 1993-2014, Joseph Felsenstein All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ /* cont.h: included in contml & contrast */ #ifndef OLDC /*function prototypes*/ void alloctree(pointarray *, long); void freetree(pointarray *, long); void setuptree(tree *, long); void allocview(tree *, long, long); void freeview(tree *, long); void standev2(long, long, long, long, double, double *, double **, longer); /*function prototypes*/ #endif phylip-3.697/src/contml.c0000644004732000473200000012037112406201116014771 0ustar joefelsenst_g/* version 3.696. Written by Joseph Felsenstein, Akiko Fuseki, Sean Lamont, and Andrew Keeffe. Copyright (c) 1993-2014, Joseph Felsenstein All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #include #include "phylip.h" #include "cont.h" #define epsilon1 0.000001 /* small number */ #define epsilon2 0.02 /* not such a small number */ #define smoothings 4 /* number of passes through smoothing algorithm */ #define over 60 #ifndef OLDC /* function prototypes */ void getoptions(void); void allocrest(void); void doinit(void); void getalleles(void); void inputdata(void); void transformgfs(void); void getinput(void); void sumlikely(node *, node *, double *); double evaluate(tree *); double distance(node *, node *); void makedists(node *); void makebigv(node *, boolean *); void correctv(node *); void littlev(node *); void nuview(node *); void update(node *); void smooth(node *); void insert_(node *, node *); void copynode(node *, node *); void copy_(tree *, tree *); void inittip(long, tree *); void buildnewtip(long, tree *, long); void buildsimpletree(tree *); void addtraverse(node *, node *, boolean); void re_move(node **, node **); void rearrange(node *); void coordinates(node *, double, long *, double *); void drawline(long, double); void printree(void); void treeout(node *); void describe(node *, double, double); void summarize(void); void nodeinit(node *); void initrav(node *); void treevaluate(void); void maketree(void); void globrearrange(void); /* function prototypes */ #endif Char infilename[FNMLNGTH], outfilename[FNMLNGTH], intreename[FNMLNGTH], outtreename[FNMLNGTH]; long nonodes2, loci, totalleles, df, outgrno, col, datasets, ith, njumble, jumb=0; long inseed, inseed0; long *alleles, *locus, *weight; phenotype3 *x; boolean all, contchars, global, jumble, lengths, outgropt, trout, usertree, printdata, progress, treeprint, mulsets, firstset; longer seed; long *enterorder; tree curtree, priortree, bestree, bestree2; long nextsp, numtrees, which, maxwhich, shimotrees; /* From maketree, propagated to global */ boolean succeeded; double maxlogl; double l0gl[MAXSHIMOTREES]; double *pbar, *sqrtp, *l0gf[MAXSHIMOTREES]; Char ch; char *progname; double trweight; /* added to make treeread happy */ boolean goteof; boolean haslengths; /* end of ones added to make treeread happy */ node *addwhere; void getoptions() { /* interactively set options */ long inseed0, loopcount; Char ch; boolean done; fprintf(outfile, "\nContinuous character Maximum Likelihood"); fprintf(outfile, " method version %s\n\n",VERSION); putchar('\n'); global = false; jumble = false; njumble = 1; lengths = false; outgrno = 1; outgropt = false; all = false; contchars = false; trout = true; usertree = false; printdata = false; progress = true; treeprint = true; loopcount = 0; do { cleerhome(); printf("\nContinuous character Maximum Likelihood"); printf(" method version %s\n\n",VERSION); printf("Settings for this run:\n"); printf(" U Search for best tree? %s\n", (usertree ? "No, use user trees in input" : "Yes")); if (usertree) { printf(" L Use lengths from user trees?%s\n", (lengths ? " Yes" : " No")); } printf(" C Gene frequencies or continuous characters? %s\n", (contchars ? "Continuous characters" : "Gene frequencies")); if (!contchars) printf(" A Input file has all alleles at each locus? %s\n", (all ? "Yes" : "No, one allele missing at each")); printf(" O Outgroup root? %s %ld\n", (outgropt ? "Yes, at species number" : "No, use as outgroup species"),outgrno); if (!usertree) { printf(" G Global rearrangements? %s\n", (global ? "Yes" : "No")); printf(" J Randomize input order of species?"); if (jumble) printf(" Yes (seed=%8ld,%3ld times)\n", inseed0, njumble); else printf(" No. Use input order\n"); } printf(" M Analyze multiple data sets?"); if (mulsets) printf(" Yes, %2ld sets\n", datasets); else printf(" No\n"); printf(" 0 Terminal type (IBM PC, ANSI, none)? %s\n", ibmpc ? "IBM PC" : ansi ? "ANSI" : "(none)"); printf(" 1 Print out the data at start of run %s\n", (printdata ? "Yes" : "No")); printf(" 2 Print indications of progress of run %s\n", (progress ? "Yes" : "No")); printf(" 3 Print out tree %s\n", (treeprint ? "Yes" : "No")); printf(" 4 Write out trees onto tree file? %s\n", (trout ? "Yes" : "No")); printf("\n Y to accept these or type the letter for one to change\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); uppercase(&ch); done = (ch == 'Y'); if (!done) { if (((!usertree) && (strchr("JOUGACM12340", ch) != NULL)) || (usertree && ((strchr("LOUACM12340", ch) != NULL)))){ switch (ch) { case 'A': if (!contchars) all = !all; break; case 'C': contchars = !contchars; break; case 'G': global = !global; break; case 'J': jumble = !jumble; if (jumble) initjumble(&inseed, &inseed0, seed, &njumble); else njumble = 1; break; case 'L': lengths = !lengths; break; case 'O': outgropt = !outgropt; if (outgropt) initoutgroup(&outgrno, spp); break; case 'U': usertree = !usertree; break; case 'M': mulsets = !mulsets; if (mulsets) initdatasets(&datasets); break; case '0': initterminal(&ibmpc, &ansi); break; case '1': printdata = !printdata; break; case '2': progress = !progress; break; case '3': treeprint = !treeprint; break; case '4': trout = !trout; break; } } else printf("Not a possible option!\n"); } countup(&loopcount, 100); } while (!done); } /* getoptions */ void allocrest() { /* allocate arrays for number of alleles, the data coordinates, names etc */ alleles = (long *)Malloc(loci*sizeof(long)); if (contchars) locus = (long *)Malloc(loci*sizeof(long)); x = (phenotype3 *)Malloc(spp*sizeof(phenotype3)); nayme = (naym *)Malloc(spp*sizeof(naym)); enterorder = (long *)Malloc(spp*sizeof(long)); } /* allocrest */ void doinit() { /* initializes variables */ inputnumbers(&spp, &loci, &nonodes2, 1); getoptions(); if(!usertree) nonodes2--; if (printdata) fprintf(outfile, "\n%4ld Populations, %4ld Loci\n", spp, loci); alloctree(&curtree.nodep, nonodes2); if (!usertree) { alloctree(&bestree.nodep, nonodes2); alloctree(&priortree.nodep, nonodes2); if (njumble > 1) { alloctree(&bestree2.nodep, nonodes2); } } allocrest(); } /* doinit */ void getalleles() { /* set up number of alleles at loci */ long i, j, m; if (!firstset) samenumsp(&loci, ith); if (contchars ) { totalleles = loci; for (i = 1; i <= loci; i++) { locus[i - 1] = i; alleles[i - 1] = 1; } df = loci; } else { totalleles = 0; scan_eoln(infile); if (printdata) { fprintf(outfile, "\nNumbers of alleles at the loci:\n"); fprintf(outfile, "------- -- ------- -- --- -----\n\n"); } for (i = 1; i <= loci; i++) { if (eoln(infile)) scan_eoln(infile); if (fscanf(infile, "%ld", &alleles[i - 1]) != 1) { printf("ERROR: Unable to read number of alleles at locus %ld\n", i); exxit(-1); } if (alleles[i - 1] <= 0) { printf("ERROR: Bad number of alleles: %ld at locus %ld\n", alleles[i-1], i); exxit(-1); } totalleles += alleles[i - 1]; if (printdata) fprintf(outfile, "%4ld", alleles[i - 1]); } locus = (long *)Malloc(totalleles*sizeof(long)); m = 0; for (i = 1; i <= loci; i++) { for (j = 0; j < alleles[i - 1]; j++) locus[m+j] = i; m += alleles[i - 1]; } df = totalleles - loci; } allocview(&curtree, nonodes2, totalleles); if (!usertree) { allocview(&bestree, nonodes2, totalleles); allocview(&priortree, nonodes2, totalleles); if (njumble > 1) allocview(&bestree2, nonodes2, totalleles); } for (i = 0; i < spp; i++) x[i] = (phenotype3)Malloc(totalleles*sizeof(double)); pbar = (double *)Malloc(totalleles*sizeof(double)); if (usertree) for (i = 0; i < MAXSHIMOTREES; i++) l0gf[i] = (double *)Malloc(totalleles*sizeof(double)); if (printdata) putc('\n', outfile); } /* getalleles */ void inputdata() { /* read species data */ long i, j, k, l, m, m0, n, p; double sum; if (printdata) { fprintf(outfile, "\nName"); if (contchars) fprintf(outfile, " Phenotypes\n"); else fprintf(outfile, " Gene Frequencies\n"); fprintf(outfile, "----"); if (contchars) fprintf(outfile, " ----------\n"); else fprintf(outfile, " ---- -----------\n"); putc('\n', outfile); if (!contchars) { for (j = 1; j <= nmlngth - 8; j++) putc(' ', outfile); fprintf(outfile, "locus:"); p = 1; for (j = 1; j <= loci; j++) { if (all) n = alleles[j - 1]; else n = alleles[j - 1] - 1; for (k = 1; k <= n; k++) { fprintf(outfile, "%10ld", j); if (p % 6 == 0 && (all || p < df)) { putc('\n', outfile); for (l = 1; l <= nmlngth - 2; l++) putc(' ', outfile); } p++; } } fprintf(outfile, "\n\n"); } } for (i = 0; i < spp; i++) { scan_eoln(infile); initname(i); if (printdata) for (j = 0; j < nmlngth; j++) putc(nayme[i][j], outfile); m = 1; p = 1; for (j = 1; j <= loci; j++) { m0 = m; sum = 0.0; if (contchars) n = 1; else if (all) n = alleles[j - 1]; else n = alleles[j - 1] - 1; for (k = 1; k <= n; k++) { if (eoln(infile)) scan_eoln(infile); if (fscanf(infile, "%lf", &x[i][m - 1]) != 1) { printf("ERROR: unable to read allele frequency" "for species %ld, locus %ld\n", i+1, j); exxit(-1); } sum += x[i][m - 1]; if (!contchars && x[i][m - 1] < 0.0) { printf("\n\nERROR: locus %ld in species %ld: an allele", j, i+1); printf(" frequency is negative\n"); exxit(-1); } if (printdata) { fprintf(outfile, "%10.5f", x[i][m - 1]); if (p % 6 == 0 && (all || p < df)) { putc('\n', outfile); for (l = 1; l <= nmlngth; l++) putc(' ', outfile); } } p++; m++; } if (all && !contchars) { if (fabs(sum - 1.0) > epsilon2) { printf( "\n\nERROR: Locus %ld in species %ld: frequencies do not add up to 1\n", j, i + 1); printf("\nFrequencies are:\n"); for (l = m0; l <= m-3; l++) printf("%f+", x[i][l]); printf("%f = %f\n\n", x[i][m-2], sum); exxit(-1); } else { for (l = 0; l <= m-2; l++) x[i][l] /= sum; } } if (!all && !contchars) { x[i][m-1] = 1.0 - sum; if (x[i][m-1] < 0.0) { if (x[i][m-1] > -epsilon2) { for (l = 0; l <= m-2; l++) x[i][l] /= sum; x[i][m-1] = 0.0; } else { printf("\n\nERROR: Locus %ld in species %ld: ", j, i + 1); printf("frequencies add up to more than 1\n"); printf("\nFrequencies are:\n"); for (l = m0-1; l <= m-3; l++) printf("%f+", x[i][l]); printf("%f = %f\n\n", x[i][m-2], sum); exxit(-1); } } m++; } } if (printdata) putc('\n', outfile); } scan_eoln(infile); if (printdata) putc('\n', outfile); } /* inputdata */ void transformgfs() { /* do stereographic projection transformation on gene frequencies to get variables that come closer to independent Brownian motions */ long i, j, k, l, m, n, maxalleles; double f, sum; double *sumprod, *sqrtp, *pbar; phenotype3 *c; sumprod = (double *)Malloc(loci*sizeof(double)); sqrtp = (double *)Malloc(totalleles*sizeof(double)); pbar = (double *)Malloc(totalleles*sizeof(double)); for (i = 0; i < totalleles; i++) { /* get mean gene frequencies */ pbar[i] = 0.0; for (j = 0; j < spp; j++) pbar[i] += x[j][i]; pbar[i] /= spp; if (pbar[i] == 0.0) sqrtp[i] = 0.0; else sqrtp[i] = sqrt(pbar[i]); } for (i = 0; i < spp; i++) { for (j = 0; j < loci; j++) /* for each locus, sum of root(p*x) */ sumprod[j] = 0.0; for (j = 0; j < totalleles; j++) if ((pbar[j]*x[i][j]) >= 0.0) sumprod[locus[j]-1] += sqrtp[j]*sqrt(x[i][j]); for (j = 0; j < totalleles; j++) { /* the projection to tangent plane */ f = (1.0 + sumprod[locus[j]-1])/2.0; if (x[i][j] == 0.0) x[i][j] = (2.0/f - 1.0)*sqrtp[j]; else x[i][j] = (1.0/f)*sqrt(x[i][j]) + (1.0/f - 1.0)*sqrtp[j]; } } maxalleles = 0; for (i = 0; i < loci; i++) if (alleles[i] > maxalleles) maxalleles = alleles[i]; c = (phenotype3 *)Malloc(maxalleles*sizeof(phenotype3)); for (i = 0; i < maxalleles; i++) /* enough room for any locus's contrasts */ c[i] = (double *)Malloc(maxalleles*sizeof(double)); m = 0; for (j = 0; j < loci; j++) { /* do this for each locus */ for (k = 0; k < alleles[j]-1; k++) { /* one fewer than # of alleles */ c[k][0] = 1.0; for (l = 0; l < k; l++) { /* for contrasts 1 to k make it ... */ sum = 0.0; for (n = 0; n <= l; n++) sum += c[k][n]*c[l][n]; if (fabs(c[l][l+1]) > 0.000000001) /* ... orthogonal to those ones */ c[k][l+1] = -sum / c[l][l+1]; /* set coeff to make orthogonal */ else c[k][l+1] = 1.0; } sum = 0.0; for (l = 0; l <= k; l++) /* make it orthogonal to vector of sqrtp's */ sum += c[k][l]*sqrtp[m+l]; if (sqrtp[m+k+1] > 0.0000000001) c[k][k+1] = - sum / sqrtp[m+k+1]; /* ... setting last coeff */ else { for (l = 0; l <= k; l++) c[k][l] = 0.0; c[k][k+1] = 1.0; } sum = 0.0; for (l = 0; l <= k+1; l++) sum += c[k][l]*c[k][l]; sum = sqrt(sum); for (l = 0; l <= k+1; l++) if (sum > 0.0000000001) c[k][l] /= sum; } for (i = 0; i < spp; i++) { /* the orthonormal axes in the plane */ for (l = 0; l < alleles[j]-1; l++) { /* compute the l-th one */ c[maxalleles-1][l] = 0.0; /* temporarily store it ... */ for (n = 0; n <= l+1; n++) c[maxalleles-1][l] += c[l][n]*x[i][m+n]; } for (l = 0; l < alleles[j]-1; l++) x[i][m+l] = c[maxalleles-1][l]; /* replace the gene freqs by it */ } m += alleles[j]; } for (i = 0; i < maxalleles; i++) free(c[i]); free(c); free(sumprod); free(sqrtp); free(pbar); } /* transformgfs */ void getinput() { /* reads the input data */ getalleles(); inputdata(); if (!contchars) { transformgfs(); } } /* getinput */ void sumlikely(node *p, node *q, double *sum) { /* sum contribution to likelihood over forks in tree */ long i, j, m; double term, sumsq, vee; double temp; if (!p->tip) sumlikely(p->next->back, p->next->next->back, sum); if (!q->tip) sumlikely(q->next->back, q->next->next->back, sum); if (p->back == q) vee = p->v; else vee = p->v + q->v; vee += p->deltav + q->deltav; if (vee <= 1.0e-10) { printf("ERROR: check for two identical species "); printf("and eliminate one from the data\n"); exxit(-1); } sumsq = 0.0; if (usertree && which <= MAXSHIMOTREES) { for (i = 0; i < loci; i++) l0gf[which - 1][i] += (1 - alleles[i]) * log(vee) / 2.0; } if (contchars) { m = 0; for (i = 0; i < loci; i++) { temp = p->view[i] - q->view[i]; term = temp * temp; if (usertree && which <= MAXSHIMOTREES) l0gf[which - 1][i] -= term / (2.0 * vee); sumsq += term; } } else { m = 0; for (i = 0; i < loci; i++) { for (j = 1; j < alleles[i]; j++) { temp = p->view[m+j-1] - q->view[m+j-1]; term = temp * temp; if (usertree && which <= MAXSHIMOTREES) l0gf[which - 1][i] -= term / (2.0 * vee); sumsq += term; } m += alleles[i]; } } (*sum) += df * log(vee) / -2.0 - sumsq / (2.0 * vee); } /* sumlikely */ double evaluate(tree *t) { /* evaluate likelihood of a tree */ long i; double sum; sum = 0.0; if (usertree && which <= MAXSHIMOTREES) { for (i = 0; i < loci; i++) l0gf[which - 1][i] = 0.0; } sumlikely(t->start->back, t->start, &sum); if (usertree && which <= MAXSHIMOTREES) { l0gl[which - 1] = sum; if (which == 1) { maxwhich = 1; maxlogl = sum; } else if (sum > maxlogl) { maxwhich = which; maxlogl = sum; } } t->likelihood = sum; return sum; } /* evaluate */ double distance(node *p, node *q) { /* distance between two nodes */ long i, j, m; double sum, temp; sum = 0.0; if (!contchars) { m = 0; for (i = 0; i < loci; i++) { for (j = 0; j < alleles[i]-1; j++) { temp = p->view[m+j] - q->view[m+j]; sum += temp * temp; } m += alleles[i]; } } else { for (i = 0; i < totalleles; i++) { temp = p->view[i] - q->view[i]; sum += temp * temp; } } return sum; } /* distance */ void makedists(node *p) { /* compute distances among three neighbors of a node */ long i; node *q; for (i = 1; i <= 3; i++) { q = p->next; p->dist = distance(p->back, q->back); p = q; } } /* makedists */ void makebigv(node *p, boolean *negatives) { /* make new branch length */ long i; node *temp, *q, *r; q = p->next; r = q->next; *negatives = false; for (i = 1; i <= 3; i++) { p->bigv = p->v + p->back->deltav; if (p->iter) { p->bigv = (p->dist + r->dist - q->dist) / (df * 2); p->back->bigv = p->bigv; if (p->bigv < p->back->deltav) *negatives = true; } temp = p; p = q; q = r; r = temp; } } /* makebigv */ void correctv(node *p) { /* iterate branch lengths if some are to be zero */ node *q, *r, *temp; long i, j; double f1, f2, vtot; q = p->next; r = q->next; for (i = 1; i <= smoothings; i++) { for (j = 1; j <= 3; j++) { vtot = q->bigv + r->bigv; if (vtot > 0.0) f1 = q->bigv / vtot; else f1 = 0.5; f2 = 1.0 - f1; p->bigv = (f1 * r->dist + f2 * p->dist - f1 * f2 * q->dist) / df; p->bigv -= vtot * f1 * f2; if (p->bigv < p->back->deltav) p->bigv = p->back->deltav; p->back->bigv = p->bigv; temp = p; p = q; q = r; r = temp; } } } /* correctv */ void littlev(node *p) { /* remove part of it that belongs to other barnches */ long i; for (i = 1; i <= 3; i++) { if (p->iter) p->v = p->bigv - p->back->deltav; if (p->back->iter) p->back->v = p->v; p = p->next; } } /* littlev */ void nuview(node *p) { /* renew information about subtrees */ long i, j, k, m; node *q, *r, *a, *b, *temp; double v1, v2, vtot, f1, f2; q = p->next; r = q->next; for (i = 1; i <= 3; i++) { a = q->back; b = r->back; v1 = q->bigv; v2 = r->bigv; vtot = v1 + v2; if (vtot > 0.0) f1 = v2 / vtot; else f1 = 0.5; f2 = 1.0 - f1; m = 0; for (j = 0; j < loci; j++) { for (k = 1; k <= alleles[j]; k++) p->view[m+k-1] = f1 * a->view[m+k-1] + f2 * b->view[m+k-1]; m += alleles[j]; } p->deltav = v1 * f1; temp = p; p = q; q = r; r = temp; } } /* nuview */ void update(node *p) { /* update branch lengths around a node */ boolean negatives; if (p->tip) return; makedists(p); makebigv(p,&negatives); if (negatives) correctv(p); littlev(p); nuview(p); } /* update */ void smooth(node *p) { /* go through tree getting new branch lengths and views */ if (p->tip) return; update(p); smooth(p->next->back); smooth(p->next->next->back); } /* smooth */ void insert_(node *p, node *q) { /* put p and q together and iterate info. on resulting tree */ long i; hookup(p->next->next, q->back); hookup(p->next, q); for (i = 1; i <= smoothings; i++) { smooth(p); smooth(p->back); } } /* insert_ */ void copynode(node *c, node *d) { /* make a copy of a node */ memcpy(d->view, c->view, totalleles*sizeof(double)); d->v = c->v; d->iter = c->iter; d->deltav = c->deltav; d->bigv = c->bigv; d->dist = c->dist; d->xcoord = c->xcoord; d->ycoord = c->ycoord; d->ymin = c->ymin; d->ymax = c->ymax; } /* copynode */ void copy_(tree *a, tree *b) { /* make a copy of tree a to tree b */ long i, j; node *p, *q; for (i = 0; i < spp; i++) { copynode(a->nodep[i], b->nodep[i]); if (a->nodep[i]->back) { if (a->nodep[i]->back == a->nodep[a->nodep[i]->back->index - 1]) b->nodep[i]->back = b->nodep[a->nodep[i]->back->index - 1]; else if (a->nodep[i]->back == a->nodep[a->nodep[i]->back->index - 1]->next) b->nodep[i]->back = b->nodep[a->nodep[i]->back->index - 1]->next; else b->nodep[i]->back = b->nodep[a->nodep[i]->back->index - 1]->next->next; } else b->nodep[i]->back = NULL; } for (i = spp; i < nonodes2; i++) { p = a->nodep[i]; q = b->nodep[i]; for (j = 1; j <= 3; j++) { copynode(p, q); if (p->back) { if (p->back == a->nodep[p->back->index - 1]) q->back = b->nodep[p->back->index - 1]; else if (p->back == a->nodep[p->back->index - 1]->next) q->back = b->nodep[p->back->index - 1]->next; else q->back = b->nodep[p->back->index - 1]->next->next; } else q->back = NULL; p = p->next; q = q->next; } } b->likelihood = a->likelihood; b->start = a->start; } /* copy_ */ void inittip(long m, tree *t) { /* initialize branch lengths and views in a tip */ node *tmp; tmp = t->nodep[m - 1]; memcpy(tmp->view, x[m - 1], totalleles*sizeof(double)); tmp->deltav = 0.0; tmp->v = 0.0; } /* inittip */ void buildnewtip(long m, tree *t, long nextsp) { /* initialize and hook up a new tip */ node *p; inittip(m, t); p = t->nodep[nextsp + spp - 3]; hookup(t->nodep[m - 1], p); } /* buildnewtip */ void buildsimpletree(tree *t) { /* make and initialize a three-species tree */ inittip(enterorder[0], t); inittip(enterorder[1], t); hookup(t->nodep[enterorder[0] - 1], t->nodep[enterorder[1] - 1]); buildnewtip(enterorder[2], t, nextsp); insert_(t->nodep[enterorder[2] - 1]->back, t->nodep[enterorder[0] - 1]); } /* buildsimpletree */ void addtraverse(node *p, node *q, boolean contin) { /* traverse through a tree, finding best place to add p */ insert_(p, q); numtrees++; if (evaluate(&curtree) > bestree.likelihood) { copy_(&curtree, &bestree); addwhere = q; } copy_(&priortree, &curtree); if (!q->tip && contin) { addtraverse(p, q->next->back, contin); addtraverse(p, q->next->next->back, contin); } } /* addtraverse */ void re_move(node **p, node **q) { /* remove p and record in q where it was */ *q = (*p)->next->back; hookup(*q, (*p)->next->next->back); (*p)->next->back = NULL; (*p)->next->next->back = NULL; update(*q); update((*q)->back); } /* re_move */ void globrearrange() { /* does global rearrangements */ tree globtree; tree oldtree; int i,j,k,num_sibs,num_sibs2; node *where,*sib_ptr,*sib_ptr2; double oldbestyet = curtree.likelihood; int success = false; alloctree(&globtree.nodep,nonodes2); alloctree(&oldtree.nodep,nonodes2); setuptree(&globtree,nonodes2); setuptree(&oldtree,nonodes2); allocview(&oldtree, nonodes2, totalleles); allocview(&globtree, nonodes2, totalleles); copy_(&curtree,&globtree); copy_(&curtree,&oldtree); for ( i = spp ; i < nonodes2 ; i++ ) { num_sibs = count_sibs(curtree.nodep[i]); sib_ptr = curtree.nodep[i]; if ( (i - spp) % (( nonodes2 / 72 ) + 1 ) == 0 ) putchar('.'); fflush(stdout); for ( j = 0 ; j <= num_sibs ; j++ ) { re_move(&sib_ptr,&where); copy_(&curtree,&priortree); if (where->tip) { copy_(&oldtree,&curtree); copy_(&oldtree,&bestree); sib_ptr = sib_ptr->next; continue; } else num_sibs2 = count_sibs(where); sib_ptr2 = where; for ( k = 0 ; k < num_sibs2 ; k++ ) { addwhere = NULL; addtraverse(sib_ptr,sib_ptr2->back,true); if ( addwhere && where != addwhere && where->back != addwhere && bestree.likelihood > globtree.likelihood) { copy_(&bestree,&globtree); success = true; } sib_ptr2 = sib_ptr2->next; } copy_(&oldtree,&curtree); copy_(&oldtree,&bestree); sib_ptr = sib_ptr->next; } } copy_(&globtree,&curtree); copy_(&globtree,&bestree); if (success && globtree.likelihood > oldbestyet) { succeeded = true; } else { succeeded = false; } freeview(&oldtree, nonodes2); freeview(&globtree, nonodes2); freetree(&globtree.nodep,nonodes2); freetree(&oldtree.nodep,nonodes2); } void rearrange(node *p) { /* rearranges the tree locally */ node *q, *r; if (!p->tip && !p->back->tip) { r = p->next->next; re_move(&r, &q ); copy_(&curtree, &priortree); addtraverse(r, q->next->back, false); addtraverse(r, q->next->next->back, false); copy_(&bestree, &curtree); } if (!p->tip) { rearrange(p->next->back); rearrange(p->next->next->back); } } /* rearrange */ void coordinates(node *p, double lengthsum, long *tipy, double *tipmax) { /* establishes coordinates of nodes */ node *q, *first, *last; if (p->tip) { p->xcoord = lengthsum; p->ycoord = *tipy; p->ymin = *tipy; p->ymax = *tipy; (*tipy) += down; if (lengthsum > (*tipmax)) (*tipmax) = lengthsum; return; } q = p->next; do { coordinates(q->back, lengthsum + q->v, tipy,tipmax); q = q->next; } while ((p == curtree.start || p != q) && (p != curtree.start || p->next != q)); first = p->next->back; q = p; while (q->next != p) q = q->next; last = q->back; p->xcoord = lengthsum; if (p == curtree.start) p->ycoord = p->next->next->back->ycoord; else p->ycoord = (first->ycoord + last->ycoord) / 2; p->ymin = first->ymin; p->ymax = last->ymax; } /* coordinates */ void drawline(long i, double scale) { /* draws one row of the tree diagram by moving up tree */ node *p, *q; long n, j; boolean extra; node *r, *first = NULL, *last = NULL; boolean done; p = curtree.start; q = curtree.start; extra = false; if (i == (long)p->ycoord && p == curtree.start) { if (p->index - spp >= 10) fprintf(outfile, " %2ld", p->index - spp); else fprintf(outfile, " %ld", p->index - spp); extra = true; } else fprintf(outfile, " "); do { if (!p->tip) { r = p->next; done = false; do { if (i >= (long)r->back->ymin && i <= (long)r->back->ymax) { q = r->back; done = true; } r = r->next; } while (!(done || (p != curtree.start && r == p) || (p == curtree.start && r == p->next))); first = p->next->back; r = p; while (r->next != p) r = r->next; last = r->back; if (p == curtree.start) last = p->back; } done = (p->tip || p == q); n = (long)(scale * (q->xcoord - p->xcoord) + 0.5); if (n < 3 && !q->tip) n = 3; if (extra) { n--; extra = false; } if ((long)q->ycoord == i && !done) { if ((long)p->ycoord != (long)q->ycoord) putc('+', outfile); else putc('-', outfile); if (!q->tip) { for (j = 1; j <= n - 2; j++) putc('-', outfile); if (q->index - spp >= 10) fprintf(outfile, "%2ld", q->index - spp); else fprintf(outfile, "-%ld", q->index - spp); extra = true; } else { for (j = 1; j < n; j++) putc('-', outfile); } } else if (!p->tip) { if ((long)last->ycoord > i && (long)first->ycoord < i && i != (long)p->ycoord) { putc('!', outfile); for (j = 1; j < n; j++) putc(' ', outfile); } else { for (j = 1; j <= n; j++) putc(' ', outfile); } } else { for (j = 1; j <= n; j++) putc(' ', outfile); } if (q != p) p = q; } while (!done); if ((long)p->ycoord == i && p->tip) { for (j = 0; j < nmlngth; j++) putc(nayme[p->index - 1][j], outfile); } putc('\n', outfile); } /* drawline */ void printree() { /* prints out diagram of the tree */ long i; long tipy; double tipmax,scale; if (!treeprint) return; putc('\n', outfile); tipy = 1; tipmax = 0.0; coordinates(curtree.start, 0.0, &tipy,&tipmax); scale = over / (tipmax + 0.0001); for (i = 1; i <= (tipy - down); i++) drawline(i,scale); putc('\n', outfile); } /* printree */ void treeout(node *p) { /* write out file with representation of final tree */ long i, n, w; Char c; double x; if (p->tip) { n = 0; for (i = 1; i <= nmlngth; i++) { if (nayme[p->index - 1][i - 1] != ' ') n = i; } for (i = 0; i < n; i++) { c = nayme[p->index - 1][i]; if (c == ' ') c = '_'; putc(c, outtree); } col += n; } else { putc('(', outtree); col++; treeout(p->next->back); putc(',', outtree); col++; if (col > 55) { putc('\n', outtree); col = 0; } treeout(p->next->next->back); if (p == curtree.start) { putc(',', outtree); col++; if (col > 45) { putc('\n', outtree); col = 0; } treeout(p->back); } putc(')', outtree); col++; } x = p->v; if (x > 0.0) w = (long)(0.43429448222 * log(x)); else if (x == 0.0) w = 0; else w = (long)(0.43429448222 * log(-x)) + 1; if (w < 0) w = 0; if (p == curtree.start) fprintf(outtree, ";\n"); else { fprintf(outtree, ":%*.8f", (int)w + 7, x); col += w + 8; } } /* treeout */ void describe(node *p, double chilow, double chihigh) { /* print out information for one branch */ long i; node *q; double bigv, delta; q = p->back; fprintf(outfile, "%3ld ", q->index - spp); if (p->tip) { for (i = 0; i < nmlngth; i++) putc(nayme[p->index - 1][i], outfile); } else fprintf(outfile, "%4ld ", p->index - spp); fprintf(outfile, "%15.8f", q->v); delta = p->deltav + p->back->deltav; bigv = p->v + delta; if (p->iter) fprintf(outfile, " (%12.8f,%12.8f)", chilow * bigv - delta, chihigh * bigv - delta); fprintf(outfile, "\n"); if (!p->tip) { describe(p->next->back, chilow,chihigh); describe(p->next->next->back, chilow,chihigh); } } /* describe */ void summarize(void) { /* print out branch lengths etc. */ double chilow,chihigh; fprintf(outfile, "\nremember: "); if (outgropt) fprintf(outfile, "(although rooted by outgroup) "); fprintf(outfile, "this is an unrooted tree!\n\n"); fprintf(outfile, "Ln Likelihood = %11.5f\n", curtree.likelihood); if (df == 1) { chilow = 0.000982; chihigh = 5.02389; } else if (df == 2) { chilow = 0.05064; chihigh = 7.3777; } else { chilow = 1.0 - 2.0 / (df * 9); chihigh = chilow; chilow -= 1.95996 * sqrt(2.0 / (df * 9)); chihigh += 1.95996 * sqrt(2.0 / (df * 9)); chilow *= chilow * chilow; chihigh *= chihigh * chihigh; } fprintf(outfile, "\nBetween And Length"); fprintf(outfile, " Approx. Confidence Limits\n"); fprintf(outfile, "------- --- ------"); fprintf(outfile, " ------- ---------- ------\n"); describe(curtree.start->next->back, chilow,chihigh); describe(curtree.start->next->next->back, chilow,chihigh); describe(curtree.start->back, chilow, chihigh); fprintf(outfile, "\n\n"); if (trout) { col = 0; treeout(curtree.start); } } /* summarize */ void nodeinit(node *p) { /* initialize a node */ node *q, *r; long i, j, m; if (p->tip) return; q = p->next->back; r = p->next->next->back; nodeinit(q); nodeinit(r); m = 0; for (i = 0; i < loci; i++) { for (j = 1; j < alleles[i]; j++) p->view[m+j-1] = 0.5 * q->view[m+j-1] + 0.5 * r->view[m+j-1]; m += alleles[i]; } if ((!lengths) || p->iter) p->v = 0.1; if ((!lengths) || p->back->iter) p->back->v = 0.1; } /* nodeinit */ void initrav(node *p) { /* traverse to initialize */ node* q; if (p->tip) nodeinit(p->back); else { q = p->next; while ( q != p) { initrav(q->back); q = q->next; } } } /* initrav */ void treevaluate() { /* evaluate user-defined tree, iterating branch lengths */ long i; unroot(&curtree,nonodes2); initrav(curtree.start); initrav(curtree.start->back); for (i = 1; i <= smoothings * 4; i++) smooth(curtree.start); evaluate(&curtree); } /* treevaluate */ void maketree() { /* construct the tree */ long i; if (usertree) { /* Open in binary: ftell() is broken for UNIX line-endings under WIN32 */ openfile(&intree,INTREE,"input tree file", "rb",progname,intreename); numtrees = countsemic(&intree); if(numtrees > MAXSHIMOTREES) shimotrees = MAXSHIMOTREES; else shimotrees = numtrees; if (numtrees > 2) initseed(&inseed, &inseed0, seed); if (treeprint) { fprintf(outfile, "User-defined tree"); if (numtrees > 1) putc('s', outfile); putc('\n', outfile); } setuptree(&curtree, nonodes2); for (which = 1; which <= spp; which++) inittip(which, &curtree); which = 1; while (which <= numtrees) { for (i = 0 ; i < nonodes2 ; i++) { if ( i > spp) { /* must do this since not all nodes may be used if an unrooted tree is read in after a rooted one */ curtree.nodep[i]->back = NULL; curtree.nodep[i]->next->back = NULL; curtree.nodep[i]->next->next->back = NULL; } else curtree.nodep[i]->back = NULL; } treeread2 (intree, &curtree.start, curtree.nodep, lengths, &trweight, &goteof, &haslengths, &spp,false,nonodes2); curtree.start = curtree.nodep[outgrno - 1]->back; treevaluate(); printree(); summarize(); which++; } FClose(intree); if (numtrees > 1 && loci > 1 ) { weight = (long *)Malloc(loci*sizeof(long)); for (i = 0; i < loci; i++) weight[i] = 1; standev2(numtrees, maxwhich, 0, loci-1, maxlogl, l0gl, l0gf, seed); free(weight); fprintf(outfile, "\n\n"); } } else { /* if ( !usertree ) */ if (jumb == 1) { setuptree(&curtree, nonodes2); setuptree(&priortree, nonodes2); setuptree(&bestree, nonodes2); if (njumble > 1) setuptree(&bestree2, nonodes2); } for (i = 1; i <= spp; i++) enterorder[i - 1] = i; if (jumble) randumize(seed, enterorder); nextsp = 3; buildsimpletree(&curtree); curtree.start = curtree.nodep[enterorder[0] - 1]->back; if (jumb == 1) numtrees = 1; nextsp = 4; if (progress) { printf("Adding species:\n"); writename(0, 3, enterorder); #ifdef WIN32 phyFillScreenColor(); #endif } while (nextsp <= spp) { buildnewtip(enterorder[nextsp - 1], &curtree, nextsp); copy_(&curtree, &priortree); bestree.likelihood = -DBL_MAX; addtraverse(curtree.nodep[enterorder[nextsp - 1] - 1]->back, curtree.start, true ); copy_(&bestree, &curtree); if (progress) { writename(nextsp - 1, 1, enterorder); #ifdef WIN32 phyFillScreenColor(); #endif } if (global && nextsp == spp) { if (progress) { printf("\nDoing global rearrangements\n"); printf(" !"); for (i = 1; i <= spp - 2; i++) if ( (i - spp) % (( nonodes2 / 72 ) + 1 ) == 0 ) putchar('-'); printf("!\n"); printf(" "); } } succeeded = true; while (succeeded) { succeeded = false; if ( global && nextsp == spp ) globrearrange(); else rearrange(curtree.start); if (global && nextsp == spp) putc('\n', outfile); } if (global && nextsp == spp && progress) putchar('\n'); if (njumble > 1) { if (jumb == 1 && nextsp == spp) copy_(&bestree, &bestree2); else if (nextsp == spp) { if (bestree2.likelihood < bestree.likelihood) copy_(&bestree, &bestree2); } } if (nextsp == spp && jumb == njumble) { if (njumble > 1) copy_(&bestree2, &curtree); curtree.start = curtree.nodep[outgrno - 1]->back; printree(); summarize(); } nextsp++; } } if ( jumb < njumble) return; if (progress) { printf("\nOutput written to file \"%s\"\n", outfilename); if (trout) printf("\nTree also written onto file \"%s\"\n", outtreename); } freeview(&curtree, nonodes2); if (!usertree) { freeview(&bestree, nonodes2); freeview(&priortree, nonodes2); } for (i = 0; i < spp; i++) free(x[i]); if (!contchars) { free(locus); free(pbar); } } /* maketree */ int main(int argc, Char *argv[]) { /* main program */ long i; #ifdef MAC argc = 1; /* macsetup("Contml",""); */ argv[0] = "Contml"; #endif init(argc, argv); progname = argv[0]; openfile(&infile,INFILE,"input file", "r",argv[0],infilename); openfile(&outfile,OUTFILE,"output file", "w",argv[0],outfilename); ibmpc = IBMCRT; ansi = ANSICRT; mulsets = false; firstset = true; datasets = 1; doinit(); if (trout) openfile(&outtree,OUTTREE,"output tree file", "w",argv[0],outtreename); for (ith = 1; ith <= datasets; ith++) { getinput(); if (ith == 1) firstset = false; if (datasets > 1) { fprintf(outfile, "Data set # %ld:\n\n", ith); if (progress) printf("\nData set # %ld:\n", ith); } for (jumb = 1; jumb <= njumble; jumb++) maketree(); if (usertree) for (i = 0; i < MAXSHIMOTREES; i++) free(l0gf[i]); } FClose(outfile); FClose(outtree); FClose(infile); #ifdef MAC fixmacfile(outfilename); fixmacfile(outtreename); #endif printf("\nDone.\n\n"); #ifdef WIN32 phyRestoreConsoleAttributes(); #endif return 0; } phylip-3.697/src/contrast.c0000644004732000473200000007111612406201116015334 0ustar joefelsenst_g /* version 3.696. Written by Joseph Felsenstein, Akiko Fuseki, Sean Lamont, and Andrew Keeffe. Copyright (c) 1993-2014, Joseph Felsenstein All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #include "phylip.h" #include "cont.h" #ifndef OLDC /* function prototypes */ void getoptions(void); void getdata(void); void allocrest(void); void doinit(void); void contwithin(void); void contbetween(node *, node *); void nuview(node *); void makecontrasts(node *); void writecontrasts(void); void regressions(void); double logdet(double **); void invert(double **); void initcovars(boolean); double normdiff(boolean); void matcopy(double **, double **); void newcovars(boolean); void printcovariances(boolean); void emiterate(boolean); void initcontrastnode(node **, node **, node *, long, long, long *, long *, initops, pointarray, pointarray, Char *, Char *, FILE *); void maketree(void); /* function prototypes */ #endif Char infilename[FNMLNGTH], outfilename[FNMLNGTH], intreename[FNMLNGTH]; long nonodes, chars, numtrees; long *sample, contnum; phenotype3 **x, **cntrast, *ssqcont; double **vara, **vare, **oldvara, **oldvare, **Bax, **Bex, **temp1, **temp2, **temp3; double logL, logLvara, logLnovara; boolean nophylo, printdata, progress, reg, mulsets, varywithin, writecont, bifurcating; Char ch; long contno; node *grbg; /* Local variables for maketree, propagated globally for c version: */ tree curtree; /* Variables declared just to make treeread happy */ boolean haslengths, goteof, first; double trweight; void getoptions() { /* interactively set options */ long loopcount, loopcount2; Char ch; boolean done, done1; mulsets = false; nophylo = false; printdata = false; progress = true; varywithin = false; writecont = false; loopcount = 0; do { cleerhome(); printf("\nContinuous character comparative analysis, version %s\n\n", VERSION); printf("Settings for this run:\n"); printf(" W Within-population variation in data?"); if (varywithin) printf(" Yes, multiple individuals\n"); else { printf(" No, species values are means\n"); printf(" R Print out correlations and regressions? %s\n", (reg ? "Yes" : "No")); } if (varywithin) { printf(" A LRT test of no phylogenetic component?"); if (nophylo) printf(" Yes, with and without VarA\n"); else printf(" No, just assume it is there\n"); } if (!varywithin) printf(" C Print out contrasts? %s\n", (writecont? "Yes" : "No")); printf(" M Analyze multiple trees?"); if (mulsets) printf(" Yes, %2ld trees\n", numtrees); else printf(" No\n"); printf(" 0 Terminal type (IBM PC, ANSI, none)? %s\n", ibmpc ? "IBM PC" : ansi ? "ANSI" : "(none)"); printf(" 1 Print out the data at start of run %s\n", (printdata ? "Yes" : "No")); printf(" 2 Print indications of progress of run %s\n", (progress ? "Yes" : "No")); printf("\n Y to accept these or type the letter for one to change\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); if (ch == '\n') ch = ' '; uppercase(&ch); done = (ch == 'Y'); if (!done) { if (strchr("RAMWC120", ch) != NULL) { switch (ch) { case 'R': reg = !reg; break; case 'A': nophylo = !nophylo; break; case 'M': mulsets = !mulsets; if (mulsets) { loopcount2 = 0; do { printf("How many trees?\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%ld%*[^\n]", &numtrees); getchar(); done1 = (numtrees >= 1); if (!done1) printf("BAD TREES NUMBER: it must be greater than 1\n"); countup(&loopcount2, 10); } while (done1 != true); } break; case 'C': writecont = !writecont; break; case 'W': varywithin = !varywithin; if (!nophylo) nophylo = true; break; case '0': initterminal(&ibmpc, &ansi); break; case '1': printdata = !printdata; break; case '2': progress = !progress; break; } } else printf("Not a possible option!\n"); } countup(&loopcount, 100); } while (!done); } /* getoptions */ void getdata() { /* read species data */ long i, j, k, l; if (printdata) { fprintf(outfile, "\nContinuous character contrasts analysis, version %s\n\n",VERSION); fprintf(outfile, "%4ld Populations, %4ld Characters\n\n", spp, chars); fprintf(outfile, "Name"); fprintf(outfile, " Phenotypes\n"); fprintf(outfile, "----"); fprintf(outfile, " ----------\n\n"); } x = (phenotype3 **)Malloc((long)spp*sizeof(phenotype3 *)); cntrast = (phenotype3 **)Malloc((long)spp*sizeof(phenotype3 *)); ssqcont = (phenotype3 *)Malloc((long)spp*sizeof(phenotype3 *)); contnum = spp-1; for (i = 0; i < spp; i++) { scan_eoln(infile); initname(i); if (varywithin) { if (fscanf(infile, "%ld", &sample[i]) == 1) { contnum += sample[i]-1; scan_eoln(infile); } else { printf("Error reading number of individuals for species %ld\n", i+1); exxit(-1); } } else sample[i] = 1; if (printdata) for(j = 0; j < nmlngth; j++) putc(nayme[i][j], outfile); x[i] = (phenotype3 *)Malloc((long)sample[i]*sizeof(phenotype3)); cntrast[i] = (phenotype3 *)Malloc((long)(sample[i]*sizeof(phenotype3))); ssqcont[i] = (double *)Malloc((long)(sample[i]*sizeof(double))); for (k = 0; k < sample[i]; k++) { x[i][k] = (phenotype3)Malloc((long)chars*sizeof(double)); cntrast[i][k] = (phenotype3)Malloc((long)chars*sizeof(double)); for (j = 1; j <= chars; j++) { if (eoln(infile)) scan_eoln(infile); if (fscanf(infile, "%lf", &x[i][k][j - 1]) != 1) { printf("Error in input file at species %ld\n", i+1); exxit(-1); } if (printdata) { fprintf(outfile, " %9.5f", x[i][k][j - 1]); if (j % 6 == 0) { putc('\n', outfile); for (l = 1; l <= nmlngth; l++) putc(' ', outfile); } } } /* got all the data we need for that member, */ /* read the next line if we still have more members to define */ if (k != sample[i]-1) scan_eoln(infile); } if (printdata) putc('\n', outfile); } scan_eoln(infile); if (printdata) putc('\n', outfile); } /* getdata */ void allocrest() { long i; /* otherwise if individual variation, these are allocated in getdata */ sample = (long *)Malloc((long)spp*sizeof(long)); nayme = (naym *)Malloc((long)spp*sizeof(naym)); vara = (double **)Malloc((long)chars*sizeof(double *)); oldvara = (double **)Malloc((long)chars*sizeof(double *)); vare = (double **)Malloc((long)chars*sizeof(double *)); oldvare = (double **)Malloc((long)chars*sizeof(double *)); Bax = (double **)Malloc((long)chars*sizeof(double *)); Bex = (double **)Malloc((long)chars*sizeof(double *)); temp1 = (double **)Malloc((long)chars*sizeof(double *)); temp2 = (double **)Malloc((long)chars*sizeof(double *)); temp3 = (double **)Malloc((long)chars*sizeof(double *)); for (i = 0; i < chars; i++) { vara[i] = (double *)Malloc((long)chars*sizeof(double)); oldvara[i] = (double *)Malloc((long)chars*sizeof(double)); vare[i] = (double *)Malloc((long)chars*sizeof(double)); oldvare[i] = (double *)Malloc((long)chars*sizeof(double)); Bax[i] = (double *)Malloc((long)chars*sizeof(double)); Bex[i] = (double *)Malloc((long)chars*sizeof(double)); temp1[i] = (double *)Malloc((long)chars*sizeof(double)); temp2[i] = (double *)Malloc((long)chars*sizeof(double)); temp3[i] = (double *)Malloc((long)chars*sizeof(double)); } } /* allocrest */ void doinit() { /* initializes variables */ inputnumbers(&spp, &chars, &nonodes, 1); getoptions(); allocrest(); } /* doinit */ void contwithin() { /* compute the within-species contrasts, if any */ long i, j, k; double *sumphen; sumphen = (double *)Malloc((long)chars*sizeof(double)); for (i = 0; i <= spp-1 ; i++) { for (j = 0; j < chars; j++) sumphen[j] = 0.0; for (k = 0; k <= (sample[i]-1); k++) { for (j = 0; j < chars; j++) { if (k > 0) cntrast[i][k][j] = (sumphen[j] - k*x[i][k][j])/sqrt((double)(k*(k+1))); sumphen[j] += x[i][k][j]; if (k == (sample[i]-1)) curtree.nodep[i]->view[j] = sumphen[j]/sample[i]; x[i][0][j] = sumphen[j]/sample[i]; } if (k == 0) curtree.nodep[i]->ssq = 1.0/sample[i]; /* sum of squares for sp. i */ else ssqcont[i][k] = 1.0; /* if a within contrast */ } } free(sumphen); contno = 1; } /* contwithin */ void contbetween(node *p, node *q) { /* compute one contrast */ long i; double v1, v2; if ((p->v < 0.0) || (q->v < 0.0)) { printf("\nERROR: input tree has a negative branch length,"); printf(" which is not allowed.\n\n"); exxit(-1); } for (i = 0; i < chars; i++) cntrast[contno - 1][0][i] = (p->view[i] - q->view[i])/sqrt(p->ssq+q->ssq); v1 = q->v + q->deltav; if (p->back != q) v2 = p->v + p->deltav; else v2 = p->deltav; ssqcont[contno - 1][0] = (v1 + v2)/(p->ssq + q->ssq); /* this is really the variance of the contrast */ contno++; } /* contbetween */ void nuview(node *p) { /* renew information about subtrees */ long j; node *q, *r; double v1, v2, vtot, f1, f2; q = p->next->back; r = p->next->next->back; v1 = q->v + q->deltav; v2 = r->v + r->deltav; vtot = v1 + v2; if (vtot > 0.0) f1 = v2 / vtot; else f1 = 0.5; f2 = 1.0 - f1; for (j = 0; j < chars; j++) p->view[j] = f1 * q->view[j] + f2 * r->view[j]; p->deltav = v1 * f1; p->ssq = f1*f1*q->ssq + f2*f2*r->ssq; } /* nuview */ void makecontrasts(node *p) { /* compute the contrasts, recursively */ if (p->tip) return; makecontrasts(p->next->back); makecontrasts(p->next->next->back); nuview(p); contbetween(p->next->back, p->next->next->back); } /* makecontrasts */ void writecontrasts() { /* write out the contrasts */ long i, j; if (printdata || reg) { fprintf(outfile, "\nContrasts (columns are different characters)\n"); fprintf(outfile, "--------- -------- --- --------- -----------\n\n"); } for (i = 0; i <= contno - 2; i++) { for (j = 0; j < chars; j++) fprintf(outfile, " %9.5f", cntrast[i][0][j]/sqrt(ssqcont[i][0])); putc('\n', outfile); } } /* writecontrasts */ void regressions() { /* compute regressions and correlations among contrasts */ long i, j, k; double **sumprod; sumprod = (double **)Malloc((long)chars*sizeof(double *)); for (i = 0; i < chars; i++) { sumprod[i] = (double *)Malloc((long)chars*sizeof(double)); for (j = 0; j < chars; j++) sumprod[i][j] = 0.0; } for (i = 0; i <= contno - 2; i++) { for (j = 0; j < chars; j++) { for (k = 0; k < chars; k++) sumprod[j][k] += cntrast[i][0][j] * cntrast[i][0][k] / ssqcont[i][0]; } } fprintf(outfile, "\nCovariance matrix\n"); fprintf(outfile, "---------- ------\n\n"); for (i = 0; i < chars; i++) { for (j = 0; j < chars; j++) sumprod[i][j] /= contno - 1; } for (i = 0; i < chars; i++) { for (j = 0; j < chars; j++) fprintf(outfile, " %9.4f", sumprod[i][j]); putc('\n', outfile); } fprintf(outfile, "\nRegressions (columns on rows)\n"); fprintf(outfile, "----------- -------- -- -----\n\n"); for (i = 0; i < chars; i++) { for (j = 0; j < chars; j++) fprintf(outfile, " %9.4f", sumprod[i][j] / sumprod[i][i]); putc('\n', outfile); } fprintf(outfile, "\nCorrelations\n"); fprintf(outfile, "------------\n\n"); for (i = 0; i < chars; i++) { for (j = 0; j < chars; j++) fprintf(outfile, " %9.4f", sumprod[i][j] / sqrt(sumprod[i][i] * sumprod[j][j])); putc('\n', outfile); } for (i = 0; i < chars; i++) free(sumprod[i]); free(sumprod); } /* regressions */ double logdet(double **a) { /* Gauss-Jordan log determinant calculation. in place, overwriting previous contents of a. On exit, matrix a contains the inverse. Works only for positive definite A */ long i, j, k; double temp, sum; sum = 0.0; for (i = 0; i < chars; i++) { if (fabs(a[i][i]) < 1.0E-37) { printf("ERROR: tried to invert singular matrix.\n"); exxit(-1); } sum += log(a[i][i]); temp = 1.0 / a[i][i]; a[i][i] = 1.0; for (j = 0; j < chars; j++) a[i][j] *= temp; for (j = 0; j < chars; j++) { if (j != i) { temp = a[j][i]; a[j][i] = 0.0; for (k = 0; k < chars; k++) a[j][k] -= temp * a[i][k]; } } } return(sum); } /* logdet */ void invert(double **a) { /* Gauss-Jordan reduction -- invert chars x chars matrix a in place, overwriting previous contents of a. On exit, matrix a contains the inverse.*/ long i, j, k; double temp; for (i = 0; i < chars; i++) { if (fabs(a[i][i]) < 1.0E-37) { printf("ERROR: tried to invert singular matrix.\n"); exxit(-1); } temp = 1.0 / a[i][i]; a[i][i] = 1.0; for (j = 0; j < chars; j++) a[i][j] *= temp; for (j = 0; j < chars; j++) { if (j != i) { temp = a[j][i]; a[j][i] = 0.0; for (k = 0; k < chars; k++) a[j][k] -= temp * a[i][k]; } } } } /*invert*/ void initcovars(boolean novara) { /* Initialize covariance estimates */ long i, j, k, l, contswithin; /* zero the matrices */ for (i = 0; i < chars; i++) for (j = 0; j < chars; j++) { vara[i][j] = 0.0; vare[i][j] = 0.0; } /* estimate VE from within contrasts -- unbiasedly */ contswithin = 0; for (i = 0; i < spp; i++) { for (j = 1; j < sample[i]; j++) { contswithin++; for (k = 0; k < chars; k++) for (l = 0; l < chars; l++) vare[k][l] += cntrast[i][j][k]*cntrast[i][j][l]; } } /* estimate VA from between contrasts -- biasedly: does not take out VE */ if (!novara) { /* leave VarA = 0 if no A component assumed present */ for (i = 0; i < spp-1; i++) { for (j = 0; j < chars; j++) for (k = 0; k < chars; k++) if (ssqcont[i][0] <= 0.0) vara[j][k] += cntrast[i][0][j]*cntrast[i][0][k]; else vara[j][k] += cntrast[i][0][j]*cntrast[i][0][k] / ((long)(spp-1)*ssqcont[i][0]); } } for (k = 0; k < chars; k++) for (l = 0; l < chars; l++) if (contswithin > 0) vare[k][l] /= contswithin; else { if (!novara) { vara[k][l] = 0.5 * vara[k][l]; vare[k][l] = vara[k][l]; } } } /* initcovars */ double normdiff(boolean novara) { /* Get relative norm of difference between old, new covariances */ double s; long i, j; s = 0.0; for (i = 0; i < chars; i++) for (j = 0; j < chars; j++) { if (!novara) { if (fabs(oldvara[i][j]) <= 0.00000001) s += vara[i][j]; else s += fabs(vara[i][j]/oldvara[i][j]-1.0); } if (fabs(oldvare[i][j]) <= 0.00000001) s += vare[i][j]; else s += fabs(vare[i][j]/oldvare[i][j]-1.0); } return s/((double)(chars*chars)); } /* normdiff */ void matcopy(double **a, double **b) { /* Copy matrices chars x chars: a to b */ long i; for (i = 0; i < chars; i++) { memcpy(b[i], a[i], chars*sizeof(double)); } } /* matcopy */ void newcovars(boolean novara) { /* one EM update of covariances, compute old likelihood too */ long i, j, k, l, m; double sum, sum2, sum3, sqssq; if (!novara) matcopy(vara, oldvara); matcopy(vare, oldvare); sum2 = 0.0; /* log likelihood of old parameters accumulates here */ for (i = 0; i < chars; i++) /* zero out vara and vare */ for (j = 0; j < chars; j++) { if (!novara) vara[i][j] = 0.0; vare[i][j] = 0.0; } for (i = 0; i < spp-1; i++) { /* accumulate over contrasts ... */ if (i <= spp-2) { /* E(aa'|x) and E(ee'|x) for "between" contrasts */ sqssq = sqrt(ssqcont[i][0]); for (k = 0; k < chars; k++) /* compute (dA+E) for this contrast */ for (l = 0; l < chars; l++) if (!novara) temp1[k][l] = ssqcont[i][0] * oldvara[k][l] + oldvare[k][l]; else temp1[k][l] = oldvare[k][l]; matcopy(temp1, temp2); invert(temp2); /* compute (dA+E)^(-1) */ /* sum of - x (dA+E)^(-1) x'/2 for old A, E */ for (k = 0; k < chars; k++) for (l = 0; l < chars; l++) sum2 -= cntrast[i][0][k]*temp2[k][l]*cntrast[i][0][l]/2.0; matcopy(temp1, temp3); sum2 -= 0.5 * logdet(temp3); /* log determinant term too */ if (!novara) { for (k = 0; k < chars; k++) for (l = 0; l < chars; l++) { sum = 0.0; for (j = 0; j < chars; j++) sum += temp2[k][j] * sqssq * oldvara[j][l]; Bax[k][l] = sum; /* Bax = (dA+E)^(-1) * sqrt(d) * A */ } } for (k = 0; k < chars; k++) for (l = 0; l < chars; l++) { sum = 0.0; for (j = 0; j < chars; j++) sum += temp2[k][j] * oldvare[j][l]; Bex[k][l] = sum; /* Bex = (dA+E)^(-1) * E */ } if (!novara) { for (k = 0; k < chars; k++) for (l = 0; l < chars; l++) { sum = 0.0; for (m = 0; m < chars; m++) sum += Bax[m][k] * (cntrast[i][0][m]*cntrast[i][0][l] -temp1[m][l]); temp2[k][l] = sum; /* Bax'*(xx'-(dA+E)) ... */ } for (k = 0; k < chars; k++) for (l = 0; l < chars; l++) { sum = 0.0; for (m = 0; m < chars; m++) sum += temp2[k][m] * Bax[m][l]; vara[k][l] += sum; /* ... * Bax */ } } for (k = 0; k < chars; k++) for (l = 0; l < chars; l++) { sum = 0.0; for (m = 0; m < chars; m++) sum += Bex[m][k] * (cntrast[i][0][m]*cntrast[i][0][l] -temp1[m][l]); temp2[k][l] = sum; /* Bex'*(xx'-(dA+E)) ... */ } for (k = 0; k < chars; k++) for (l = 0; l < chars; l++) { sum = 0.0; for (m = 0; m < chars; m++) sum += temp2[k][m] * Bex[m][l]; vare[k][l] += sum; /* ... * Bex */ } } } matcopy(oldvare, temp2); invert(temp2); /* get E^(-1) */ matcopy(oldvare, temp3); sum3 = 0.5 * logdet(temp3); /* get 1/2 log det(E) */ for (i = 0; i < spp; i++) { if (sample[i] > 1) { for (j = 1; j < sample[i]; j++) { /* E(aa'|x) (invisibly) and E(ee'|x) for within contrasts */ for (k = 0; k < chars; k++) for (l = 0; l < chars; l++) { vare[k][l] += cntrast[i][j][k] * cntrast[i][j][l] - oldvare[k][l]; sum2 -= cntrast[i][j][k] * temp2[k][l] * cntrast[i][j][l] / 2.0; /* accumulate - x*E^(-1)*x'/2 for old E */ } sum2 -= sum3; /* log determinant term too */ } } } for (i = 0; i < chars; i++) /* complete EM by dividing by denom ... */ for (j = 0; j < chars; j++) { /* ... and adding old VA, VE */ if (!novara) { vara[i][j] /= (double)contnum; vara[i][j] += oldvara[i][j]; } vare[i][j] /= (double)contnum; vare[i][j] += oldvare[i][j]; } logL = sum2; /* log likelihood for old values */ } /* newcovars */ void printcovariances(boolean novara) { /* print out ML covariances and regressions in the error-covariance case */ long i, j; fprintf(outfile, "\n\n"); if (novara) fprintf(outfile, "Estimates when VarA is not in the model\n\n"); else fprintf(outfile, "Estimates when VarA is in the model\n\n"); if (!novara) { fprintf(outfile, "Estimate of VarA\n"); fprintf(outfile, "-------- -- ----\n"); fprintf(outfile, "\n"); for (i = 0; i < chars; i++) { for (j = 0; j < chars; j++) fprintf(outfile, " %12.6f ", vara[i][j]); fprintf(outfile, "\n"); } fprintf(outfile, "\n"); } fprintf(outfile, "Estimate of VarE\n"); fprintf(outfile, "-------- -- ----\n"); fprintf(outfile, "\n"); for (i = 0; i < chars; i++) { for (j = 0; j < chars; j++) fprintf(outfile, " %12.6f ", vare[i][j]); fprintf(outfile, "\n"); } fprintf(outfile, "\n"); if (!novara) { fprintf(outfile, "VarA Regressions (columns on rows)\n"); fprintf(outfile, "---- ----------- -------- -- -----\n\n"); for (i = 0; i < chars; i++) { for (j = 0; j < chars; j++) fprintf(outfile, " %9.4f", vara[i][j] / vara[i][i]); putc('\n', outfile); } fprintf(outfile, "\n"); fprintf(outfile, "VarA Correlations\n"); fprintf(outfile, "---- ------------\n\n"); for (i = 0; i < chars; i++) { for (j = 0; j < chars; j++) fprintf(outfile, " %9.4f", vara[i][j] / sqrt(vara[i][i] * vara[j][j])); putc('\n', outfile); } fprintf(outfile, "\n"); } fprintf(outfile, "VarE Regressions (columns on rows)\n"); fprintf(outfile, "---- ----------- -------- -- -----\n\n"); for (i = 0; i < chars; i++) { for (j = 0; j < chars; j++) fprintf(outfile, " %9.4f", vare[i][j] / vare[i][i]); putc('\n', outfile); } fprintf(outfile, "\n"); fprintf(outfile, "\nVarE Correlations\n"); fprintf(outfile, "---- ------------\n\n"); for (i = 0; i < chars; i++) { for (j = 0; j < chars; j++) fprintf(outfile, " %9.4f", vare[i][j] / sqrt(vare[i][i] * vare[j][j])); putc('\n', outfile); } fprintf(outfile, "\n\n"); } /* printcovariances */ void emiterate(boolean novara) { /* EM iteration of error and phylogenetic covariances */ /* How to handle missing values? */ long its; double relnorm; initcovars(novara); its = 1; do { newcovars(novara); relnorm = normdiff(novara); if (its % 100 == 0) printf("Iteration no. %ld: ln L = %f, Norm = %f\n", its, logL, relnorm); its++; } while ((relnorm > 0.00001) && (its < 10000)); if (its == 10000) { printf("\nWARNING: Iterations did not converge."); printf(" Results may be unreliable.\n"); } } /* emiterate */ void initcontrastnode(node **p, node **grbg, node *q, long len, long nodei, long *ntips, long *parens, initops whichinit, pointarray treenode, pointarray nodep, Char *str, Char *ch, FILE *intree) { /* initializes a node */ boolean minusread; double valyew, divisor; switch (whichinit) { case bottom: gnu(grbg, p); (*p)->index = nodei; (*p)->tip = false; nodep[(*p)->index - 1] = (*p); (*p)->view = (phenotype3)Malloc((long)chars*sizeof(double)); break; case nonbottom: gnu(grbg, p); (*p)->index = nodei; (*p)->view = (phenotype3)Malloc((long)chars*sizeof(double)); break; case tip: match_names_to_data (str, nodep, p, spp); (*p)->view = (phenotype3)Malloc((long)chars*sizeof(double)); (*p)->deltav = 0.0; break; case length: processlength(&valyew, &divisor, ch, &minusread, intree, parens); (*p)->v = valyew / divisor; (*p)->iter = false; if ((*p)->back != NULL) { (*p)->back->v = (*p)->v; (*p)->back->iter = false; } break; default: /* cases of hslength,iter,hsnolength,treewt,unittrwt*/ break; /* not handled */ } } /* initcontrastnode */ void maketree() { /* set up the tree and use it */ long which, nextnode; node *q, *r; alloctree(&curtree.nodep, nonodes); setuptree(&curtree, nonodes); which = 1; while (which <= numtrees) { if ((printdata || reg) && numtrees > 1) { fprintf(outfile, "Tree number%4ld\n", which); fprintf(outfile, "==== ====== ====\n\n"); } nextnode = 0; treeread (intree, &curtree.start, curtree.nodep, &goteof, &first, curtree.nodep, &nextnode, &haslengths, &grbg, initcontrastnode,false,nonodes); q = curtree.start; r = curtree.start; while (!(q->next == curtree.start)) q = q->next; q->next = curtree.start->next; curtree.start = q; chuck(&grbg, r); curtree.nodep[spp] = q; bifurcating = (curtree.start->next->next == curtree.start); contwithin(); makecontrasts(curtree.start); if (!bifurcating) { makecontrasts(curtree.start->back); contbetween(curtree.start, curtree.start->back); } if (!varywithin) { if (writecont) writecontrasts(); if (reg) regressions(); putc('\n', outfile); } else { emiterate(false); printcovariances(false); if (nophylo) { logLvara = logL; emiterate(nophylo); printcovariances(nophylo); logLnovara = logL; fprintf(outfile, "\n\n\n Likelihood Ratio Test"); fprintf(outfile, " of no VarA component\n"); fprintf(outfile, " ---------- ----- ----"); fprintf(outfile, " -- -- ---- ---------\n\n"); fprintf(outfile, " Log likelihood with varA = %13.5f,", logLvara); fprintf(outfile, " %ld parameters\n\n", chars*(chars+1)); fprintf(outfile, " Log likelihood without varA = %13.5f,", logLnovara); fprintf(outfile, " %ld parameters\n\n", chars*(chars+1)/2); fprintf(outfile, " difference = %13.5f\n\n", logLvara-logLnovara); fprintf(outfile, " Chi-square value = %13.5f, ", 2.0*(logLvara-logLnovara)); fprintf(outfile, " %ld degrees of freedom\n\n", chars*(chars+1)/2); } } which++; } if (progress) printf("\nOutput written to file \"%s\"\n\n", outfilename); } /* maketree */ int main(int argc, Char *argv[]) { /* main program */ #ifdef MAC argc = 1; /* macsetup("Contrast","Contrast"); */ argv[0] = "Contrast"; #endif init(argc, argv); openfile(&infile,INFILE,"input data","r",argv[0],infilename); /* Open in binary: ftell() is broken for UNIX line-endings under WIN32 */ openfile(&intree,INTREE,"input tree", "rb",argv[0],intreename); openfile(&outfile,OUTFILE,"output", "w",argv[0],outfilename); ibmpc = IBMCRT; ansi = ANSICRT; reg = true; numtrees = 1; doinit(); getdata(); maketree(); FClose(infile); FClose(outfile); FClose(intree); printf("Done.\n\n"); #ifdef WIN32 phyRestoreConsoleAttributes(); #endif return 0; } phylip-3.697/src/COPYRIGHT0000644004732000473200000000272412406201116014625 0ustar joefelsenst_g Copyright Notice for PHYLIP The following copyright notice is intended to cover all source code, all documentation, and all executable programs of the PHYLIP package. Copyright (c) 1980-2014, Joseph Felsenstein All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. phylip-3.697/src/disc.c0000644004732000473200000006317312407037364014442 0ustar joefelsenst_g#include "phylip.h" #include "disc.h" /* version 3.696. Written by Joseph Felsenstein, Akiko Fuseki, Sean Lamont, and Andrew Keeffe. Copyright (c) 1993-2014, Joseph Felsenstein All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ long chars, nonodes, nextree, which; /* nonodes = number of nodes in tree * * chars = number of binary characters * * words = number of words needed to represent characters of one organism */ steptr weight, extras; boolean printdata; void inputdata(pointptr treenode,boolean dollo,boolean printdata,FILE *outfile) { /* input the names and character state data for species */ /* used in Dollop, Dolpenny, Dolmove, & Move */ long i, j, l; char k; Char charstate; /* possible states are '0', '1', 'P', 'B', and '?' */ if (printdata) headings(chars, "Characters", "----------"); for (i = 0; i < (chars); i++) extras[i] = 0; for (i = 1; i <= spp; i++) { initname(i-1); if (printdata) { for (j = 0; j < nmlngth; j++) putc(nayme[i - 1][j], outfile); fprintf(outfile, " "); } for (j = 0; j < (words); j++) { treenode[i - 1]->stateone[j] = 0; treenode[i - 1]->statezero[j] = 0; } for (j = 1; j <= (chars); j++) { k = (j - 1) % bits + 1; l = (j - 1) / bits + 1; do { if (eoln(infile)) scan_eoln(infile); charstate = gettc(infile); } while (charstate == ' ' || charstate == '\t'); if (charstate == 'b') charstate = 'B'; if (charstate == 'p') charstate = 'P'; if (charstate != '0' && charstate != '1' && charstate != '?' && charstate != 'P' && charstate != 'B') { printf("\n\nERROR: Bad character state: %c ",charstate); printf("at character %ld of species %ld\n\n", j, i); exxit(-1); } if (printdata) { newline(outfile, j, 55, nmlngth + 3); putc(charstate, outfile); if (j % 5 == 0) putc(' ', outfile); } if (charstate == '1') treenode[i - 1]->stateone[l - 1] = ((long)treenode[i - 1]->stateone[l - 1]) | (1L << k); if (charstate == '0') treenode[i - 1]->statezero[l - 1] = ((long)treenode[i - 1]->statezero[l - 1]) | (1L << k); if (charstate == 'P' || charstate == 'B') { if (dollo) extras[j - 1] += weight[j - 1]; else { treenode[i - 1]->stateone[l - 1] = ((long)treenode[i - 1]->stateone[l - 1]) | (1L << k); treenode[i - 1]->statezero[l - 1] = ((long)treenode[i - 1]->statezero[l - 1]) | (1L << k); } } } scan_eoln(infile); if (printdata) putc('\n', outfile); } if (printdata) fprintf(outfile, "\n\n"); } /* inputdata */ void inputdata2(pointptr2 treenode) { /* input the names and character state data for species */ /* used in Mix & Penny */ long i, j, l; char k; Char charstate; /* possible states are '0', '1', 'P', 'B', and '?' */ if (printdata) headings(chars, "Characters", "----------"); for (i = 0; i < (chars); i++) extras[i] = 0; for (i = 1; i <= spp; i++) { initname(i-1); if (printdata) { for (j = 0; j < nmlngth; j++) putc(nayme[i - 1][j], outfile); } fprintf(outfile, " "); for (j = 0; j < (words); j++) { treenode[i - 1]->fulstte1[j] = 0; treenode[i - 1]->fulstte0[j] = 0; treenode[i - 1]->empstte1[j] = 0; treenode[i - 1]->empstte0[j] = 0; } for (j = 1; j <= (chars); j++) { k = (j - 1) % bits + 1; l = (j - 1) / bits + 1; do { if (eoln(infile)) scan_eoln(infile); charstate = gettc(infile); if (charstate == '\n' || charstate == '\t') charstate = ' '; } while (charstate == ' '); if (charstate == 'b') charstate = 'B'; if (charstate == 'p') charstate = 'P'; if (charstate != '0' && charstate != '1' && charstate != '?' && charstate != 'P' && charstate != 'B') { printf("\n\nERROR: Bad character state: %c ",charstate); printf("at character %ld of species %ld\n\n", j, i); exxit(-1); } if (printdata) { newline(outfile, j, 55, nmlngth + 3); putc(charstate, outfile); if (j % 5 == 0) putc(' ', outfile); } if (charstate == '1') { treenode[i-1]->fulstte1[l-1] = ((long)treenode[i-1]->fulstte1[l-1]) | (1L << k); treenode[i-1]->empstte1[l-1] = treenode[i-1]->fulstte1[l-1]; } if (charstate == '0') { treenode[i-1]->fulstte0[l-1] = ((long)treenode[i-1]->fulstte0[l-1]) | (1L << k); treenode[i-1]->empstte0[l-1] = treenode[i-1]->fulstte0[l-1]; } if (charstate == 'P' || charstate == 'B') extras[j-1] += weight[j-1]; } scan_eoln(infile); if (printdata) putc('\n', outfile); } fprintf(outfile, "\n\n"); } /* inputdata2 */ void alloctree(pointptr *treenode) { /* allocate tree nodes dynamically */ /* used in dollop, dolmove, dolpenny, & move */ long i, j; node *p, *q; (*treenode) = (pointptr)Malloc(nonodes*sizeof(node *)); for (i = 0; i < (spp); i++) { (*treenode)[i] = (node *)Malloc(sizeof(node)); (*treenode)[i]->stateone = (bitptr)Malloc(words*sizeof(long)); (*treenode)[i]->statezero = (bitptr)Malloc(words*sizeof(long)); } for (i = spp; i < (nonodes); i++) { q = NULL; for (j = 1; j <= 3; j++) { p = (node *)Malloc(sizeof(node)); p->stateone = (bitptr)Malloc(words*sizeof(long)); p->statezero = (bitptr)Malloc(words*sizeof(long)); p->next = q; q = p; } p->next->next->next = p; (*treenode)[i] = p; } } /* alloctree */ void alloctree2(pointptr2 *treenode) { /* allocate tree nodes dynamically */ /* used in mix & penny */ long i, j; node2 *p, *q; (*treenode) = (pointptr2)Malloc(nonodes*sizeof(node2 *)); for (i = 0; i < (spp); i++) { (*treenode)[i] = (node2 *)Malloc(sizeof(node2)); (*treenode)[i]->fulstte1 = (bitptr)Malloc(words*sizeof(long)); (*treenode)[i]->fulstte0 = (bitptr)Malloc(words*sizeof(long)); (*treenode)[i]->empstte1 = (bitptr)Malloc(words*sizeof(long)); (*treenode)[i]->empstte0 = (bitptr)Malloc(words*sizeof(long)); (*treenode)[i]->fulsteps = (bitptr)Malloc(words*sizeof(long)); (*treenode)[i]->empsteps = (bitptr)Malloc(words*sizeof(long)); } for (i = spp; i < (nonodes); i++) { q = NULL; for (j = 1; j <= 3; j++) { p = (node2 *)Malloc(sizeof(node2)); p->fulstte1 = (bitptr)Malloc(words*sizeof(long)); p->fulstte0 = (bitptr)Malloc(words*sizeof(long)); p->empstte1 = (bitptr)Malloc(words*sizeof(long)); p->empstte0 = (bitptr)Malloc(words*sizeof(long)); p->fulsteps = (bitptr)Malloc(words*sizeof(long)); p->empsteps = (bitptr)Malloc(words*sizeof(long)); p->next = q; q = p; } p->next->next->next = p; (*treenode)[i] = p; } } /* alloctree2 */ void setuptree(pointptr treenode) { /* initialize tree nodes */ /* used in dollop, dolmove, dolpenny, & move */ long i; node *p; for (i = 1; i <= (nonodes); i++) { treenode[i-1]->back = NULL; treenode[i-1]->tip = (i <= spp); treenode[i-1]->index = i; if (i > spp) { p = treenode[i-1]->next; while (p != treenode[i-1]) { p->back = NULL; p->tip = false; p->index = i; p = p->next; } } } } /* setuptree */ void setuptree2(pointptr2 treenode) { /* initialize tree nodes */ /* used in mix & penny */ long i; node2 *p; for (i = 1; i <= (nonodes); i++) { treenode[i-1]->back = NULL; treenode[i-1]->tip = (i <= spp); treenode[i-1]->index = i; if (i > spp) { p = treenode[i-1]->next; while (p != treenode[i-1]) { p->back = NULL; p->tip = false; p->index = i; p = p->next; } } } } /* setuptree2 */ void inputancestors(boolean *anczero0, boolean *ancone0) { /* reads the ancestral states for each character */ /* used in dollop, dolmove, dolpenny, mix, move, & penny */ long i; Char ch; for (i = 0; i < (chars); i++) { anczero0[i] = true; ancone0[i] = true; do { if (eoln(ancfile)) scan_eoln(ancfile); ch = gettc(ancfile); if (ch == '\n') ch = ' '; } while (ch == ' '); if (ch == 'p') ch = 'P'; if (ch == 'b') ch = 'B'; if (strchr("10PB?",ch) != NULL){ anczero0[i] = (ch == '1') ? false : anczero0[i]; ancone0[i] = (ch == '0') ? false : ancone0[i]; } else { printf("BAD ANCESTOR STATE: %cAT CHARACTER %4ld\n", ch, i + 1); exxit(-1); } } scan_eoln(ancfile); } /* inputancestorsnew */ void printancestors(FILE *filename, boolean *anczero, boolean *ancone) { /* print out list of ancestral states */ /* used in dollop, dolmove, dolpenny, mix, move, & penny */ long i; fprintf(filename, " Ancestral states:\n"); for (i = 1; i <= nmlngth + 3; i++) putc(' ', filename); for (i = 1; i <= (chars); i++) { newline(filename, i, 55, nmlngth + 3); if (ancone[i-1] && anczero[i-1]) putc('?', filename); else if (ancone[i-1]) putc('1', filename); else putc('0', filename); if (i % 5 == 0) putc(' ', filename); } fprintf(filename, "\n\n"); } /* printancestor */ void add(node *below, node *newtip, node *newfork, node **root, pointptr treenode) { /* inserts the nodes newfork and its left descendant, newtip, to the tree. below becomes newfork's right descendant. The global variable root is also updated */ /* used in dollop & dolpenny */ if (below != treenode[below->index - 1]) below = treenode[below->index - 1]; if (below->back != NULL) below->back->back = newfork; newfork->back = below->back; below->back = newfork->next->next; newfork->next->next->back = below; newfork->next->back = newtip; newtip->back = newfork->next; if (*root == below) *root = newfork; } /* add */ void add2(node *below, node *newtip, node *newfork, node **root, boolean restoring, boolean wasleft, pointptr treenode) { /* inserts the nodes newfork and its left descendant, newtip, to the tree. below becomes newfork's right descendant */ /* used in move & dolmove */ boolean putleft; node *leftdesc, *rtdesc; if (below != treenode[below->index - 1]) below = treenode[below->index - 1]; if (below->back != NULL) below->back->back = newfork; newfork->back = below->back; putleft = true; if (restoring) putleft = wasleft; if (putleft) { leftdesc = newtip; rtdesc = below; } else { leftdesc = below; rtdesc = newtip; } rtdesc->back = newfork->next->next; newfork->next->next->back = rtdesc; newfork->next->back = leftdesc; leftdesc->back = newfork->next; if (*root == below) *root = newfork; (*root)->back = NULL; } /* add2 */ void add3(node2 *below, node2 *newtip, node2 *newfork, node2 **root, pointptr2 treenode) { /* inserts the nodes newfork and its left descendant, newtip, to the tree. below becomes newfork's right descendant. The global variable root is also updated */ /* used in mix & penny */ node2 *p; if (below != treenode[below->index - 1]) below = treenode[below->index - 1]; if (below->back != NULL) below->back->back = newfork; newfork->back = below->back; below->back = newfork->next->next; newfork->next->next->back = below; newfork->next->back = newtip; newtip->back = newfork->next; if (*root == below) *root = newfork; (*root)->back = NULL; p = newfork; do { p->visited = false; p = p->back; if (p != NULL) p = treenode[p->index - 1]; } while (p != NULL); } /* add3 */ void re_move(node **item, node **fork, node **root, pointptr treenode) { /* removes nodes item and its ancestor, fork, from the tree. the new descendant of fork's ancestor is made to be fork's second descendant (other than item). Also returns pointers to the deleted nodes, item and fork. The global variable root is also updated */ /* used in dollop & dolpenny */ node *p, *q; if ((*item)->back == NULL) { *fork = NULL; return; } *fork = treenode[(*item)->back->index - 1]; if (*root == *fork) { if (*item == (*fork)->next->back) *root = (*fork)->next->next->back; else *root = (*fork)->next->back; } p = (*item)->back->next->back; q = (*item)->back->next->next->back; if (p != NULL) p->back = q; if (q != NULL) q->back = p; (*fork)->back = NULL; p = (*fork)->next; while (p != *fork) { p->back = NULL; p = p->next; } (*item)->back = NULL; } /* re_move */ void re_move2(node **item, node **fork, node **root, boolean *wasleft, pointptr treenode) { /* removes nodes item and its ancestor, fork, from the tree. the new descendant of fork's ancestor is made to be fork's second descendant (other than item). Also returns pointers to the deleted nodes, item and fork */ /* used in move & dolmove */ node *p, *q; if ((*item)->back == NULL) { *fork = NULL; return; } *fork = treenode[(*item)->back->index - 1]; if (*item == (*fork)->next->back) { if (*root == *fork) *root = (*fork)->next->next->back; (*wasleft) = true; } else { if (*root == *fork) *root = (*fork)->next->back; (*wasleft) = false; } p = (*item)->back->next->back; q = (*item)->back->next->next->back; if (p != NULL) p->back = q; if (q != NULL) q->back = p; (*fork)->back = NULL; p = (*fork)->next; while (p != *fork) { p->back = NULL; p = p->next; } (*item)->back = NULL; } /* re_move2 */ void re_move3(node2 **item, node2 **fork, node2 **root, pointptr2 treenode) { /* removes nodes item and its ancestor, fork, from the tree. the new descendant of fork's ancestor is made to be fork's second descendant (other than item). Also returns pointers to the deleted nodes, item and fork. The global variable *root is also updated */ /* used in mix & penny */ node2 *p, *q; if ((*item)->back == NULL) { *fork = NULL; return; } *fork = treenode[(*item)->back->index - 1]; if (*root == *fork) { if (*item == (*fork)->next->back) *root = (*fork)->next->next->back; else *root = (*fork)->next->back; } p = (*item)->back->next->back; q = (*item)->back->next->next->back; if (p != NULL) p->back = q; if (q != NULL) q->back = p; q = (*fork)->back; (*fork)->back = NULL; p = (*fork)->next; while (p != *fork) { p->back = NULL; p = p->next; } (*item)->back = NULL; if (q != NULL) q = treenode[q->index - 1]; while (q != NULL) { q-> visited = false; q = q->back; if (q != NULL) q = treenode[q->index - 1]; } } /* re_move3 */ void coordinates(node *p, long *tipy, double f, long *fartemp) { /* establishes coordinates of nodes */ /* used in dollop, dolpenny, dolmove, & move */ node *q, *first, *last; if (p->tip) { p->xcoord = 0; p->ycoord = *tipy; p->ymin = *tipy; p->ymax = *tipy; *tipy += down; return; } q = p->next; do { coordinates(q->back, tipy, f, fartemp); q = q->next; } while (p != q); first = p->next->back; q = p->next; while (q->next != p) q = q->next; last = q->back; p->xcoord = (last->ymax - first->ymin) * f; p->ycoord = (first->ycoord + last->ycoord) / 2; p->ymin = first->ymin; p->ymax = last->ymax; if (p->xcoord > *fartemp) *fartemp = p->xcoord; } /* coordinates */ void coordinates2(node2 *p, long *tipy) { /* establishes coordinates2 of nodes */ node2 *q, *first, *last; if (p->tip) { p->xcoord = 0; p->ycoord = *tipy; p->ymin = *tipy; p->ymax = *tipy; (*tipy) += down; return; } q = p->next; do { coordinates2(q->back, tipy); q = q->next; } while (p != q); first = p->next->back; q = p->next; while (q->next != p) q = q->next; last = q->back; p->xcoord = last->ymax - first->ymin; p->ycoord = (first->ycoord + last->ycoord) / 2; p->ymin = first->ymin; p->ymax = last->ymax; } /* coordinates2 */ void treeout(node *p, long nextree, long *col, node *root) { /* write out file with representation of final tree */ /* used in dollop, dolmove, dolpenny, & move */ long i, n; Char c; node *q; if (p->tip) { n = 0; for (i = 1; i <= nmlngth; i++) { if (nayme[p->index - 1][i - 1] != ' ') n = i; } for (i = 0; i < n; i++) { c = nayme[p->index - 1][i]; if (c == ' ') c = '_'; putc(c, outtree); } *col += n; } else { q = p->next; putc('(', outtree); (*col)++; while (q != p) { treeout(q->back, nextree, col, root); q = q->next; if (q == p) break; putc(',', outtree); (*col)++; if (*col > 65) { putc('\n', outtree); *col = 0; } } putc(')', outtree); (*col)++; } if (p != root) return; if (nextree > 2) fprintf(outtree, "[%6.4f];\n", 1.0 / (nextree - 1)); else fprintf(outtree, ";\n"); } /* treeout */ void treeout2(node2 *p, long *col, node2 *root) { /* write out file with representation of final tree */ /* used in mix & penny */ long i, n; Char c; if (p->tip) { n = 0; for (i = 1; i <= nmlngth; i++) { if (nayme[p->index - 1][i - 1] != ' ') n = i; } for (i = 0; i < n; i++) { c = nayme[p->index - 1][i]; if (c == ' ') c = '_'; putc(c, outtree); } *col += n; } else { putc('(', outtree); (*col)++; treeout2(p->next->back, col, root); putc(',', outtree); (*col)++; if (*col > 65) { putc('\n', outtree); *col = 0; } treeout2(p->next->next->back, col, root); putc(')', outtree); (*col)++; } if (p != root) return; if (nextree > 2) fprintf(outtree, "[%6.4f];\n", 1.0 / (nextree - 1)); else fprintf(outtree, ";\n"); } /* treeout2 */ void standev(long numtrees, long minwhich, double minsteps, double *nsteps, double **fsteps, longer seed) { /* paired sites tests (KHT or SH) on user-defined trees */ /* used in pars */ long i, j, k; double wt, sumw, sum, sum2, sd; double temp; double **covar, *P, *f, *r; #define SAMPLES 1000 if (numtrees > maxuser) { printf("TOO MANY USER-DEFINED TREES"); printf(" test only performed in the first %ld of them\n", (long)maxuser); } else if (numtrees == 2) { fprintf(outfile, "Kishino-Hasegawa-Templeton test\n\n"); fprintf(outfile, "Tree Steps Diff Steps Its S.D."); fprintf(outfile, " Significantly worse?\n\n"); which = 1; while (which <= numtrees) { fprintf(outfile, "%3ld%10.1f", which, nsteps[which - 1]); if (minwhich == which) fprintf(outfile, " <------ best\n"); else { sumw = 0.0; sum = 0.0; sum2 = 0.0; for (i = 0; i < chars; i++) { if (weight[i] > 0) { wt = weight[i]; sumw += wt; temp = (fsteps[which - 1][i] - fsteps[minwhich - 1][i]) / 10.0; sum += wt*temp; sum2 += wt * temp * temp; } } temp = sum / sumw; sd = sqrt(sumw / (sumw - 1.0) * (sum2 - temp * temp)); fprintf(outfile, "%10.1f%12.4f", (nsteps[which - 1] - minsteps) / 10, sd); if (sum > 1.95996 * sd) fprintf(outfile, " Yes\n"); else fprintf(outfile, " No\n"); } which++; } fprintf(outfile, "\n\n"); } else { /* Shimodaira-Hasegawa test using normal approximation */ if(numtrees > MAXSHIMOTREES){ fprintf(outfile, "Shimodaira-Hasegawa test on first %d of %ld trees\n\n" , MAXSHIMOTREES, numtrees); numtrees = MAXSHIMOTREES; } else { fprintf(outfile, "Shimodaira-Hasegawa test\n\n"); } covar = (double **)Malloc(numtrees*sizeof(double *)); for (i = 0; i < numtrees; i++) covar[i] = (double *)Malloc(numtrees*sizeof(double)); sumw = 0.0; for (i = 0; i < chars; i++) sumw += weight[i]; for (i = 0; i < numtrees; i++) { /* compute covariances of trees */ sum = nsteps[i]/(10.0*sumw); for (j = 0; j <=i; j++) { sum2 = nsteps[j]/(10.0*sumw); temp = 0.0; for (k = 0; k < chars; k++) { if (weight[k] > 0) temp = temp + weight[k]*(fsteps[i][k]/10.0-sum) *(fsteps[j][k]/10.0-sum2); } covar[i][j] = temp; if (i != j) covar[j][i] = temp; } } for (i = 0; i < numtrees; i++) { /* in-place Cholesky decomposition of trees x trees covariance matrix */ sum = 0.0; for (j = 0; j <= i-1; j++) sum = sum + covar[i][j] * covar[i][j]; if (covar[i][i]-sum <= 0.0) temp = 0.0; else temp = sqrt(covar[i][i] - sum); covar[i][i] = temp; for (j = i+1; j < numtrees; j++) { sum = 0.0; for (k = 0; k < i; k++) sum = sum + covar[i][k] * covar[j][k]; if (fabs(temp) < 1.0E-12) covar[j][i] = 0.0; else covar[j][i] = (covar[j][i] - sum)/temp; } } f = (double *)Malloc(numtrees*sizeof(double)); /* resampled sums */ P = (double *)Malloc(numtrees*sizeof(double)); /* vector of P's of trees */ r = (double *)Malloc(numtrees*sizeof(double)); /* store Normal variates */ for (i = 0; i < numtrees; i++) P[i] = 0.0; sum2 = nsteps[0]; /* sum2 will be smallest # of steps */ for (i = 1; i < numtrees; i++) if (sum2 > nsteps[i]) sum2 = nsteps[i]; for (i = 1; i <= SAMPLES; i++) { /* loop over resampled trees */ for (j = 0; j < numtrees; j++) /* draw Normal variates */ r[j] = normrand(seed); for (j = 0; j < numtrees; j++) { /* compute vectors */ sum = 0.0; for (k = 0; k <= j; k++) sum += covar[j][k]*r[k]; f[j] = sum; } sum = f[1]; for (j = 1; j < numtrees; j++) /* get min of vector */ if (f[j] < sum) sum = f[j]; for (j = 0; j < numtrees; j++) /* accumulate P's */ if (nsteps[j]-sum2 <= f[j] - sum) P[j] += 1.0/SAMPLES; } fprintf(outfile, "Tree Steps Diff Steps P value"); fprintf(outfile, " Significantly worse?\n\n"); for (i = 0; i < numtrees; i++) { fprintf(outfile, "%3ld%10.1f", i+1, nsteps[i]); if ((minwhich-1) == i) fprintf(outfile, " <------ best\n"); else { fprintf(outfile, " %9.1f %10.3f", nsteps[i]-sum2, P[i]); if (P[i] < 0.05) fprintf(outfile, " Yes\n"); else fprintf(outfile, " No\n"); } } fprintf(outfile, "\n"); free(P); /* free the variables we Malloc'ed */ free(f); free(r); for (i = 0; i < numtrees; i++) free(covar[i]); free(covar); } } /* standev */ void guesstates(Char *guess) { /* write best guesses of ancestral states */ /* used in dollop, dolpenny, mix, & penny */ long i, j; fprintf(outfile, "best guesses of ancestral states:\n"); fprintf(outfile, " "); for (i = 0; i <= 9; i++) fprintf(outfile, "%2ld", i); fprintf(outfile, "\n *--------------------\n"); for (i = 0; i <= (chars / 10); i++) { fprintf(outfile, "%5ld!", i * 10); for (j = 0; j <= 9; j++) { if (i * 10 + j == 0 || i * 10 + j > chars) fprintf(outfile, " "); else fprintf(outfile, " %c", guess[i * 10 + j - 1]); } putc('\n', outfile); } putc('\n', outfile); } /* guesstates */ void freegarbage(gbit **garbage) { /* used in dollop, dolpenny, mix, & penny */ gbit *p; while (*garbage) { p = *garbage; *garbage = (*garbage)->next; free(p->bits_); free(p); } } /* freegarbage */ void disc_gnu(gbit **p, gbit **grbg) { /* this is a do-it-yourself garbage collectors for move Make a new node or pull one off the garbage list */ if (*grbg != NULL) { *p = *grbg; *grbg = (*grbg)->next; } else { *p = (gbit *)Malloc(sizeof(gbit)); (*p)->bits_ = (bitptr)Malloc(words*sizeof(long)); } (*p)->next = NULL; } /* disc_gnu */ void disc_chuck(gbit *p, gbit **grbg) { /* collect garbage on p -- put it on front of garbage list */ p->next = *grbg; *grbg = p; } /* disc_chuck */ phylip-3.697/src/disc.h0000644004732000473200000000750412407037430014435 0ustar joefelsenst_g /* version 3.696. Written by Joseph Felsenstein, Akiko Fuseki, Sean Lamont, and Andrew Keeffe. Copyright (c) 1993-2014, Joseph Felsenstein All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ /* disc.h: included in mix, move, penny, dollop, dolmove, dolpenny, & clique */ /* node and pointptr used in Dollop, Dolmove, Dolpenny, Move, & Clique */ typedef node **pointptr; /* node and pointptr used in Mix & Penny */ typedef struct node2 { /* describes a tip species or an ancestor */ struct node2 *next, *back; long index; boolean tip, bottom,visited;/* present species are tips of tree */ bitptr fulstte1, fulstte0; /* see in PROCEDURE fillin */ bitptr empstte1, empstte0; /* see in PROCEDURE fillin */ bitptr fulsteps,empsteps; long xcoord, ycoord, ymin; /* used by printree */ long ymax; } node2; typedef node2 **pointptr2; typedef struct gbit { bitptr bits_; struct gbit *next; } gbit; typedef struct htrav_vars { node *r; boolean bottom, nonzero; gbit *zerobelow, *onebelow; } htrav_vars; typedef struct htrav_vars2 { node2 *r; boolean bottom, maybe, nonzero; gbit *zerobelow, *onebelow; } htrav_vars2; extern long chars, nonodes, nextree, which; /* nonodes = number of nodes in tree * * chars = number of binary characters */ extern steptr weight, extras; extern boolean printdata; #ifndef OLDC /*function prototypes*/ void inputdata(pointptr, boolean, boolean, FILE *); void inputdata2(pointptr2); void alloctree(pointptr *); void alloctree2(pointptr2 *); void setuptree(pointptr); void setuptree2(pointptr2); void inputancestors(boolean *, boolean *); void inputancestorsnew(boolean *, boolean *); void printancestors(FILE *, boolean *, boolean *); void add(node *, node *, node *, node **, pointptr); void add2(node *, node *, node *, node **, boolean, boolean, pointptr); void add3(node2 *, node2 *, node2 *, node2 **, pointptr2); void re_move(node **, node **, node **, pointptr); void re_move2(node **, node **, node **, boolean *, pointptr); void re_move3(node2 **, node2 **, node2 **, pointptr2); void coordinates(node *, long *, double , long *); void coordinates2(node2 *, long *); void treeout(node *, long, long *, node *); void treeout2(node2 *, long *, node2 *); void standev(long, long, double, double *, double **, longer); void guesstates(Char *); void freegarbage(gbit **); void disc_gnu(gbit **, gbit **); void disc_chuck(gbit *, gbit **); /*function prototypes*/ #endif phylip-3.697/src/discrete.c0000644004732000473200000025127612407037465015327 0ustar joefelsenst_g#include "phylip.h" #include "discrete.h" /* version 3.696. Written by Joseph Felsenstein, Akiko Fuseki, Sean Lamont, and Andrew Keeffe. Copyright (c) 1993-2014, Joseph Felsenstein All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ long nonodes, endsite, outgrno, nextree, which; boolean interleaved, printdata, outgropt, treeprint, dotdiff; steptr weight, category, alias, location, ally; sequence y, convtab; void inputdata(long chars) { /* input the names and sequences for each species */ /* used by pars */ long i, j, k, l; long basesread=0, basesnew=0, nsymbol=0, convsymboli=0; Char charstate; boolean allread, done, found; if (printdata) headings(chars, "Sequences", "---------"); basesread = 0; allread = false; while (!(allread)) { /* eat white space -- if the separator line has spaces on it*/ do { charstate = gettc(infile); } while (charstate == ' ' || charstate == '\t'); ungetc(charstate, infile); if (eoln(infile)) scan_eoln(infile); i = 1; while (i <= spp) { if ((interleaved && basesread == 0) || !interleaved) initname(i - 1); j = (interleaved) ? basesread : 0; done = false; while (!done && !eoff(infile)) { if (interleaved) done = true; while (j < chars && !(eoln(infile) || eoff(infile))) { charstate = gettc(infile); if (charstate == '\n' || charstate == '\t') charstate = ' '; if (charstate == ' ') continue; if ((strchr("!\"#$%&'()*+,-./0123456789:;<=>?@\ ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`\ abcdefghijklmnopqrstuvwxyz{|}~",charstate)) == NULL){ printf( "\n\nERROR: Bad symbol: %c at position %ld of species %ld\n\n", charstate, j+1, i); exxit(-1); } j++; y[i - 1][j - 1] = charstate; } if (interleaved) continue; if (j < chars) scan_eoln(infile); else if (j == chars) done = true; } if (interleaved && i == 1) basesnew = j; scan_eoln(infile); if ((interleaved && j != basesnew) || (!interleaved && j != chars)) { printf("\n\nERROR: Sequences out of alignment at position %ld\n\n", j); exxit(-1); } i++; } if (interleaved) { basesread = basesnew; allread = (basesread == chars); } else allread = (i > spp); } if (printdata) { for (i = 1; i <= ((chars - 1) / 60 + 1); i++) { for (j = 1; j <= spp; j++) { for (k = 0; k < nmlngth; k++) putc(nayme[j - 1][k], outfile); fprintf(outfile, " "); l = i * 60; if (l > chars) l = chars; for (k = (i - 1) * 60 + 1; k <= l; k++) { if (dotdiff && (j > 1 && y[j - 1][k - 1] == y[0][k - 1])) charstate = '.'; else charstate = y[j - 1][k - 1]; putc(charstate, outfile); if (k % 10 == 0 && k % 60 != 0) putc(' ', outfile); } putc('\n', outfile); } putc('\n', outfile); } putc('\n', outfile); } for (i = 1; i <= chars; i++) { nsymbol = 0; for (j = 1; j <= spp; j++) { if ((nsymbol == 0) && (y[j - 1][i - 1] != '?')) { nsymbol = 1; convsymboli = 1; convtab[0][i-1] = y[j-1][i-1]; } else if (y[j - 1][i - 1] != '?'){ found = false; for (k = 1; k <= nsymbol; k++) { if (convtab[k - 1][i - 1] == y[j - 1][i - 1]) { found = true; convsymboli = k; } } if (!found) { nsymbol++; convtab[nsymbol-1][i - 1] = y[j - 1][i - 1]; convsymboli = nsymbol; } } if (nsymbol <= 8) { if (y[j - 1][i - 1] != '?') y[j - 1][i - 1] = (Char)('0' + (convsymboli - 1)); } else { printf( "\n\nERROR: More than maximum of 8 symbols in column %ld\n\n", i); exxit(-1); } } } } /* inputdata */ void alloctree(pointarray *treenode, long nonodes, boolean usertree) { /* allocate treenode dynamically */ /* used in pars */ long i, j; node *p, *q; *treenode = (pointarray)Malloc(nonodes*sizeof(node *)); for (i = 0; i < spp; i++) { (*treenode)[i] = (node *)Malloc(sizeof(node)); (*treenode)[i]->tip = true; (*treenode)[i]->index = i+1; (*treenode)[i]->iter = true; (*treenode)[i]->branchnum = i+1; (*treenode)[i]->initialized = true; } if (!usertree) for (i = spp; i < nonodes; i++) { q = NULL; for (j = 1; j <= 3; j++) { p = (node *)Malloc(sizeof(node)); p->tip = false; p->index = i+1; p->iter = true; p->branchnum = i+1; p->initialized = false; p->next = q; q = p; } p->next->next->next = p; (*treenode)[i] = p; } } /* alloctree */ void setuptree(pointarray treenode, long nonodes, boolean usertree) { /* initialize treenodes */ long i; node *p; for (i = 1; i <= nonodes; i++) { if (i <= spp || !usertree) { treenode[i-1]->back = NULL; treenode[i-1]->tip = (i <= spp); treenode[i-1]->index = i; treenode[i-1]->numdesc = 0; treenode[i-1]->iter = true; treenode[i-1]->initialized = true; } } if (!usertree) { for (i = spp + 1; i <= nonodes; i++) { p = treenode[i-1]->next; while (p != treenode[i-1]) { p->back = NULL; p->tip = false; p->index = i; p->numdesc = 0; p->iter = true; p->initialized = false; p = p->next; } } } } /* setuptree */ void alloctip(node *p, long *zeros, unsigned char *zeros2) { /* allocate a tip node */ /* used by pars */ p->numsteps = (steptr)Malloc(endsite*sizeof(long)); p->oldnumsteps = (steptr)Malloc(endsite*sizeof(long)); p->discbase = (discbaseptr)Malloc(endsite*sizeof(unsigned char)); p->olddiscbase = (discbaseptr)Malloc(endsite*sizeof(unsigned char)); memcpy(p->discbase, zeros2, endsite*sizeof(unsigned char)); memcpy(p->numsteps, zeros, endsite*sizeof(long)); memcpy(p->olddiscbase, zeros2, endsite*sizeof(unsigned char)); memcpy(p->oldnumsteps, zeros, endsite*sizeof(long)); } /* alloctip */ void sitesort(long chars, steptr weight) { /* Shell sort keeping sites, weights in same order */ /* used in pars */ long gap, i, j, jj, jg, k, itemp; boolean flip, tied; gap = chars / 2; while (gap > 0) { for (i = gap + 1; i <= chars; i++) { j = i - gap; flip = true; while (j > 0 && flip) { jj = alias[j - 1]; jg = alias[j + gap - 1]; tied = true; k = 1; while (k <= spp && tied) { flip = (y[k - 1][jj - 1] > y[k - 1][jg - 1]); tied = (tied && y[k - 1][jj - 1] == y[k - 1][jg - 1]); k++; } if (!flip) break; itemp = alias[j - 1]; alias[j - 1] = alias[j + gap - 1]; alias[j + gap - 1] = itemp; itemp = weight[j - 1]; weight[j - 1] = weight[j + gap - 1]; weight[j + gap - 1] = itemp; j -= gap; } } gap /= 2; } } /* sitesort */ void sitecombine(long chars) { /* combine sites that have identical patterns */ /* used in pars */ long i, j, k; boolean tied; i = 1; while (i < chars) { j = i + 1; tied = true; while (j <= chars && tied) { k = 1; while (k <= spp && tied) { tied = (tied && y[k - 1][alias[i - 1] - 1] == y[k - 1][alias[j - 1] - 1]); k++; } if (tied) { weight[i - 1] += weight[j - 1]; weight[j - 1] = 0; ally[alias[j - 1] - 1] = alias[i - 1]; } j++; } i = j - 1; } } /* sitecombine */ void sitescrunch(long chars) { /* move so one representative of each pattern of sites comes first */ /* used in pars */ long i, j, itemp; boolean done, found; done = false; i = 1; j = 2; while (!done) { if (ally[alias[i - 1] - 1] != alias[i - 1]) { if (j <= i) j = i + 1; if (j <= chars) { do { found = (ally[alias[j - 1] - 1] == alias[j - 1]); j++; } while (!(found || j > chars)); if (found) { j--; itemp = alias[i - 1]; alias[i - 1] = alias[j - 1]; alias[j - 1] = itemp; itemp = weight[i - 1]; weight[i - 1] = weight[j - 1]; weight[j - 1] = itemp; } else done = true; } else done = true; } i++; done = (done || i >= chars); } } /* sitescrunch */ void makevalues(pointarray treenode, long *zeros, unsigned char *zeros2, boolean usertree) { /* set up fractional likelihoods at tips */ /* used by pars */ long i, j; unsigned char ns=0; node *p; setuptree(treenode, nonodes, usertree); for (i = 0; i < spp; i++) alloctip(treenode[i], zeros, zeros2); if (!usertree) { for (i = spp; i < nonodes; i++) { p = treenode[i]; do { allocdiscnontip(p, zeros, zeros2, endsite); p = p->next; } while (p != treenode[i]); } } for (j = 0; j < endsite; j++) { for (i = 0; i < spp; i++) { switch (y[i][alias[j] - 1]) { case '0': ns = 1 << zero; break; case '1': ns = 1 << one; break; case '2': ns = 1 << two; break; case '3': ns = 1 << three; break; case '4': ns = 1 << four; break; case '5': ns = 1 << five; break; case '6': ns = 1 << six; break; case '7': ns = 1 << seven; break; case '?': ns = (1 << zero) | (1 << one) | (1 << two) | (1 << three) | (1 << four) | (1 << five) | (1 << six) | (1 << seven); break; } treenode[i]->discbase[j] = ns; treenode[i]->numsteps[j] = 0; } } } /* makevalues */ void fillin(node *p, node *left, node *rt) { /* sets up for each node in the tree the base sequence at that point and counts the changes. */ long i, j, k, n; node *q; if (!left) { memcpy(p->discbase, rt->discbase, endsite*sizeof(unsigned char)); memcpy(p->numsteps, rt->numsteps, endsite*sizeof(long)); q = rt; } else if (!rt) { memcpy(p->discbase, left->discbase, endsite*sizeof(unsigned char)); memcpy(p->numsteps, left->numsteps, endsite*sizeof(long)); q = left; } else { for (i = 0; i < endsite; i++) { p->discbase[i] = left->discbase[i] & rt->discbase[i]; p->numsteps[i] = left->numsteps[i] + rt->numsteps[i]; if (p->discbase[i] == 0) { p->discbase[i] = left->discbase[i] | rt->discbase[i]; p->numsteps[i] += weight[i]; } } q = rt; } if (left && rt) n = 2; else n = 1; for (i = 0; i < endsite; i++) for (j = (long)zero; j <= (long)seven; j++) p->discnumnuc[i][j] = 0; for (k = 1; k <= n; k++) { if (k == 2) q = left; for (i = 0; i < endsite; i++) { for (j = (long)zero; j <= (long)seven; j++) { if (q->discbase[i] & (1 << j)) p->discnumnuc[i][j]++; } } } } /* fillin */ long getlargest(long *discnumnuc) { /* find the largest in array numnuc */ long i, largest; largest = 0; for (i = (long)zero; i <= (long)seven; i++) if (discnumnuc[i] > largest) largest = discnumnuc[i]; return largest; } /* getlargest */ void multifillin(node *p, node *q, long dnumdesc) { /* sets up for each node in the tree the base sequence at that point and counts the changes according to the changes in q's base */ long i, j, largest, descsteps; unsigned char b; memcpy(p->olddiscbase, p->discbase, endsite*sizeof(unsigned char)); memcpy(p->oldnumsteps, p->numsteps, endsite*sizeof(long)); for (i = 0; i < endsite; i++) { descsteps = 0; for (j = (long)zero; j <= (long)seven; j++) { b = 1 << j; if ((descsteps == 0) && (p->discbase[i] & b)) descsteps = p->numsteps[i] - (p->numdesc - dnumdesc - p->discnumnuc[i][j]) * weight[i]; } if (dnumdesc == -1) descsteps -= q->oldnumsteps[i]; else if (dnumdesc == 0) descsteps += (q->numsteps[i] - q->oldnumsteps[i]); else descsteps += q->numsteps[i]; if (q->olddiscbase[i] != q->discbase[i]) { for (j = (long)zero; j <= (long)seven; j++) { b = 1 << j; if ((q->olddiscbase[i] & b) && !(q->discbase[i] & b)) p->discnumnuc[i][j]--; else if (!(q->olddiscbase[i] & b) && (q->discbase[i] & b)) p->discnumnuc[i][j]++; } } largest = getlargest(p->discnumnuc[i]); if (q->olddiscbase[i] != q->discbase[i]) { p->discbase[i] = 0; for (j = (long)zero; j <= (long)seven; j++) { if (p->discnumnuc[i][j] == largest) p->discbase[i] |= (1 << j); } } p->numsteps[i] = (p->numdesc - largest) * weight[i] + descsteps; } } /* multifillin */ void sumnsteps(node *p, node *left, node *rt, long a, long b) { /* sets up for each node in the tree the base sequence at that point and counts the changes. */ long i; unsigned char ns, rs, ls; if (!left) { memcpy(p->numsteps, rt->numsteps, endsite*sizeof(long)); memcpy(p->discbase, rt->discbase, endsite*sizeof(unsigned char)); } else if (!rt) { memcpy(p->numsteps, left->numsteps, endsite*sizeof(long)); memcpy(p->discbase, left->discbase, endsite*sizeof(unsigned char)); } else for (i = a; i < b; i++) { ls = left->discbase[i]; rs = rt->discbase[i]; ns = ls & rs; p->numsteps[i] = left->numsteps[i] + rt->numsteps[i]; if (ns == 0) { ns = ls | rs; p->numsteps[i] += weight[i]; } p->discbase[i] = ns; } } /* sumnsteps */ void sumnsteps2(node *p, node *left, node *rt, long a, long b, long *threshwt) { /* counts the changes at each node. */ long i, steps; unsigned char ns, rs, ls; long term; if (a == 0) p->sumsteps = 0.0; if (!left) memcpy(p->numsteps, rt->numsteps, endsite*sizeof(long)); else if (!rt) memcpy(p->numsteps, left->numsteps, endsite*sizeof(long)); else for (i = a; i < b; i++) { ls = left->discbase[i]; rs = rt->discbase[i]; ns = ls & rs; p->numsteps[i] = left->numsteps[i] + rt->numsteps[i]; if (ns == 0) p->numsteps[i] += weight[i]; } for (i = a; i < b; i++) { steps = p->numsteps[i]; if ((long)steps <= threshwt[i]) term = steps; else term = threshwt[i]; p->sumsteps += (double)term; } } /* sumnsteps2 */ void multisumnsteps(node *p, node *q, long a, long b, long *threshwt) { /* sets up for each node in the tree the base sequence at that point and counts the changes according to the changes in q's base */ long i, j, steps, largest, descsteps; long term; if (a == 0) p->sumsteps = 0.0; for (i = a; i < b; i++) { descsteps = 0; for (j = (long)zero; j <= (long)seven; j++) { if ((descsteps == 0) && (p->discbase[i] & (1 << j))) descsteps = p->numsteps[i] - (p->numdesc - 1 - p->discnumnuc[i][j]) * weight[i]; } descsteps += q->numsteps[i]; largest = 0; for (j = (long)zero; j <= (long)seven; j++) { if (q->discbase[i] & (1 << j)) p->discnumnuc[i][j]++; if (p->discnumnuc[i][j] > largest) largest = p->discnumnuc[i][j]; } steps = ((p->numdesc - largest) * weight[i] + descsteps); if ((long)steps <= threshwt[i]) term = steps; else term = threshwt[i]; p->sumsteps += (double)term; } } /* multisumnsteps */ void multisumnsteps2(node *p) { /* counts the changes at each multi-way node. Sums up steps of all descendants */ long i, j, largest; node *q; discbaseptr b; for (i = 0; i < endsite; i++) { p->numsteps[i] = 0; q = p->next; while (q != p) { if (q->back) { p->numsteps[i] += q->back->numsteps[i]; b = q->back->discbase; for (j = (long)zero; j <= (long)seven; j++) if (b[i] & (1 << j)) p->discnumnuc[i][j]++; } q = q->next; } largest = getlargest(p->discnumnuc[i]); p->numsteps[i] += ((p->numdesc - largest) * weight[i]); p->discbase[i] = 0; for (j = (long)zero; j <= (long)seven; j++) { if (p->discnumnuc[i][j] == largest) p->discbase[i] |= (1 << j); } } } /* multisumnsteps2 */ boolean alltips(node *forknode, node *p) { /* returns true if all descendants of forknode except p are tips; false otherwise. */ node *q, *r; boolean tips; tips = true; r = forknode; q = forknode->next; do { if (q->back && q->back != p && !q->back->tip) tips = false; q = q->next; } while (tips && q != r); return tips; } /* alltips */ void gdispose(node *p, node **grbg, pointarray treenode) { /* go through tree throwing away nodes */ node *q, *r; p->back = NULL; if (p->tip) return; treenode[p->index - 1] = NULL; q = p->next; while (q != p) { gdispose(q->back, grbg, treenode); q->back = NULL; r = q; q = q->next; chuck(grbg, r); } chuck(grbg, q); } /* gdispose */ void preorder(node *p, node *r, node *root, node *removing, node *adding, node *changing, long dnumdesc) { /* recompute number of steps in preorder taking both ancestoral and descendent steps into account. removing points to a node being removed, if any */ node *q, *p1, *p2; if (p && !p->tip && p != adding) { q = p; do { if (p->back != r) { if (p->numdesc > 2) { if (changing) multifillin (p, r, dnumdesc); else multifillin (p, r, 0); } else { p1 = p->next; if (!removing) while (!p1->back) p1 = p1->next; else while (!p1->back || p1->back == removing) p1 = p1->next; p2 = p1->next; if (!removing) while (!p2->back) p2 = p2->next; else while (!p2->back || p2->back == removing) p2 = p2->next; p1 = p1->back; p2 = p2->back; if (p->back == p1) p1 = NULL; else if (p->back == p2) p2 = NULL; memcpy(p->olddiscbase, p->discbase, endsite*sizeof(unsigned char)); memcpy(p->oldnumsteps, p->numsteps, endsite*sizeof(long)); fillin(p, p1, p2); } } p = p->next; } while (p != q); q = p; do { preorder(p->next->back, p->next, root, removing, adding, NULL, 0); p = p->next; } while (p->next != q); } } /* preorder */ void updatenumdesc(node *p, node *root, long n) { /* set p's numdesc to n. If p is the root, numdesc of p's descendants are set to n-1. */ node *q; q = p; if (p == root && n > 0) { p->numdesc = n; n--; q = q->next; } do { q->numdesc = n; q = q->next; } while (q != p); } void add(node *below, node *newtip, node *newfork, node **root, boolean recompute, pointarray treenode, node **grbg, long *zeros, unsigned char *zeros2) { /* inserts the nodes newfork and its left descendant, newtip, to the tree. below becomes newfork's right descendant. if newfork is NULL, newtip is added as below's sibling */ /* used in pars */ node *p; if (below != treenode[below->index - 1]) below = treenode[below->index - 1]; if (newfork) { if (below->back != NULL) below->back->back = newfork; newfork->back = below->back; below->back = newfork->next->next; newfork->next->next->back = below; newfork->next->back = newtip; newtip->back = newfork->next; if (*root == below) *root = newfork; updatenumdesc(newfork, *root, 2); } else { gnudisctreenode(grbg, &p, below->index, endsite, zeros, zeros2); p->back = newtip; newtip->back = p; p->next = below->next; below->next = p; updatenumdesc(below, *root, below->numdesc + 1); } if (!newtip->tip) updatenumdesc(newtip, *root, newtip->numdesc); (*root)->back = NULL; if (!recompute) return; if (!newfork) { memcpy(newtip->back->discbase, below->discbase, endsite*sizeof(unsigned char)); memcpy(newtip->back->numsteps, below->numsteps, endsite*sizeof(long)); memcpy(newtip->back->discnumnuc, below->discnumnuc, endsite*sizeof(discnucarray)); if (below != *root) { memcpy(below->back->olddiscbase, zeros2, endsite*sizeof(unsigned char)); memcpy(below->back->oldnumsteps, zeros, endsite*sizeof(long)); multifillin(newtip->back, below->back, 1); } if (!newtip->tip) { memcpy(newtip->back->olddiscbase, zeros2, endsite*sizeof(unsigned char)); memcpy(newtip->back->oldnumsteps, zeros, endsite*sizeof(long)); preorder(newtip, newtip->back, *root, NULL, NULL, below, 1); } memcpy(newtip->olddiscbase, zeros2, endsite*sizeof(unsigned char)); memcpy(newtip->oldnumsteps, zeros, endsite*sizeof(long)); preorder(below, newtip, *root, NULL, newtip, below, 1); if (below != *root) preorder(below->back, below, *root, NULL, NULL, NULL, 0); } else { fillin(newtip->back, newtip->back->next->back, newtip->back->next->next->back); if (!newtip->tip) { memcpy(newtip->back->olddiscbase, zeros2, endsite*sizeof(unsigned char)); memcpy(newtip->back->oldnumsteps, zeros, endsite*sizeof(long)); preorder(newtip, newtip->back, *root, NULL, NULL, newfork, 1); } if (newfork != *root) { memcpy(below->back->discbase, newfork->back->discbase, endsite*sizeof(unsigned char)); memcpy(below->back->numsteps, newfork->back->numsteps, endsite*sizeof(long)); preorder(newfork, newtip, *root, NULL, newtip, NULL, 0); } else { fillin(below->back, newtip, NULL); fillin(newfork, newtip, below); memcpy(below->back->olddiscbase, zeros2, endsite*sizeof(unsigned char)); memcpy(below->back->oldnumsteps, zeros, endsite*sizeof(long)); preorder(below, below->back, *root, NULL, NULL, newfork, 1); } if (newfork != *root) { memcpy(newfork->olddiscbase, below->discbase, endsite*sizeof(unsigned char)); memcpy(newfork->oldnumsteps, below->numsteps, endsite*sizeof(long)); preorder(newfork->back, newfork, *root, NULL, NULL, NULL, 0); } } } /* add */ void findbelow(node **below, node *item, node *fork) { /* decide which of fork's binary children is below */ if (fork->next->back == item) *below = fork->next->next->back; else *below = fork->next->back; } /* findbelow */ void re_move(node *item, node **fork, node **root, boolean recompute, pointarray treenode, node **grbg, long *zeros, unsigned char *zeros2) { /* removes nodes item and its ancestor, fork, from the tree. the new descendant of fork's ancestor is made to be fork's second descendant (other than item). Also returns pointers to the deleted nodes, item and fork. If item belongs to a node with more than 2 descendants, fork will not be deleted */ /* used in pars */ node *p, *q, *other, *otherback = NULL; if (item->back == NULL) { *fork = NULL; return; } *fork = treenode[item->back->index - 1]; if ((*fork)->numdesc == 2) { updatenumdesc(*fork, *root, 0); findbelow(&other, item, *fork); otherback = other->back; if (*root == *fork) { if (other->tip) *root = NULL; else { *root = other; updatenumdesc(other, *root, other->numdesc); } } p = item->back->next->back; q = item->back->next->next->back; if (p != NULL) p->back = q; if (q != NULL) q->back = p; (*fork)->back = NULL; p = (*fork)->next; while (p != *fork) { p->back = NULL; p = p->next; } } else { updatenumdesc(*fork, *root, (*fork)->numdesc - 1); p = *fork; while (p->next != item->back) p = p->next; p->next = item->back->next; } if (!item->tip) { updatenumdesc(item, item, item->numdesc); if (recompute) { memcpy(item->back->olddiscbase, item->back->discbase, endsite*sizeof(unsigned char)); memcpy(item->back->oldnumsteps, item->back->numsteps, endsite*sizeof(long)); memcpy(item->back->discbase, zeros2, endsite*sizeof(unsigned char)); memcpy(item->back->numsteps, zeros, endsite*sizeof(long)); preorder(item, item->back, *root, item->back, NULL, item, -1); } } if ((*fork)->numdesc >= 2) chuck(grbg, item->back); item->back = NULL; if (!recompute) return; if ((*fork)->numdesc == 0) { memcpy(otherback->olddiscbase, otherback->discbase, endsite*sizeof(unsigned char)); memcpy(otherback->oldnumsteps, otherback->numsteps, endsite*sizeof(long)); if (other == *root) { memcpy(otherback->discbase, zeros2, endsite*sizeof(unsigned char)); memcpy(otherback->numsteps, zeros, endsite*sizeof(long)); } else { memcpy(otherback->discbase, other->back->discbase, endsite*sizeof(unsigned char)); memcpy(otherback->numsteps, other->back->numsteps, endsite*sizeof(long)); } p = other->back; other->back = otherback; if (other == *root) preorder(other, otherback, *root, otherback, NULL, other, -1); else preorder(other, otherback, *root, NULL, NULL, NULL, 0); other->back = p; if (other != *root) { memcpy(other->olddiscbase,(*fork)->discbase, endsite*sizeof(unsigned char)); memcpy(other->oldnumsteps,(*fork)->numsteps, endsite*sizeof(long)); preorder(other->back, other, *root, NULL, NULL, NULL, 0); } } else { memcpy(item->olddiscbase, item->discbase, endsite*sizeof(unsigned char)); memcpy(item->oldnumsteps, item->numsteps, endsite*sizeof(long)); memcpy(item->discbase, zeros2, endsite*sizeof(unsigned char)); memcpy(item->numsteps, zeros, endsite*sizeof(long)); preorder(*fork, item, *root, NULL, NULL, *fork, -1); if (*fork != *root) preorder((*fork)->back, *fork, *root, NULL, NULL, NULL, 0); memcpy(item->discbase, item->olddiscbase, endsite*sizeof(unsigned char)); memcpy(item->numsteps, item->oldnumsteps, endsite*sizeof(long)); } } /* re_move */ void postorder(node *p) { /* traverses an n-ary tree, suming up steps at a node's descendants */ /* used in pars */ node *q; if (p->tip) return; q = p->next; while (q != p) { postorder(q->back); q = q->next; } zerodiscnumnuc(p, endsite); if (p->numdesc > 2) multisumnsteps2(p); else fillin(p, p->next->back, p->next->next->back); } /* postorder */ void getnufork(node **nufork, node **grbg, pointarray treenode, long *zeros, unsigned char *zeros2) { /* find a fork not used currently */ long i; i = spp; while (treenode[i] && treenode[i]->numdesc > 0) i++; if (!treenode[i]) gnudisctreenode(grbg, &treenode[i], i, endsite, zeros, zeros2); *nufork = treenode[i]; } /* getnufork */ void reroot(node *outgroup, node *root) { /* reorients tree, putting outgroup in desired position. used if the root is binary. */ /* used in pars */ node *p, *q; if (outgroup->back->index == root->index) return; p = root->next; q = root->next->next; p->back->back = q->back; q->back->back = p->back; p->back = outgroup; q->back = outgroup->back; outgroup->back->back = q; outgroup->back = p; } /* reroot */ void reroot2(node *outgroup, node *root) { /* reorients tree, putting outgroup in desired position. */ /* used in pars */ node *p; p = outgroup->back->next; while (p->next != outgroup->back) p = p->next; root->next = outgroup->back; p->next = root; } /* reroot2 */ void reroot3(node *outgroup,node *root,node *root2,node *lastdesc,node **grbg) { /* reorients tree, putting back outgroup in original position. */ /* used in pars */ node *p; p = root->next; while (p->next != root) p = p->next; chuck(grbg, root); p->next = outgroup->back; root2->next = lastdesc->next; lastdesc->next = root2; } /* reroot3 */ void savetraverse(node *p) { /* sets BOOLEANs that indicate which way is down */ node *q; p->bottom = true; if (p->tip) return; q = p->next; while (q != p) { q->bottom = false; savetraverse(q->back); q = q->next; } } /* savetraverse */ void newindex(long i, node *p) { /* assigns index i to node p */ while (p->index != i) { p->index = i; p = p->next; } } /* newindex */ void flipindexes(long nextnode, pointarray treenode) { /* flips index of nodes between nextnode and last node. */ long last; node *temp; last = nonodes; while (treenode[last - 1]->numdesc == 0) last--; if (last > nextnode) { temp = treenode[nextnode - 1]; treenode[nextnode - 1] = treenode[last - 1]; treenode[last - 1] = temp; newindex(nextnode, treenode[nextnode - 1]); newindex(last, treenode[last - 1]); } } /* flipindexes */ boolean parentinmulti(node *anode) { /* sees if anode's parent has more than 2 children */ node *p; while (!anode->bottom) anode = anode->next; p = anode->back; while (!p->bottom) p = p->next; return (p->numdesc > 2); } /* parentinmulti */ long sibsvisited(node *anode, long *place) { /* computes the number of nodes which are visited earlier than anode among its siblings */ node *p; long nvisited; while (!anode->bottom) anode = anode->next; p = anode->back->next; nvisited = 0; do { if (!p->bottom && place[p->back->index - 1] != 0) nvisited++; p = p->next; } while (p != anode->back); return nvisited; } /* sibsvisited */ long smallest(node *anode, long *place) { /* finds the smallest index of sibling of anode */ node *p; long min; while (!anode->bottom) anode = anode->next; p = anode->back->next; if (p->bottom) p = p->next; min = nonodes; do { if (p->back && place[p->back->index - 1] != 0) { if (p->back->index <= spp) { if (p->back->index < min) min = p->back->index; } else { if (place[p->back->index - 1] < min) min = place[p->back->index - 1]; } } p = p->next; if (p->bottom) p = p->next; } while (p != anode->back); return min; } /* smallest */ void bintomulti(node **root, node **binroot, node **grbg, long *zeros, unsigned char *zeros2) { /* attaches root's left child to its right child and makes the right child new root */ node *left, *right, *newnode, *temp; right = (*root)->next->next->back; left = (*root)->next->back; if (right->tip) { (*root)->next = right->back; (*root)->next->next = left->back; temp = left; left = right; right = temp; right->back->next = *root; } gnudisctreenode(grbg, &newnode, right->index, endsite, zeros, zeros2); newnode->next = right->next; newnode->back = left; left->back = newnode; right->next = newnode; (*root)->next->back = (*root)->next->next->back = NULL; *binroot = *root; (*binroot)->numdesc = 0; *root = right; (*root)->numdesc++; (*root)->back = NULL; } /* bintomulti */ void backtobinary(node **root, node *binroot, node **grbg) { /* restores binary root */ node *p; binroot->next->back = (*root)->next->back; (*root)->next->back->back = binroot->next; p = (*root)->next; (*root)->next = p->next; binroot->next->next->back = *root; (*root)->back = binroot->next->next; chuck(grbg, p); (*root)->numdesc--; *root = binroot; (*root)->numdesc = 2; } /* backtobinary */ boolean outgrin(node *root, node *outgrnode) { /* checks if outgroup node is a child of root */ node *p; p = root->next; while (p != root) { if (p->back == outgrnode) return true; p = p->next; } return false; } /* outgrin */ void flipnodes(node *nodea, node *nodeb) { /* flip nodes */ node *backa, *backb; backa = nodea->back; backb = nodeb->back; backa->back = nodeb; backb->back = nodea; nodea->back = backb; nodeb->back = backa; } /* flipnodes */ void moveleft(node *root, node *outgrnode, node **flipback) { /* makes outgroup node to leftmost child of root */ node *p; boolean done; p = root->next; done = false; while (p != root && !done) { if (p->back == outgrnode) { *flipback = p; flipnodes(root->next->back, p->back); done = true; } p = p->next; } } /* moveleft */ void savetree(node *root, long *place, pointarray treenode, node **grbg, long *zeros, unsigned char *zeros2) { /* record in place where each species has to be added to reconstruct this tree */ /* used by pars */ long i, j, nextnode, nvisited; node *p, *q, *r = NULL, *root2, *lastdesc, *outgrnode, *binroot, *flipback; boolean done, newfork; binroot = NULL; lastdesc = NULL; root2 = NULL; flipback = NULL; outgrnode = treenode[outgrno - 1]; if (root->numdesc == 2) bintomulti(&root, &binroot, grbg, zeros, zeros2); if (outgrin(root, outgrnode)) { if (outgrnode != root->next->back) moveleft(root, outgrnode, &flipback); } else { root2 = root; lastdesc = root->next; while (lastdesc->next != root) lastdesc = lastdesc->next; lastdesc->next = root->next; gnudisctreenode(grbg, &root, outgrnode->back->index, endsite, zeros, zeros2); root->numdesc = root2->numdesc; reroot2(outgrnode, root); } savetraverse(root); nextnode = spp + 1; for (i = nextnode; i <= nonodes; i++) if (treenode[i - 1]->numdesc == 0) flipindexes(i, treenode); for (i = 0; i < nonodes; i++) place[i] = 0; place[root->index - 1] = 1; for (i = 1; i <= spp; i++) { p = treenode[i - 1]; while (place[p->index - 1] == 0) { place[p->index - 1] = i; while (!p->bottom) p = p->next; r = p; p = p->back; } if (i > 1) { q = treenode[i - 1]; newfork = true; nvisited = sibsvisited(q, place); if (nvisited == 0) { if (parentinmulti(r)) { nvisited = sibsvisited(r, place); if (nvisited == 0) place[i - 1] = place[p->index - 1]; else if (nvisited == 1) place[i - 1] = smallest(r, place); else { place[i - 1] = -smallest(r, place); newfork = false; } } else place[i - 1] = place[p->index - 1]; } else if (nvisited == 1) { place[i - 1] = place[p->index - 1]; } else { place[i - 1] = -smallest(q, place); newfork = false; } if (newfork) { j = place[p->index - 1]; done = false; while (!done) { place[p->index - 1] = nextnode; while (!p->bottom) p = p->next; p = p->back; done = (p == NULL); if (!done) done = (place[p->index - 1] != j); if (done) { nextnode++; } } } } } if (flipback) flipnodes(outgrnode, flipback->back); else { if (root2) { reroot3(outgrnode, root, root2, lastdesc, grbg); root = root2; } } if (binroot) backtobinary(&root, binroot, grbg); } /* savetree */ void addnsave(node *p, node *item, node *nufork, node **root, node **grbg, boolean multf, pointarray treenode, long *place, long *zeros, unsigned char *zeros2) { /* adds item to tree and save it. Then removes item. */ node *dummy; if (!multf) add(p, item, nufork, root, false, treenode, grbg, zeros, zeros2); else add(p, item, NULL, root, false, treenode, grbg, zeros, zeros2); savetree(*root, place, treenode, grbg, zeros, zeros2); if (!multf) re_move(item, &nufork, root, false, treenode, grbg, zeros, zeros2); else re_move(item, &dummy, root, false, treenode, grbg, zeros, zeros2); } /* addnsave */ void addbestever(long *pos, long *nextree, long maxtrees, boolean collapse, long *place, bestelm *bestrees) { /* adds first best tree */ *pos = 1; *nextree = 1; initbestrees(bestrees, maxtrees, true); initbestrees(bestrees, maxtrees, false); addtree(*pos, nextree, collapse, place, bestrees); } /* addbestever */ void addtiedtree(long pos, long *nextree, long maxtrees, boolean collapse, long *place, bestelm *bestrees) { /* add tied tree */ if (*nextree <= maxtrees) addtree(pos, nextree, collapse, place, bestrees); } /* addtiedtree */ void clearcollapse(pointarray treenode) { /* clears collapse status at a node */ long i; node *p; for (i = 0; i < nonodes; i++) { treenode[i]->collapse = undefined; if (!treenode[i]->tip) { p = treenode[i]->next; while (p != treenode[i]) { p->collapse = undefined; p = p->next; } } } } /* clearcollapse */ void clearbottom(pointarray treenode) { /* clears boolean bottom at a node */ long i; node *p; for (i = 0; i < nonodes; i++) { treenode[i]->bottom = false; if (!treenode[i]->tip) { p = treenode[i]->next; while (p != treenode[i]) { p->bottom = false; p = p->next; } } } } /* clearbottom */ void collabranch(node *collapfrom, node *tempfrom, node *tempto) { /* collapse branch from collapfrom */ long i, j, largest, descsteps; boolean done; unsigned char b; for (i = 0; i < endsite; i++) { descsteps = 0; for (j = (long)zero; j <= (long)seven; j++) { b = 1 << j; if ((descsteps == 0) && (collapfrom->discbase[i] & b)) descsteps = tempfrom->oldnumsteps[i] - (collapfrom->numdesc - collapfrom->discnumnuc[i][j]) * weight[i]; } done = false; for (j = (long)zero; j <= (long)seven; j++) { b = 1 << j; if (!done && (tempto->discbase[i] & b)) { descsteps += (tempto->numsteps[i] - (tempto->numdesc - collapfrom->numdesc - tempto->discnumnuc[i][j]) * weight[i]); done = true; } } for (j = (long)zero; j <= (long)seven; j++) tempto->discnumnuc[i][j] += collapfrom->discnumnuc[i][j]; largest = getlargest(tempto->discnumnuc[i]); tempto->discbase[i] = 0; for (j = (long)zero; j <= (long)seven; j++) { if (tempto->discnumnuc[i][j] == largest) tempto->discbase[i] |= (1 << j); } tempto->numsteps[i] = (tempto->numdesc - largest) * weight[i] + descsteps; } } /* collabranch */ boolean allcommonbases(node *a, node *b, boolean *allsame) { /* see if bases are common at all sites for nodes a and b */ long i; boolean allcommon; allcommon = true; *allsame = true; for (i = 0; i < endsite; i++) { if ((a->discbase[i] & b->discbase[i]) == 0) allcommon = false; else if (a->discbase[i] != b->discbase[i]) *allsame = false; } return allcommon; } /* allcommonbases */ void findbottom(node *p, node **bottom) { /* find a node with field bottom set at node p */ node *q; if (p->bottom) *bottom = p; else { q = p->next; while(!q->bottom && q != p) q = q->next; *bottom = q; } } /* findbottom */ boolean moresteps(node *a, node *b) { /* see if numsteps of node a exceeds those of node b */ long i; for (i = 0; i < endsite; i++) if (a->numsteps[i] > b->numsteps[i]) return true; return false; } /* moresteps */ boolean passdown(node *desc, node *parent, node *start, node *below, node *item, node *added, node *total, node *tempdsc, node *tempprt, boolean multf) { /* track down to node start to see if an ancestor branch can be collapsed */ node *temp; boolean done, allsame; done = (parent == start); while (!done) { desc = parent; findbottom(parent->back, &parent); if (multf && start == below && parent == below) parent = added; memcpy(tempdsc->discbase, tempprt->discbase, endsite*sizeof(unsigned char)); memcpy(tempdsc->numsteps, tempprt->numsteps, endsite*sizeof(long)); memcpy(tempdsc->olddiscbase, desc->discbase, endsite*sizeof(unsigned char)); memcpy(tempdsc->oldnumsteps, desc->numsteps, endsite*sizeof(long)); memcpy(tempprt->discbase, parent->discbase, endsite*sizeof(unsigned char)); memcpy(tempprt->numsteps, parent->numsteps, endsite*sizeof(long)); memcpy(tempprt->discnumnuc, parent->discnumnuc, endsite*sizeof(discnucarray)); tempprt->numdesc = parent->numdesc; multifillin(tempprt, tempdsc, 0); if (!allcommonbases(tempprt, parent, &allsame)) return false; else if (moresteps(tempprt, parent)) return false; else if (allsame) return true; if (parent == added) parent = below; done = (parent == start); if (done && ((start == item) || (!multf && start == below))) { memcpy(tempdsc->discbase, tempprt->discbase, endsite*sizeof(unsigned char)); memcpy(tempdsc->numsteps, tempprt->numsteps, endsite*sizeof(long)); memcpy(tempdsc->olddiscbase, start->discbase, endsite*sizeof(unsigned char)); memcpy(tempdsc->oldnumsteps, start->numsteps, endsite*sizeof(long)); multifillin(added, tempdsc, 0); tempprt = added; } } temp = tempdsc; if (start == below || start == item) fillin(temp, tempprt, below->back); else fillin(temp, tempprt, added); return !moresteps(temp, total); } /* passdown */ boolean trycollapdesc(node *desc, node *parent, node *start, node *below, node *item, node *added, node *total, node *tempdsc, node *tempprt, boolean multf,long *zeros, unsigned char *zeros2) { /* see if branch between nodes desc and parent can be collapsed */ boolean allsame; if (desc->numdesc == 1) return true; if (multf && start == below && parent == below) parent = added; memcpy(tempdsc->discbase, zeros2, endsite*sizeof(unsigned char)); memcpy(tempdsc->numsteps, zeros, endsite*sizeof(long)); memcpy(tempdsc->olddiscbase, desc->discbase, endsite*sizeof(unsigned char)); memcpy(tempdsc->oldnumsteps, desc->numsteps, endsite*sizeof(long)); memcpy(tempprt->discbase, parent->discbase, endsite*sizeof(unsigned char)); memcpy(tempprt->numsteps, parent->numsteps, endsite*sizeof(long)); memcpy(tempprt->discnumnuc, parent->discnumnuc, endsite*sizeof(discnucarray)); tempprt->numdesc = parent->numdesc - 1; multifillin(tempprt, tempdsc, -1); tempprt->numdesc += desc->numdesc; collabranch(desc, tempdsc, tempprt); if (!allcommonbases(tempprt, parent, &allsame) || moresteps(tempprt, parent)) { if (parent != added) { desc->collapse = nocollap; parent->collapse = nocollap; } return false; } else if (allsame) { if (parent != added) { desc->collapse = tocollap; parent->collapse = tocollap; } return true; } if (parent == added) parent = below; if ((start == item && parent == item) || (!multf && start == below && parent == below)) { memcpy(tempdsc->discbase, tempprt->discbase, endsite*sizeof(unsigned char)); memcpy(tempdsc->numsteps, tempprt->numsteps, endsite*sizeof(long)); memcpy(tempdsc->olddiscbase, start->discbase, endsite*sizeof(unsigned char)); memcpy(tempdsc->oldnumsteps, start->numsteps, endsite*sizeof(long)); memcpy(tempprt->discbase, added->discbase, endsite*sizeof(unsigned char)); memcpy(tempprt->numsteps, added->numsteps, endsite*sizeof(long)); memcpy(tempprt->discnumnuc, added->discnumnuc, endsite*sizeof(discnucarray)); tempprt->numdesc = added->numdesc; multifillin(tempprt, tempdsc, 0); if (!allcommonbases(tempprt, added, &allsame)) return false; else if (moresteps(tempprt, added)) return false; else if (allsame) return true; } return passdown(desc, parent, start, below, item, added, total, tempdsc, tempprt, multf); } /* trycollapdesc */ void setbottom(node *p) { /* set field bottom at node p */ node *q; p->bottom = true; q = p->next; do { q->bottom = false; q = q->next; } while (q != p); } /* setbottom */ boolean zeroinsubtree(node *subtree, node *start, node *below, node *item, node *added, node *total, node *tempdsc, node *tempprt, boolean multf, node* root, long *zeros, unsigned char *zeros2) { /* sees if subtree contains a zero length branch */ node *p; if (!subtree->tip) { setbottom(subtree); p = subtree->next; do { if (p->back && !p->back->tip && !((p->back->collapse == nocollap) && (subtree->collapse == nocollap)) && (subtree->numdesc != 1)) { if ((p->back->collapse == tocollap) && (subtree->collapse == tocollap) && multf && (subtree != below)) return true; /* when root->numdesc == 2 * there is no mandatory step at the root, * instead of checking at the root we check around it * we only need to check p because the first if * statement already gets rid of it for the subtree */ else if ((p->back->index != root->index || root->numdesc > 2) && trycollapdesc(p->back, subtree, start, below, item, added, total, tempdsc, tempprt, multf, zeros, zeros2)) return true; else if ((p->back->index == root->index && root->numdesc == 2) && !(root->next->back->tip) && !(root->next->next->back->tip) && trycollapdesc(root->next->back, root->next->next->back, start, below, item, added, total, tempdsc, tempprt, multf, zeros, zeros2)) return true; } p = p->next; } while (p != subtree); p = subtree->next; do { if (p->back && !p->back->tip) { if (zeroinsubtree(p->back, start, below, item, added, total, tempdsc, tempprt, multf, root, zeros, zeros2)) return true; } p = p->next; } while (p != subtree); } return false; } /* zeroinsubtree */ boolean collapsible(node *item, node *below, node *temp, node *temp1, node *tempdsc, node *tempprt, node *added, node *total, boolean multf, node *root, long *zeros, unsigned char *zeros2, pointarray treenode) { /* sees if any branch can be collapsed */ node *belowbk; boolean allsame; if (multf) { memcpy(tempdsc->discbase, item->discbase, endsite*sizeof(unsigned char)); memcpy(tempdsc->numsteps, item->numsteps, endsite*sizeof(long)); memcpy(tempdsc->olddiscbase, zeros2, endsite*sizeof(unsigned char)); memcpy(tempdsc->oldnumsteps, zeros, endsite*sizeof(long)); memcpy(added->discbase, below->discbase, endsite*sizeof(unsigned char)); memcpy(added->numsteps, below->numsteps, endsite*sizeof(long)); memcpy(added->discnumnuc, below->discnumnuc, endsite*sizeof(discnucarray)); added->numdesc = below->numdesc + 1; multifillin(added, tempdsc, 1); } else { fillin(added, item, below); added->numdesc = 2; } fillin(total, added, below->back); clearbottom(treenode); if (below->back) { if (zeroinsubtree(below->back, below->back, below, item, added, total, tempdsc, tempprt, multf, root, zeros, zeros2)) return true; } if (multf) { if (zeroinsubtree(below, below, below, item, added, total, tempdsc, tempprt, multf, root, zeros, zeros2)) return true; } else if (!below->tip) { if (zeroinsubtree(below, below, below, item, added, total, tempdsc, tempprt, multf, root, zeros, zeros2)) return true; } if (!item->tip) { if (zeroinsubtree(item, item, below, item, added, total, tempdsc, tempprt, multf, root, zeros, zeros2)) return true; } if (multf && below->back && !below->back->tip) { memcpy(tempdsc->discbase, zeros2, endsite*sizeof(unsigned char)); memcpy(tempdsc->numsteps, zeros, endsite*sizeof(long)); memcpy(tempdsc->olddiscbase, added->discbase, endsite*sizeof(unsigned char)); memcpy(tempdsc->oldnumsteps, added->numsteps, endsite*sizeof(long)); if (below->back == treenode[below->back->index - 1]) belowbk = below->back->next; else belowbk = treenode[below->back->index - 1]; memcpy(tempprt->discbase, belowbk->discbase, endsite*sizeof(unsigned char)); memcpy(tempprt->numsteps, belowbk->numsteps, endsite*sizeof(long)); memcpy(tempprt->discnumnuc, belowbk->discnumnuc, endsite*sizeof(discnucarray)); tempprt->numdesc = belowbk->numdesc - 1; multifillin(tempprt, tempdsc, -1); tempprt->numdesc += added->numdesc; collabranch(added, tempdsc, tempprt); if (!allcommonbases(tempprt, belowbk, &allsame)) return false; else if (allsame && !moresteps(tempprt, belowbk)) return true; else if (belowbk->back) { fillin(temp, tempprt, belowbk->back); fillin(temp1, belowbk, belowbk->back); return !moresteps(temp, temp1); } } return false; } /* collapsible */ void replaceback(node **oldback, node *item, node *forknode, node **grbg, long *zeros, unsigned char *zeros2) { /* replaces back node of item with another */ node *p; p = forknode; while (p->next->back != item) p = p->next; *oldback = p->next; gnudisctreenode(grbg, &p->next, forknode->index, endsite, zeros, zeros2); p->next->next = (*oldback)->next; p->next->back = (*oldback)->back; p->next->back->back = p->next; (*oldback)->next = (*oldback)->back = NULL; } /* replaceback */ void putback(node *oldback, node *item, node *forknode, node **grbg) { /* restores node to back of item */ node *p, *q; p = forknode; while (p->next != item->back) p = p->next; q = p->next; oldback->next = p->next->next; p->next = oldback; oldback->back = item; item->back = oldback; oldback->index = forknode->index; chuck(grbg, q); } /* putback */ void savelocrearr(node *item, node *forknode, node *below, node *tmp, node *tmp1, node *tmp2, node *tmp3, node *tmprm, node *tmpadd, node **root, long maxtrees, long *nextree, boolean multf, boolean bestever, boolean *saved, long *place, bestelm *bestrees, pointarray treenode, node **grbg, long *zeros, unsigned char *zeros2) { /* saves tied or better trees during local rearrangements by removing item from forknode and adding to below */ node *other, *otherback=NULL, *oldfork, *nufork, *oldback; long pos; boolean found, collapse; if (forknode->numdesc == 2) { findbelow(&other, item, forknode); otherback = other->back; oldback = NULL; } else { other = NULL; replaceback(&oldback, item, forknode, grbg, zeros, zeros2); } re_move(item, &oldfork, root, false, treenode, grbg, zeros, zeros2); if (!multf) getnufork(&nufork, grbg, treenode, zeros, zeros2); else nufork = NULL; addnsave(below, item, nufork, root, grbg, multf, treenode, place, zeros, zeros2); pos = 0; findtree(&found, &pos, *nextree, place, bestrees); if (other) { add(other, item, oldfork, root, false, treenode, grbg, zeros, zeros2); if (otherback->back != other) flipnodes(item, other); } else add(forknode, item, NULL, root, false, treenode, grbg, zeros, zeros2); *saved = false; if (found) { if (oldback) putback(oldback, item, forknode, grbg); } else { if (oldback) chuck(grbg, oldback); re_move(item, &oldfork, root, true, treenode, grbg, zeros, zeros2); collapse = collapsible(item, below, tmp, tmp1, tmp2, tmp3, tmprm, tmpadd, multf, *root, zeros, zeros2, treenode); if (!collapse) { if (bestever) addbestever(&pos, nextree, maxtrees, collapse, place, bestrees); else addtiedtree(pos, nextree, maxtrees, collapse, place, bestrees); } if (other) add(other, item, oldfork, root, true, treenode, grbg, zeros, zeros2); else add(forknode, item, NULL, root, true, treenode, grbg, zeros, zeros2); *saved = !collapse; } } /* savelocrearr */ void clearvisited(pointarray treenode) { /* clears boolean visited at a node */ long i; node *p; for (i = 0; i < nonodes; i++) { treenode[i]->visited = false; if (!treenode[i]->tip) { p = treenode[i]->next; while (p != treenode[i]) { p->visited = false; p = p->next; } } } } /* clearvisited */ void hyprint(long b1,long b2,struct LOC_hyptrav *htrav,pointarray treenode) { /* print out states in sites b1 through b2 at node */ long i, j, k; boolean dot, found; if (htrav->bottom) { if (!outgropt) fprintf(outfile, " "); else fprintf(outfile, "root "); } else fprintf(outfile, "%4ld ", htrav->r->back->index - spp); if (htrav->r->tip) { for (i = 0; i < nmlngth; i++) putc(nayme[htrav->r->index - 1][i], outfile); } else fprintf(outfile, "%4ld ", htrav->r->index - spp); if (htrav->bottom) fprintf(outfile, " "); else if (htrav->nonzero) fprintf(outfile, " yes "); else if (htrav->maybe) fprintf(outfile, " maybe "); else fprintf(outfile, " no "); for (i = b1; i <= b2; i++) { j = location[ally[i - 1] - 1]; htrav->tempset = htrav->r->discbase[j - 1]; htrav->anc = htrav->hypset[j - 1]; if (!htrav->bottom) htrav->anc = treenode[htrav->r->back->index - 1]->discbase[j - 1]; dot = dotdiff && (htrav->tempset == htrav->anc && !htrav->bottom); if (dot) putc('.', outfile); else { found = false; k = (long)zero; do { if (htrav->tempset == (1 << k)) { putc(convtab[k][i - 1], outfile); found = true; } k++; } while (!found && k <= (long)seven); if (!found) putc('?', outfile); } if (i % 10 == 0) putc(' ', outfile); } putc('\n', outfile); } /* hyprint */ void gnubase(gbases **p, gbases **garbage, long endsite) { /* this and the following are do-it-yourself garbage collectors. Make a new node or pull one off the garbage list */ if (*garbage != NULL) { *p = *garbage; *garbage = (*garbage)->next; } else { *p = (gbases *)Malloc(sizeof(gbases)); (*p)->discbase = (discbaseptr)Malloc(endsite*sizeof(unsigned char)); } (*p)->next = NULL; } /* gnubase */ void chuckbase(gbases *p, gbases **garbage) { /* collect garbage on p -- put it on front of garbage list */ p->next = *garbage; *garbage = p; } /* chuckbase */ void hyptrav(node *r_, discbaseptr hypset_, long b1, long b2, boolean bottom_, pointarray treenode, gbases **garbage) { /* compute, print out states at one interior node */ struct LOC_hyptrav Vars; long i, j, k; long largest; gbases *ancset; discnucarray *tempnuc; node *p, *q; Vars.bottom = bottom_; Vars.r = r_; Vars.hypset = hypset_; gnubase(&ancset, garbage, endsite); tempnuc = (discnucarray *)Malloc(endsite*sizeof(discnucarray)); Vars.maybe = false; Vars.nonzero = false; if (!Vars.r->tip) zerodiscnumnuc(Vars.r, endsite); for (i = b1 - 1; i < b2; i++) { j = location[ally[i] - 1]; Vars.anc = Vars.hypset[j - 1]; if (!Vars.r->tip) { p = Vars.r->next; for (k = (long)zero; k <= (long)seven; k++) if (Vars.anc & (1 << k)) Vars.r->discnumnuc[j - 1][k]++; do { for (k = (long)zero; k <= (long)seven; k++) if (p->back->discbase[j - 1] & (1 << k)) Vars.r->discnumnuc[j - 1][k]++; p = p->next; } while (p != Vars.r); largest = getlargest(Vars.r->discnumnuc[j - 1]); Vars.tempset = 0; for (k = (long)zero; k <= (long)seven; k++) { if (Vars.r->discnumnuc[j - 1][k] == largest) Vars.tempset |= (1 << k); } Vars.r->discbase[j - 1] = Vars.tempset; } if (!Vars.bottom) Vars.anc = treenode[Vars.r->back->index - 1]->discbase[j - 1]; Vars.nonzero = (Vars.nonzero || (Vars.r->discbase[j - 1] & Vars.anc) == 0); Vars.maybe = (Vars.maybe || Vars.r->discbase[j - 1] != Vars.anc); } hyprint(b1, b2, &Vars, treenode); Vars.bottom = false; if (!Vars.r->tip) { memcpy(tempnuc, Vars.r->discnumnuc, endsite*sizeof(discnucarray)); q = Vars.r->next; do { memcpy(Vars.r->discnumnuc, tempnuc, endsite*sizeof(discnucarray)); for (i = b1 - 1; i < b2; i++) { j = location[ally[i] - 1]; for (k = (long)zero; k <= (long)seven; k++) if (q->back->discbase[j - 1] & (1 << k)) Vars.r->discnumnuc[j - 1][k]--; largest = getlargest(Vars.r->discnumnuc[j - 1]); ancset->discbase[j - 1] = 0; for (k = (long)zero; k <= (long)seven; k++) if (Vars.r->discnumnuc[j - 1][k] == largest) ancset->discbase[j - 1] |= (1 << k); if (!Vars.bottom) Vars.anc = ancset->discbase[j - 1]; } hyptrav(q->back, ancset->discbase, b1, b2, Vars.bottom, treenode, garbage); q = q->next; } while (q != Vars.r); } chuckbase(ancset, garbage); } /* hyptrav */ void hypstates(long chars, node *root, pointarray treenode, gbases **garbage) { /* fill in and describe states at interior nodes */ /* used in pars */ long i, n; discbaseptr nothing; fprintf(outfile, "\nFrom To Any Steps? State at upper node\n"); fprintf(outfile, " "); if (dotdiff) fprintf(outfile, " ( . means same as in the node below it on tree)\n"); nothing = (discbaseptr)Malloc(endsite*sizeof(unsigned char)); for (i = 0; i < endsite; i++) nothing[i] = 0; for (i = 1; i <= ((chars - 1) / 40 + 1); i++) { putc('\n', outfile); n = i * 40; if (n > chars) n = chars; hyptrav(root, nothing, i * 40 - 39, n, true, treenode, garbage); } free(nothing); } /* hypstates */ void initbranchlen(node *p) { node *q; p->v = 0.0; if (p->back) p->back->v = 0.0; if (p->tip) return; q = p->next; while (q != p) { initbranchlen(q->back); q = q->next; } q = p->next; while (q != p) { q->v = 0.0; q = q->next; } } /* initbranchlen */ void initmin(node *p, long sitei, boolean internal) { long i; if (internal) { for (i = (long)zero; i <= (long)seven; i++) { p->disccumlengths[i] = 0; p->discnumreconst[i] = 1; } } else { for (i = (long)zero; i <= (long)seven; i++) { if (p->discbase[sitei - 1] & (1 << i)) { p->disccumlengths[i] = 0; p->discnumreconst[i] = 1; } else { p->disccumlengths[i] = -1; p->discnumreconst[i] = 0; } } } } /* initmin */ void initbase(node *p, long sitei) { /* traverse tree to initialize base at internal nodes */ node *q; long i, largest; if (p->tip) return; q = p->next; while (q != p) { if (q->back) { memcpy(q->discnumnuc, p->discnumnuc, endsite*sizeof(discnucarray)); for (i = (long)zero; i <= (long)seven; i++) { if (q->back->discbase[sitei - 1] & (1 << i)) q->discnumnuc[sitei - 1][i]--; } if (p->back) { for (i = (long)zero; i <= (long)seven; i++) { if (p->back->discbase[sitei - 1] & (1 << i)) q->discnumnuc[sitei - 1][i]++; } } largest = getlargest(q->discnumnuc[sitei - 1]); q->discbase[sitei - 1] = 0; for (i = (long)zero; i <= (long)seven; i++) { if (q->discnumnuc[sitei - 1][i] == largest) q->discbase[sitei - 1] |= (1 << i); } } q = q->next; } q = p->next; while (q != p) { initbase(q->back, sitei); q = q->next; } } /* initbase */ void inittreetrav(node *p, long sitei) { /* traverse tree to clear boolean initialized and set up base */ node *q; if (p->tip) { initmin(p, sitei, false); p->initialized = true; return; } q = p->next; while (q != p) { inittreetrav(q->back, sitei); q = q->next; } initmin(p, sitei, true); p->initialized = false; q = p->next; while (q != p) { initmin(q, sitei, true); q->initialized = false; q = q->next; } } /* inittreetrav */ void compmin(node *p, node *desc) { /* computes minimum lengths up to p */ long i, j, minn, cost, desclen, descrecon=0, maxx; maxx = 10 * spp; for (i = (long)zero; i <= (long)seven; i++) { minn = maxx; for (j = (long)zero; j <= (long)seven; j++) { if (i == j) cost = 0; else cost = 1; if (desc->disccumlengths[j] == -1) { desclen = maxx; } else { desclen = desc->disccumlengths[j]; } if (minn > cost + desclen) { minn = cost + desclen; descrecon = 0; } if (minn == cost + desclen) { descrecon += desc->discnumreconst[j]; } } p->disccumlengths[i] += minn; p->discnumreconst[i] *= descrecon; } p->initialized = true; } /* compmin */ void minpostorder(node *p, pointarray treenode) { /* traverses an n-ary tree, computing minimum steps at each node */ node *q; if (p->tip) { return; } q = p->next; while (q != p) { if (q->back) minpostorder(q->back, treenode); q = q->next; } if (!p->initialized) { q = p->next; while (q != p) { if (q->back) compmin(p, q->back); q = q->next; } } } /* minpostorder */ void branchlength(node *subtr1, node *subtr2, double *brlen, pointarray treenode) { /* computes a branch length between two subtrees for a given site */ long i, j, minn, cost, nom, denom; node *temp; if (subtr1->tip) { temp = subtr1; subtr1 = subtr2; subtr2 = temp; } if (subtr1->index == outgrno) { temp = subtr1; subtr1 = subtr2; subtr2 = temp; } minpostorder(subtr1, treenode); minpostorder(subtr2, treenode); minn = 10 * spp; nom = 0; denom = 0; for (i = (long)zero; i <= (long)seven; i++) { for (j = (long)zero; j <= (long)seven; j++) { if (i == j) cost = 0; else cost = 1; if (subtr1->disccumlengths[i] != -1 && (subtr2->disccumlengths[j] != -1)) { if (subtr1->disccumlengths[i] + cost + subtr2->disccumlengths[j] < minn) { minn = subtr1->disccumlengths[i] + cost + subtr2->disccumlengths[j]; nom = 0; denom = 0; } if (subtr1->disccumlengths[i] + cost + subtr2->disccumlengths[j] == minn) { nom += subtr1->discnumreconst[i] * subtr2->discnumreconst[j] * cost; denom += subtr1->discnumreconst[i] * subtr2->discnumreconst[j]; } } } } *brlen = (double)nom/(double)denom; } /* branchlength */ void printbranchlengths(node *p) { node *q; long i; if (p->tip) return; q = p->next; do { fprintf(outfile, "%6ld ",q->index - spp); if (q->back->tip) { for (i = 0; i < nmlngth; i++) putc(nayme[q->back->index - 1][i], outfile); } else fprintf(outfile, "%6ld ", q->back->index - spp); fprintf(outfile, " %.2f\n",q->v); if (q->back) printbranchlengths(q->back); q = q->next; } while (q != p); } /* printbranchlengths */ void branchlentrav(node *p, node *root, long sitei, long chars, double *brlen, pointarray treenode) { /* traverses the tree computing tree length at each branch */ node *q; if (p->tip) return; if (p->index == outgrno) p = p->back; q = p->next; do { if (q->back) { branchlength(q, q->back, brlen, treenode); q->v += ((weight[sitei - 1] / 10.0) * (*brlen)); q->back->v += ((weight[sitei - 1] / 10.0) * (*brlen)); if (!q->back->tip) branchlentrav(q->back, root, sitei, chars, brlen, treenode); } q = q->next; } while (q != p); } /* branchlentrav */ void treelength(node *root, long chars, pointarray treenode) { /* calls branchlentrav at each site */ long sitei; double trlen; initbranchlen(root); for (sitei = 1; sitei <= endsite; sitei++) { trlen = 0.0; initbase(root, sitei); inittreetrav(root, sitei); branchlentrav(root, root, sitei, chars, &trlen, treenode); } } /* treelength */ void coordinates(node *p, long *tipy, double f, long *fartemp) { /* establishes coordinates of nodes for display without lengths */ node *q, *first, *last, *mid1 = NULL, *mid2 = NULL; long numbranches, numb2; if (p->tip) { p->xcoord = 0; p->ycoord = *tipy; p->ymin = *tipy; p->ymax = *tipy; (*tipy) += down; return; } numbranches = 0; q = p->next; do { coordinates(q->back, tipy, f, fartemp); numbranches += 1; q = q->next; } while (p != q); first = p->next->back; q = p->next; while (q->next != p) q = q->next; last = q->back; numb2 = 1; q = p->next; while (q != p) { if (numb2 == (numbranches + 1)/2) mid1 = q->back; if (numb2 == (numbranches/2 + 1)) mid2 = q->back; numb2 += 1; q = q->next; } p->xcoord = (long)((double)(last->ymax - first->ymin) * f); p->ycoord = (long)((mid1->ycoord + mid2->ycoord) / 2); p->ymin = first->ymin; p->ymax = last->ymax; if (p->xcoord > *fartemp) *fartemp = p->xcoord; } /* coordinates */ void drawline(long i, double scale, node *root) { /* draws one row of the tree diagram by moving up tree */ node *p, *q, *r, *first = NULL, *last = NULL; long n, j; boolean extra, done, noplus; p = root; q = root; extra = false; noplus = false; if (i == (long)p->ycoord && p == root) { if (p->index - spp >= 10) fprintf(outfile, " %2ld", p->index - spp); else fprintf(outfile, " %ld", p->index - spp); extra = true; noplus = true; } else fprintf(outfile, " "); do { if (!p->tip) { r = p->next; done = false; do { if (i >= r->back->ymin && i <= r->back->ymax) { q = r->back; done = true; } r = r->next; } while (!(done || r == p)); first = p->next->back; r = p->next; while (r->next != p) r = r->next; last = r->back; } done = (p == q); n = (long)(scale * (p->xcoord - q->xcoord) + 0.5); if (n < 3 && !q->tip) n = 3; if (extra) { n--; extra = false; } if ((long)q->ycoord == i && !done) { if (noplus) { putc('-', outfile); noplus = false; } else putc('+', outfile); if (!q->tip) { for (j = 1; j <= n - 2; j++) putc('-', outfile); if (q->index - spp >= 10) fprintf(outfile, "%2ld", q->index - spp); else fprintf(outfile, "-%ld", q->index - spp); extra = true; noplus = true; } else { for (j = 1; j < n; j++) putc('-', outfile); } } else if (!p->tip) { if ((long)last->ycoord > i && (long)first->ycoord < i && i != (long)p->ycoord) { putc('!', outfile); for (j = 1; j < n; j++) putc(' ', outfile); } else { for (j = 1; j <= n; j++) putc(' ', outfile); } noplus = false; } else { for (j = 1; j <= n; j++) putc(' ', outfile); noplus = false; } if (p != q) p = q; } while (!done); if ((long)p->ycoord == i && p->tip) { for (j = 0; j < nmlngth; j++) putc(nayme[p->index - 1][j], outfile); } putc('\n', outfile); } /* drawline */ void printree(node *root, double f) { /* prints out diagram of the tree */ /* used in pars */ long i, tipy, dummy; double scale; putc('\n', outfile); if (!treeprint) return; putc('\n', outfile); tipy = 1; dummy = 0; coordinates(root, &tipy, f, &dummy); scale = 1.5; putc('\n', outfile); for (i = 1; i <= (tipy - down); i++) drawline(i, scale, root); fprintf(outfile, "\n remember:"); if (outgropt) fprintf(outfile, " (although rooted by outgroup)"); fprintf(outfile, " this is an unrooted tree!\n\n"); } /* printree */ void writesteps(long chars, boolean weights, steptr oldweight, node *root) { /* used in pars */ long i, j, k, l; putc('\n', outfile); if (weights) fprintf(outfile, "weighted "); fprintf(outfile, "steps in each site:\n"); fprintf(outfile, " "); for (i = 0; i <= 9; i++) fprintf(outfile, "%4ld", i); fprintf(outfile, "\n *------------------------------------"); fprintf(outfile, "-----\n"); for (i = 0; i <= (chars / 10); i++) { fprintf(outfile, "%5ld", i * 10); putc('|', outfile); for (j = 0; j <= 9; j++) { k = i * 10 + j; if (k == 0 || k > chars) fprintf(outfile, " "); else { l = location[ally[k - 1] - 1]; if (oldweight[k - 1] > 0) fprintf(outfile, "%4ld", oldweight[k - 1] * (root->numsteps[l - 1] / weight[l - 1])); else fprintf(outfile, " 0"); } } putc('\n', outfile); } } /* writesteps */ void treeout(node *p, long nextree, long *col, node *root) { /* write out file with representation of final tree */ /* used in pars */ node *q; long i, n; Char c; if (p->tip) { n = 0; for (i = 1; i <= nmlngth; i++) { if (nayme[p->index - 1][i - 1] != ' ') n = i; } for (i = 0; i < n; i++) { c = nayme[p->index - 1][i]; if (c == ' ') c = '_'; putc(c, outtree); } *col += n; } else { putc('(', outtree); (*col)++; q = p->next; while (q != p) { treeout(q->back, nextree, col, root); q = q->next; if (q == p) break; putc(',', outtree); (*col)++; if (*col > 60) { putc('\n', outtree); *col = 0; } } putc(')', outtree); (*col)++; } if (p != root) return; if (nextree > 2) fprintf(outtree, "[%6.4f];\n", 1.0 / (nextree - 1)); else fprintf(outtree, ";\n"); } /* treeout */ void treeout3(node *p, long nextree, long *col, node *root) { /* write out file with representation of final tree */ /* used in dnapars -- writes branch lengths */ node *q; long i, n, w; double x; Char c; if (p->tip) { n = 0; for (i = 1; i <= nmlngth; i++) { if (nayme[p->index - 1][i - 1] != ' ') n = i; } for (i = 0; i < n; i++) { c = nayme[p->index - 1][i]; if (c == ' ') c = '_'; putc(c, outtree); } *col += n; } else { putc('(', outtree); (*col)++; q = p->next; while (q != p) { treeout3(q->back, nextree, col, root); q = q->next; if (q == p) break; putc(',', outtree); (*col)++; if (*col > 60) { putc('\n', outtree); *col = 0; } } putc(')', outtree); (*col)++; } x = p->v; if (x > 0.0) w = (long)(0.43429448222 * log(x)); else if (x == 0.0) w = 0; else w = (long)(0.43429448222 * log(-x)) + 1; if (w < 0) w = 0; if (p != root) { fprintf(outtree, ":%*.2f", (int)(w + 4), x); } if (p != root) return; if (nextree > 2) fprintf(outtree, "[%6.4f];\n", 1.0 / (nextree - 1)); else fprintf(outtree, ";\n"); } /* treeout3 */ void drawline3(long i, double scale, node *start) { /* draws one row of the tree diagram by moving up tree */ /* used in pars */ node *p, *q; long n, j; boolean extra; node *r, *first = NULL, *last = NULL; boolean done; p = start; q = start; extra = false; if (i == (long)p->ycoord) { if (p->index - spp >= 10) fprintf(outfile, " %2ld", p->index - spp); else fprintf(outfile, " %ld", p->index - spp); extra = true; } else fprintf(outfile, " "); do { if (!p->tip) { r = p->next; done = false; do { if (i >= r->back->ymin && i <= r->back->ymax) { q = r->back; done = true; } r = r->next; } while (!(done || (r == p))); first = p->next->back; r = p; while (r->next != p) r = r->next; last = r->back; } done = (p->tip || p == q); n = (long)(scale * (q->xcoord - p->xcoord) + 0.5); if (n < 3 && !q->tip) n = 3; if (extra) { n--; extra = false; } if ((long)q->ycoord == i && !done) { if ((long)p->ycoord != (long)q->ycoord) putc('+', outfile); else putc('-', outfile); if (!q->tip) { for (j = 1; j <= n - 2; j++) putc('-', outfile); if (q->index - spp >= 10) fprintf(outfile, "%2ld", q->index - spp); else fprintf(outfile, "-%ld", q->index - spp); extra = true; } else { for (j = 1; j < n; j++) putc('-', outfile); } } else if (!p->tip) { if ((long)last->ycoord > i && (long)first->ycoord < i && (i != (long)p->ycoord || p == start)) { putc('|', outfile); for (j = 1; j < n; j++) putc(' ', outfile); } else { for (j = 1; j <= n; j++) putc(' ', outfile); } } else { for (j = 1; j <= n; j++) putc(' ', outfile); } if (q != p) p = q; } while (!done); if ((long)p->ycoord == i && p->tip) { for (j = 0; j < nmlngth; j++) putc(nayme[p->index-1][j], outfile); } putc('\n', outfile); } /* drawline3 */ void standev(long chars, long numtrees, long minwhich, double minsteps, double *nsteps, long **fsteps, longer seed) { /* do paired sites test (KHT or SH) on user trees */ /* used in pars */ long i, j, k; double wt, sumw, sum, sum2, sd; double temp; double **covar, *P, *f, *r; #define SAMPLES 1000 if (numtrees == 2) { fprintf(outfile, "Kishino-Hasegawa-Templeton test\n\n"); fprintf(outfile, "Tree Steps Diff Steps Its S.D."); fprintf(outfile, " Significantly worse?\n\n"); which = 1; while (which <= numtrees) { fprintf(outfile, "%3ld%10.1f", which, nsteps[which - 1] / 10); if (minwhich == which) fprintf(outfile, " <------ best\n"); else { sumw = 0.0; sum = 0.0; sum2 = 0.0; for (i = 0; i < endsite; i++) { if (weight[i] > 0) { wt = weight[i] / 10.0; sumw += wt; temp = (fsteps[which - 1][i] - fsteps[minwhich - 1][i]) / 10.0; sum += temp; sum2 += temp * temp / wt; } } temp = sum / sumw; sd = sqrt(sumw / (sumw - 1.0) * (sum2 - sum * sum / sumw)); fprintf(outfile, "%9.1f %12.4f", (nsteps[which - 1] - minsteps) / 10, sd); if (sum > 1.95996 * sd) fprintf(outfile, " Yes\n"); else fprintf(outfile, " No\n"); } which++; } fprintf(outfile, "\n\n"); } else { /* Shimodaira-Hasegawa test using normal approximation */ if(numtrees > MAXSHIMOTREES){ fprintf(outfile, "Shimodaira-Hasegawa test on first %d of %ld trees\n\n" , MAXSHIMOTREES, numtrees); numtrees = MAXSHIMOTREES; } else { fprintf(outfile, "Shimodaira-Hasegawa test\n\n"); } covar = (double **)Malloc(numtrees*sizeof(double *)); for (i = 0; i < numtrees; i++) covar[i] = (double *)Malloc(numtrees*sizeof(double)); sumw = 0.0; for (i = 0; i < endsite; i++) sumw += weight[i]; for (i = 0; i < numtrees; i++) { /* compute covariances of trees */ sum = nsteps[i]/sumw; for (j = 0; j <=i; j++) { sum2 = nsteps[j]/sumw; temp = 0.0; for (k = 0; k < endsite; k++) { wt = weight[k]/10.0; if (weight[k] > 0) { temp = temp + wt*(fsteps[i][k]/(wt*10.0)-sum) *(fsteps[j][k]/(wt*10.0)-sum2); } } covar[i][j] = temp; if (i != j) covar[j][i] = temp; } } for (i = 0; i < numtrees; i++) { /* in-place Cholesky decomposition of trees x trees covariance matrix */ sum = 0.0; for (j = 0; j <= i-1; j++) sum = sum + covar[i][j] * covar[i][j]; if (covar[i][i] <= sum) temp = 0.0; else temp = sqrt(covar[i][i] - sum); covar[i][i] = temp; for (j = i+1; j < numtrees; j++) { sum = 0.0; for (k = 0; k < i; k++) sum = sum + covar[i][k] * covar[j][k]; if (fabs(temp) < 1.0E-23) covar[j][i] = 0.0; else covar[j][i] = (covar[j][i] - sum)/temp; } } f = (double *)Malloc(numtrees*sizeof(double)); /* resampled sum of steps */ P = (double *)Malloc(numtrees*sizeof(double)); /* vector of P's of trees */ r = (double *)Malloc(numtrees*sizeof(double)); /* store normal variates */ for (i = 0; i < numtrees; i++) P[i] = 0.0; sum2 = nsteps[0]/10.0; /* sum2 will be smallest # of steps */ for (i = 1; i < numtrees; i++) if (sum2 > nsteps[i]/10.0) sum2 = nsteps[i]/10.0; for (i = 1; i <= SAMPLES; i++) { /* loop over resampled trees */ for (j = 0; j < numtrees; j++) /* draw normal variates */ r[j] = normrand(seed); for (j = 0; j < numtrees; j++) { /* compute vectors */ sum = 0.0; for (k = 0; k <= j; k++) sum += covar[j][k]*r[k]; f[j] = sum; } sum = f[1]; for (j = 1; j < numtrees; j++) /* get min of vector */ if (f[j] < sum) sum = f[j]; for (j = 0; j < numtrees; j++) /* accumulate P's */ if (nsteps[j]/10.0-sum2 <= f[j] - sum) P[j] += 1.0/SAMPLES; } fprintf(outfile, "Tree Steps Diff Steps P value"); fprintf(outfile, " Significantly worse?\n\n"); for (i = 0; i < numtrees; i++) { fprintf(outfile, "%3ld%10.1f", i+1, nsteps[i]/10); if ((minwhich-1) == i) fprintf(outfile, " <------ best\n"); else { fprintf(outfile, " %9.1f %10.3f", nsteps[i]/10.0-sum2, P[i]); if (P[i] < 0.05) fprintf(outfile, " Yes\n"); else fprintf(outfile, " No\n"); } } fprintf(outfile, "\n"); free(P); /* free the variables we Malloc'ed */ free(f); free(r); for (i = 0; i < numtrees; i++) free(covar[i]); free(covar); } } /* standev */ void freetip(node *anode) { /* used in pars */ free(anode->numsteps); free(anode->oldnumsteps); free(anode->discbase); free(anode->olddiscbase); } /* freetip */ void freenontip(node *anode) { /* used in pars */ free(anode->numsteps); free(anode->oldnumsteps); free(anode->discbase); free(anode->olddiscbase); free(anode->discnumnuc); } /* freenontip */ void freenodes(long nonodes, pointarray treenode) { /* used in pars */ long i; node *p; for (i = 0; i < spp; i++) freetip(treenode[i]); for (i = spp; i < nonodes; i++) { if (treenode[i] != NULL) { p = treenode[i]->next; do { freenontip(p); p = p->next; } while (p != treenode[i]); freenontip(p); } } } /* freenodes */ void freenode(node **anode) { /* used in pars */ freenontip(*anode); free(*anode); } /* freenode */ void freetree(long nonodes, pointarray treenode) { /* used in pars */ long i; node *p, *q; for (i = 0; i < spp; i++) free(treenode[i]); for (i = spp; i < nonodes; i++) { if (treenode[i] != NULL) { p = treenode[i]->next; do { q = p->next; free(p); p = q; } while (p != treenode[i]); free(p); } } free(treenode); } /* freetree */ void freegarbage(gbases **garbage) { /* used in pars */ gbases *p; while (*garbage) { p = *garbage; *garbage = (*garbage)->next; free(p->discbase); free(p); } } /* freegarbage */ void freegrbg(node **grbg) { /* used in pars */ node *p; while (*grbg) { p = *grbg; *grbg = (*grbg)->next; freenontip(p); free(p); } } /*freegrbg */ void collapsetree(node *p, node *root, node **grbg, pointarray treenode, long *zeros, unsigned char *zeros2) { /* Recurse through tree searching for zero length brances between */ /* nodes (not to tips). If one exists, collapse the nodes together, */ /* removing the branch. */ node *q, *x1, *y1, *x2, *y2; long i, j, index, index2, numd; if (p->tip) return; q = p->next; do { if (!q->back->tip && q->v == 0.000000) { /* merge the two nodes. */ x1 = y2 = q->next; x2 = y1 = q->back->next; while(x1->next != q) x1 = x1-> next; while(y1->next != q->back) y1 = y1-> next; x1->next = x2; y1->next = y2; index = q->index; index2 = q->back->index; numd = treenode[index-1]->numdesc + q->back->numdesc -1; chuck(grbg, q->back); chuck(grbg, q); q = x2; /* update the indicies around the node circle */ do{ if(q->index != index){ q->index = index; } q = q-> next; }while(x2 != q); updatenumdesc(treenode[index-1], root, numd); /* Alter treenode to point to real nodes, and update indicies */ /* acordingly. */ j = 0; i=0; for(i = (index2-1); i < nonodes-1 && treenode[i+1]; i++){ treenode[i]=treenode[i+1]; treenode[i+1] = NULL; x1=x2=treenode[i]; do{ x1->index = i+1; x1 = x1 -> next; } while(x1 != x2); } /* Create a new empty fork in the blank spot of treenode */ x1=NULL; for(i=1; i <=3 ; i++){ gnudisctreenode(grbg, &x2, index2, endsite, zeros, zeros2); x2->next = x1; x1 = x2; } x2->next->next->next = x2; treenode[nonodes-1]=x2; if (q->back) collapsetree(q->back, root, grbg, treenode, zeros, zeros2); } else { if (q->back) collapsetree(q->back, root, grbg, treenode, zeros, zeros2); q = q->next; } } while (q != p); } /* collapsetree */ void collapsebestrees(node **root, node **grbg, pointarray treenode, bestelm *bestrees, long *place, long *zeros, unsigned char *zeros2, long chars, boolean recompute, boolean progress) { /* Goes through all best trees, collapsing trees where possible, and */ /* deleting trees that are not unique. */ long i,j, k, pos, nextnode, oldnextree; boolean found; node *dummy; oldnextree = nextree; for(i = 0 ; i < (oldnextree - 1) ; i++){ bestrees[i].collapse = true; } if(progress) printf("Collapsing best trees\n "); k = 0; for(i = 0 ; i < (oldnextree - 1) ; i++){ if(progress){ if(i % (((oldnextree-1) / 72) + 1) == 0) putchar('.'); fflush(stdout); } while(!bestrees[k].collapse) k++; /* Reconstruct tree. */ *root = treenode[0]; add(treenode[0], treenode[1], treenode[spp], root, recompute, treenode, grbg, zeros, zeros2); nextnode = spp + 2; for (j = 3; j <= spp; j++) { if (bestrees[k].btree[j - 1] > 0) add(treenode[bestrees[k].btree[j - 1] - 1], treenode[j - 1], treenode[nextnode++ - 1], root, recompute, treenode, grbg, zeros, zeros2); else add(treenode[treenode[-bestrees[k].btree[j - 1]-1]->back->index-1], treenode[j - 1], NULL, root, recompute, treenode, grbg, zeros, zeros2); } reroot(treenode[outgrno - 1], *root); treelength(*root, chars, treenode); collapsetree(*root, *root, grbg, treenode, zeros, zeros2); savetree(*root, place, treenode, grbg, zeros, zeros2); /* move everything down in the bestree list */ for(j = k ; j < (nextree - 2) ; j++){ memcpy(bestrees[j].btree, bestrees[j + 1].btree, spp * sizeof(long)); bestrees[j].gloreange = bestrees[j + 1].gloreange; bestrees[j + 1].gloreange = false; bestrees[j].locreange = bestrees[j + 1].locreange; bestrees[j + 1].locreange = false; bestrees[j].collapse = bestrees[j + 1].collapse; } pos=0; findtree(&found, &pos, nextree-1, place, bestrees); /* put the new tree at the end of the list if it wasn't found */ nextree--; if(!found) addtree(pos, &nextree, false, place, bestrees); /* Deconstruct the tree */ for (j = 1; j < spp; j++){ re_move(treenode[j], &dummy, root, recompute, treenode, grbg, zeros, zeros2); } } if (progress) { putchar('\n'); #ifdef WIN32 phyFillScreenColor(); #endif } } phylip-3.697/src/discrete.h0000644004732000473200000001544012407037527015322 0ustar joefelsenst_g /* version 3.696. Written by Joseph Felsenstein, Akiko Fuseki, Sean Lamont, and Andrew Keeffe. Copyright (c) 1993-2014, Joseph Felsenstein All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ /* discrete.h: included in pars */ typedef struct gbases { discbaseptr discbase; struct gbases *next; } gbases; struct LOC_hyptrav { boolean bottom; node *r; discbaseptr hypset; boolean maybe, nonzero; unsigned char tempset, anc; } ; extern long nonodes, endsite, outgrno, nextree, which; extern boolean interleaved, printdata, outgropt, treeprint, dotdiff; extern steptr weight, category, alias, location, ally; extern sequence y, convtab; #ifndef OLDC /*function prototypes*/ void inputdata(long); void alloctree(pointarray *, long, boolean); void setuptree(pointarray, long, boolean); void alloctip(node *, long *, unsigned char *); void sitesort(long, steptr); void sitecombine(long); void sitescrunch(long); void makevalues(pointarray, long *, unsigned char *, boolean); void fillin(node *, node *, node *); long getlargest(long *); void multifillin(node *, node *, long); void sumnsteps(node *, node *, node *, long, long); void sumnsteps2(node *, node *, node *, long, long, long *); void multisumnsteps(node *, node *, long, long, long *); void multisumnsteps2(node *); void findoutgroup(node *, boolean *); boolean alltips(node *, node *); void gdispose(node *, node **, pointarray); void preorder(node *, node *, node *, node *, node *, node *, long ); void updatenumdesc(node *, node *, long); void add(node *, node *, node *, node **, boolean, pointarray, node **, long *, unsigned char *); void findbelow(node **, node *, node *); void re_move(node *, node **, node **, boolean, pointarray, node **, long *, unsigned char *); void postorder(node *); void getnufork(node **, node **, pointarray, long *, unsigned char *); void reroot(node *, node *); void reroot2(node *, node *); void reroot3(node *, node *, node *, node *, node **); void savetraverse(node *); void newindex(long, node *); void flipindexes(long, pointarray); boolean parentinmulti(node *); long sibsvisited(node *, long *); long smallest(node *, long *); void bintomulti(node **, node **, node **, long *, unsigned char *); void backtobinary(node **, node *, node **); boolean outgrin(node *, node *); void flipnodes(node *, node *); void moveleft(node *, node *, node **); void savetree(node *, long *, pointarray, node **, long *, unsigned char *); void addnsave(node *, node *, node *, node **, node **, boolean multf, pointarray , long *, long *, unsigned char *); void addbestever(long *, long *, long, boolean, long *, bestelm *); void addtiedtree(long, long *, long, boolean, long *, bestelm *); void clearcollapse(pointarray); void clearbottom(pointarray); void collabranch(node *,node *,node *); boolean allcommonbases(node *, node *, boolean *); void findbottom(node *, node **); boolean moresteps(node *, node *); boolean passdown(node *, node *, node *, node *, node *, node *, node *, node *, node *, boolean); boolean trycollapdesc(node *, node *, node *, node *, node *, node *, node *, node *, node *, boolean ,long *, unsigned char *); void setbottom(node *); boolean zeroinsubtree(node *, node *, node *, node *, node *, node *, node *, node *, boolean , node *, long *, unsigned char *); boolean collapsible(node *, node *, node *, node *, node *, node *, node *, node *, boolean , node *, long *, unsigned char *, pointarray); void replaceback(node **,node *,node *,node **,long *,unsigned char *); void putback(node *, node *, node *, node **); void savelocrearr(node *, node *, node *, node *, node *, node *, node *, node *, node *, node **, long, long *, boolean, boolean, boolean *, long *, bestelm *, pointarray, node **, long *, unsigned char *); void clearvisited(pointarray); void hyprint(long,long,struct LOC_hyptrav *, pointarray); void gnubase(gbases **, gbases **, long); void chuckbase(gbases *, gbases **); void hyptrav(node *, discbaseptr, long, long, boolean, pointarray, gbases **); void hypstates(long, node *, pointarray, gbases **); void initbranchlen(node *); void initmin(node *, long, boolean); void initbase(node *, long); void inittreetrav(node *, long); void compmin(node *, node *); void minpostorder(node *, pointarray); void branchlength(node *, node *, double *, pointarray); void printbranchlengths(node *); void branchlentrav(node *, node *, long, long, double *, pointarray); void treelength(node *, long, pointarray); void coordinates(node *, long *, double , long *); void drawline(long, double, node *); void printree(node *, double); void writesteps(long, boolean, steptr, node *); void treeout(node *, long, long *, node *); void drawline3(long, double, node *); void standev(long, long, long, double, double *, long **, longer); void freetip(node *); void freenontip(node *); void freenodes(long, pointarray); void freenode(node **); void freetree(long, pointarray); void freegarbage(gbases **); void freegrbg(node **); void treeout3(node *p, long nextree, long *col, node *root); void collapsetree(node *, node *, node **, pointarray, long *, unsigned char *); void collapsebestrees(node **, node **, pointarray, bestelm *, long *, long *, unsigned char *, long, boolean, boolean); /*function prototypes*/ #endif phylip-3.697/src/dist.c0000644004732000473200000003340612407046172014454 0ustar joefelsenst_g#include "phylip.h" #include "dist.h" /* version 3.696. Written by Joseph Felsenstein, Akiko Fuseki, Sean Lamont, and Andrew Keeffe. Copyright (c) 1993-2014, Joseph Felsenstein All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ void alloctree(pointptr *treenode, long nonodes) { /* allocate spp tips and (nonodes - spp) forks, each containing three * nodes. Fill in treenode where 0..spp-1 are pointers to tip nodes, and * spp..nonodes-1 are pointers to one node in each fork. */ /* used in fitch, kitsch, neighbor */ long i, j; node *p, *q; *treenode = (pointptr)Malloc(nonodes*sizeof(node *)); for (i = 0; i < spp; i++) (*treenode)[i] = (node *)Malloc(sizeof(node)); for (i = spp; i < nonodes; i++) { q = NULL; for (j = 1; j <= 3; j++) { p = (node *)Malloc(sizeof(node)); p->next = q; q = p; } p->next->next->next = p; (*treenode)[i] = p; } } /* alloctree */ void freetree(pointptr *treenode, long nonodes) { long i; node *p, *q; for (i = 0; i < spp; i++) free((*treenode)[i]); for (i = spp; i < nonodes; i++) { p = (*treenode)[i]; q = p->next; while(q != p) { node * r = q; q = q->next; free(r); } free(p); } free(*treenode); } /* freetree */ void allocd(long nonodes, pointptr treenode) { /* used in fitch & kitsch */ long i, j; node *p; for (i = 0; i < (spp); i++) { treenode[i]->d = (vector)Malloc(nonodes*sizeof(double)); } for (i = spp; i < nonodes; i++) { p = treenode[i]; for (j = 1; j <= 3; j++) { p->d = (vector)Malloc(nonodes*sizeof(double)); p = p->next; } } } void freed(long nonodes, pointptr treenode) { /* used in fitch */ long i, j; node *p; for (i = 0; i < (spp); i++) { free(treenode[i]->d); } for (i = spp; i < nonodes; i++) { p = treenode[i]; for (j = 1; j <= 3; j++) { free(p->d); p = p->next; } } } void allocw(long nonodes, pointptr treenode) { /* used in fitch & kitsch */ long i, j; node *p; for (i = 0; i < (spp); i++) { treenode[i]->w = (vector)Malloc(nonodes*sizeof(double)); } for (i = spp; i < nonodes; i++) { p = treenode[i]; for (j = 1; j <= 3; j++) { p->w = (vector)Malloc(nonodes*sizeof(double)); p = p->next; } } } void freew(long nonodes, pointptr treenode) { /* used in fitch */ long i, j; node *p; for (i = 0; i < (spp); i++) { free(treenode[i]->w); } for (i = spp; i < nonodes; i++) { p = treenode[i]; for (j = 1; j <= 3; j++) { free(p->w); p = p->next; } } } void setuptree(tree *a, long nonodes) { /* initialize a tree */ /* used in fitch, kitsch, & neighbor */ long i=0; node *p; for (i = 1; i <= nonodes; i++) { a->nodep[i - 1]->back = NULL; a->nodep[i - 1]->tip = (i <= spp); a->nodep[i - 1]->iter = true; a->nodep[i - 1]->index = i; a->nodep[i - 1]->t = 0.0; a->nodep[i - 1]->sametime = false; a->nodep[i - 1]->v = 0.0; if (i > spp) { p = a->nodep[i-1]->next; while (p != a->nodep[i-1]) { p->back = NULL; p->tip = false; p->iter = true; p->index = i; p->t = 0.0; p->sametime = false; p = p->next; } } } a->likelihood = -1.0; a->start = a->nodep[0]; a->root = NULL; } /* setuptree */ void inputdata(boolean replicates, boolean printdata, boolean lower, boolean upper, vector *x, intvector *reps) { /* read in distance matrix */ /* used in fitch & neighbor */ long i=0, j=0, k=0, columns=0; boolean skipit=false, skipother=false; if (replicates) columns = 4; else columns = 6; if (printdata) { fprintf(outfile, "\nName Distances"); if (replicates) fprintf(outfile, " (replicates)"); fprintf(outfile, "\n---- ---------"); if (replicates) fprintf(outfile, "-------------"); fprintf(outfile, "\n\n"); } for (i = 0; i < spp; i++) { x[i][i] = 0.0; scan_eoln(infile); initname(i); for (j = 0; j < spp; j++) { skipit = ((lower && j + 1 >= i + 1) || (upper && j + 1 <= i + 1)); skipother = ((lower && i + 1 >= j + 1) || (upper && i + 1 <= j + 1)); if (!skipit) { if (eoln(infile)) scan_eoln(infile); if (fscanf(infile, "%lf", &x[i][j]) != 1) { printf("The infile is of the wrong type\n"); exxit(-1); } if (replicates) { if (eoln(infile)) scan_eoln(infile); if (fscanf(infile, "%ld", &reps[i][j]) != 1) { printf("The infile is of the wrong type\n"); exxit(-1); } } else reps[i][j] = 1; } if (!skipit && skipother) { x[j][i] = x[i][j]; reps[j][i] = reps[i][j]; } if ((i == j) && (fabs(x[i][j]) > 0.000000001)) { printf("\nERROR: diagonal element of row %ld of distance matrix ", i+1); printf("is not zero.\n"); printf(" Is it a distance matrix?\n\n"); exxit(-1); } if ((j < i) && (fabs(x[i][j]-x[j][i]) > 0.000000001)) { printf("ERROR: distance matrix is not symmetric:\n"); printf(" (%ld,%ld) element and (%ld,%ld) element are unequal.\n", i+1, j+1, j+1, i+1); printf(" They are %10.6f and %10.6f, respectively.\n", x[i][j], x[j][i]); printf(" Is it a distance matrix?\n\n"); exxit(-1); } } } scan_eoln(infile); if (!printdata) return; for (i = 0; i < spp; i++) { for (j = 0; j < nmlngth; j++) putc(nayme[i][j], outfile); putc(' ', outfile); for (j = 1; j <= spp; j++) { fprintf(outfile, "%10.5f", x[i][j - 1]); if (replicates) fprintf(outfile, " (%3ld)", reps[i][j - 1]); if (j % columns == 0 && j < spp) { putc('\n', outfile); for (k = 1; k <= nmlngth + 1; k++) putc(' ', outfile); } } putc('\n', outfile); } putc('\n', outfile); } /* inputdata */ void coordinates(node *p, double lengthsum, long *tipy, double *tipmax, node *start, boolean njoin) { /* establishes coordinates of nodes */ node *q, *first, *last; if (p->tip) { p->xcoord = (long)(over * lengthsum + 0.5); p->ycoord = *tipy; p->ymin = *tipy; p->ymax = *tipy; (*tipy) += down; if (lengthsum > *tipmax) *tipmax = lengthsum; return; } q = p->next; do { if (q->back) coordinates(q->back, lengthsum + q->v, tipy,tipmax, start, njoin); q = q->next; } while ((p == start || p != q) && (p != start || p->next != q)); first = p->next->back; q = p; while (q->next != p && q->next->back) /* is this right ? */ q = q->next; last = q->back; p->xcoord = (long)(over * lengthsum + 0.5); if (p == start && p->back) p->ycoord = p->next->next->back->ycoord; else p->ycoord = (first->ycoord + last->ycoord) / 2; p->ymin = first->ymin; p->ymax = last->ymax; } /* coordinates */ void drawline(long i, double scale, node *start, boolean rooted) { /* draws one row of the tree diagram by moving up tree */ node *p, *q; long n=0, j=0; boolean extra=false, trif=false; node *r, *first =NULL, *last =NULL; boolean done=false; p = start; q = start; extra = false; trif = false; if (i == (long)p->ycoord && p == start) { /* display the root */ if (rooted) { if (p->index - spp >= 10) fprintf(outfile, "-"); else fprintf(outfile, "--"); } else { if (p->index - spp >= 10) fprintf(outfile, " "); else fprintf(outfile, " "); } if (p->index - spp >= 10) fprintf(outfile, "%2ld", p->index - spp); else fprintf(outfile, "%ld", p->index - spp); extra = true; trif = true; } else fprintf(outfile, " "); do { if (!p->tip) { /* internal nodes */ r = p->next; /* r->back here is going to the same node. */ do { if (!r->back) { r = r->next; continue; } if (i >= r->back->ymin && i <= r->back->ymax) { q = r->back; break; } r = r->next; } while (!((p != start && r == p) || (p == start && r == p->next))); first = p->next->back; r = p; while (r->next != p) r = r->next; last = r->back; if (!rooted && (p == start)) last = p->back; } /* end internal node case... */ /* draw the line: */ done = (p->tip || p == q); n = (long)(scale * (q->xcoord - p->xcoord) + 0.5); if (!q->tip) { if ((n < 3) && (q->index - spp >= 10)) n = 3; if ((n < 2) && (q->index - spp < 10)) n = 2; } if (extra) { n--; extra = false; } if ((long)q->ycoord == i && !done) { if (p->ycoord != q->ycoord) putc('+', outfile); if (trif) { n++; trif = false; } if (!q->tip) { for (j = 1; j <= n - 2; j++) putc('-', outfile); if (q->index - spp >= 10) fprintf(outfile, "%2ld", q->index - spp); else fprintf(outfile, "-%ld", q->index - spp); extra = true; } else { for (j = 1; j < n; j++) putc('-', outfile); } } else if (!p->tip) { if ((long)last->ycoord > i && (long)first->ycoord < i && i != (long)p->ycoord) { putc('!', outfile); for (j = 1; j < n; j++) putc(' ', outfile); } else { for (j = 1; j <= n; j++) putc(' ', outfile); trif = false; } } if (q != p) p = q; } while (!done); if ((long)p->ycoord == i && p->tip) { for (j = 0; j < nmlngth; j++) putc(nayme[p->index - 1][j], outfile); } putc('\n', outfile); } /* drawline */ void printree(node *start, boolean treeprint, boolean njoin, boolean rooted) { /* prints out diagram of the tree */ /* used in fitch & neighbor */ long i; long tipy; double scale,tipmax; if (!treeprint) return; putc('\n', outfile); tipy = 1; tipmax = 0.0; coordinates(start, 0.0, &tipy, &tipmax, start, njoin); scale = 1.0 / (long)(tipmax + 1.000); for (i = 1; i <= (tipy - down); i++) drawline(i, scale, start, rooted); putc('\n', outfile); } /* printree */ void treeoutr(node *p, long *col, tree *curtree) { /* write out file with representation of final tree. * Rooted case. Used in kitsch and neighbor. */ long i, n, w; Char c; double x; if (p->tip) { n = 0; for (i = 1; i <= nmlngth; i++) { if (nayme[p->index - 1][i - 1] != ' ') n = i; } for (i = 0; i < n; i++) { c = nayme[p->index - 1][i]; if (c == ' ') c = '_'; putc(c, outtree); } (*col) += n; } else { putc('(', outtree); (*col)++; treeoutr(p->next->back,col,curtree); putc(',', outtree); (*col)++; if ((*col) > 55) { putc('\n', outtree); (*col) = 0; } treeoutr(p->next->next->back,col,curtree); putc(')', outtree); (*col)++; } x = p->v; if (x > 0.0) w = (long)(0.43429448222 * log(x)); else if (x == 0.0) w = 0; else w = (long)(0.43429448222 * log(-x)) + 1; if (w < 0) w = 0; if (p == curtree->root) fprintf(outtree, ";\n"); else { fprintf(outtree, ":%*.5f", (int)(w + 7), x); (*col) += w + 8; } } /* treeoutr */ void treeout(node *p, long *col, double m, boolean njoin, node *start) { /* write out file with representation of final tree */ /* used in fitch & neighbor */ long i=0, n=0, w=0; Char c; double x=0.0; if (p->tip) { n = 0; for (i = 1; i <= nmlngth; i++) { if (nayme[p->index - 1][i - 1] != ' ') n = i; } for (i = 0; i < n; i++) { c = nayme[p->index - 1][i]; if (c == ' ') c = '_'; putc(c, outtree); } *col += n; } else { putc('(', outtree); (*col)++; treeout(p->next->back, col, m, njoin, start); putc(',', outtree); (*col)++; if (*col > 55) { putc('\n', outtree); *col = 0; } treeout(p->next->next->back, col, m, njoin, start); if (p == start && njoin) { putc(',', outtree); treeout(p->back, col, m, njoin, start); } putc(')', outtree); (*col)++; } x = p->v; if (x > 0.0) w = (long)(m * log(x)); else if (x == 0.0) w = 0; else w = (long)(m * log(-x)) + 1; if (w < 0) w = 0; if (p == start) fprintf(outtree, ";\n"); else { fprintf(outtree, ":%*.5f", (int) w + 7, x); *col += w + 8; } } /* treeout */ phylip-3.697/src/dist.h0000644004732000473200000000427512407046216014462 0ustar joefelsenst_g /* version 3.696. Written by Joseph Felsenstein, Akiko Fuseki, Sean Lamont, and Andrew Keeffe. Copyright (c) 1993-2014, Joseph Felsenstein All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ /* dist.h: included in fitch, kitsch, & neighbor */ #define over 60 typedef long *intvector; typedef node **pointptr; #ifndef OLDC /*function prototypes*/ void alloctree(pointptr *, long); void freetree(pointptr *, long); void allocd(long, pointptr); void freed(long, pointptr); void allocw(long, pointptr); void freew(long, pointptr); void setuptree(tree *, long); void inputdata(boolean, boolean, boolean, boolean, vector *, intvector *); void coordinates(node *, double, long *, double *, node *, boolean); void drawline(long, double, node *, boolean); void printree(node *, boolean, boolean, boolean); void treeoutr(node *, long *, tree *); void treeout(node *, long *, double, boolean, node *); /*function prototypes*/ #endif phylip-3.697/src/dnacomp.c0000644004732000473200000010056012406201116015114 0ustar joefelsenst_g #include "phylip.h" #include "seq.h" /* version 3.696. Written by Joseph Felsenstein, Akiko Fuseki, Sean Lamont, and Andrew Keeffe. Copyright (c) 1993-2014, Joseph Felsenstein All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #define maxtrees 100 /* maximum number of tied trees stored */ typedef boolean *boolptr; #ifndef OLDC /* function prototypes */ void getoptions(void); void allocrest(void); void doinit(void); void initdnacompnode(node **, node **, node *, long, long, long *, long *, initops, pointarray, pointarray, Char *, Char *, FILE *); void makeweights(void); void doinput(void); void mincomp(long ); void evaluate(node *); void localsavetree(void); void tryadd(node *, node *, node *); void addpreorder(node *, node *, node *); void tryrearr(node *, boolean *); void repreorder(node *, boolean *); void rearrange(node **); void describe(void); void initboolnames(node *, boolean *); void maketree(void); void freerest(void); void standev3(long, long, long, double, double *, long **, longer); void reallocchars(void); /* function prototypes */ #endif Char infilename[FNMLNGTH], outfilename[FNMLNGTH], intreename[FNMLNGTH], outtreename[FNMLNGTH], weightfilename[FNMLNGTH]; node *root, *p; long chars, col, ith, njumble, jumb, msets; long inseed, inseed0; boolean jumble, usertree, trout, weights, progress, stepbox, ancseq, firstset, mulsets, justwts; steptr oldweight, necsteps; pointarray treenode; /* pointers to all nodes in tree */ long *enterorder; Char basechar[32]="ACMGRSVTWYHKDBNO???????????????"; bestelm *bestrees; boolean dummy; longer seed; gbases *garbage; Char ch; Char progname[20]; long *zeros; /* Local variables for maketree, propogated globally for C version: */ long maxwhich; double like, maxsteps, bestyet, bestlike, bstlike2; boolean lastrearr, recompute; double nsteps[maxuser]; long **fsteps; node *there; long *place; boolptr in_tree; baseptr nothing; node *temp, *temp1; node *grbg; void getoptions() { /* interactively set options */ long loopcount, loopcount2; Char ch, ch2; fprintf(outfile, "\nDNA compatibility algorithm, version %s\n\n",VERSION); putchar('\n'); jumble = false; njumble = 1; outgrno = 1; outgropt = false; trout = true; usertree = false; weights = false; justwts = false; printdata = false; dotdiff = true; progress = true; treeprint = true; stepbox = false; ancseq = false; interleaved = true; loopcount = 0; for (;;) { cleerhome(); printf("\nDNA compatibility algorithm, version %s\n\n",VERSION); printf("Settings for this run:\n"); printf(" U Search for best tree? %s\n", (usertree ? "No, use user trees in input file" : "Yes")); if (!usertree) { printf(" J Randomize input order of sequences?"); if (jumble) { printf( " Yes (seed =%8ld,%3ld times)\n", inseed0, njumble); } else printf(" No. Use input order\n"); } printf(" O Outgroup root?"); if (outgropt) printf(" Yes, at sequence number%3ld\n", outgrno); else printf(" No, use as outgroup species%3ld\n", outgrno); printf(" W Sites weighted? %s\n", (weights ? "Yes" : "No")); printf(" M Analyze multiple data sets?"); if (mulsets) printf(" Yes, %2ld %s\n", msets, (justwts ? "sets of weights" : "data sets")); else printf(" No\n"); printf(" I Input sequences interleaved? %s\n", (interleaved ? "Yes" : "No, sequential")); printf(" 0 Terminal type (IBM PC, ANSI, none)? %s\n", ibmpc ? "IBM PC" : ansi ? "ANSI" : "(none)"); printf(" 1 Print out the data at start of run %s\n", (printdata ? "Yes" : "No")); printf(" 2 Print indications of progress of run %s\n", (progress ? "Yes" : "No")); printf(" 3 Print out tree %s\n", (treeprint ? "Yes" : "No")); printf(" 4 Print steps & compatibility at sites %s\n", (stepbox ? "Yes" : "No")); printf(" 5 Print sequences at all nodes of tree %s\n", (ancseq ? "Yes" : "No")); printf(" 6 Write out trees onto tree file? %s\n", (trout ? "Yes" : "No")); if(weights && justwts){ printf( "WARNING: W option and Multiple Weights options are both on. "); printf( "The W menu option is unnecessary and has no additional effect. \n"); } printf("\nAre these settings correct? (type Y or the letter for one to change)\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); uppercase(&ch); if (ch == 'Y') break; if (((!usertree) && (strchr("WJOTUMI1234560", ch) != NULL)) || (usertree && ((strchr("WOTUMI1234560", ch) != NULL)))){ switch (ch) { case 'J': jumble = !jumble; if (jumble) initjumble(&inseed, &inseed0, seed, &njumble); else njumble = 1; break; case 'O': outgropt = !outgropt; if (outgropt) initoutgroup(&outgrno, spp); break; case 'U': usertree = !usertree; break; case 'M': mulsets = !mulsets; if (mulsets){ printf("Multiple data sets or multiple weights?"); loopcount2 = 0; do { printf(" (type D or W)\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch2); getchar(); if (ch2 == '\n') ch2 = ' '; uppercase(&ch2); countup(&loopcount2, 10); } while ((ch2 != 'W') && (ch2 != 'D')); justwts = (ch2 == 'W'); if (justwts) justweights(&msets); else initdatasets(&msets); if (!jumble) { jumble = true; initjumble(&inseed, &inseed0, seed, &njumble); } } break; case 'I': interleaved = !interleaved; break; case 'W': weights = !weights; break; case '0': initterminal(&ibmpc, &ansi); break; case '1': printdata = !printdata; break; case '2': progress = !progress; break; case '3': treeprint = !treeprint; break; case '4': stepbox = !stepbox; break; case '5': ancseq = !ancseq; break; case '6': trout = !trout; break; } } else printf("Not a possible option!\n"); countup(&loopcount, 100); } } /* getoptions */ void reallocchars(void) {/* The amount of chars can change between runs this function reallocates all the variables whose size depends on the amount of chars */ long i; for (i = 0; i < spp; i++) { free(y[i]); y[i] = (Char *)Malloc(chars*sizeof(Char)); } free(weight); free(oldweight); free(enterorder); free(necsteps); free(alias); free(ally); free(location); free(in_tree); weight = (steptr)Malloc(chars*sizeof(long)); oldweight = (steptr)Malloc(chars*sizeof(long)); enterorder = (long *)Malloc(spp*sizeof(long)); necsteps = (steptr)Malloc(chars*sizeof(long)); alias = (steptr)Malloc(chars*sizeof(long)); ally = (steptr)Malloc(chars*sizeof(long)); location = (steptr)Malloc(chars*sizeof(long)); in_tree = (boolptr)Malloc(chars*sizeof(boolean)); } void allocrest() { long i; y = (Char **)Malloc(spp*sizeof(Char *)); for (i = 0; i < spp; i++) y[i] = (Char *)Malloc(chars*sizeof(Char)); bestrees = (bestelm *) Malloc(maxtrees*sizeof(bestelm)); for (i = 1; i <= maxtrees; i++) bestrees[i - 1].btree = (long *)Malloc(nonodes*sizeof(long)); nayme = (naym *)Malloc(spp*sizeof(naym)); weight = (steptr)Malloc(chars*sizeof(long)); oldweight = (steptr)Malloc(chars*sizeof(long)); enterorder = (long *)Malloc(spp*sizeof(long)); necsteps = (steptr)Malloc(chars*sizeof(long)); alias = (steptr)Malloc(chars*sizeof(long)); ally = (steptr)Malloc(chars*sizeof(long)); location = (steptr)Malloc(chars*sizeof(long)); place = (long *)Malloc((2*spp-1)*sizeof(long)); in_tree = (boolptr)Malloc(spp*sizeof(boolean)); } /* allocrest */ void doinit() { /* initializes variables */ inputnumbers(&spp, &chars, &nonodes, 1); getoptions(); if (printdata) fprintf(outfile, "%2ld species, %3ld sites\n", spp, chars); alloctree(&treenode, nonodes, usertree); allocrest(); } /* doinit */ void initdnacompnode(node **p, node **grbg, node *q, long len, long nodei, long *ntips, long *parens, initops whichinit, pointarray treenode, pointarray nodep, Char *str, Char *ch, FILE *intree) { /* initializes a node */ boolean minusread; double valyew, divisor; switch (whichinit) { case bottom: gnutreenode(grbg, p, nodei, endsite, zeros); treenode[nodei - 1] = *p; break; case nonbottom: gnutreenode(grbg, p, nodei, endsite, zeros); break; case tip: match_names_to_data (str, treenode, p, spp); break; case length: processlength(&valyew, &divisor, ch, &minusread, intree, parens); /* process and discard lengths */ default: break; } } /* initdnacompnode */ void makeweights() { /* make up weights vector to avoid duplicate computations */ long i; for (i = 1; i <= chars; i++) { alias[i - 1] = i; oldweight[i - 1] = weight[i - 1]; ally[i - 1] = i; } sitesort(chars, weight); sitecombine(chars); sitescrunch(chars); endsite = 0; for (i = 1; i <= chars; i++) { if (ally[i - 1] == i) endsite++; } for (i = 1; i <= endsite; i++) location[alias[i - 1] - 1] = i; zeros = (long *)Malloc(endsite*sizeof(long)); for (i = 0; i < endsite; i++) zeros[i] = 0; } /* makeweights */ void doinput() { /* reads the input data */ long i; if (justwts) { if (firstset) inputdata(chars); for (i = 0; i < chars; i++) weight[i] = 1; inputweights(chars, weight, &weights); if (justwts) { fprintf(outfile, "\n\nWeights set # %ld:\n\n", ith); if (progress) printf("\nWeights set # %ld:\n\n", ith); } if (printdata) printweights(outfile, 0, chars, weight, "Sites"); } else { if (!firstset){ samenumsp(&chars, ith); reallocchars(); } inputdata(chars); for (i = 0; i < chars; i++) weight[i] = 1; if (weights) { inputweights(chars, weight, &weights); if (printdata) printweights(outfile, 0, chars, weight, "Sites"); } } makeweights(); makevalues(treenode, zeros, usertree); allocnode(&temp, zeros, endsite); allocnode(&temp1, zeros, endsite); } /* doinput */ void mincomp(long n) { /* computes for each site the minimum number of steps necessary to accomodate those species already in the analysis, adding in species n */ long i, j, k, l, m; bases b; long s; boolean allowable, deleted; in_tree[n - 1] = true; for (i = 0; i < endsite; i++) necsteps[i] = 3; for (m = 0; m <= 31; m++) { s = 0; l = -1; k = m; for (b = A; (long)b <= (long)O; b = (bases)((long)b + 1)) { if ((k & 1) == 1) { s |= 1L << ((long)b); l++; } k /= 2; } for (j = 0; j < endsite; j++) { allowable = true; i = 1; while (allowable && i <= spp) { if (in_tree[i - 1] && treenode[i - 1]->base[j] != 0) { if ((treenode[i - 1]->base[j] & s) == 0) allowable = false; } i++; } if (allowable) { if (l < necsteps[j]) necsteps[j] = l; } } } for (j = 0; j < endsite; j++) { deleted = false; for (i = 0; i < spp; i++) { if (in_tree[i] && treenode[i]->base[j] == 0) deleted = true; } if (deleted) necsteps[j]++; } for (i = 0; i < endsite; i++) necsteps[i] *= weight[i]; } /* mincomp */ void evaluate(node *r) { /* determines the number of steps needed for a tree. this is the minimum number of steps needed to evolve sequences on this tree */ long i, term; double sum; sum = 0.0; for (i = 0; i < endsite; i++) { if (r->numsteps[i] == necsteps[i]) term = weight[i]; else term = 0; sum += term; if (usertree && which <= maxuser) fsteps[which - 1][i] = term; } if (usertree && which <= maxuser) { nsteps[which - 1] = sum; if (which == 1) { maxwhich = 1; maxsteps = sum; } else if (sum > maxsteps) { maxwhich = which; maxsteps = sum; } } like = sum; } /* evaluate */ void localsavetree() { /* record in place where each species has to be added to reconstruct this tree */ long i, j; node *p; boolean done; reroot(treenode[outgrno - 1], root); savetraverse(root); for (i = 0; i < nonodes; i++) place[i] = 0; place[root->index - 1] = 1; for (i = 1; i <= spp; i++) { p = treenode[i - 1]; while (place[p->index - 1] == 0) { place[p->index - 1] = i; while (!p->bottom) p = p->next; p = p->back; } if (i > 1) { place[i - 1] = place[p->index - 1]; j = place[p->index - 1]; done = false; while (!done) { place[p->index - 1] = spp + i - 1; while (!p->bottom) p = p->next; p = p->back; done = (p == NULL); if (!done) done = (place[p->index - 1] != j); } } } } /* localsavetree */ void tryadd(node *p, node *item, node *nufork) { /* temporarily adds one fork and one tip to the tree. if the location where they are added yields greater "likelihood" than other locations tested up to that time, then keeps that location as there */ long pos; boolean found; node *rute, *q; if (p == root) fillin(temp, item, p); else { fillin(temp1, item, p); fillin(temp, temp1, p->back); } evaluate(temp); if (lastrearr) { if (like < bestlike) { if (item == nufork->next->next->back) { q = nufork->next; nufork->next = nufork->next->next; nufork->next->next = q; q->next = nufork; } } else if (like >= bstlike2) { recompute = false; add(p, item, nufork, &root, recompute, treenode, &grbg, zeros); rute = root->next->back; localsavetree(); reroot(rute, root); if (like > bstlike2) { bestlike = bstlike2 = like; pos = 1; nextree = 1; addtree(pos, &nextree, dummy, place, bestrees); } else { pos = 0; findtree(&found, &pos, nextree, place, bestrees); if (!found) { if (nextree <= maxtrees) addtree(pos, &nextree, dummy, place, bestrees); } } re_move(item, &nufork, &root, recompute, treenode, &grbg, zeros); recompute = true; } } if (like > bestyet) { bestyet = like; there = p; } } /* tryadd */ void addpreorder(node *p, node *item, node *nufork) { /* traverses a binary tree, calling PROCEDURE tryadd at a node before calling tryadd at its descendants */ if (p == NULL) return; tryadd(p, item, nufork); if (!p->tip) { addpreorder(p->next->back, item, nufork); addpreorder(p->next->next->back, item, nufork); } } /* addpreorder */ void tryrearr(node *p, boolean *success) { /* evaluates one rearrangement of the tree. if the new tree has greater "likelihood" than the old one sets success := TRUE and keeps the new tree. otherwise, restores the old tree */ node *frombelow, *whereto, *forknode, *q; double oldlike; if (p->back == NULL) return; forknode = treenode[p->back->index - 1]; if (forknode->back == NULL) return; oldlike = bestyet; if (p->back->next->next == forknode) frombelow = forknode->next->next->back; else frombelow = forknode->next->back; whereto = treenode[forknode->back->index - 1]; if (whereto->next->back == forknode) q = whereto->next->next->back; else q = whereto->next->back; fillin(temp1, frombelow, q); fillin(temp, temp1, p); fillin(temp1, temp, whereto->back); evaluate(temp1); if (like <= oldlike + LIKE_EPSILON) { if (p != forknode->next->next->back) return; q = forknode->next; forknode->next = forknode->next->next; forknode->next->next = q; q->next = forknode; return; } recompute = false; re_move(p, &forknode, &root, recompute, treenode, &grbg, zeros); fillin(whereto, whereto->next->back, whereto->next->next->back); recompute = true; add(whereto, p, forknode, &root, recompute, treenode, &grbg, zeros); *success = true; bestyet = like; } /* tryrearr */ void repreorder(node *p, boolean *success) { /* traverses a binary tree, calling PROCEDURE tryrearr at a node before calling tryrearr at its descendants */ if (p == NULL) return; tryrearr(p,success); if (!p->tip) { repreorder(p->next->back,success); repreorder(p->next->next->back,success); } } /* repreorder */ void rearrange(node **r) { /* traverses the tree (preorder), finding any local rearrangement which decreases the number of steps. if traversal succeeds in increasing the tree's "likelihood", PROCEDURE rearrange runs traversal again */ boolean success=true; while (success) { success = false; repreorder(*r,&success); } } /* rearrange */ void describe() { /* prints ancestors, steps and table of numbers of steps in each site and table of compatibilities */ long i, j, k; if (treeprint) { fprintf(outfile, "\ntotal number of compatible sites is "); fprintf(outfile, "%10.1f\n", like); } if (stepbox) { writesteps(chars, weights, oldweight, root); fprintf(outfile, "\n compatibility (Y or N) of each site with this tree:\n\n"); fprintf(outfile, " "); for (i = 0; i <= 9; i++) fprintf(outfile, "%ld", i); fprintf(outfile, "\n *----------\n"); for (i = 0; i <= (chars / 10); i++) { putc(' ', outfile); fprintf(outfile, "%3ld !", i * 10); for (j = 0; j <= 9; j++) { k = i * 10 + j; if (k > 0 && k <= chars) { if (root->numsteps[location[ally[k - 1] - 1] - 1] == necsteps[location[ally[k - 1] - 1] - 1]) { if (oldweight[k - 1] > 0) putc('Y', outfile); else putc('y', outfile); } else { if (oldweight[k - 1] > 0) putc('N', outfile); else putc('n', outfile); } } else putc(' ', outfile); } putc('\n', outfile); } } if (ancseq) { hypstates(chars, root, treenode, &garbage, basechar); putc('\n', outfile); } putc('\n', outfile); if (trout) { col = 0; treeout(root, nextree, &col, root); } } /* describe */ void initboolnames(node *p, boolean *names) { /* sets BOOLEANs that indicate tips */ node *q; if (p->tip) { names[p->index - 1] = true; return; } q = p->next; while (q != p) { initboolnames(q->back, names); q = q->next; } } /* initboolnames */ void standev3(long chars, long numtrees, long maxwhich, double maxsteps, double *nsteps, long **fsteps, longer seed) { /* do paired sites test (KHT or SH) on user-defined trees */ long i, j, k; double wt, sumw, sum, sum2, sd; double temp; double **covar, *P, *f, *r; #define SAMPLES 1000 if (numtrees == 2) { fprintf(outfile, "Kishino-Hasegawa-Templeton test\n\n"); fprintf(outfile, "Tree Compatible Difference Its S.D."); fprintf(outfile, " Significantly worse?\n\n"); which = 1; while (which <= numtrees) { fprintf(outfile, "%3ld %11.1f", which, nsteps[which - 1]); if (maxwhich == which) fprintf(outfile, " <------ best\n"); else { sumw = 0.0; sum = 0.0; sum2 = 0.0; for (i = 0; i < chars; i++) { if (weight[i] > 0) { wt = weight[i]; sumw += wt; temp = (fsteps[maxwhich - 1][i] - fsteps[which - 1][i]); sum += temp; sum2 += temp * temp / wt; } } sd = sqrt(sumw / (sumw - 1.0) * (sum2 - sum * sum /sumw)); fprintf(outfile, " %10.1f %11.4f", (maxsteps-nsteps[which - 1]), sd); if (sum > 1.95996 * sd) fprintf(outfile, " Yes\n"); else fprintf(outfile, " No\n"); } which++; } fprintf(outfile, "\n\n"); } else { /* Shimodaira-Hasegawa test using normal approximation */ if(numtrees > MAXSHIMOTREES){ fprintf(outfile, "Shimodaira-Hasegawa test on first %d of %ld trees\n\n" , MAXSHIMOTREES, numtrees); numtrees = MAXSHIMOTREES; } else { fprintf(outfile, "Shimodaira-Hasegawa test\n\n"); } covar = (double **)Malloc(numtrees*sizeof(double *)); sumw = 0.0; for (i = 0; i < chars; i++) sumw += weight[i]; for (i = 0; i < numtrees; i++) covar[i] = (double *)Malloc(numtrees*sizeof(double)); for (i = 0; i < numtrees; i++) { /* compute covariances of trees */ sum = nsteps[i]/sumw; for (j = 0; j <=i; j++) { sum2 = nsteps[j]/sumw; temp = 0.0; for (k = 0; k < chars; k++) { if (weight[k] > 0) { wt = weight[k]; temp = temp + (fsteps[i][k]- wt * sum) * (fsteps[j][k]- wt * sum2) / wt; } } covar[i][j] = temp; if (i != j) covar[j][i] = temp; } } for (i = 0; i < numtrees; i++) { /* in-place Cholesky decomposition of trees x trees covariance matrix */ sum = 0.0; for (j = 0; j <= i-1; j++) sum = sum + covar[i][j] * covar[i][j]; if (sqrt(covar[i][i] <= sum)) temp = 0.0; else temp = sqrt(covar[i][i] - sum); covar[i][i] = temp; for (j = i+1; j < numtrees; j++) { sum = 0.0; for (k = 0; k < i; k++) sum = sum + covar[i][k] * covar[j][k]; if (fabs(temp) < 1.0E-12) covar[j][i] = 0.0; else covar[j][i] = (covar[j][i] - sum)/temp; } } f = (double *)Malloc(numtrees*sizeof(double)); /* resamples sums */ P = (double *)Malloc(numtrees*sizeof(double)); /* vector of P's of trees */ r = (double *)Malloc(numtrees*sizeof(double)); /* store normal variates */ for (i = 0; i < numtrees; i++) P[i] = 0.0; sum2 = nsteps[0]; /* sum2 will be largest # of compat. sites */ for (i = 1; i < numtrees; i++) if (sum2 < nsteps[i]) sum2 = nsteps[i]; for (i = 1; i <= SAMPLES; i++) { /* loop over resampled trees */ for (j = 0; j < numtrees; j++) /* draw Normal variates */ r[j] = normrand(seed); for (j = 0; j < numtrees; j++) { /* compute vectors */ sum = 0.0; for (k = 0; k <= j; k++) sum += covar[j][k]*r[k]; f[j] = sum; } sum = f[1]; for (j = 1; j < numtrees; j++) /* get max of vector */ if (f[j] > sum) sum = f[j]; for (j = 0; j < numtrees; j++) /* accumulate P's */ if (sum2-nsteps[j] <= sum-f[j]) P[j] += 1.0/SAMPLES; } fprintf(outfile, "Tree Compatible Difference P value"); fprintf(outfile, " Significantly worse?\n\n"); for (i = 0; i < numtrees; i++) { fprintf(outfile, "%3ld %10.1f", i+1, nsteps[i]); if ((maxwhich-1) == i) fprintf(outfile, " <------ best\n"); else { fprintf(outfile, " %10.1f %10.3f", sum2-nsteps[i], P[i]); if (P[i] < 0.05) fprintf(outfile, " Yes\n"); else fprintf(outfile, " No\n"); } } fprintf(outfile, "\n"); free(P); /* free the variables we Malloc'ed */ free(f); free(r); for (i = 0; i < numtrees; i++) free(covar[i]); free(covar); } } /* standev */ void maketree() { /* constructs a binary tree from the pointers in treenode. adds each node at location which yields highest "likelihood" then rearranges the tree for greatest "likelihood" */ long i, j, numtrees, nextnode; boolean firsttree, goteof, haslengths; double gotlike; node *item, *nufork, *dummy; pointarray nodep; boolean *names; if (!usertree) { recompute = true; for (i = 0; i < spp; i++) in_tree[i] = false; for (i = 1; i <= spp; i++) enterorder[i - 1] = i; if (jumble) randumize(seed, enterorder); root = treenode[enterorder[0] - 1]; add(treenode[enterorder[0] - 1], treenode[enterorder[1] - 1], treenode[spp], &root, recompute, treenode, &grbg, zeros); if (progress) { printf("Adding species:\n"); writename(0, 2, enterorder); #ifdef WIN32 phyFillScreenColor(); #endif } in_tree[0] = true; in_tree[1] = true; lastrearr = false; for (i = 3; i <= spp; i++) { mincomp(i); bestyet = -350.0 * spp * chars; item = treenode[enterorder[i - 1] - 1]; nufork = treenode[spp + i - 2]; there = root; addpreorder(root, item, nufork); add(there, item, nufork, &root, recompute, treenode, &grbg, zeros); like = bestyet; rearrange(&root); if (progress) { writename(i - 1, 1, enterorder); #ifdef WIN32 phyFillScreenColor(); #endif } lastrearr = (i == spp); if (lastrearr) { if (progress) { printf("\nDoing global rearrangements\n"); printf(" !"); for (j = 1; j <= nonodes; j++) if ( j % (( nonodes / 72 ) + 1 ) == 0 ) putchar('-'); printf("!\n"); #ifdef WIN32 phyFillScreenColor(); #endif } bestlike = bestyet; if (jumb == 1) { bstlike2 = bestlike; nextree = 1; } do { if (progress) printf(" "); gotlike = bestlike; for (j = 0; j < nonodes; j++) { bestyet = -10.0 * spp * chars; item = treenode[j]; there = root; if (item != root) { re_move(item, &nufork, &root, recompute, treenode, &grbg, zeros); there = root; addpreorder(root, item, nufork); add(there, item, nufork, &root, recompute, treenode, &grbg, zeros); } if (progress) { if ( j % (( nonodes / 72 ) + 1 ) == 0 ) putchar('.'); fflush(stdout); } } if (progress) putchar('\n'); } while (bestlike > gotlike); } } if (progress) putchar('\n'); for (i = spp - 1; i >= 1; i--) re_move(treenode[i], &dummy, &root, recompute, treenode, &grbg, zeros); if (jumb == njumble) { if (treeprint) { putc('\n', outfile); if (nextree == 2) fprintf(outfile, "One most parsimonious tree found:\n"); else fprintf(outfile, "%6ld trees in all found\n", nextree - 1); } if (nextree > maxtrees + 1) { if (treeprint) fprintf(outfile, "here are the first%4ld of them\n", (long)maxtrees); nextree = maxtrees + 1; } if (treeprint) putc('\n', outfile); recompute = false; for (i = 0; i <= (nextree - 2); i++) { root = treenode[0]; add(treenode[0], treenode[1], treenode[spp], &root, recompute, treenode, &grbg, zeros); for (j = 3; j <= spp; j++) add(treenode[bestrees[i].btree[j - 1] - 1], treenode[j - 1], treenode[spp + j - 2], &root, recompute, treenode, &grbg, zeros); reroot(treenode[outgrno - 1], root); postorder(root); evaluate(root); printree(root, 1.0); describe(); for (j = 1; j < spp; j++) re_move(treenode[j], &dummy, &root, recompute, treenode, &grbg, zeros); } } } else { /* Open in binary: ftell() is broken for UNIX line-endings under WIN32 */ openfile(&intree, INTREE, "input tree file", "rb", progname, intreename); numtrees = countsemic(&intree); if (numtrees > 2) initseed(&inseed, &inseed0, seed); if (treeprint) { fprintf(outfile, "User-defined tree"); if (numtrees > 1) putc('s', outfile); fprintf(outfile, ":\n"); } fsteps = (long **)Malloc(maxuser*sizeof(long *)); for (j = 1; j <= maxuser; j++) fsteps[j - 1] = (long *)Malloc(endsite*sizeof(long)); names = (boolean *)Malloc(spp*sizeof(boolean)); nodep = NULL; maxsteps = 0.0; which = 1; while (which <= numtrees) { firsttree = true; nextnode = 0; haslengths = true; treeread(intree, &root, treenode, &goteof, &firsttree, nodep, &nextnode, &haslengths, &grbg, initdnacompnode,false,nonodes); for (j = 0; j < spp; j++) names[j] = false; initboolnames(root, names); for (j = 0; j < spp; j++) in_tree[j] = names[j]; j = 1; while (!in_tree[j - 1]) j++; mincomp(j); if (outgropt) reroot(treenode[outgrno - 1], root); postorder(root); evaluate(root); printree(root, 1.0); describe(); which++; } FClose(intree); putc('\n', outfile); if (numtrees > 1 && chars > 1 ) { standev3(chars, numtrees, maxwhich, maxsteps, nsteps, fsteps, seed); } for (j = 1; j <= maxuser; j++) free(fsteps[j - 1]); free(fsteps); free(names); } if (jumb == njumble) { if (progress) { printf("\nOutput written to file \"%s\"\n", outfilename); if (trout) printf("\nTrees also written onto file \"%s\"\n", outtreename); putchar('\n'); } } } /* maketree */ void freerest() { if (!usertree) { freenode(&temp); freenode(&temp1); } freegrbg(&grbg); if (ancseq) freegarbage(&garbage); free(zeros); freenodes(nonodes, treenode); } /* freerest */ int main(int argc, Char *argv[]) { /* DNA compatibility by uphill search */ /* reads in spp, chars, and the data. Then calls maketree to construct the tree */ #ifdef MAC argc = 1; /* macsetup("Dnacomp",""); */ argv[0]="Dnacomp"; #endif init(argc, argv); openfile(&infile,INFILE,"input file", "r",argv[0],infilename); openfile(&outfile,OUTFILE,"output file", "w",argv[0],outfilename); mulsets = false; garbage = NULL; grbg = NULL; ibmpc = IBMCRT; ansi = ANSICRT; msets = 1; firstset = true; doinit(); if (weights || justwts) openfile(&weightfile,WEIGHTFILE,"weights file","r",argv[0],weightfilename); if (trout) openfile(&outtree,OUTTREE,"output tree file", "w",argv[0],outtreename); for (ith = 1; ith <= msets; ith++) { doinput(); if (ith == 1) firstset = false; if (msets > 1 && !justwts) { fprintf(outfile, "Data set # %ld:\n\n", ith); if (progress) printf("Data set # %ld:\n\n", ith); } for (jumb = 1; jumb <= njumble; jumb++) maketree(); freerest(); } freetree(nonodes, treenode); FClose(infile); FClose(outfile); FClose(outtree); #ifdef MAC fixmacfile(outfilename); fixmacfile(outtreename); #endif #ifdef WIN32 phyRestoreConsoleAttributes(); #endif exxit(0); return 0; } /* DNA compatibility by uphill search */ phylip-3.697/src/dnadist.c0000644004732000473200000011154112406201116015122 0ustar joefelsenst_g#include "phylip.h" #include "seq.h" /* version 3.696. Written by Joseph Felsenstein, Akiko Fuseki, Sean Lamont, and Andrew Keeffe. Copyright (c) 1993-2014, Joseph Felsenstein All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #define iterationsd 100 /* number of iterates of EM for each distance */ typedef struct valrec { double rat, ratxv, z1, y1, z1zz, z1yy, z1xv; } valrec; Char infilename[FNMLNGTH], outfilename[FNMLNGTH], catfilename[FNMLNGTH], weightfilename[FNMLNGTH]; long sites, categs, weightsum, datasets, ith, rcategs; boolean freqsfrom, jukes, kimura, logdet, gama, invar, similarity, lower, f84, weights, progress, ctgry, mulsets, justwts, firstset, baddists; boolean matrix_flags; /* Matrix output format */ node **nodep; double xi, xv, ttratio, ttratio0, freqa, freqc, freqg, freqt, freqr, freqy, freqar, freqcy, freqgr, freqty, cvi, invarfrac, sumrates, fracchange; steptr oldweight; double rate[maxcategs]; double **d; double sumweightrat; /* these values were propagated */ double *weightrat; /* to global values from */ valrec tbl[maxcategs]; /* function makedists. */ #ifndef OLDC /* function prototypes */ void getoptions(void); void allocrest(void); void reallocsites(void); void doinit(void); void inputcategories(void); void printcategories(void); void inputoptions(void); void dnadist_sitesort(void); void dnadist_sitecombine(void); void dnadist_sitescrunch(void); void makeweights(void); void dnadist_makevalues(void); void dnadist_empiricalfreqs(void); void getinput(void); void inittable(void); double lndet(double (*a)[4]); void makev(long, long, double *); void makedists(void); void writedists(void); /* function prototypes */ #endif void getoptions() { /* interactively set options */ long loopcount, loopcount2; Char ch, ch2; boolean ttr; char *str = NULL; ctgry = false; categs = 1; cvi = 1.0; rcategs = 1; rate[0] = 1.0; freqsfrom = true; gama = false; invar = false; invarfrac = 0.0; jukes = false; justwts = false; kimura = false; logdet = false; f84 = true; lower = false; matrix_flags = MAT_MACHINE; similarity = false; ttratio = 2.0; ttr = false; weights = false; printdata = false; dotdiff = true; progress = true; interleaved = true; loopcount = 0; for (;;) { cleerhome(); printf("\nNucleic acid sequence Distance Matrix program,"); printf(" version %s\n\n",VERSION); printf("Settings for this run:\n"); printf(" D Distance (F84, Kimura, Jukes-Cantor, LogDet)? %s\n", kimura ? "Kimura 2-parameter" : jukes ? "Jukes-Cantor" : logdet ? "LogDet" : similarity ? "Similarity table" : "F84"); if (kimura || f84 || jukes) { printf(" G Gamma distributed rates across sites? "); if (gama) printf("Yes\n"); else { if (invar) printf("Gamma+Invariant\n"); else printf("No\n"); } } if (kimura || f84) { printf(" T Transition/transversion ratio?"); if (!ttr) printf(" 2.0\n"); else printf("%8.4f\n", ttratio); } if (!logdet && !similarity && !gama && !invar) { printf(" C One category of substitution rates?"); if (!ctgry || categs == 1) printf(" Yes\n"); else printf(" %ld categories\n", categs); } printf(" W Use weights for sites?"); if (weights) printf(" Yes\n"); else printf(" No\n"); if (f84) printf(" F Use empirical base frequencies? %s\n", (freqsfrom ? "Yes" : "No")); printf(" L Form of distance matrix? "); switch (matrix_flags) { case MAT_MACHINE: str = "Square"; break; case MAT_LOWERTRI: str = "Lower-triangular"; break; case MAT_HUMAN: str = "Human-readable"; break; default: /* shouldn't happen */ assert( 0 ); str = "(unknown)"; } puts(str); printf(" M Analyze multiple data sets?"); if (mulsets) printf(" Yes, %2ld %s\n", datasets, (justwts ? "sets of weights" : "data sets")); else printf(" No\n"); printf(" I Input sequences interleaved? %s\n", (interleaved ? "Yes" : "No, sequential")); printf(" 0 Terminal type (IBM PC, ANSI, none)? %s\n", ibmpc ? "IBM PC" : ansi ? "ANSI" : "(none)"); printf(" 1 Print out the data at start of run %s\n", (printdata ? "Yes" : "No")); printf(" 2 Print indications of progress of run %s\n", (progress ? "Yes" : "No")); printf("\n Y to accept these or type the letter for one to change\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); uppercase(&ch); if (ch == 'Y') break; if ((f84 && (strchr("CFGWLDTMI012",ch) != NULL)) || (kimura && (strchr("CGWLDTMI012",ch) != NULL)) || (jukes && (strchr("CGWLDMI012",ch) != NULL)) || ((logdet || similarity) && (strchr("WLDMI012",ch)) != NULL) || (ctgry && (strchr("CFWLDTMI012",ch) != NULL))) { switch (ch) { case 'D': if (kimura) { kimura = false; jukes = true; freqsfrom = false; } else if (f84) { f84 = false; kimura = true; freqsfrom = false; } else if (logdet) { logdet = false; similarity = true; } else if (similarity) { similarity = false; f84 = true; freqsfrom = true; } else { jukes = false; logdet = true; freqsfrom = false; } break; case 'G': if (!(gama || invar)) gama = true; else { if (gama) { gama = false; invar = true; } else { if (invar) invar = false; } } break; case 'C': ctgry = !ctgry; if (ctgry) { initcatn(&categs); initcategs(categs, rate); } break; case 'F': freqsfrom = !freqsfrom; if (!freqsfrom) initfreqs(&freqa, &freqc, &freqg, &freqt); break; case 'W': weights = !weights; break; case 'L': /* square -> lower-triangular -> machine-readable */ switch ( matrix_flags ) { case MAT_HUMAN: matrix_flags = MAT_MACHINE; break; case MAT_LOWERTRI: matrix_flags = MAT_HUMAN; break; case MAT_MACHINE: matrix_flags = MAT_LOWERTRI; break; default: assert(0); matrix_flags = MAT_HUMAN; } break; case 'T': ttr = !ttr; if (ttr) initratio(&ttratio); break; case 'M': mulsets = !mulsets; if (mulsets) { printf("Multiple data sets or multiple weights?"); loopcount2 = 0; do { printf(" (type D or W)\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch2); uppercase(&ch2); getchar(); countup(&loopcount2, 10); } while ((ch2 != 'W') && (ch2 != 'D')); justwts = (ch2 == 'W'); if (justwts) justweights(&datasets); else initdatasets(&datasets); } break; case 'I': interleaved = !interleaved; break; case '0': initterminal(&ibmpc, &ansi); break; case '1': printdata = !printdata; break; case '2': progress = !progress; break; } } else { if (strchr("CFGWLDTMI012",ch) == NULL) printf("Not a possible option!\n"); else printf("That option not allowed with these settings\n"); printf("\nPress Enter or Return key to continue\n"); getchar(); } countup(&loopcount, 100); } /* Prevent similarity matrices from easily being used as input to other * programs */ if (similarity) { if (matrix_flags == MAT_MACHINE) matrix_flags = MAT_HUMAN; else if (matrix_flags == MAT_LOWERTRI) matrix_flags = MAT_LOWER | MAT_HUMAN; } if (gama || invar) { loopcount = 0; do { printf( "\nCoefficient of variation of substitution rate among sites (must be positive)\n"); printf( " In gamma distribution parameters, this is 1/(square root of alpha)\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%lf%*[^\n]", &cvi); getchar(); countup(&loopcount, 10); } while (cvi <= 0.0); cvi = 1.0 / (cvi * cvi); } if (invar) { loopcount = 0; do { printf("Fraction of invariant sites?\n"); fflush(stdout); scanf("%lf%*[^\n]", &invarfrac); getchar(); countup (&loopcount, 10); } while ((invarfrac <= 0.0) || (invarfrac >= 1.0)); } if (!printdata) return; fprintf(outfile, "\nNucleic acid sequence Distance Matrix program,"); fprintf(outfile, " version %s\n\n",VERSION); } /* getoptions */ void allocrest() { long i; y = (Char **)Malloc(spp*sizeof(Char *)); nodep = (node **)Malloc(spp*sizeof(node *)); for (i = 0; i < spp; i++) { y[i] = (Char *)Malloc(sites*sizeof(Char)); nodep[i] = (node *)Malloc(sizeof(node)); } d = (double **)Malloc(spp*sizeof(double *)); for (i = 0; i < spp; i++) d[i] = (double*)Malloc(spp*sizeof(double)); nayme = (naym *)Malloc(spp*sizeof(naym)); category = (steptr)Malloc(sites*sizeof(long)); oldweight = (steptr)Malloc(sites*sizeof(long)); weight = (steptr)Malloc(sites*sizeof(long)); alias = (steptr)Malloc(sites*sizeof(long)); ally = (steptr)Malloc(sites*sizeof(long)); location = (steptr)Malloc(sites*sizeof(long)); weightrat = (double *)Malloc(sites*sizeof(double)); } /* allocrest */ void reallocsites() {/* The amount of sites can change between runs this function reallocates all the variables whose size depends on the amount of sites */ long i; for (i = 0; i < spp; i++) { free(y[i]); y[i] = (Char *)Malloc(sites*sizeof(Char)); } free(category); free(oldweight); free(weight); free(alias); free(ally); free(location); free(weightrat); category = (steptr)Malloc(sites*sizeof(long)); oldweight = (steptr)Malloc(sites*sizeof(long)); weight = (steptr)Malloc(sites*sizeof(long)); alias = (steptr)Malloc(sites*sizeof(long)); ally = (steptr)Malloc(sites*sizeof(long)); location = (steptr)Malloc(sites*sizeof(long)); weightrat = (double *)Malloc(sites*sizeof(double)); } /* reallocsites */ void doinit() { /* initializes variables */ inputnumbers(&spp, &sites, &nonodes, 1); getoptions(); if (printdata) fprintf(outfile, "%2ld species, %3ld sites\n", spp, sites); allocrest(); } /* doinit */ void inputcategories() { /* reads the categories for each site */ long i; Char ch; for (i = 1; i < nmlngth; i++) gettc(infile); for (i = 0; i < sites; i++) { do { if (eoln(infile)) scan_eoln(infile); ch = gettc(infile); } while (ch == ' '); category[i] = ch - '0'; } scan_eoln(infile); } /* inputcategories */ void printcategories() { /* print out list of categories of sites */ long i, j; fprintf(outfile, "Rate categories\n\n"); for (i = 1; i <= nmlngth + 3; i++) putc(' ', outfile); for (i = 1; i <= sites; i++) { fprintf(outfile, "%ld", category[i - 1]); if (i % 60 == 0) { putc('\n', outfile); for (j = 1; j <= nmlngth + 3; j++) putc(' ', outfile); } else if (i % 10 == 0) putc(' ', outfile); } fprintf(outfile, "\n\n"); } /* printcategories */ void inputoptions() { /* read options information */ long i; if (!firstset && !justwts) { samenumsp(&sites, ith); reallocsites(); } for (i = 0; i < sites; i++) { category[i] = 1; oldweight[i] = 1; } if (justwts || weights) inputweights(sites, oldweight, &weights); if (printdata) putc('\n', outfile); if (jukes && printdata) fprintf(outfile, " Jukes-Cantor Distance\n"); if (kimura && printdata) fprintf(outfile, " Kimura 2-parameter Distance\n"); if (f84 && printdata) fprintf(outfile, " F84 Distance\n"); if (similarity) fprintf(outfile, " \n Table of similarity between sequences\n"); if (firstset && printdata && (kimura || f84)) fprintf(outfile, "\nTransition/transversion ratio = %10.6f\n", ttratio); if (ctgry && categs > 1) { inputcategs(0, sites, category, categs, "DnaDist"); if (printdata) printcategs(outfile, sites, category, "Site categories"); } else if (printdata && (categs > 1)) { fprintf(outfile, "\nSite category Rate of change\n\n"); for (i = 1; i <= categs; i++) fprintf(outfile, "%12ld%13.3f\n", i, rate[i - 1]); putc('\n', outfile); printcategories(); } if ((jukes || kimura || logdet) && freqsfrom) { printf(" WARNING: CANNOT USE EMPIRICAL BASE FREQUENCIES"); printf(" WITH JUKES-CANTOR, KIMURA, JIN/NEI OR LOGDET DISTANCES\n"); exxit(-1); } if (jukes) ttratio = 0.5000001; if (weights && printdata) printweights(outfile, 0, sites, oldweight, "Sites"); } /* inputoptions */ void dnadist_sitesort() { /* Shell sort of sites lexicographically */ long gap, i, j, jj, jg, k, itemp; boolean flip, tied; gap = sites / 2; while (gap > 0) { for (i = gap + 1; i <= sites; i++) { j = i - gap; flip = true; while (j > 0 && flip) { jj = alias[j - 1]; jg = alias[j + gap - 1]; tied = (oldweight[jj - 1] == oldweight[jg - 1]); flip = (oldweight[jj - 1] < oldweight[jg - 1] || (tied && category[jj - 1] > category[jg - 1])); tied = (tied && category[jj - 1] == category[jg - 1]); k = 1; while (k <= spp && tied) { flip = (y[k - 1][jj - 1] > y[k - 1][jg - 1]); tied = (tied && y[k - 1][jj - 1] == y[k - 1][jg - 1]); k++; } if (!flip) break; itemp = alias[j - 1]; alias[j - 1] = alias[j + gap - 1]; alias[j + gap - 1] = itemp; j -= gap; } } gap /= 2; } } /* dnadist_sitesort */ void dnadist_sitecombine() { /* combine sites that have identical patterns */ long i, j, k; boolean tied; i = 1; while (i < sites) { j = i + 1; tied = true; while (j <= sites && tied) { tied = (category[alias[i - 1] - 1] == category[alias[j - 1] - 1]); k = 1; while (k <= spp && tied) { tied = (tied && y[k - 1][alias[i - 1] - 1] == y[k - 1][alias[j - 1] - 1]); k++; } if (!tied) break; ally[alias[j - 1] - 1] = alias[i - 1]; j++; } i = j; } } /* dnadist_sitecombine */ void dnadist_sitescrunch() { /* move so one representative of each pattern of sites comes first */ long i, j, itemp; boolean done, found, completed; done = false; i = 1; j = 2; while (!done) { if (ally[alias[i - 1] - 1] != alias[i - 1]) { if (j <= i) j = i + 1; if (j <= sites) { do { found = (ally[alias[j - 1] - 1] == alias[j - 1]); j++; completed = (j > sites); } while (!(found || completed)); if (found) { j--; itemp = alias[i - 1]; alias[i - 1] = alias[j - 1]; alias[j - 1] = itemp; } else done = true; } else done = true; } i++; done = (done || i >= sites); } } /* dnadist_sitescrunch */ void makeweights() { /* make up weights vector to avoid duplicate computations */ long i; for (i = 1; i <= sites; i++) { alias[i - 1] = i; ally[i - 1] = i; location[i - 1] = 0; weight[i - 1] = 0; } dnadist_sitesort(); dnadist_sitecombine(); dnadist_sitescrunch(); endsite = 0; for (i = 1; i <= sites; i++) { if (ally[i - 1] == i) endsite++; } for (i = 1; i <= endsite; i++) location[alias[i - 1] - 1] = i; weightsum = 0; for (i = 0; i < sites; i++) weightsum += oldweight[i]; sumrates = 0.0; for (i = 0; i < sites; i++) sumrates += oldweight[i] * rate[category[i] - 1]; for (i = 0; i < categs; i++) rate[i] *= weightsum / sumrates; for (i = 0; i < sites; i++) if (location[ally[i]-1] > 0) weight[location[ally[i] - 1] - 1] += oldweight[i]; } /* makeweights */ void dnadist_makevalues() { /* set up fractional likelihoods at tips */ long i, j, k; bases b; for (i = 0; i < spp; i++) { nodep[i]->x = (phenotype)Malloc(endsite*sizeof(ratelike)); for (j = 0; j < endsite; j++) nodep[i]->x[j] = (ratelike)Malloc(rcategs*sizeof(sitelike)); } for (k = 0; k < endsite; k++) { j = alias[k]; for (i = 0; i < spp; i++) { for (b = A; (long)b <= (long)T; b = (bases)((long)b + 1)) nodep[i]->x[k][0][(long)b - (long)A] = 0.0; switch (y[i][j - 1]) { case 'A': nodep[i]->x[k][0][0] = 1.0; break; case 'C': nodep[i]->x[k][0][(long)C - (long)A] = 1.0; break; case 'G': nodep[i]->x[k][0][(long)G - (long)A] = 1.0; break; case 'T': nodep[i]->x[k][0][(long)T - (long)A] = 1.0; break; case 'U': nodep[i]->x[k][0][(long)T - (long)A] = 1.0; break; case 'M': nodep[i]->x[k][0][0] = 1.0; nodep[i]->x[k][0][(long)C - (long)A] = 1.0; break; case 'R': nodep[i]->x[k][0][0] = 1.0; nodep[i]->x[k][0][(long)G - (long)A] = 1.0; break; case 'W': nodep[i]->x[k][0][0] = 1.0; nodep[i]->x[k][0][(long)T - (long)A] = 1.0; break; case 'S': nodep[i]->x[k][0][(long)C - (long)A] = 1.0; nodep[i]->x[k][0][(long)G - (long)A] = 1.0; break; case 'Y': nodep[i]->x[k][0][(long)C - (long)A] = 1.0; nodep[i]->x[k][0][(long)T - (long)A] = 1.0; break; case 'K': nodep[i]->x[k][0][(long)G - (long)A] = 1.0; nodep[i]->x[k][0][(long)T - (long)A] = 1.0; break; case 'B': nodep[i]->x[k][0][(long)C - (long)A] = 1.0; nodep[i]->x[k][0][(long)G - (long)A] = 1.0; nodep[i]->x[k][0][(long)T - (long)A] = 1.0; break; case 'D': nodep[i]->x[k][0][0] = 1.0; nodep[i]->x[k][0][(long)G - (long)A] = 1.0; nodep[i]->x[k][0][(long)T - (long)A] = 1.0; break; case 'H': nodep[i]->x[k][0][0] = 1.0; nodep[i]->x[k][0][(long)C - (long)A] = 1.0; nodep[i]->x[k][0][(long)T - (long)A] = 1.0; break; case 'V': nodep[i]->x[k][0][0] = 1.0; nodep[i]->x[k][0][(long)C - (long)A] = 1.0; nodep[i]->x[k][0][(long)G - (long)A] = 1.0; break; case 'N': for (b = A; (long)b <= (long)T; b = (bases)((long)b + 1)) nodep[i]->x[k][0][(long)b - (long)A] = 1.0; break; case 'X': for (b = A; (long)b <= (long)T; b = (bases)((long)b + 1)) nodep[i]->x[k][0][(long)b - (long)A] = 1.0; break; case '?': for (b = A; (long)b <= (long)T; b = (bases)((long)b + 1)) nodep[i]->x[k][0][(long)b - (long)A] = 1.0; break; case 'O': for (b = A; (long)b <= (long)T; b = (bases)((long)b + 1)) nodep[i]->x[k][0][(long)b - (long)A] = 1.0; break; case '-': for (b = A; (long)b <= (long)T; b = (bases)((long)b + 1)) nodep[i]->x[k][0][(long)b - (long)A] = 1.0; break; } } } } /* dnadist_makevalues */ void dnadist_empiricalfreqs() { /* Get empirical base frequencies from the data */ long i, j, k; double sum, suma, sumc, sumg, sumt, w; freqa = 0.25; freqc = 0.25; freqg = 0.25; freqt = 0.25; for (k = 1; k <= 8; k++) { suma = 0.0; sumc = 0.0; sumg = 0.0; sumt = 0.0; for (i = 0; i < spp; i++) { for (j = 0; j < endsite; j++) { w = weight[j]; sum = freqa * nodep[i]->x[j][0][0]; sum += freqc * nodep[i]->x[j][0][(long)C - (long)A]; sum += freqg * nodep[i]->x[j][0][(long)G - (long)A]; sum += freqt * nodep[i]->x[j][0][(long)T - (long)A]; suma += w * freqa * nodep[i]->x[j][0][0] / sum; sumc += w * freqc * nodep[i]->x[j][0][(long)C - (long)A] / sum; sumg += w * freqg * nodep[i]->x[j][0][(long)G - (long)A] / sum; sumt += w * freqt * nodep[i]->x[j][0][(long)T - (long)A] / sum; } } sum = suma + sumc + sumg + sumt; freqa = suma / sum; freqc = sumc / sum; freqg = sumg / sum; freqt = sumt / sum; } } /* dnadist_empiricalfreqs */ void getinput() { /* reads the input data */ inputoptions(); if ((!freqsfrom) && !logdet && !similarity) { if (kimura || jukes) { freqa = 0.25; freqc = 0.25; freqg = 0.25; freqt = 0.25; } getbasefreqs(freqa, freqc, freqg, freqt, &freqr, &freqy, &freqar, &freqcy, &freqgr, &freqty, &ttratio, &xi, &xv, &fracchange, freqsfrom, printdata); if (freqa < 0.00000001) { freqa = 0.000001; freqc = 0.999999*freqc; freqg = 0.999999*freqg; freqt = 0.999999*freqt; } if (freqc < 0.00000001) { freqa = 0.999999*freqa; freqc = 0.000001; freqg = 0.999999*freqg; freqt = 0.999999*freqt; } if (freqg < 0.00000001) { freqa = 0.999999*freqa; freqc = 0.999999*freqc; freqg = 0.000001; freqt = 0.999999*freqt; } if (freqt < 0.00000001) { freqa = 0.999999*freqa; freqc = 0.999999*freqc; freqg = 0.999999*freqg; freqt = 0.000001; } } if (!justwts || firstset) inputdata(sites); makeweights(); dnadist_makevalues(); if (freqsfrom) { dnadist_empiricalfreqs(); getbasefreqs(freqa, freqc, freqg, freqt, &freqr, &freqy, &freqar, &freqcy, &freqgr, &freqty, &ttratio, &xi, &xv, &fracchange, freqsfrom, printdata); } } /* getinput */ void inittable() { /* Define a lookup table. Precompute values and store in a table */ long i; for (i = 0; i < categs; i++) { tbl[i].rat = rate[i]; tbl[i].ratxv = rate[i] * xv; } } /* inittable */ double lndet(double (*a)[4]) { long i, j, k; double temp, ld; /*Gauss-Jordan reduction -- invert matrix a in place, overwriting previous contents of a. On exit, matrix a contains the inverse, lndet contains the log of the determinant */ ld = 1.0; for (i = 0; i < 4; i++) { ld *= a[i][i]; temp = 1.0 / a[i][i]; a[i][i] = 1.0; for (j = 0; j < 4; j++) a[i][j] *= temp; for (j = 0; j < 4; j++) { if (j != i) { temp = a[j][i]; a[j][i] = 0.0; for (k = 0; k < 4; k++) a[j][k] -= temp * a[i][k]; } } } if (ld <= 0.0) return(99.0); else return(log(ld)); } /* lndet */ void makev(long m, long n, double *v) { /* compute one distance */ long i, j, k, l, it, num1, num2, idx; long numerator = 0, denominator = 0; double sum, sum1, sum2, sumyr, lz, aa, bb, cc, vv=0, p1, p2, p3, q1, q2, q3, tt, delta = 0, slope, xx1freqa, xx1freqc, xx1freqg, xx1freqt; double *prod, *prod2, *prod3; boolean quick, jukesquick, kimquick, logdetquick, overlap; bases b; node *p, *q; sitelike xx1, xx2; double basetable[4][4]; /* for quick logdet */ double basefreq1[4], basefreq2[4]; p = nodep[m - 1]; q = nodep[n - 1]; /* check for overlap between sequences */ overlap = false; for(i=0 ; i < sites ; i++){ if((strchr("NX?O-",y[m-1][i])==NULL) && (strchr("NX?O-",y[n-1][i])==NULL)){ overlap = true; break; } } if(!overlap){ printf("\nWARNING: NO OVERLAP BETWEEN SEQUENCES %ld AND %ld; -1.0 WAS WRITTEN\n", m, n); baddists = true; return; } quick = (!ctgry || categs == 1); if (jukes || kimura || logdet || similarity) { numerator = 0; denominator = 0; for (i = 0; i < endsite; i++) { memcpy(xx1, p->x[i][0], sizeof(sitelike)); memcpy(xx2, q->x[i][0], sizeof(sitelike)); sum = 0.0; sum1 = 0.0; sum2 = 0.0; for (b = A; (long)b <= (long)T; b = (bases)((long)b + 1)) { sum1 += xx1[(long)b - (long)A]; sum2 += xx2[(long)b - (long)A]; sum += xx1[(long)b - (long)A] * xx2[(long)b - (long)A]; } quick = (quick && (sum1 == 1.0 || sum1 == 4.0) && (sum2 == 1.0 || sum2 == 4.0)); if (sum1 == 1.0 && sum2 == 1.0) { numerator += (long)(weight[i] * sum); denominator += weight[i]; } } } jukesquick = ((jukes || similarity) && quick); kimquick = (kimura && quick); logdetquick = (logdet && quick); if (logdet && !quick) { printf(" WARNING: CANNOT CALCULATE LOGDET DISTANCE\n"); printf(" WITH PRESENT PROGRAM IF PARTIALLY AMBIGUOUS NUCLEOTIDES\n"); printf(" -1.0 WAS WRITTEN\n"); baddists = true; } if (jukesquick && jukes && (numerator * 4 <= denominator)) { printf("\nWARNING: INFINITE DISTANCE BETWEEN "); printf(" SPECIES %3ld AND %3ld\n", m, n); printf(" -1.0 WAS WRITTEN\n"); baddists = true; } if (jukesquick && invar && (4 * (((double)numerator / denominator) - invarfrac) <= (1.0 - invarfrac))) { printf("\nWARNING: DIFFERENCE BETWEEN SPECIES %3ld AND %3ld", m, n); printf(" TOO LARGE FOR INVARIABLE SITES\n"); printf(" -1.0 WAS WRITTEN\n"); baddists = true; } if (jukesquick) { if (!gama && !invar) vv = -0.75 * log((4.0*((double)numerator / denominator) - 1.0) / 3.0); else if (!invar) vv = 0.75 * cvi * (exp(-(1/cvi)* log((4.0 * ((double)numerator / denominator) - 1.0) / 3.0)) - 1.0); else vv = 0.75 * cvi * (exp(-(1/cvi)* log((4.0 * ((double)numerator / denominator - invarfrac)/ (1.0-invarfrac) - 1.0) / 3.0)) - 1.0); } if (kimquick) { num1 = 0; num2 = 0; denominator = 0; for (i = 0; i < endsite; i++) { memcpy(xx1, p->x[i][0], sizeof(sitelike)); memcpy(xx2, q->x[i][0], sizeof(sitelike)); sum = 0.0; sum1 = 0.0; sum2 = 0.0; for (b = A; (long)b <= (long)T; b = (bases)((long)b + 1)) { sum1 += xx1[(long)b - (long)A]; sum2 += xx2[(long)b - (long)A]; sum += xx1[(long)b - (long)A] * xx2[(long)b - (long)A]; } sumyr = (xx1[0] + xx1[(long)G - (long)A]) * (xx2[0] + xx2[(long)G - (long)A]) + (xx1[(long)C - (long)A] + xx1[(long)T - (long)A]) * (xx2[(long)C - (long)A] + xx2[(long)T - (long)A]); if (sum1 == 1.0 && sum2 == 1.0) { num1 += (long)(weight[i] * sum); num2 += (long)(weight[i] * (sumyr - sum)); denominator += weight[i]; } } tt = ((1.0 - (double)num1 / denominator)-invarfrac)/(1.0-invarfrac); if (tt > 0.0) { delta = 0.1; tt = delta; it = 0; while (fabs(delta) > 0.0000002 && it < iterationsd) { it++; if (!gama) { p1 = exp(-tt); p2 = exp(-xv * tt) - exp(-tt); p3 = 1.0 - exp(-xv * tt); } else { p1 = exp(-cvi * log(1 + tt / cvi)); p2 = exp(-cvi * log(1 + xv * tt / cvi)) - exp(-cvi * log(1 + tt / cvi)); p3 = 1.0 - exp(-cvi * log(1 + xv * tt / cvi)); } q1 = p1 + p2 / 2.0 + p3 / 4.0; q2 = p2 / 2.0 + p3 / 4.0; q3 = p3 / 2.0; q1 = q1 * (1.0-invarfrac) + invarfrac; q2 *= (1.0 - invarfrac); q3 *= (1.0 - invarfrac); if (!gama && !invar) slope = 0.5 * exp(-tt) * (num2 / q2 - num1 / q1) + 0.25 * xv * exp(-xv * tt) * ((denominator - num1 - num2) * 2 / q3 - num2 / q2 - num1 / q1); else slope = 0.5 * (1 / (1 + tt / cvi)) * exp(-cvi * log(1 + tt / cvi)) * (num2 / q2 - num1 / q1) + 0.25 * (xv / (1 + xv * tt / cvi)) * exp(-cvi * log(1 + xv * tt / cvi)) * ((denominator - num1 - num2) * 2 / q3 - num2 / q2 - num1 / q1); slope *= (1.0-invarfrac); if (slope < 0.0) delta = fabs(delta) / -2.0; else delta = fabs(delta); tt += delta; } } if ((delta >= 0.1) && (!similarity)) { printf("\nWARNING: DIFFERENCE BETWEEN SPECIES %3ld AND %3ld", m, n); if (invar) printf(" TOO LARGE FOR INVARIABLE SITES\n"); else printf(" TOO LARGE TO ESTIMATE DISTANCE\n"); printf(" -1.0 WAS WRITTEN\n"); baddists = true; } vv = fracchange * tt; } if (!(jukesquick || kimquick || logdet)) { prod = (double *)Malloc(sites*sizeof(double)); prod2 = (double *)Malloc(sites*sizeof(double)); prod3 = (double *)Malloc(sites*sizeof(double)); for (i = 0; i < endsite; i++) { memcpy(xx1, p->x[i][0], sizeof(sitelike)); memcpy(xx2, q->x[i][0], sizeof(sitelike)); xx1freqa = xx1[0] * freqa; xx1freqc = xx1[(long)C - (long)A] * freqc; xx1freqg = xx1[(long)G - (long)A] * freqg; xx1freqt = xx1[(long)T - (long)A] * freqt; sum1 = xx1freqa + xx1freqc + xx1freqg + xx1freqt; sum2 = freqa * xx2[0] + freqc * xx2[(long)C - (long)A] + freqg * xx2[(long)G - (long)A] + freqt * xx2[(long)T - (long)A]; prod[i] = sum1 * sum2; prod2[i] = (xx1freqa + xx1freqg) * (xx2[0] * freqar + xx2[(long)G - (long)A] * freqgr) + (xx1freqc + xx1freqt) * (xx2[(long)C - (long)A] * freqcy + xx2[(long)T - (long)A] * freqty); prod3[i] = xx1freqa * xx2[0] + xx1freqc * xx2[(long)C - (long)A] + xx1freqg * xx2[(long)G - (long)A] + xx1freqt * xx2[(long)T - (long)A]; } tt = 0.1; delta = 0.1; it = 1; while (it < iterationsd && fabs(delta) > 0.0000002) { slope = 0.0; if (tt > 0.0) { lz = -tt; for (i = 0; i < categs; i++) { if (!gama) { tbl[i].z1 = exp(tbl[i].ratxv * lz); tbl[i].z1zz = exp(tbl[i].rat * lz); } else { tbl[i].z1 = exp(-cvi*log(1.0-tbl[i].ratxv * lz/cvi)); tbl[i].z1zz = exp(-cvi*log(1.0-tbl[i].rat * lz/cvi)); } tbl[i].y1 = 1.0 - tbl[i].z1; tbl[i].z1yy = tbl[i].z1 - tbl[i].z1zz; tbl[i].z1xv = tbl[i].z1 * xv; } for (i = 0; i < endsite; i++) { idx = category[alias[i] - 1]; cc = prod[i]; bb = prod2[i]; aa = prod3[i]; if (!gama && !invar) slope += weightrat[i] * (tbl[idx - 1].z1zz * (bb - aa) + tbl[idx - 1].z1xv * (cc - bb)) / (aa * tbl[idx - 1].z1zz + bb * tbl[idx - 1].z1yy + cc * tbl[idx - 1].y1); else slope += (1.0-invarfrac) * weightrat[i] * ( ((tbl[idx-1].rat)/(1.0-tbl[idx-1].rat * lz/cvi)) * tbl[idx - 1].z1zz * (bb - aa) + ((tbl[idx-1].ratxv)/(1.0-tbl[idx-1].ratxv * lz/cvi)) * tbl[idx - 1].z1 * (cc - bb)) / (aa * ((1.0-invarfrac)*tbl[idx - 1].z1zz + invarfrac) + bb * (1.0-invarfrac)*tbl[idx - 1].z1yy + cc * (1.0-invarfrac)*tbl[idx - 1].y1); } } if (slope < 0.0) delta = fabs(delta) / -2.0; else delta = fabs(delta); tt += delta; it++; } if ((delta >= 0.1) && (!similarity)) { printf("\nWARNING: DIFFERENCE BETWEEN SPECIES %3ld AND %3ld", m, n); if (invar) printf(" TOO LARGE FOR INVARIABLE SITES\n"); else printf(" TOO LARGE TO ESTIMATE DISTANCE\n"); printf(" -1.0 WAS WRITTEN\n"); baddists = true; } vv = tt * fracchange; free(prod); free(prod2); free(prod3); } if (logdetquick) { /* compute logdet when no ambiguous nucleotides */ for (i = 0; i < 4; i++) { basefreq1[i] = 0.0; basefreq2[i] = 0.0; for (j = 0; j < 4; j++) basetable[i][j] = 0.0; } for (i = 0; i < endsite; i++) { k = 0; while (p->x[i][0][k] == 0.0) k++; basefreq1[k] += weight[i]; l = 0; while (q->x[i][0][l] == 0.0) l++; basefreq2[l] += weight[i]; basetable[k][l] += weight[i]; } vv = lndet(basetable); if (vv == 99.0) { printf("\nNegative or zero determinant for distance between species"); printf(" %ld and %ld\n", m, n); printf(" -1.0 WAS WRITTEN\n"); baddists = true; } vv = -0.25*(vv - 0.5*(log(basefreq1[0])+log(basefreq1[1]) +log(basefreq1[2])+log(basefreq1[3]) +log(basefreq2[0])+log(basefreq2[1]) +log(basefreq2[2])+log(basefreq2[3]))); } if (similarity) { if (denominator < 1.0) { printf("\nWARNING: SPECIES %3ld AND %3ld HAVE NO BASES THAT", m, n); printf(" CAN BE COMPARED\n"); printf(" -1.0 WAS WRITTEN\n"); baddists = true; } vv = (double)numerator / denominator; } *v = vv; } /* makev */ void makedists() { /* compute distance matrix */ long i, j; double v; inittable(); for (i = 0; i < endsite; i++) weightrat[i] = weight[i] * rate[category[alias[i] - 1] - 1]; if (progress) { printf("Distances calculated for species\n"); #ifdef WIN32 phyFillScreenColor(); #endif } for (i = 0; i < spp; i++) if (similarity) d[i][i] = 1.0; else d[i][i] = 0.0; baddists = false; for (i = 1; i < spp; i++) { if (progress) { printf(" "); for (j = 0; j < nmlngth; j++) putchar(nayme[i - 1][j]); printf(" "); } for (j = i + 1; j <= spp; j++) { makev(i, j, &v); v = fabs(v); if ( baddists == true ) { v = -1; baddists = false; } d[i - 1][j - 1] = v; d[j - 1][i - 1] = v; if (progress) { putchar('.'); fflush(stdout); } } if (progress) { putchar('\n'); #ifdef WIN32 phyFillScreenColor(); #endif } } if (progress) { printf(" "); for (j = 0; j < nmlngth; j++) putchar(nayme[spp - 1][j]); putchar('\n'); } for (i = 0; i < spp; i++) { for (j = 0; j < endsite; j++) free(nodep[i]->x[j]); free(nodep[i]->x); } } /* makedists */ void writedists() { /* write out distances */ char **names; names = stringnames_new(); output_matrix_d(outfile, d, spp, spp, names, names, matrix_flags); stringnames_delete(names); if (progress) printf("\nDistances written to file \"%s\"\n\n", outfilename); } /* writedists */ int main(int argc, Char *argv[]) { /* DNA Distances by Maximum Likelihood */ #ifdef MAC argc = 1; /* macsetup("Dnadist",""); */ argv[0] = "Dnadist"; #endif init(argc, argv); openfile(&infile,INFILE,"input file","r",argv[0],infilename); openfile(&outfile,OUTFILE,"output file","w",argv[0],outfilename); ibmpc = IBMCRT; ansi = ANSICRT; mulsets = false; datasets = 1; firstset = true; doinit(); ttratio0 = ttratio; if (ctgry) openfile(&catfile,CATFILE,"categories file","r",argv[0],catfilename); if (weights || justwts) openfile(&weightfile,WEIGHTFILE,"weights file","r",argv[0],weightfilename); for (ith = 1; ith <= datasets; ith++) { ttratio = ttratio0; getinput(); if (ith == 1) firstset = false; if (datasets > 1 && progress) printf("Data set # %ld:\n\n",ith); makedists(); writedists(); } FClose(infile); FClose(outfile); #ifdef MAC fixmacfile(outfilename); #endif printf("Done.\n\n"); #ifdef WIN32 phyRestoreConsoleAttributes(); #endif return 0; } /* DNA Distances by Maximum Likelihood */ phylip-3.697/src/dnainvar.c0000644004732000473200000006234112406201116015301 0ustar joefelsenst_g #include "phylip.h" #include "seq.h" /* version 3.696. Written by Joseph Felsenstein, Akiko Fuseki, Sean Lamont, and Andrew Keeffe. Copyright (c) 1993-2014, Joseph Felsenstein All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #define maxsp 4 /* maximum number of species -- must be 4 */ typedef enum { xx, yy, zz, ww } simbol; #ifndef OLDC /* function prototypes */ void getoptions(void); void allocrest(void); void doinit(void); void dnainvar_sitecombine(void); void makeweights(void); void doinput(void); void prntpatterns(void); void makesymmetries(void); void prntsymbol(simbol); void prntsymmetries(void); void tabulate(long,long,long,long,double *,double *,double *,double *); void dnainvar_writename(long); void writetree(long, long, long, long); void exacttest(long, long); void invariants(void); void makeinv(void); void reallocsites(void); /* function prototypes */ #endif Char infilename[FNMLNGTH], outfilename[FNMLNGTH], weightfilename[FNMLNGTH]; long sites, msets, ith; boolean weights, progress, prntpat, printinv, mulsets, firstset, justwts; steptr aliasweight; long f[(long)ww - (long)xx + 1][(long)ww - (long)xx + 1] [(long)ww - (long)xx + 1]; /* made global from being local to makeinv */ void getoptions() { /* interactively set options */ long loopcount, loopcount2; boolean done; Char ch, ch2; fprintf(outfile, "\nNucleic acid sequence Invariants "); fprintf(outfile, "method, version %s\n\n",VERSION); putchar('\n'); printdata = false; weights = false; dotdiff = true; progress = true; prntpat = true; printinv = true; interleaved = true; loopcount = 0; do { cleerhome(); printf("\nNucleic acid sequence Invariants "); printf("method, version %s\n\n",VERSION); printf("Settings for this run:\n"); printf(" W Sites weighted? %s\n", (weights ? "Yes" : "No")); printf(" M Analyze multiple data sets?"); if (mulsets) printf(" Yes, %2ld %s\n", msets, (justwts ? "sets of weights" : "data sets")); else printf(" No\n"); printf(" I Input sequences interleaved?"); if (interleaved) printf(" Yes\n"); else printf(" No, sequential\n"); printf(" 0 Terminal type (IBM PC, ANSI, none)?"); if (ibmpc) printf(" IBM PC\n"); if (ansi) printf(" ANSI\n"); if (!(ibmpc || ansi)) printf(" (none)\n"); printf(" 1 Print out the data at start of run"); if (printdata) printf(" Yes\n"); else printf(" No\n"); if (printdata) printf(" . Use dot-differencing to display them %s\n", dotdiff ? "Yes" : "No"); printf(" 2 Print indications of progress of run %s\n", (progress ? "Yes" : "No")); printf(" 3 Print out the counts of patterns"); if (prntpat) printf(" Yes\n"); else printf(" No\n"); printf(" 4 Print out the invariants"); if (printinv) printf(" Yes\n"); else printf(" No\n"); if(weights && justwts){ printf( "WARNING: W option and Multiple Weights options are both on. "); printf( "The W menu option is unnecessary and has no additional effect. \n"); } printf("\n Y to accept these or type the letter for one to change\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); uppercase(&ch); done = (ch == 'Y'); if (!done) { if (strchr("WMI01.234",ch) != NULL) { switch (ch) { case 'W': weights = !weights; break; case 'M': mulsets = !mulsets; if (mulsets){ printf("Multiple data sets or multiple weights?"); loopcount2 = 0; do { printf(" (type D or W)\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch2); getchar(); if (ch2 == '\n') ch2 = ' '; uppercase(&ch2); countup(&loopcount2, 10); } while ((ch2 != 'W') && (ch2 != 'D')); justwts = (ch2 == 'W'); if (justwts) justweights(&msets); else initdatasets(&msets); } break; case 'I': interleaved = !interleaved; break; case '0': initterminal(&ibmpc, &ansi); break; case '1': printdata = !printdata; break; case '.': dotdiff = !dotdiff; break; case '2': progress = !progress; break; case '3': prntpat = !prntpat; break; case '4': printinv = !printinv; break; } } else printf("Not a possible option!\n"); } countup(&loopcount, 100); } while (!done); } /* getoptions */ void reallocsites(void) { long i; for (i=0; i < spp; i++) { free(y[i]); y[i] = (Char *)Malloc(sites*sizeof(Char)); } free(weight); free(alias); free(aliasweight); weight = (steptr)Malloc(sites * sizeof(long)); alias = (steptr)Malloc(sites * sizeof(long)); aliasweight = (steptr)Malloc(sites * sizeof(long)); } void allocrest() { long i; y = (Char **)Malloc(spp*sizeof(Char *)); for (i = 0; i < spp; i++) y[i] = (Char *)Malloc(sites*sizeof(Char)); nayme = (naym *)Malloc(maxsp * sizeof(naym)); weight = (steptr)Malloc(sites * sizeof(long)); alias = (steptr)Malloc(sites * sizeof(long)); aliasweight = (steptr)Malloc(sites * sizeof(long)); } void doinit() { /* initializes variables */ inputnumbers(&spp, &sites, &nonodes, 1); if (spp > maxsp){ printf("TOO MANY SPECIES: only 4 allowed\n"); exxit(-1);} getoptions(); if (printdata) fprintf(outfile, "%2ld species, %3ld sites\n", spp, sites); allocrest(); } /* doinit*/ void dnainvar_sitecombine() { /* combine sites that have identical patterns */ long i, j, k; boolean tied; i = 1; while (i < sites) { j = i + 1; tied = true; while (j <= sites && tied) { k = 1; while (k <= spp && tied) { tied = (tied && y[k - 1][alias[i - 1] - 1] == y[k - 1][alias[j - 1] - 1]); k++; } if (tied && aliasweight[j - 1] > 0) { aliasweight[i - 1] += aliasweight[j - 1]; aliasweight[j - 1] = 0; } j++; } i = j - 1; } } /* dnainvar_sitecombine */ void makeweights() { /* make up weights vector to avoid duplicate computations */ long i; for (i = 1; i <= sites; i++) { alias[i - 1] = i; aliasweight[i - 1] = weight[i - 1]; } sitesort(sites, aliasweight); dnainvar_sitecombine(); sitescrunch2(sites, 1, 2, aliasweight); for (i = 1; i <= sites; i++) { weight[i - 1] = aliasweight[i - 1]; if (weight[i - 1] > 0) endsite = i; } } /* makeweights */ void doinput() { /* reads the input data */ long i; if (justwts) { if (firstset) inputdata(sites); for (i = 0; i < sites; i++) weight[i] = 1; inputweights(sites, weight, &weights); if (justwts) { fprintf(outfile, "\n\nWeights set # %ld:\n\n", ith); if (progress) printf("\nWeights set # %ld:\n\n", ith); } if (printdata) printweights(outfile, 0, sites, weight, "Sites"); } else { if (!firstset){ samenumsp(&sites, ith); reallocsites(); } inputdata(sites); for (i = 0; i < sites; i++) weight[i] = 1; if (weights) { inputweights(sites, weight, &weights); if (printdata) printweights(outfile, 0, sites, weight, "Sites"); } } makeweights(); } /* doinput */ void prntpatterns() { /* print out patterns */ long i, j; fprintf(outfile, "\n Pattern"); if (prntpat) fprintf(outfile, " Number of times"); fprintf(outfile, "\n\n"); for (i = 0; i < endsite; i++) { fprintf(outfile, " "); for (j = 0; j < spp; j++) putc(y[j][alias[i] - 1], outfile); if (prntpat) fprintf(outfile, " %8ld", weight[i]); putc('\n', outfile); } putc('\n', outfile); } /* prntpatterns */ void makesymmetries() { /* get frequencies of symmetrized patterns */ long i, j; boolean drop, usedz; Char ch, ch1, zchar; simbol s1, s2, s3; simbol t[maxsp - 1]; for (s1 = xx; (long)s1 <= (long)ww; s1 = (simbol)((long)s1 + 1)) { for (s2 = xx; (long)s2 <= (long)ww; s2 = (simbol)((long)s2 + 1)) { for (s3 = xx; (long)s3 <= (long)ww; s3 = (simbol)((long)s3 + 1)) f[(long)s1 - (long)xx][(long)s2 - (long)xx] [(long)s3 - (long)xx] = 0; } } for (i = 0; i < endsite; i++) { drop = false; for (j = 0; j < spp; j++) { ch = y[j][alias[i] - 1]; drop = (drop || (ch != 'A' && ch != 'C' && ch != 'G' && ch != 'T' && ch != 'U')); } ch1 = y[0][alias[i] - 1]; if (!drop) { usedz = false; zchar = ' '; for (j = 2; j <= spp; j++) { ch = y[j - 1][alias[i] - 1]; if (ch == ch1) t[j - 2] = xx; else if ((ch1 == 'A' && ch == 'G') || (ch1 == 'G' && ch == 'A') || (ch1 == 'C' && (ch == 'T' || ch == 'U')) || ((ch1 == 'T' || ch1 == 'U') && ch == 'C')) t[j - 2] = yy; else if (!usedz) { t[j - 2] = zz; usedz = true; zchar = ch; } else if (usedz && ch == zchar) t[j - 2] = zz; else if (usedz && ch != zchar) t[j - 2] = ww; } f[(long)t[0] - (long)xx][(long)t[1] - (long)xx] [(long)t[2] - (long)xx] += weight[i]; } } } /* makesymmetries */ void prntsymbol(simbol s) { /* print 1, 2, 3, 4 as appropriate */ switch (s) { case xx: putc('1', outfile); break; case yy: putc('2', outfile); break; case zz: putc('3', outfile); break; case ww: putc('4', outfile); break; } } /* prntsymbol */ void prntsymmetries() { /* print out symmetrized pattern numbers */ simbol s1, s2, s3; fprintf(outfile, "\nSymmetrized patterns (1, 2 = the two purines "); fprintf(outfile, "and 3, 4 = the two pyrimidines\n"); fprintf(outfile, " or 1, 2 = the two pyrimidines "); fprintf(outfile, "and 3, 4 = the two purines)\n\n"); for (s1 = xx; (long)s1 <= (long)ww; s1 = (simbol)((long)s1 + 1)) { for (s2 = xx; (long)s2 <= (long)ww; s2 = (simbol)((long)s2 + 1)) { for (s3 = xx; (long)s3 <= (long)ww; s3 = (simbol)((long)s3 + 1)) { if (f[(long)s1 - (long)xx][(long)s2 - (long)xx] [(long)s3 - (long)xx] > 0) { fprintf(outfile, " 1"); prntsymbol(s1); prntsymbol(s2); prntsymbol(s3); if (prntpat) fprintf(outfile, " %7ld", f[(long)s1 - (long)xx][(long)s2 - (long)xx] [(long)s3 - (long)xx]); putc('\n', outfile); } } } } } /* prntsymmetries */ void tabulate(long mm, long nn, long pp, long qq, double *mr, double *nr, double *pr, double *qr) { /* make quadratic invariant, table, chi-square */ long total; double k, TEMP; fprintf(outfile, "\n Contingency Table\n\n"); fprintf(outfile, "%7ld%6ld\n", mm, nn); fprintf(outfile, "%7ld%6ld\n\n", pp, qq); *mr = (long)(mm); *nr = (long)(nn); *pr = (long)pp; *qr = (long)qq; total = mm + nn + pp + qq; if (printinv) fprintf(outfile, " Quadratic invariant = %15.1f\n\n", (*nr) * (*pr) - (*mr) * (*qr)); fprintf(outfile, " Chi-square = "); TEMP = (*mr) * (*qr) - (*nr) * (*pr); k = total * (TEMP * TEMP) / (((*mr) + (*nr)) * ((*mr) + (*pr)) * ((*nr) + (*qr)) * ((*pr) + (*qr))); fprintf(outfile, "%10.5f", k); if ((*mr) * (*qr) > (*nr) * (*pr) && k > 2.71) fprintf(outfile, " (P < 0.05)\n"); else fprintf(outfile, " (not significant)\n"); fprintf(outfile, "\n\n"); } /* tabulate */ void dnainvar_writename(long m) { /* write out a species name */ long i, n; n = nmlngth; while (nayme[m - 1][n - 1] == ' ') n--; if (n == 0) n = 1; for (i = 0; i < n; i++) putc(nayme[m - 1][i], outfile); } /* dnainvar_writename */ void writetree(long i, long j, long k, long l) { /* write out tree topology ((i,j),(k,l)) using names */ fprintf(outfile, "(("); dnainvar_writename(i); putc(',', outfile); dnainvar_writename(j); fprintf(outfile, "),("); dnainvar_writename(k); putc(',', outfile); dnainvar_writename(l); fprintf(outfile, "))\n"); } /* writetree */ void exacttest(long m, long n) { /* exact binomial test that m <= n */ long i; double p, sum; p = 1.0; for (i = 1; i <= m + n; i++) p /= 2.0; sum = p; for (i = 1; i <= n; i++) { p = p * (m + n - i + 1) / i; sum += p; } fprintf(outfile, " %7.4f", sum); if (sum <= 0.05) fprintf(outfile, " yes\n"); else fprintf(outfile, " no\n"); } /* exacttest */ void invariants() { /* compute invariants */ long m, n, p, q; double L1, L2, L3; double mr,nr,pr,qr; fprintf(outfile, "\nTree topologies (unrooted): \n\n"); fprintf(outfile, " I: "); writetree(1, 2, 3, 4); fprintf(outfile, " II: "); writetree(1, 3, 2, 4); fprintf(outfile, " III: "); writetree(1, 4, 2, 3); fprintf(outfile, "\n\nLake's linear invariants\n"); fprintf(outfile, " (these are expected to be zero for the two incorrect tree topologies.\n"); fprintf(outfile, " This is tested by testing the equality of the two parts\n"); fprintf(outfile, " of each expression using a one-sided exact binomial test.\n"); fprintf(outfile, " The null hypothesis is that the first part is no larger than the second.)\n\n"); fprintf(outfile, " Tree "); fprintf(outfile, " Exact test P value Significant?\n\n"); m = f[(long)yy - (long)xx][(long)zz - (long)xx] [(long)ww - (long)xx] + f[0][(long)zz - (long)xx] [(long)zz - (long)xx]; n = f[(long)yy - (long)xx][(long)zz - (long)xx] [(long)zz - (long)xx] + f[0][(long)zz - (long)xx] [(long)ww - (long)xx]; fprintf(outfile, " I %5ld - %5ld = %5ld", m, n, m - n); exacttest(m, n); m = f[(long)zz - (long)xx][(long)yy - (long)xx] [(long)ww - (long)xx] + f[(long)zz - (long)xx][0] [(long)zz - (long)xx]; n = f[(long)zz - (long)xx][(long)yy - (long)xx] [(long)zz - (long)xx] + f[(long)zz - (long)xx][0] [(long)ww - (long)xx]; fprintf(outfile, " II %5ld - %5ld = %5ld", m, n, m - n); exacttest(m, n); m = f[(long)zz - (long)xx][(long)ww - (long)xx] [(long)yy - (long)xx] + f[(long)zz - (long)xx] [(long)zz - (long)xx][0]; n = f[(long)zz - (long)xx][(long)zz - (long)xx] [(long)yy - (long)xx] + f[(long)zz - (long)xx] [(long)ww - (long)xx][0]; fprintf(outfile, " III%5ld - %5ld = %5ld", m, n, m - n); exacttest(m, n); fprintf(outfile, "\n\nCavender's quadratic invariants (type L)"); fprintf(outfile, " using purines vs. pyrimidines\n"); fprintf(outfile, " (these are expected to be zero, and thus have a nonsignificant\n"); fprintf(outfile, " chi-square, for the correct tree topology)\n"); fprintf(outfile, "They will be misled if there are substantially\n"); fprintf(outfile, "different evolutionary rate between sites, or\n"); fprintf(outfile, "different purine:pyrimidine ratios from 1:1.\n\n"); fprintf(outfile, " Tree I:\n"); m = f[0][0][0] + f[0][(long)yy - (long)xx] [(long)yy - (long)xx] + f[0][(long)zz - (long)xx] [(long)zz - (long)xx]; n = f[0][0][(long)yy - (long)xx] + f[0][0] [(long)zz - (long)xx] + f[0][(long)yy - (long)xx][0] + f[0] [(long)yy - (long)xx][(long)zz - (long)xx] + f[0] [(long)zz - (long)xx][0] + f[0][(long)zz - (long)xx] [(long)yy - (long)xx] + f[0][(long)zz - (long)xx] [(long)ww - (long)xx]; p = f[(long)yy - (long)xx][0][0] + f[(long)yy - (long)xx] [(long)yy - (long)xx] [(long)yy - (long)xx] + f[(long)yy - (long)xx] [(long)zz - (long)xx] [(long)zz - (long)xx] + f[(long)zz - (long)xx][0] [0] + f[(long)zz - (long)xx][(long)yy - (long)xx] [(long)yy - (long)xx] + f[(long)zz - (long)xx] [(long)zz - (long)xx] [(long)zz - (long)xx] + f[(long)zz - (long)xx] [(long)ww - (long)xx][(long)ww - (long)xx]; q = f[(long)yy - (long)xx][0][(long)yy - (long)xx] + f[(long)yy - (long)xx][0][(long)zz - (long)xx] + f[(long)yy - (long)xx][(long)yy - (long)xx][0] + f[(long)yy - (long)xx][(long)yy - (long)xx][(long)zz - (long)xx] + f[(long)yy - (long)xx][(long)zz - (long)xx][0] + f[(long)yy - (long)xx][(long)zz - (long)xx][(long)yy - (long)xx] + f[(long)yy - (long)xx][(long)zz - (long)xx][(long)ww - (long)xx] + f[(long)zz - (long)xx][0][(long)yy - (long)xx] + f[(long)zz - (long)xx][0][(long)zz - (long)xx] + f[(long)zz - (long)xx][0][(long)ww - (long)xx] + f[(long)zz - (long)xx][(long)yy - (long)xx][0] + f[(long)zz - (long)xx][(long)yy - (long)xx][(long)zz - (long)xx] + f[(long)zz - (long)xx][(long)yy - (long)xx][(long)ww - (long)xx] + f[(long)zz - (long)xx][(long)zz - (long)xx][0] + f[(long)zz - (long)xx][(long)zz - (long)xx][(long)yy - (long)xx] + f[(long)zz - (long)xx][(long)zz - (long)xx][(long)ww - (long)xx] + f[(long)zz - (long)xx][(long)ww - (long)xx][0] + f[(long)zz - (long)xx][(long)ww - (long)xx][(long)yy - (long)xx] + f[(long)zz - (long)xx][(long)ww - (long)xx][(long)zz - (long)xx]; nr = n; pr = p; mr = m; qr = q; L1 = nr * pr - mr * qr; tabulate(m, n, p, q, &mr,&nr,&pr,&qr); fprintf(outfile, " Tree II:\n"); m = f[0][0][0] + f[(long)yy - (long)xx][0] [(long)yy - (long)xx] + f[(long)zz - (long)xx][0] [(long)zz - (long)xx]; n = f[0][0][(long)yy - (long)xx] + f[0][0] [(long)zz - (long)xx] + f[(long)yy - (long)xx][0] [0] + f[(long)yy - (long)xx][0] [(long)zz - (long)xx] + f[(long)zz - (long)xx][0] [0] + f[(long)zz - (long)xx][0] [(long)yy - (long)xx] + f[(long)zz - (long)xx][0] [(long)ww - (long)xx]; p = f[0][(long)yy - (long)xx][0] + f[(long)yy - (long)xx] [(long)yy - (long)xx] [(long)yy - (long)xx] + f[(long)zz - (long)xx] [(long)yy - (long)xx][(long)zz - (long)xx] + f[0] [(long)zz - (long)xx][0] + f[(long)yy - (long)xx] [(long)zz - (long)xx] [(long)yy - (long)xx] + f[(long)zz - (long)xx] [(long)zz - (long)xx] [(long)zz - (long)xx] + f[(long)zz - (long)xx] [(long)ww - (long)xx][(long)zz - (long)xx]; q = f[0][(long)yy - (long)xx][(long)yy - (long)xx] + f[0] [(long)yy - (long)xx][(long)zz - (long)xx] + f[(long)yy - (long)xx][(long)yy - (long)xx][0] + f[(long)yy - (long)xx][(long)yy - (long)xx][(long)zz - (long)xx] + f[(long)zz - (long)xx][(long)yy - (long)xx][0] + f[(long)zz - (long)xx][(long)yy - (long)xx][(long)yy - (long)xx] + f[(long)zz - (long)xx][(long)yy - (long)xx][(long)ww - (long)xx] + f[0][(long)zz - (long)xx][(long)yy - (long)xx] + f[0] [(long)zz - (long)xx][(long)zz - (long)xx] + f[0] [(long)zz - (long)xx][(long)ww - (long)xx] + f[(long)yy - (long)xx][(long)zz - (long)xx][0] + f[(long)yy - (long)xx][(long)zz - (long)xx][(long)zz - (long)xx] + f[(long)yy - (long)xx][(long)zz - (long)xx][(long)ww - (long)xx] + f[(long)zz - (long)xx][(long)zz - (long)xx][0] + f[(long)zz - (long)xx][(long)zz - (long)xx][(long)yy - (long)xx] + f[(long)zz - (long)xx][(long)zz - (long)xx][(long)ww - (long)xx] + f[(long)zz - (long)xx][(long)ww - (long)xx][0] + f[(long)zz - (long)xx][(long)ww - (long)xx][(long)yy - (long)xx] + f[(long)zz - (long)xx][(long)ww - (long)xx][(long)ww - (long)xx]; nr = n; pr = p; mr = m; qr = q; L2 = nr * pr - mr * qr; tabulate(m, n, p, q, &mr,&nr,&pr,&qr); fprintf(outfile, " Tree III:\n"); m = f[0][0][0] + f[(long)yy - (long)xx][(long)yy - (long)xx] [0] + f[(long)zz - (long)xx][(long)zz - (long)xx][0]; n = f[(long)yy - (long)xx][0][0] + f[(long)zz - (long)xx][0] [0] + f[0][(long)yy - (long)xx][0] + f[(long)zz - (long)xx] [(long)yy - (long)xx][0] + f[0][(long)zz - (long)xx] [0] + f[(long)yy - (long)xx][(long)zz - (long)xx] [0] + f[(long)zz - (long)xx][(long)ww - (long)xx][0]; p = f[0][0][(long)yy - (long)xx] + f[(long)yy - (long)xx] [(long)yy - (long)xx] [(long)yy - (long)xx] + f[(long)zz - (long)xx] [(long)zz - (long)xx][(long)yy - (long)xx] + f[0][0] [(long)zz - (long)xx] + f[(long)yy - (long)xx] [(long)yy - (long)xx] [(long)zz - (long)xx] + f[(long)zz - (long)xx] [(long)zz - (long)xx] [(long)zz - (long)xx] + f[(long)zz - (long)xx] [(long)zz - (long)xx][(long)ww - (long)xx]; q = f[(long)yy - (long)xx][0][(long)yy - (long)xx] + f[(long)zz - (long)xx] [0][(long)yy - (long)xx] + f[0][(long)yy - (long)xx][(long)yy - (long)xx] + f[(long)zz - (long)xx][(long)yy - (long)xx][(long)yy - (long)xx] + f[0][(long)zz - (long)xx] [(long)yy - (long)xx] + f[(long)yy - (long)xx][(long)zz - (long)xx] [(long)yy - (long)xx] + f[(long)zz - (long)xx][(long)ww - (long)xx] [(long)yy - (long)xx] + f[(long)yy - (long)xx][0] [(long)zz - (long)xx] + f[(long)zz - (long)xx][0] [(long)zz - (long)xx] + f[0][(long)zz - (long)xx] [(long)ww - (long)xx] + f[0][(long)yy - (long)xx] [(long)zz - (long)xx] + f[(long)zz - (long)xx] [(long)yy - (long)xx] [(long)zz - (long)xx] + f[(long)zz - (long)xx] [(long)yy - (long)xx][(long)ww - (long)xx] + f[0] [(long)zz - (long)xx] [(long)zz - (long)xx] + f[(long)yy - (long)xx] [(long)zz - (long)xx] [(long)zz - (long)xx] + f[(long)zz - (long)xx] [(long)ww - (long)xx] [(long)ww - (long)xx] + f[(long)zz - (long)xx][0] [(long)ww - (long)xx] + f[(long)yy - (long)xx] [(long)zz - (long)xx][(long)ww - (long)xx] + f[(long)zz - (long)xx][(long)ww - (long)xx][(long)zz - (long)xx]; nr = n; pr = p; mr = m; qr = q; L3 = nr * pr - mr * qr; tabulate(m, n, p, q, &mr,&nr,&pr,&qr); fprintf(outfile, "\n\nCavender's quadratic invariants (type K)"); fprintf(outfile, " using purines vs. pyrimidines\n"); fprintf(outfile, " (these are expected to be zero for the correct tree topology)\n"); fprintf(outfile, "They will be misled if there are substantially\n"); fprintf(outfile, "different evolutionary rate between sites, or\n"); fprintf(outfile, "different purine:pyrimidine ratios from 1:1.\n"); fprintf(outfile, "No statistical test is done on them here.\n\n"); fprintf(outfile, " Tree I: %15.1f\n", L2 - L3); fprintf(outfile, " Tree II: %15.1f\n", L3 - L1); fprintf(outfile, " Tree III: %15.1f\n\n", L1 - L2); } /* invariants */ void makeinv() { /* print out patterns and compute invariants */ prntpatterns(); makesymmetries(); prntsymmetries(); if (printinv) invariants(); } /* makeinv */ int main(int argc, Char *argv[]) { /* DNA Invariants */ #ifdef MAC argc = 1; /* macsetup("Dnainvar",""); */ argv[0] = "Dnainvar"; #endif init(argc,argv); openfile(&infile,INFILE,"input file", "r",argv[0],infilename); openfile(&outfile,OUTFILE,"output file", "w",argv[0],outfilename); ibmpc = IBMCRT; ansi = ANSICRT; mulsets = false; firstset = true; msets = 1; doinit(); if (weights || justwts) openfile(&weightfile,WEIGHTFILE,"weights file","r",argv[0],weightfilename); for (ith = 1; ith <= msets; ith++) { doinput(); if (ith == 1) firstset = false; if (msets > 1 && !justwts) { if (progress) printf("\nData set # %ld:\n",ith); fprintf(outfile, "Data set # %ld:\n\n",ith); } makeinv(); } if (progress) { putchar('\n'); printf("Output written to output file \"%s\"\n", outfilename); putchar('\n'); } FClose(outfile); FClose(infile); #ifdef MAC fixmacfile(outfilename); #endif printf("Done.\n\n"); #ifdef WIN32 phyRestoreConsoleAttributes(); #endif return 0; } /* DNA Invariants */ phylip-3.697/src/dnaml.c0000644004732000473200000022040612406201116014570 0ustar joefelsenst_g #include "phylip.h" #include "seq.h" /* version 3.696. Written by Joseph Felsenstein, Mark Moehring, Akiko Fuseki, Sean Lamont, Andrew Keeffe, Dan Fineman, and Patrick Colacurcio. Copyright (c) 1993-2014, Joseph Felsenstein All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ typedef struct valrec { double rat, ratxi, ratxv, orig_zz, z1, y1, z1zz, z1yy, xiz1, xiy1xv; double *ww, *zz, *wwzz, *vvzz; } valrec; typedef long vall[maxcategs]; typedef double contribarr[maxcategs]; #ifndef OLDC /* function prototypes */ void dnamlcopy(tree *, tree *, long, long); void getoptions(void); void allocrest(void); void doinit(void); void inputoptions(void); void makeweights(void); void getinput(void); void inittable_for_usertree(FILE *); void inittable(void); double evaluate(node *, boolean); void alloc_nvd (long, nuview_data *); void free_nvd (nuview_data *); void nuview(node *); void slopecurv(node *, double, double *, double *, double *); void makenewv(node *); void update(node *); void smooth(node *); void insert_(node *, node *, boolean); void dnaml_re_move(node **, node **); void buildnewtip(long, tree *); void buildsimpletree(tree *); void addtraverse(node *, node *, boolean); void rearrange(node *, node *); void initdnamlnode(node **, node **, node *, long, long, long *, long *, initops, pointarray, pointarray, Char *, Char *, FILE *); void dnaml_coordinates(node *, double, long *, double *); void dnaml_printree(void); void sigma(node *, double *, double *, double *); void describe(node *); void reconstr(node *, long); void rectrav(node *, long, long); void summarize(void); void dnaml_treeout(node *); void inittravtree(node *); void treevaluate(void); void maketree(void); void clean_up(void); void reallocsites(void); void globrearrange(void); void dnaml_unroot_here(node* root, node** nodep, long nonodes); void dnaml_unroot(node* p, node** nodep, long nonodes); void freetable(void); void alloclrsaves(void); void resetlrsaves(void); void freelrsaves(void); /* function prototypes */ #endif /* local rearrangements need to save views. created globally so that * * reallocation of the same variable is unnecessary */ node **lrsaves; long oldendsite; double fracchange; long rcategs; boolean haslengths; Char infilename[FNMLNGTH], outfilename[FNMLNGTH], intreename[FNMLNGTH], outtreename[FNMLNGTH], catfilename[FNMLNGTH], weightfilename[FNMLNGTH]; double *rate, *rrate, *probcat; long nonodes2, sites, weightsum, categs, datasets, ith, njumble, jumb; long parens; boolean freqsfrom, global, jumble, weights, trout, usertree, ctgry, rctgry, auto_, hypstate, ttr, progress, mulsets, justwts, firstset, improve, smoothit, polishing, lngths, gama, invar,inserting=false; tree curtree, bestree, bestree2, priortree; node *qwhere, *grbg, *addwhere; double xi, xv, ttratio, ttratio0, freqa, freqc, freqg, freqt, freqr, freqy, freqar, freqcy, freqgr, freqty, cv, alpha, lambda, invarfrac, bestyet; long *enterorder, inseed, inseed0; steptr aliasweight; contribarr *contribution, like, nulike, clai; double **term, **slopeterm, **curveterm; longer seed; Char* progname; char basechar[16]="acmgrsvtwyhkdbn"; /* Local variables for maketree, propagated globally for c version: */ long k, nextsp, numtrees, maxwhich, mx, mx0, mx1, shimotrees; double dummy, maxlogl; boolean succeeded, smoothed; double **l0gf; double *l0gl; valrec ***tbl; Char ch, ch2; long col; vall *mp=NULL; void dnamlcopy(tree *a, tree *b, long nonodes, long categs) { /* copies tree a to tree b*/ /* assumes bifurcation (OK) */ long i, j; node *p, *q; for (i = 0; i < spp; i++) { copynode(a->nodep[i], b->nodep[i], categs); if (a->nodep[i]->back) { if (a->nodep[i]->back == a->nodep[a->nodep[i]->back->index - 1]) b->nodep[i]->back = b->nodep[a->nodep[i]->back->index - 1]; else if (a->nodep[i]->back == a->nodep[a->nodep[i]->back->index - 1]->next) b->nodep[i]->back = b->nodep[a->nodep[i]->back->index - 1]->next; else b->nodep[i]->back = b->nodep[a->nodep[i]->back->index - 1]->next->next; } else b->nodep[i]->back = NULL; } for (i = spp; i < nonodes; i++) { p = a->nodep[i]; q = b->nodep[i]; for (j = 1; j <= 3; j++) { copynode(p, q, categs); if (p->back) { if (p->back == a->nodep[p->back->index - 1]) q->back = b->nodep[p->back->index - 1]; else if (p->back == a->nodep[p->back->index - 1]->next) q->back = b->nodep[p->back->index - 1]->next; else q->back = b->nodep[p->back->index - 1]->next->next; } else q->back = NULL; p = p->next; q = q->next; } } b->likelihood = a->likelihood; b->start = a->start; /* start used in dnaml only */ b->root = a->root; /* root used in dnamlk only */ } /* dnamlcopy plc*/ void getoptions() { /* interactively set options */ long i, loopcount, loopcount2; Char ch; boolean didchangecat, didchangercat; double probsum; fprintf(outfile, "\nNucleic acid sequence Maximum Likelihood"); fprintf(outfile, " method, version %s\n\n",VERSION); putchar('\n'); ctgry = false; didchangecat = false; rctgry = false; didchangercat = false; categs = 1; rcategs = 1; auto_ = false; freqsfrom = true; gama = false; global = false; hypstate = false; improve = false; invar = false; jumble = false; njumble = 1; lngths = false; lambda = 1.0; outgrno = 1; outgropt = false; trout = true; ttratio = 2.0; ttr = false; usertree = false; weights = false; printdata = false; dotdiff = true; progress = true; treeprint = true; interleaved = true; loopcount = 0; for (;;){ cleerhome(); printf("Nucleic acid sequence Maximum Likelihood"); printf(" method, version %s\n\n",VERSION); printf("Settings for this run:\n"); printf(" U Search for best tree? %s\n", (usertree ? "No, use user trees in input file" : "Yes")); if (usertree) { printf(" L Use lengths from user trees? %s\n", (lngths ? "Yes" : "No")); } printf(" T Transition/transversion ratio:%8.4f\n", (ttr ? ttratio : 2.0)); printf(" F Use empirical base frequencies? %s\n", (freqsfrom ? "Yes" : "No")); printf(" C One category of sites?"); if (!ctgry || categs == 1) printf(" Yes\n"); else printf(" %ld categories of sites\n", categs); printf(" R Rate variation among sites?"); if (!rctgry) printf(" constant rate\n"); else { if (gama) printf(" Gamma distributed rates\n"); else { if (invar) printf(" Gamma+Invariant sites\n"); else printf(" user-defined HMM of rates\n"); } printf(" A Rates at adjacent sites correlated?"); if (!auto_) printf(" No, they are independent\n"); else printf(" Yes, mean block length =%6.1f\n", 1.0 / lambda); } printf(" W Sites weighted? %s\n", (weights ? "Yes" : "No")); if (!usertree) { printf(" S Speedier but rougher analysis? %s\n", (improve ? "No, not rough" : "Yes")); printf(" G Global rearrangements? %s\n", (global ? "Yes" : "No")); } if (!usertree) { printf(" J Randomize input order of sequences?"); if (jumble) printf(" Yes (seed =%8ld,%3ld times)\n", inseed0, njumble); else printf(" No. Use input order\n"); } printf(" O Outgroup root? %s%3ld\n", (outgropt ? "Yes, at sequence number" : "No, use as outgroup species"),outgrno); printf(" M Analyze multiple data sets?"); if (mulsets) printf(" Yes, %2ld %s\n", datasets, (justwts ? "sets of weights" : "data sets")); else printf(" No\n"); printf(" I Input sequences interleaved? %s\n", (interleaved ? "Yes" : "No, sequential")); printf(" 0 Terminal type (IBM PC, ANSI, none)? %s\n", (ibmpc ? "IBM PC" : ansi ? "ANSI" : "(none)")); printf(" 1 Print out the data at start of run %s\n", (printdata ? "Yes" : "No")); printf(" 2 Print indications of progress of run %s\n", (progress ? "Yes" : "No")); printf(" 3 Print out tree %s\n", (treeprint ? "Yes" : "No")); printf(" 4 Write out trees onto tree file? %s\n", (trout ? "Yes" : "No")); printf(" 5 Reconstruct hypothetical sequences? %s\n", (hypstate ? "Yes" : "No")); printf("\n Y to accept these or type the letter for one to change\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); if (ch == '\n') ch = ' '; uppercase(&ch); if (ch == 'Y') break; if (((!usertree) && (strchr("UTFCRAWSGJVOMI012345", ch) != NULL)) || (usertree && ((strchr("ULTFCRAWSVOMI012345", ch) != NULL)))){ switch (ch) { case 'F': freqsfrom = !freqsfrom; if (!freqsfrom) { initfreqs(&freqa, &freqc, &freqg, &freqt); } break; case 'C': ctgry = !ctgry; if (ctgry) { printf("\nSitewise user-assigned categories:\n\n"); initcatn(&categs); if (rate){ free(rate); } rate = (double *) Malloc(categs * sizeof(double)); didchangecat = true; initcategs(categs, rate); } break; case 'R': if (!rctgry) { rctgry = true; gama = true; } else { if (gama) { gama = false; invar = true; } else { if (invar) invar = false; else rctgry = false; } } break; case 'A': auto_ = !auto_; if (auto_) initlambda(&lambda); break; case 'W': weights = !weights; break; case 'S': improve = !improve; break; case 'G': global = !global; break; case 'J': jumble = !jumble; if (jumble) initjumble(&inseed, &inseed0, seed, &njumble); else njumble = 1; break; case 'L': lngths = !lngths; break; case 'O': outgropt = !outgropt; if (outgropt) initoutgroup(&outgrno, spp); break; case 'T': ttr = !ttr; if (ttr) { initratio(&ttratio); } break; case 'U': usertree = !usertree; break; case 'M': mulsets = !mulsets; if (mulsets) { printf("Multiple data sets or multiple weights?"); loopcount2 = 0; do { printf(" (type D or W)\n"); fflush(stdout); scanf("%c%*[^\n]", &ch2); getchar(); if (ch2 == '\n') ch2 = ' '; uppercase(&ch2); countup(&loopcount2, 10); } while ((ch2 != 'W') && (ch2 != 'D')); justwts = (ch2 == 'W'); if (justwts) justweights(&datasets); else initdatasets(&datasets); if (!jumble) { jumble = true; initjumble(&inseed, &inseed0, seed, &njumble); } } break; case 'I': interleaved = !interleaved; break; case '0': initterminal(&ibmpc, &ansi); break; case '1': printdata = !printdata; break; case '2': progress = !progress; break; case '3': treeprint = !treeprint; break; case '4': trout = !trout; break; case '5': hypstate = !hypstate; break; } } else printf("Not a possible option!\n"); countup(&loopcount, 100); } if (gama || invar) { loopcount = 0; do { printf( "\nCoefficient of variation of substitution rate among sites" " (must be positive)\n"); printf( " In gamma distribution parameters, this is 1/(square root of alpha)\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%lf%*[^\n]", &cv); getchar(); countup(&loopcount, 10); } while (cv <= 0.0); alpha = 1.0 / (cv * cv); } if (!rctgry) auto_ = false; if (rctgry) { printf("\nRates in HMM"); if (invar) printf(" (including one for invariant sites)"); printf(":\n"); initcatn(&rcategs); if (probcat){ free(probcat); free(rrate); } probcat = (double *) Malloc(rcategs * sizeof(double)); rrate = (double *) Malloc(rcategs * sizeof(double)); didchangercat = true; if (gama) initgammacat(rcategs, alpha, rrate, probcat); else { if (invar) { loopcount = 0; do { printf("Fraction of invariant sites?\n"); fflush(stdout); scanf("%lf%*[^\n]", &invarfrac); getchar(); countup (&loopcount, 10); } while ((invarfrac <= 0.0) || (invarfrac >= 1.0)); initgammacat(rcategs-1, alpha, rrate, probcat); for (i = 0; i < rcategs-1; i++) probcat[i] = probcat[i]*(1.0-invarfrac); probcat[rcategs-1] = invarfrac; rrate[rcategs-1] = 0.0; } else { initcategs(rcategs, rrate); initprobcat(rcategs, &probsum, probcat); } } } if (!didchangercat){ rrate = (double *) Malloc(rcategs*sizeof(double)); probcat = (double *) Malloc(rcategs*sizeof(double)); rrate[0] = 1.0; probcat[0] = 1.0; } if (!didchangecat){ rate = (double *) Malloc(categs*sizeof(double)); rate[0] = 1.0; } } /* getoptions */ void reallocsites(void) { long i; for (i=0; i < spp; i++) { free(y[i]); y[i] = (Char *) Malloc(sites*sizeof(Char)); } free(category); free(weight); free(alias); free(ally); free(location); free(aliasweight); category = (long *) Malloc(sites*sizeof(long)); weight = (long *) Malloc(sites*sizeof(long)); alias = (long *) Malloc(sites*sizeof(long)); ally = (long *) Malloc(sites*sizeof(long)); location = (long *) Malloc(sites*sizeof(long)); aliasweight = (long *) Malloc(sites*sizeof(long)); } void allocrest() { long i; y = (Char **) Malloc(spp*sizeof(Char *)); for (i = 0; i < spp; i++) y[i] = (Char *) Malloc(sites*sizeof(Char)); nayme = (naym *) Malloc(spp*sizeof(naym));; enterorder = (long *) Malloc(spp*sizeof(long)); category = (long *) Malloc(sites*sizeof(long)); weight = (long *) Malloc(sites*sizeof(long)); alias = (long *) Malloc(sites*sizeof(long)); ally = (long *) Malloc(sites*sizeof(long)); location = (long *) Malloc(sites*sizeof(long)); aliasweight = (long *) Malloc(sites*sizeof(long)); } /* allocrest */ void doinit() { /* initializes variables */ inputnumbers(&spp, &sites, &nonodes2, 1); getoptions(); if (printdata) fprintf(outfile, "%2ld species, %3ld sites\n", spp, sites); alloctree(&curtree.nodep, nonodes2, usertree); allocrest(); if (usertree) return; alloctree(&bestree.nodep, nonodes2, 0); alloctree(&priortree.nodep, nonodes2, 0); if (njumble <= 1) return; alloctree(&bestree2.nodep, nonodes2, 0); } /* doinit */ void inputoptions() { long i; if (!firstset && !justwts) { samenumsp(&sites, ith); reallocsites(); } for (i = 0; i < sites; i++) category[i] = 1; for (i = 0; i < sites; i++) weight[i] = 1; if (justwts || weights) inputweights(sites, weight, &weights); weightsum = 0; for (i = 0; i < sites; i++) weightsum += weight[i]; if (ctgry && categs > 1) { inputcategs(0, sites, category, categs, "DnaML"); if (printdata) printcategs(outfile, sites, category, "Site categories"); } if (weights && printdata) printweights(outfile, 0, sites, weight, "Sites"); } /* inputoptions */ void makeweights() { /* make up weights vector to avoid duplicate computations */ long i; for (i = 1; i <= sites; i++) { alias[i - 1] = i; ally[i - 1] = i; aliasweight[i - 1] = weight[i - 1]; location[i - 1] = 0; } sitesort2 (sites, aliasweight); sitecombine2(sites, aliasweight); sitescrunch2(sites, 1, 2, aliasweight); endsite = 0; for (i = 1; i <= sites; i++) { if (ally[i - 1] == i) endsite++; } for (i = 1; i <= endsite; i++) { location[alias[i - 1] - 1] = i; } term = (double **) Malloc( endsite * sizeof(double *)); for (i = 0; i < endsite; i++) term[i] = (double *) Malloc( rcategs * sizeof(double)); slopeterm = (double **) Malloc( endsite * sizeof(double *)); for (i = 0; i < endsite; i++) slopeterm[i] = (double *) Malloc( rcategs * sizeof(double)); curveterm = (double **) Malloc(endsite * sizeof(double *)); for (i = 0; i < endsite; i++) curveterm[i] = (double *) Malloc( rcategs * sizeof(double)); mp = (vall *) Malloc( sites*sizeof(vall)); contribution = (contribarr *) Malloc( endsite*sizeof(contribarr)); } /* makeweights */ void getinput() { /* reads the input data */ inputoptions(); if (!freqsfrom) getbasefreqs(freqa, freqc, freqg, freqt, &freqr, &freqy, &freqar, &freqcy, &freqgr, &freqty, &ttratio, &xi, &xv, &fracchange, freqsfrom, true); if (!justwts || firstset) inputdata(sites); if ( !firstset ) oldendsite = endsite; makeweights(); if ( firstset ) alloclrsaves(); else resetlrsaves(); setuptree2(&curtree); if (!usertree) { setuptree2(&bestree); setuptree2(&priortree); if (njumble > 1) setuptree2(&bestree2); } allocx(nonodes2, rcategs, curtree.nodep, usertree); if (!usertree) { allocx(nonodes2, rcategs, bestree.nodep, 0); allocx(nonodes2, rcategs, priortree.nodep, 0); if (njumble > 1) allocx(nonodes2, rcategs, bestree2.nodep, 0); } makevalues2(rcategs, curtree.nodep, endsite, spp, y, alias); if (freqsfrom) { empiricalfreqs(&freqa, &freqc, &freqg, &freqt, aliasweight, curtree.nodep); getbasefreqs(freqa, freqc, freqg, freqt, &freqr, &freqy, &freqar, &freqcy, &freqgr, &freqty, &ttratio, &xi, &xv, &fracchange, freqsfrom, true); } if (!justwts || firstset) fprintf(outfile, "\nTransition/transversion ratio = %10.6f\n\n", ttratio); } /* getinput */ void inittable_for_usertree(FILE *intree) { /* If there's a user tree, then the ww/zz/wwzz/vvzz elements need to be allocated appropriately. */ long num_comma; long i, j; /* First, figure out the largest possible furcation, i.e. the number of commas plus one */ countcomma(&intree, &num_comma); num_comma++; for (i = 0; i < rcategs; i++) { for (j = 0; j < categs; j++) { /* Free the stuff allocated assuming bifurcations */ free (tbl[i][j]->ww); free (tbl[i][j]->zz); free (tbl[i][j]->wwzz); free (tbl[i][j]->vvzz); /* Then allocate for worst-case multifurcations */ tbl[i][j]->ww = (double *) Malloc( num_comma * sizeof (double)); tbl[i][j]->zz = (double *) Malloc( num_comma * sizeof (double)); tbl[i][j]->wwzz = (double *) Malloc( num_comma * sizeof (double)); tbl[i][j]->vvzz = (double *) Malloc( num_comma * sizeof (double)); } } } /* inittable_for_usertree */ void freetable() { long i, j; for (i = 0; i < rcategs; i++) { for (j = 0; j < categs; j++) { free(tbl[i][j]->ww); free(tbl[i][j]->zz); free(tbl[i][j]->wwzz); free(tbl[i][j]->vvzz); } } for (i = 0; i < rcategs; i++) { for (j = 0; j < categs; j++) free(tbl[i][j]); free(tbl[i]); } free(tbl); } void inittable() { /* Define a lookup table. Precompute values and print them out in tables */ long i, j; double sumrates; tbl = (valrec ***) Malloc(rcategs * sizeof(valrec **)); for (i = 0; i < rcategs; i++) { tbl[i] = (valrec **) Malloc(categs*sizeof(valrec *)); for (j = 0; j < categs; j++) tbl[i][j] = (valrec *) Malloc(sizeof(valrec)); } for (i = 0; i < rcategs; i++) { for (j = 0; j < categs; j++) { tbl[i][j]->rat = rrate[i]*rate[j]; tbl[i][j]->ratxi = tbl[i][j]->rat * xi; tbl[i][j]->ratxv = tbl[i][j]->rat * xv; /* Allocate assuming bifurcations, will be changed later if necessary (i.e. there's a user tree) */ tbl[i][j]->ww = (double *) Malloc( 2 * sizeof (double)); tbl[i][j]->zz = (double *) Malloc( 2 * sizeof (double)); tbl[i][j]->wwzz = (double *) Malloc( 2 * sizeof (double)); tbl[i][j]->vvzz = (double *) Malloc( 2 * sizeof (double)); } } if (!lngths) { /* restandardize rates */ sumrates = 0.0; for (i = 0; i < endsite; i++) { for (j = 0; j < rcategs; j++) sumrates += aliasweight[i] * probcat[j] * tbl[j][category[alias[i] - 1] - 1]->rat; } sumrates /= (double)sites; for (i = 0; i < rcategs; i++) for (j = 0; j < categs; j++) { tbl[i][j]->rat /= sumrates; tbl[i][j]->ratxi /= sumrates; tbl[i][j]->ratxv /= sumrates; } } if(jumb > 1) return; if (rcategs > 1) { if (gama) { fprintf(outfile,"\nDiscrete approximation to gamma distributed rates\n"); fprintf(outfile, " Coefficient of variation of rates = %f (alpha = %f)\n", cv, alpha); } fprintf(outfile, "\nState in HMM Rate of change Probability\n\n"); for (i = 0; i < rcategs; i++) if (probcat[i] < 0.0001) fprintf(outfile, "%9ld%16.3f%20.6f\n", i+1, rrate[i], probcat[i]); else if (probcat[i] < 0.001) fprintf(outfile, "%9ld%16.3f%19.5f\n", i+1, rrate[i], probcat[i]); else if (probcat[i] < 0.01) fprintf(outfile, "%9ld%16.3f%18.4f\n", i+1, rrate[i], probcat[i]); else fprintf(outfile, "%9ld%16.3f%17.3f\n", i+1, rrate[i], probcat[i]); putc('\n', outfile); if (auto_) fprintf(outfile, "Expected length of a patch of sites having the same rate = %8.3f\n", 1/lambda); putc('\n', outfile); } if (categs > 1) { fprintf(outfile, "\nSite category Rate of change\n\n"); for (i = 0; i < categs; i++) fprintf(outfile, "%9ld%16.3f\n", i+1, rate[i]); } if ((rcategs > 1) || (categs >> 1)) fprintf(outfile, "\n\n"); } /* inittable */ double evaluate(node *p, boolean saveit) { contribarr tterm; double sum, sum2, sumc, y, lz, y1, z1zz, z1yy, prod12, prod1, prod2, prod3, sumterm, lterm; long i, j, k, lai; node *q; sitelike x1, x2; sum = 0.0; q = p->back; if ( p->initialized == false && p->tip == false) nuview(p); if ( q->initialized == false && q->tip == false) nuview(q); y = p->v; lz = -y; for (i = 0; i < rcategs; i++) for (j = 0; j < categs; j++) { tbl[i][j]->orig_zz = exp(tbl[i][j]->ratxi * lz); tbl[i][j]->z1 = exp(tbl[i][j]->ratxv * lz); tbl[i][j]->z1zz = tbl[i][j]->z1 * tbl[i][j]->orig_zz; tbl[i][j]->z1yy = tbl[i][j]->z1 - tbl[i][j]->z1zz; } for (i = 0; i < endsite; i++) { k = category[alias[i]-1] - 1; for (j = 0; j < rcategs; j++) { if (y > 0.0) { y1 = 1.0 - tbl[j][k]->z1; z1zz = tbl[j][k]->z1zz; z1yy = tbl[j][k]->z1yy; } else { y1 = 0.0; z1zz = 1.0; z1yy = 0.0; } memcpy(x1, p->x[i][j], sizeof(sitelike)); prod1 = freqa * x1[0] + freqc * x1[(long)C - (long)A] + freqg * x1[(long)G - (long)A] + freqt * x1[(long)T - (long)A]; memcpy(x2, q->x[i][j], sizeof(sitelike)); prod2 = freqa * x2[0] + freqc * x2[(long)C - (long)A] + freqg * x2[(long)G - (long)A] + freqt * x2[(long)T - (long)A]; prod3 = (x1[0] * freqa + x1[(long)G - (long)A] * freqg) * (x2[0] * freqar + x2[(long)G - (long)A] * freqgr) + (x1[(long)C - (long)A] * freqc + x1[(long)T - (long)A] * freqt) * (x2[(long)C - (long)A] * freqcy + x2[(long)T - (long)A] * freqty); prod12 = freqa * x1[0] * x2[0] + freqc * x1[(long)C - (long)A] * x2[(long)C - (long)A] + freqg * x1[(long)G - (long)A] * x2[(long)G - (long)A] + freqt * x1[(long)T - (long)A] * x2[(long)T - (long)A]; tterm[j] = z1zz * prod12 + z1yy * prod3 + y1 * prod1 * prod2; } sumterm = 0.0; for (j = 0; j < rcategs; j++) sumterm += probcat[j] * tterm[j]; lterm = log(sumterm) + p->underflows[i] + q->underflows[i]; for (j = 0; j < rcategs; j++) clai[j] = tterm[j] / sumterm; memcpy(contribution[i], clai, rcategs*sizeof(double)); if (saveit && !auto_ && usertree && (which <= shimotrees)) l0gf[which - 1][i] = lterm; sum += aliasweight[i] * lterm; } for (j = 0; j < rcategs; j++) like[j] = 1.0; for (i = 0; i < sites; i++) { sumc = 0.0; for (k = 0; k < rcategs; k++) sumc += probcat[k] * like[k]; sumc *= lambda; if ((ally[i] > 0) && (location[ally[i]-1] > 0)) { lai = location[ally[i] - 1]; memcpy(clai, contribution[lai - 1], rcategs*sizeof(double)); for (j = 0; j < rcategs; j++) nulike[j] = ((1.0 - lambda) * like[j] + sumc) * clai[j]; } else { for (j = 0; j < rcategs; j++) nulike[j] = ((1.0 - lambda) * like[j] + sumc); } memcpy(like, nulike, rcategs*sizeof(double)); } sum2 = 0.0; for (i = 0; i < rcategs; i++) sum2 += probcat[i] * like[i]; sum += log(sum2); curtree.likelihood = sum; if (!saveit || auto_ || !usertree) return sum; if(which <= shimotrees) l0gl[which - 1] = sum; if (which == 1) { maxwhich = 1; maxlogl = sum; return sum; } if (sum > maxlogl) { maxwhich = which; maxlogl = sum; } return sum; } /* evaluate */ void alloc_nvd (long num_sibs, nuview_data *local_nvd) { /* Allocate blocks of memory appropriate for the number of siblings a given node has */ local_nvd->yy = (double *) Malloc( num_sibs * sizeof (double)); local_nvd->wwzz = (double *) Malloc( num_sibs * sizeof (double)); local_nvd->vvzz = (double *) Malloc( num_sibs * sizeof (double)); local_nvd->vzsumr = (double *) Malloc( num_sibs * sizeof (double)); local_nvd->vzsumy = (double *) Malloc( num_sibs * sizeof (double)); local_nvd->sum = (double *) Malloc( num_sibs * sizeof (double)); local_nvd->sumr = (double *) Malloc( num_sibs * sizeof (double)); local_nvd->sumy = (double *) Malloc( num_sibs * sizeof (double)); local_nvd->xx = (sitelike *) Malloc( num_sibs * sizeof (sitelike)); } /* alloc_nvd */ void free_nvd (nuview_data *local_nvd) { /* The natural complement to the alloc version */ free (local_nvd->yy); free (local_nvd->wwzz); free (local_nvd->vvzz); free (local_nvd->vzsumr); free (local_nvd->vzsumy); free (local_nvd->sum); free (local_nvd->sumr); free (local_nvd->sumy); free (local_nvd->xx); } /* free_nvd */ void nuview(node *p) { long i, j, k, l,num_sibs, sib_index; nuview_data *local_nvd = NULL; node *sib_ptr, *sib_back_ptr; sitelike p_xx; double lw; double correction; double maxx; /* Figure out how many siblings the current node has */ num_sibs = count_sibs (p); /* Recursive calls, should be called for all children */ sib_ptr = p; for (i=0 ; i < num_sibs; i++) { sib_ptr = sib_ptr->next; sib_back_ptr = sib_ptr->back; if (!sib_back_ptr->tip && !sib_back_ptr->initialized) nuview (sib_back_ptr); } /* Allocate the structure and blocks therein for variables used in this function */ local_nvd = (nuview_data *) Malloc( sizeof (nuview_data)); alloc_nvd (num_sibs, local_nvd); /* Loop 1: makes assignments to tbl based on some combination of what's already in tbl and the children's value of v */ sib_ptr = p; for (sib_index=0; sib_index < num_sibs; sib_index++) { sib_ptr = sib_ptr->next; sib_back_ptr = sib_ptr->back; lw = - (sib_back_ptr->v); for (i = 0; i < rcategs; i++) for (j = 0; j < categs; j++) { tbl[i][j]->ww[sib_index] = exp(tbl[i][j]->ratxi * lw); tbl[i][j]->zz[sib_index] = exp(tbl[i][j]->ratxv * lw); tbl[i][j]->wwzz[sib_index] = tbl[i][j]->ww[sib_index] * tbl[i][j]->zz[sib_index]; tbl[i][j]->vvzz[sib_index] = (1.0 - tbl[i][j]->ww[sib_index]) * tbl[i][j]->zz[sib_index]; } } /* Loop 2: */ for (i = 0; i < endsite; i++) { correction = 0; maxx = 0; k = category[alias[i]-1] - 1; for (j = 0; j < rcategs; j++) { /* Loop 2.1 */ sib_ptr = p; for (sib_index=0; sib_index < num_sibs; sib_index++) { sib_ptr = sib_ptr->next; sib_back_ptr = sib_ptr->back; if ( j == 0 ) correction += sib_back_ptr->underflows[i]; local_nvd->wwzz[sib_index] = tbl[j][k]->wwzz[sib_index]; local_nvd->vvzz[sib_index] = tbl[j][k]->vvzz[sib_index]; local_nvd->yy[sib_index] = 1.0 - tbl[j][k]->zz[sib_index]; memcpy(local_nvd->xx[sib_index], sib_back_ptr->x[i][j], sizeof(sitelike)); } /* Loop 2.2 */ for (sib_index=0; sib_index < num_sibs; sib_index++) { local_nvd->sum[sib_index] = local_nvd->yy[sib_index] * (freqa * local_nvd->xx[sib_index][(long)A] + freqc * local_nvd->xx[sib_index][(long)C] + freqg * local_nvd->xx[sib_index][(long)G] + freqt * local_nvd->xx[sib_index][(long)T]); local_nvd->sumr[sib_index] = freqar * local_nvd->xx[sib_index][(long)A] + freqgr * local_nvd->xx[sib_index][(long)G]; local_nvd->sumy[sib_index] = freqcy * local_nvd->xx[sib_index][(long)C] + freqty * local_nvd->xx[sib_index][(long)T]; local_nvd->vzsumr[sib_index] = local_nvd->vvzz[sib_index] * local_nvd->sumr[sib_index]; local_nvd->vzsumy[sib_index] = local_nvd->vvzz[sib_index] * local_nvd->sumy[sib_index]; } /* Initialize to one, multiply incremental values for every sibling a node has */ p_xx[(long)A] = 1 ; p_xx[(long)C] = 1 ; p_xx[(long)G] = 1 ; p_xx[(long)T] = 1 ; for (sib_index=0; sib_index < num_sibs; sib_index++) { p_xx[(long)A] *= local_nvd->sum[sib_index] + local_nvd->wwzz[sib_index] * local_nvd->xx[sib_index][(long)A] + local_nvd->vzsumr[sib_index]; p_xx[(long)C] *= local_nvd->sum[sib_index] + local_nvd->wwzz[sib_index] * local_nvd->xx[sib_index][(long)C] + local_nvd->vzsumy[sib_index]; p_xx[(long)G] *= local_nvd->sum[sib_index] + local_nvd->wwzz[sib_index] * local_nvd->xx[sib_index][(long)G] + local_nvd->vzsumr[sib_index]; p_xx[(long)T] *= local_nvd->sum[sib_index] + local_nvd->wwzz[sib_index] * local_nvd->xx[sib_index][(long)T] + local_nvd->vzsumy[sib_index]; } for ( l = 0 ; l < ((long)T - (long)A + 1); l++ ) { if ( p_xx[l] > maxx ) maxx = p_xx[l]; } /* And the final point of this whole function: */ memcpy(p->x[i][j], p_xx, sizeof(sitelike)); } p->underflows[i] = 0; if ( maxx < MIN_DOUBLE) fix_x(p,i,maxx,rcategs); p->underflows[i] += correction; } p->initialized = true; free_nvd (local_nvd); free (local_nvd); } /* nuview */ void slopecurv(node *p,double y,double *like,double *slope,double *curve) { /* compute log likelihood, slope and curvature at node p */ long i, j, k, lai; double sum, sumc, sumterm, lterm, sumcs, sumcc, sum2, slope2, curve2, temp; double lz, zz, z1, zzs, z1s, zzc, z1c, aa, bb, cc, prod1, prod2, prod12, prod3; contribarr thelike, nulike, nuslope, nucurve, theslope, thecurve, clai, cslai, cclai; node *q; sitelike x1, x2; q = p->back; sum = 0.0; lz = -y; for (i = 0; i < rcategs; i++) for (j = 0; j < categs; j++) { tbl[i][j]->orig_zz = exp(tbl[i][j]->rat * lz); tbl[i][j]->z1 = exp(tbl[i][j]->ratxv * lz); } for (i = 0; i < endsite; i++) { k = category[alias[i]-1] - 1; for (j = 0; j < rcategs; j++) { if (y > 0.0) { zz = tbl[j][k]->orig_zz; z1 = tbl[j][k]->z1; } else { zz = 1.0; z1 = 1.0; } zzs = -tbl[j][k]->rat * zz ; z1s = -tbl[j][k]->ratxv * z1 ; temp = tbl[j][k]->rat; zzc = temp * temp * zz; temp = tbl[j][k]->ratxv; z1c = temp * temp * z1; memcpy(x1, p->x[i][j], sizeof(sitelike)); prod1 = freqa * x1[0] + freqc * x1[(long)C - (long)A] + freqg * x1[(long)G - (long)A] + freqt * x1[(long)T - (long)A]; memcpy(x2, q->x[i][j], sizeof(sitelike)); prod2 = freqa * x2[0] + freqc * x2[(long)C - (long)A] + freqg * x2[(long)G - (long)A] + freqt * x2[(long)T - (long)A]; prod3 = (x1[0] * freqa + x1[(long)G - (long)A] * freqg) * (x2[0] * freqar + x2[(long)G - (long)A] * freqgr) + (x1[(long)C - (long)A] * freqc + x1[(long)T - (long)A] * freqt) * (x2[(long)C - (long)A] * freqcy + x2[(long)T - (long)A] * freqty); prod12 = freqa * x1[0] * x2[0] + freqc * x1[(long)C - (long)A] * x2[(long)C - (long)A] + freqg * x1[(long)G - (long)A] * x2[(long)G - (long)A] + freqt * x1[(long)T - (long)A] * x2[(long)T - (long)A]; aa = prod12 - prod3; bb = prod3 - prod1*prod2; cc = prod1 * prod2; term[i][j] = zz * aa + z1 * bb + cc; slopeterm[i][j] = zzs * aa + z1s * bb; curveterm[i][j] = zzc * aa + z1c * bb; } sumterm = 0.0; for (j = 0; j < rcategs; j++) sumterm += probcat[j] * term[i][j]; lterm = log(sumterm) + p->underflows[i] + q->underflows[i]; for (j = 0; j < rcategs; j++) { term[i][j] = term[i][j] / sumterm; slopeterm[i][j] = slopeterm[i][j] / sumterm; curveterm[i][j] = curveterm[i][j] / sumterm; } sum += aliasweight[i] * lterm; } for (i = 0; i < rcategs; i++) { thelike[i] = 1.0; theslope[i] = 0.0; thecurve[i] = 0.0; } for (i = 0; i < sites; i++) { sumc = 0.0; sumcs = 0.0; sumcc = 0.0; for (k = 0; k < rcategs; k++) { sumc += probcat[k] * thelike[k]; sumcs += probcat[k] * theslope[k]; sumcc += probcat[k] * thecurve[k]; } sumc *= lambda; sumcs *= lambda; sumcc *= lambda; if ((ally[i] > 0) && (location[ally[i]-1] > 0)) { lai = location[ally[i] - 1]; memcpy(clai, term[lai - 1], rcategs*sizeof(double)); memcpy(cslai, slopeterm[lai - 1], rcategs*sizeof(double)); memcpy(cclai, curveterm[lai - 1], rcategs*sizeof(double)); if (weight[i] > 1) { for (j = 0; j < rcategs; j++) { if (clai[j] > 0.0) clai[j] = exp(weight[i]*log(clai[j])); else clai[j] = 0.0; if (cslai[j] > 0.0) cslai[j] = exp(weight[i]*log(cslai[j])); else cslai[j] = 0.0; if (cclai[j] > 0.0) cclai[j] = exp(weight[i]*log(cclai[j])); else cclai[j] = 0.0; } } for (j = 0; j < rcategs; j++) { nulike[j] = ((1.0 - lambda) * thelike[j] + sumc) * clai[j]; nuslope[j] = ((1.0 - lambda) * theslope[j] + sumcs) * clai[j] + ((1.0 - lambda) * thelike[j] + sumc) * cslai[j]; nucurve[j] = ((1.0 - lambda) * thecurve[j] + sumcc) * clai[j] + 2.0 * ((1.0 - lambda) * theslope[j] + sumcs) * cslai[j] + ((1.0 - lambda) * thelike[j] + sumc) * cclai[j]; } } else { for (j = 0; j < rcategs; j++) { nulike[j] = ((1.0 - lambda) * thelike[j] + sumc); nuslope[j] = ((1.0 - lambda) * theslope[j] + sumcs); nucurve[j] = ((1.0 - lambda) * thecurve[j] + sumcc); } } memcpy(thelike, nulike, rcategs*sizeof(double)); memcpy(theslope, nuslope, rcategs*sizeof(double)); memcpy(thecurve, nucurve, rcategs*sizeof(double)); } sum2 = 0.0; slope2 = 0.0; curve2 = 0.0; for (i = 0; i < rcategs; i++) { sum2 += probcat[i] * thelike[i]; slope2 += probcat[i] * theslope[i]; curve2 += probcat[i] * thecurve[i]; } sum += log(sum2); (*like) = sum; (*slope) = slope2 / sum2; /* Expressed in terms of *slope to prevent overflow */ (*curve) = curve2 / sum2 - *slope * *slope; } /* slopecurv */ void makenewv(node *p) { /* Newton-Raphson algorithm improvement of a branch length */ long it, ite; double y, yold=0, yorig, like, slope, curve, oldlike=0; boolean done, firsttime, better; node *q; q = p->back; y = p->v; yorig = y; done = false; firsttime = true; it = 1; ite = 0; while ((it < iterations) && (ite < 20) && (!done)) { slopecurv (p, y, &like, &slope, &curve); better = false; if (firsttime) { /* if no older value of y to compare with */ yold = y; oldlike = like; firsttime = false; better = true; } else { if (like > oldlike) { /* update the value of yold if it was better */ yold = y; oldlike = like; better = true; it++; } } if (better) { y = y + slope/fabs(curve); /* Newton-Raphson, forced uphill-wards */ if (y < epsilon) y = epsilon; } else { if (fabs(y - yold) < epsilon) ite = 20; y = (y + 19*yold) / 20.0; /* retract 95% of way back */ } ite++; done = fabs(y-yold) < 0.1*epsilon; } smoothed = (fabs(yold-yorig) < epsilon) && (yorig > 1000.0*epsilon); p->v = yold; /* the last one that had better likelihood */ q->v = yold; curtree.likelihood = oldlike; } /* makenewv */ void update(node *p) { long num_sibs, i; node* sib_ptr; if (!p->tip && !p->initialized) nuview(p); if (!p->back->tip && !p->back->initialized) nuview(p->back); if ((!usertree) || (usertree && !lngths) || p->iter) { makenewv(p); if ( smoothit ) { inittrav(p); inittrav(p->back); } else { if (inserting) { num_sibs = count_sibs (p); sib_ptr = p; for (i=0; i < num_sibs; i++) { sib_ptr = sib_ptr->next; sib_ptr->initialized = false; } } } } } /* update */ void smooth(node *p) { long i, num_sibs; node *sib_ptr; smoothed = false; update (p); if (p->tip) return; num_sibs = count_sibs (p); sib_ptr = p; for (i=0; i < num_sibs; i++) { sib_ptr = sib_ptr->next; if (polishing || (smoothit && !smoothed)) { smooth(sib_ptr->back); } } } /* smooth */ void insert_(node *p, node *q, boolean dooinit) { /* Insert q near p */ /* assumes bifurcation (OK) */ long i; node *r; r = p->next->next; hookup(r, q->back); hookup(p->next, q); q->v = 0.5 * q->v; q->back->v = q->v; r->v = q->v; r->back->v = r->v; p->initialized = false; if (dooinit) { inittrav(p); inittrav(p->back); } i = 1; inserting = true; while (i <= smoothings) { smooth (p); if ( !p->tip ) { smooth(p->next); smooth(p->next->next); } i++; } inserting = false; } /* insert_ */ void dnaml_re_move(node **p, node **q) { /* remove p and record in q where it was */ long i; /* assumes bifurcation (OK) */ *q = (*p)->next->back; hookup(*q, (*p)->next->next->back); (*p)->next->back = NULL; (*p)->next->next->back = NULL; (*q)->v += (*q)->back->v; (*q)->back->v = (*q)->v; if ( smoothit ) { inittrav((*q)); inittrav((*q)->back); } if ( smoothit ) { for ( i = 0 ; i < smoothings ; i++ ) { smooth(*q); smooth((*q)->back); } } else smooth(*q); } /* dnaml_re_move */ void buildnewtip(long m, tree *tr) { node *p; p = tr->nodep[nextsp + spp - 3]; hookup(tr->nodep[m - 1], p); p->v = initialv; p->back->v = initialv; } /* buildnewtip */ void buildsimpletree(tree *tr) { hookup(tr->nodep[enterorder[0] - 1], tr->nodep[enterorder[1] - 1]); tr->nodep[enterorder[0] - 1]->v = 0.1; tr->nodep[enterorder[0] - 1]->back->v = 0.1; tr->nodep[enterorder[1] - 1]->v = 0.1; tr->nodep[enterorder[1] - 1]->back->v = 0.1; buildnewtip(enterorder[2], tr); insert_(tr->nodep[enterorder[2] - 1]->back, tr->nodep[enterorder[0] - 1], false); } /* buildsimpletree2 */ void addtraverse(node *p, node *q, boolean contin) { /* try adding p at q, proceed recursively through tree */ long i, num_sibs; double like, vsave = 0; node *qback = NULL, *sib_ptr; if (!smoothit) { vsave = q->v; qback = q->back; } insert_(p, q, smoothit); like = evaluate(p, false); if (like > bestyet + LIKE_EPSILON || bestyet == UNDEFINED) { bestyet = like; if (smoothit) { dnamlcopy(&curtree, &bestree, nonodes2, rcategs); addwhere = q; } else qwhere = q; succeeded = true; } if (smoothit) dnamlcopy(&priortree, &curtree, nonodes2, rcategs); else { hookup (q, qback); q->v = vsave; q->back->v = vsave; curtree.likelihood = bestyet; } if (!q->tip && contin) { num_sibs = count_sibs (q); if (q == curtree.start) num_sibs++; sib_ptr = q; for (i=0; i < num_sibs; i++) { addtraverse(p, sib_ptr->next->back, contin); sib_ptr = sib_ptr->next; } } } /* addtraverse */ void freelrsaves() { long i,j; for ( i = 0 ; i < NLRSAVES ; i++ ) { for (j = 0; j < oldendsite; j++) free(lrsaves[i]->x[j]); free(lrsaves[i]->x); free(lrsaves[i]->underflows); free(lrsaves[i]); } free(lrsaves); } void resetlrsaves() { freelrsaves(); alloclrsaves(); } void alloclrsaves() { long i,j; lrsaves = Malloc(NLRSAVES * sizeof(node*)); for ( i = 0 ; i < NLRSAVES ; i++ ) { lrsaves[i] = Malloc(sizeof(node)); lrsaves[i]->x = (phenotype)Malloc(endsite*sizeof(ratelike)); lrsaves[i]->underflows = Malloc(endsite * sizeof (double)); for (j = 0; j < endsite; j++) lrsaves[i]->x[j] = (ratelike)Malloc(rcategs*sizeof(sitelike)); } } /* alloclrsaves */ void globrearrange() { /* does global rearrangements */ tree globtree; tree oldtree; int i,j,k,l,num_sibs,num_sibs2; node *where,*sib_ptr,*sib_ptr2; double oldbestyet = curtree.likelihood; int success = false; alloctree(&globtree.nodep,nonodes2,0); alloctree(&oldtree.nodep,nonodes2,0); setuptree2(&globtree); setuptree2(&oldtree); allocx(nonodes2, rcategs, globtree.nodep, 0); allocx(nonodes2, rcategs, oldtree.nodep, 0); dnamlcopy(&curtree,&globtree,nonodes2,rcategs); dnamlcopy(&curtree,&oldtree,nonodes2,rcategs); bestyet = curtree.likelihood; for ( i = spp ; i < nonodes2 ; i++ ) { num_sibs = count_sibs(curtree.nodep[i]); sib_ptr = curtree.nodep[i]; if ( (i - spp) % (( nonodes2 / 72 ) + 1 ) == 0 ) putchar('.'); fflush(stdout); for ( j = 0 ; j <= num_sibs ; j++ ) { dnaml_re_move(&sib_ptr,&where); dnamlcopy(&curtree,&priortree,nonodes2,rcategs); qwhere = where; if (where->tip) { dnamlcopy(&oldtree,&curtree,nonodes2,rcategs); dnamlcopy(&oldtree,&bestree,nonodes2,rcategs); sib_ptr=sib_ptr->next; continue; } else num_sibs2 = count_sibs(where); sib_ptr2 = where; for ( k = 0 ; k < num_sibs2 ; k++ ) { addwhere = NULL; addtraverse(sib_ptr,sib_ptr2->back,true); if ( !smoothit ) { if (succeeded && qwhere != where && qwhere != where->back) { insert_(sib_ptr,qwhere,true); smoothit = true; for (l = 1; l<=smoothings; l++) { smooth (where); smooth (where->back); } smoothit = false; success = true; dnamlcopy(&curtree,&globtree,nonodes2,rcategs); dnamlcopy(&priortree,&curtree,nonodes2,rcategs); } } else if ( addwhere && where != addwhere && where->back != addwhere && bestyet > globtree.likelihood) { dnamlcopy(&bestree,&globtree,nonodes2,rcategs); success = true; } sib_ptr2 = sib_ptr2->next; } dnamlcopy(&oldtree,&curtree,nonodes2,rcategs); dnamlcopy(&oldtree,&bestree,nonodes2,rcategs); sib_ptr = sib_ptr->next; } } dnamlcopy(&globtree,&curtree,nonodes2,rcategs); dnamlcopy(&globtree,&bestree,nonodes2,rcategs); if (success && globtree.likelihood > oldbestyet) { succeeded = true; } else { succeeded = false; } bestyet = globtree.likelihood; freex(nonodes2, globtree.nodep); freex(nonodes2, oldtree.nodep); freetree2(globtree.nodep, nonodes2); freetree2(oldtree.nodep, nonodes2); } /* globrearrange */ void rearrange(node *p, node *pp) { /* rearranges the tree locally moving pp around near p */ /* assumes bifurcation (OK) */ long i, num_sibs; node *q, *r, *sib_ptr; node *rnb, *rnnb; if (!p->tip && !p->back->tip) { curtree.likelihood = bestyet; if (p->back->next != pp) r = p->back->next; else r = p->back->next->next; /* assumes bifurcation, that's ok */ if (!smoothit) { rnb = r->next->back; rnnb = r->next->next->back; copynode(r,lrsaves[0],rcategs); copynode(r->next,lrsaves[1],rcategs); copynode(r->next->next,lrsaves[2],rcategs); copynode(p->next,lrsaves[3],rcategs); copynode(p->next->next,lrsaves[4],rcategs); } else dnamlcopy(&curtree, &bestree, nonodes2, rcategs); dnaml_re_move(&r, &q); nuview(p->next); nuview(p->next->next); if (smoothit) dnamlcopy(&curtree, &priortree, nonodes2, rcategs); else qwhere = q; num_sibs = count_sibs (p); sib_ptr = p; for (i=0; i < num_sibs; i++) { sib_ptr = sib_ptr->next; addtraverse(r, sib_ptr->back, false); } if (smoothit) dnamlcopy(&bestree, &curtree, nonodes2, rcategs); else { if (qwhere == q) { hookup(rnb,r->next); hookup(rnnb,r->next->next); copynode(lrsaves[0],r,rcategs); copynode(lrsaves[1],r->next,rcategs); copynode(lrsaves[2],r->next->next,rcategs); copynode(lrsaves[3],p->next,rcategs); copynode(lrsaves[4],p->next->next,rcategs); rnb->v = r->next->v; rnnb->v = r->next->next->v; r->back->v = r->v; curtree.likelihood = bestyet; } else { insert_(r, qwhere, true); smoothit = true; for (i = 1; i<=smoothings; i++) { smooth (r); smooth (r->back); } smoothit = false; } } } if (!p->tip) { num_sibs = count_sibs (p); if (p == curtree.start) num_sibs++; sib_ptr = p; for (i=0; i < num_sibs; i++) { sib_ptr = sib_ptr->next; rearrange(sib_ptr->back, p); } } } /* rearrange */ void initdnamlnode(node **p, node **grbg, node *q, long len, long nodei, long *ntips, long *parens, initops whichinit, pointarray treenode, pointarray nodep, Char *str, Char *ch, FILE *intree) { /* initializes a node */ boolean minusread; double valyew, divisor; switch (whichinit) { case bottom: gnu(grbg, p); (*p)->index = nodei; (*p)->tip = false; malloc_pheno((*p), endsite, rcategs); nodep[(*p)->index - 1] = (*p); break; case nonbottom: gnu(grbg, p); malloc_pheno(*p, endsite, rcategs); (*p)->index = nodei; break; case tip: match_names_to_data (str, nodep, p, spp); break; case iter: (*p)->initialized = false; (*p)->v = initialv; (*p)->iter = true; if ((*p)->back != NULL){ (*p)->back->iter = true; (*p)->back->v = initialv; (*p)->back->initialized = false; } break; case length: processlength(&valyew, &divisor, ch, &minusread, intree, parens); (*p)->v = valyew / divisor / fracchange; (*p)->iter = false; if ((*p)->back != NULL) { (*p)->back->v = (*p)->v; (*p)->back->iter = false; } break; case hsnolength: haslengths = false; break; default: /* cases hslength, treewt, unittrwt */ break; /* should never occur */ } } /* initdnamlnode */ void dnaml_coordinates(node *p, double lengthsum, long *tipy, double *tipmax) { /* establishes coordinates of nodes */ node *q, *first, *last; double xx; if (p->tip) { p->xcoord = (long)(over * lengthsum + 0.5); p->ycoord = (*tipy); p->ymin = (*tipy); p->ymax = (*tipy); (*tipy) += down; if (lengthsum > (*tipmax)) (*tipmax) = lengthsum; return; } q = p->next; do { xx = fracchange * q->v; if (xx > 100.0) xx = 100.0; dnaml_coordinates(q->back, lengthsum + xx, tipy,tipmax); q = q->next; } while ((p == curtree.start || p != q) && (p != curtree.start || p->next != q)); first = p->next->back; q = p; while (q->next != p) q = q->next; last = q->back; p->xcoord = (long)(over * lengthsum + 0.5); if (p == curtree.start) p->ycoord = p->next->next->back->ycoord; else p->ycoord = (first->ycoord + last->ycoord) / 2; p->ymin = first->ymin; p->ymax = last->ymax; } /* dnaml_coordinates */ void dnaml_printree() { /* prints out diagram of the tree2 */ long tipy; double scale, tipmax; long i; if (!treeprint) return; putc('\n', outfile); tipy = 1; tipmax = 0.0; dnaml_coordinates(curtree.start, 0.0, &tipy, &tipmax); scale = 1.0 / (long)(tipmax + 1.000); for (i = 1; i <= (tipy - down); i++) drawline2(i, scale, curtree); putc('\n', outfile); } /* dnaml_printree */ void sigma(node *p, double *sumlr, double *s1, double *s2) { /* compute standard deviation */ double tt, aa, like, slope, curv; slopecurv (p, p->v, &like, &slope, &curv); tt = p->v; p->v = epsilon; p->back->v = epsilon; aa = evaluate(p, false); p->v = tt; p->back->v = tt; (*sumlr) = evaluate(p, false) - aa; if (curv < -epsilon) { (*s1) = p->v + (-slope - sqrt(slope * slope - 3.841 * curv)) / curv; (*s2) = p->v + (-slope + sqrt(slope * slope - 3.841 * curv)) / curv; } else { (*s1) = -1.0; (*s2) = -1.0; } } /* sigma */ void describe(node *p) { /* print out information for one branch */ long i, num_sibs; node *q, *sib_ptr; double sumlr, sigma1, sigma2; if (!p->tip && !p->initialized) nuview(p); if (!p->back->tip && !p->back->initialized) nuview(p->back); q = p->back; if (q->tip) { fprintf(outfile, " "); for (i = 0; i < nmlngth; i++) putc(nayme[q->index-1][i], outfile); fprintf(outfile, " "); } else fprintf(outfile, " %4ld ", q->index - spp); if (p->tip) { for (i = 0; i < nmlngth; i++) putc(nayme[p->index-1][i], outfile); } else fprintf(outfile, "%4ld ", p->index - spp); fprintf(outfile, "%15.5f", q->v * fracchange); if (!usertree || (usertree && !lngths) || p->iter) { sigma(q, &sumlr, &sigma1, &sigma2); if (sigma1 <= sigma2) fprintf(outfile, " ( zero, infinity)"); else { fprintf(outfile, " ("); if (sigma2 <= 0.0) fprintf(outfile, " zero"); else fprintf(outfile, "%9.5f", sigma2 * fracchange); fprintf(outfile, ",%12.5f", sigma1 * fracchange); putc(')', outfile); } if (sumlr > 1.9205) fprintf(outfile, " *"); if (sumlr > 2.995) putc('*', outfile); } putc('\n', outfile); if (!p->tip) { num_sibs = count_sibs (p); sib_ptr = p; for (i=0; i < num_sibs; i++) { sib_ptr = sib_ptr->next; describe(sib_ptr->back); } } } /* describe */ void reconstr(node *p, long n) { /* reconstruct and print out base at site n+1 at node p */ long i, j, k, m, first, second, num_sibs; double f, sum, xx[4]; node *q; if ((ally[n] == 0) || (location[ally[n]-1] == 0)) putc('.', outfile); else { j = location[ally[n]-1] - 1; for (i = 0; i < 4; i++) { f = p->x[j][mx-1][i]; num_sibs = count_sibs(p); q = p; for (k = 0; k < num_sibs; k++) { q = q->next; f *= q->x[j][mx-1][i]; } f = sqrt(f); xx[i] = f; } xx[0] *= freqa; xx[1] *= freqc; xx[2] *= freqg; xx[3] *= freqt; sum = xx[0]+xx[1]+xx[2]+xx[3]; for (i = 0; i < 4; i++) xx[i] /= sum; first = 0; for (i = 1; i < 4; i++) if (xx [i] > xx[first]) first = i; if (first == 0) second = 1; else second = 0; for (i = 0; i < 4; i++) if ((i != first) && (xx[i] > xx[second])) second = i; m = 1 << first; if (xx[first] < 0.4999995) m = m + (1 << second); if (xx[first] > 0.95) putc(toupper(basechar[m - 1]), outfile); else putc(basechar[m - 1], outfile); if (rctgry && rcategs > 1) mx = mp[n][mx - 1]; else mx = 1; } } /* reconstr */ void rectrav(node *p, long m, long n) { /* print out segment of reconstructed sequence for one branch */ long i; node *q; putc(' ', outfile); if (p->tip) { for (i = 0; i < nmlngth; i++) putc(nayme[p->index-1][i], outfile); } else fprintf(outfile, "%4ld ", p->index - spp); fprintf(outfile, " "); mx = mx0; for (i = m; i <= n; i++) { if ((i % 10 == 0) && (i != m)) putc(' ', outfile); if (p->tip) putc(y[p->index-1][i], outfile); else reconstr(p, i); } putc('\n', outfile); if (!p->tip) { for ( q = p->next; q != p; q = q->next ) rectrav(q->back, m, n); } mx1 = mx; } /* rectrav */ void summarize() { /* print out branch length information and node numbers */ long i, j, mm, num_sibs; double mode, sum; double like[maxcategs], nulike[maxcategs]; double **marginal; node *sib_ptr; if (!treeprint) return; fprintf(outfile, "\nremember: "); if (outgropt) fprintf(outfile, "(although rooted by outgroup) "); fprintf(outfile, "this is an unrooted tree!\n\n"); fprintf(outfile, "Ln Likelihood = %11.5f\n", curtree.likelihood); fprintf(outfile, "\n Between And Length"); if (!(usertree && lngths && haslengths)) fprintf(outfile, " Approx. Confidence Limits"); fprintf(outfile, "\n"); fprintf(outfile, " ------- --- ------"); if (!(usertree && lngths && haslengths)) fprintf(outfile, " ------- ---------- ------"); fprintf(outfile, "\n\n"); for (i = spp; i < nonodes2; i++) { /* So this works with arbitrary multifurcations */ if (curtree.nodep[i]) { num_sibs = count_sibs (curtree.nodep[i]); sib_ptr = curtree.nodep[i]; for (j = 0; j < num_sibs; j++) { sib_ptr->initialized = false; sib_ptr = sib_ptr->next; } } } describe(curtree.start->back); /* So this works with arbitrary multifurcations */ num_sibs = count_sibs (curtree.start); sib_ptr = curtree.start; for (i=0; i < num_sibs; i++) { sib_ptr = sib_ptr->next; describe(sib_ptr->back); } fprintf(outfile, "\n"); if (!(usertree && lngths && haslengths)) { fprintf(outfile, " * = significantly positive, P < 0.05\n"); fprintf(outfile, " ** = significantly positive, P < 0.01\n\n"); } dummy = evaluate(curtree.start, false); if (rctgry && rcategs > 1) { for (i = 0; i < rcategs; i++) like[i] = 1.0; for (i = sites - 1; i >= 0; i--) { sum = 0.0; for (j = 0; j < rcategs; j++) { nulike[j] = (1.0 - lambda + lambda * probcat[j]) * like[j]; mp[i][j] = j + 1; for (k = 1; k <= rcategs; k++) { if (k != j + 1) { if (lambda * probcat[k - 1] * like[k - 1] > nulike[j]) { nulike[j] = lambda * probcat[k - 1] * like[k - 1]; mp[i][j] = k; } } } if ((ally[i] > 0) && (location[ally[i]-1] > 0)) nulike[j] *= contribution[location[ally[i] - 1] - 1][j]; sum += nulike[j]; } for (j = 0; j < rcategs; j++) nulike[j] /= sum; memcpy(like, nulike, rcategs * sizeof(double)); } mode = 0.0; mx = 1; for (i = 1; i <= rcategs; i++) { if (probcat[i - 1] * like[i - 1] > mode) { mx = i; mode = probcat[i - 1] * like[i - 1]; } } mx0 = mx; fprintf(outfile, "Combination of categories that contributes the most to the likelihood:\n\n"); for (i = 1; i <= nmlngth + 3; i++) putc(' ', outfile); for (i = 1; i <= sites; i++) { fprintf(outfile, "%ld", mx); if (i % 10 == 0) putc(' ', outfile); if (i % 60 == 0 && i != sites) { putc('\n', outfile); for (j = 1; j <= nmlngth + 3; j++) putc(' ', outfile); } mx = mp[i - 1][mx - 1]; } fprintf(outfile, "\n\n"); marginal = (double **) Malloc( sites*sizeof(double *)); for (i = 0; i < sites; i++) marginal[i] = (double *) Malloc( rcategs*sizeof(double)); for (i = 0; i < rcategs; i++) like[i] = 1.0; for (i = sites - 1; i >= 0; i--) { sum = 0.0; for (j = 0; j < rcategs; j++) { nulike[j] = (1.0 - lambda + lambda * probcat[j]) * like[j]; for (k = 1; k <= rcategs; k++) { if (k != j + 1) nulike[j] += lambda * probcat[k - 1] * like[k - 1]; } if ((ally[i] > 0) && (location[ally[i]-1] > 0)) nulike[j] *= contribution[location[ally[i] - 1] - 1][j]; sum += nulike[j]; } for (j = 0; j < rcategs; j++) { nulike[j] /= sum; marginal[i][j] = nulike[j]; } memcpy(like, nulike, rcategs * sizeof(double)); } for (i = 0; i < rcategs; i++) like[i] = 1.0; for (i = 0; i < sites; i++) { sum = 0.0; for (j = 0; j < rcategs; j++) { nulike[j] = (1.0 - lambda + lambda * probcat[j]) * like[j]; for (k = 1; k <= rcategs; k++) { if (k != j + 1) nulike[j] += lambda * probcat[k - 1] * like[k - 1]; } marginal[i][j] *= like[j] * probcat[j]; sum += nulike[j]; } for (j = 0; j < rcategs; j++) nulike[j] /= sum; memcpy(like, nulike, rcategs * sizeof(double)); sum = 0.0; for (j = 0; j < rcategs; j++) sum += marginal[i][j]; for (j = 0; j < rcategs; j++) marginal[i][j] /= sum; } fprintf(outfile, "Most probable category at each site if > 0.95" " probability (\".\" otherwise)\n\n"); for (i = 1; i <= nmlngth + 3; i++) putc(' ', outfile); for (i = 0; i < sites; i++) { mm = 0; sum = 0.0; for (j = 0; j < rcategs; j++) if (marginal[i][j] > sum) { sum = marginal[i][j]; mm = j; } if (sum >= 0.95) fprintf(outfile, "%ld", mm+1); else putc('.', outfile); if ((i+1) % 60 == 0) { if (i != 0) { putc('\n', outfile); for (j = 1; j <= nmlngth + 3; j++) putc(' ', outfile); } } else if ((i+1) % 10 == 0) putc(' ', outfile); } putc('\n', outfile); for (i = 0; i < sites; i++) free(marginal[i]); free(marginal); } putc('\n', outfile); if (hypstate) { fprintf(outfile, "Probable sequences at interior nodes:\n\n"); fprintf(outfile, " node "); for (i = 0; (i < 13) && (i < ((sites + (sites-1)/10 - 39) / 2)); i++) putc(' ', outfile); fprintf(outfile, "Reconstructed sequence (caps if > 0.95)\n\n"); if (!rctgry || (rcategs == 1)) mx0 = 1; for (i = 0; i < sites; i += 60) { k = i + 59; if (k >= sites) k = sites - 1; rectrav(curtree.start, i, k); rectrav(curtree.start->back, i, k); putc('\n', outfile); mx0 = mx1; } } } /* summarize */ void dnaml_treeout(node *p) { /* write out file with representation of final tree2 */ long i, n, w; Char c; double x; node *q; boolean inloop; if (p->tip) { n = 0; for (i = 1; i <= nmlngth; i++) { if (nayme[p->index-1][i - 1] != ' ') n = i; } for (i = 0; i < n; i++) { c = nayme[p->index-1][i]; if (c == ' ') c = '_'; putc(c, outtree); } col += n; } else { putc('(', outtree); col++; inloop = false; q = p->next; do { if (inloop) { putc(',', outtree); col++; if (col > 45) { putc('\n', outtree); col = 0; } } inloop = true; dnaml_treeout(q->back); q = q->next; } while ((p == curtree.start || p != q) && (p != curtree.start || p->next != q)); putc(')', outtree); col++; } x = p->v * fracchange; if (x > 0.0) w = (long)(0.43429448222 * log(x)); else if (x == 0.0) w = 0; else w = (long)(0.43429448222 * log(-x)) + 1; if (w < 0) w = 0; if (p == curtree.start) fprintf(outtree, ";\n"); else { fprintf(outtree, ":%*.5f", (int)(w + 7), x); col += w + 8; } } /* dnaml_treeout */ void inittravtree(node *p) { /* traverse tree to set initialized and v to initial values */ node *q; p->initialized = false; p->back->initialized = false; if ( usertree && (!lngths || p->iter) ) { p->v = initialv; p->back->v = initialv; } if ( !p->tip ) { q = p->next; while ( q != p ) { inittravtree(q->back); q = q->next; } } } /* inittravtree */ void treevaluate() { /* evaluate a user tree */ long i; inittravtree(curtree.start); polishing = true; smoothit = true; for (i = 1; i <= smoothings * 4; i++) smooth (curtree.start); dummy = evaluate(curtree.start, true); } /* treevaluate */ void dnaml_unroot(node* root, node** nodep, long nonodes) { node *p,*r,*q; double newl; long i; long numsibs; numsibs = count_sibs(root); if ( numsibs > 2 ) { q = root; r = root; while (!(q->next == root)) q = q->next; q->next = root->next; for(i=0 ; i < endsite ; i++){ free(r->x[i]); r->x[i] = NULL; } free(r->x); r->x = NULL; chuck(&grbg, r); curtree.nodep[spp] = q; } else { /* Bifurcating root - remove entire root fork */ /* Join oldlen on each side of root */ newl = root->next->oldlen + root->next->next->oldlen; root->next->back->oldlen = newl; root->next->next->back->oldlen = newl; /* Join v on each side of root */ newl = root->next->v + root->next->next->v; root->next->back->v = newl; root->next->next->back->v = newl; /* Connect root's children */ root->next->back->back=root->next->next->back; root->next->next->back->back = root->next->back; /* Move nodep entries down one and set indices */ for ( i = spp; i < nonodes-1; i++ ) { p = nodep[i+1]; nodep[i] = p; nodep[i+1] = NULL; if ( nodep[i] == NULL ) /* This may happen in a multifurcating intree */ break; do { p->index = i+1; p = p->next; } while (p != nodep[i]); } /* Free protx arrays from old root */ for(i=0 ; i < endsite ; i++){ free(root->x[i]); free(root->next->x[i]); free(root->next->next->x[i]); root->x[i] = NULL; root->next->x[i] = NULL; root->next->next->x[i] = NULL; } free(root->x); free(root->next->x); free(root->next->next->x); chuck(&grbg,root->next->next); chuck(&grbg,root->next); chuck(&grbg,root); } } /* dnaml_unroot */ void maketree() { long i, j; boolean dummy_first, goteof; pointarray dummy_treenode=NULL; long nextnode; node *root; inittable(); if (usertree) { /* Open in binary: ftell() is broken for UNIX line-endings under WIN32 */ openfile(&intree,INTREE,"input tree file", "rb",progname,intreename); inittable_for_usertree (intree); numtrees = countsemic(&intree); if(numtrees > MAXSHIMOTREES) shimotrees = MAXSHIMOTREES; else shimotrees = numtrees; if (numtrees > 2) initseed(&inseed, &inseed0, seed); l0gl = (double *) Malloc(shimotrees * sizeof(double)); l0gf = (double **) Malloc(shimotrees * sizeof(double *)); for (i=0; i < shimotrees; ++i) l0gf[i] = (double *) Malloc(endsite * sizeof(double)); if (treeprint) { fprintf(outfile, "User-defined tree"); if (numtrees > 1) putc('s', outfile); fprintf(outfile, ":\n\n"); } which = 1; /* This taken out of tree read, used to be [spp-1], but referring to [0] produces output identical to what the pre-modified dnaml produced. */ while (which <= numtrees) { /* These initializations required each time through the loop since multiple trees require re-initialization */ haslengths = true; nextnode = 0; dummy_first = true; goteof = false; treeread(intree, &root, dummy_treenode, &goteof, &dummy_first, curtree.nodep, &nextnode, &haslengths, &grbg, initdnamlnode, false, nonodes2); dnaml_unroot(root, curtree.nodep, nonodes2); if (goteof && (which <= numtrees)) { /* if we hit the end of the file prematurely */ printf ("\n"); printf ("ERROR: trees missing at end of file.\n"); printf ("\tExpected number of trees:\t\t%ld\n", numtrees); printf ("\tNumber of trees actually in file:\t%ld.\n\n", which - 1); exxit(-1); } curtree.start = curtree.nodep[0]->back; if ( outgropt ) curtree.start = curtree.nodep[outgrno - 1]->back; treevaluate(); dnaml_printree(); summarize(); if (trout) { col = 0; dnaml_treeout(curtree.start); } if(which < numtrees){ freex_notip(nextnode, curtree.nodep); gdispose(curtree.start, &grbg, curtree.nodep); } else nonodes2 = nextnode; which++; } FClose(intree); putc('\n', outfile); if (!auto_ && numtrees > 1 && weightsum > 1 ) standev2(numtrees, maxwhich, 0, endsite-1, maxlogl, l0gl, l0gf, aliasweight, seed); } else { /* If there's no input user tree, */ for (i = 1; i <= spp; i++) enterorder[i - 1] = i; if (jumble) randumize(seed, enterorder); if (progress) { printf("\nAdding species:\n"); writename(0, 3, enterorder); #ifdef WIN32 phyFillScreenColor(); #endif } nextsp = 3; polishing = false; smoothit = improve; buildsimpletree(&curtree); curtree.start = curtree.nodep[enterorder[0] - 1]->back; nextsp = 4; while (nextsp <= spp) { buildnewtip(enterorder[nextsp - 1], &curtree); bestyet = UNDEFINED; if (smoothit) dnamlcopy(&curtree, &priortree, nonodes2, rcategs); addtraverse(curtree.nodep[enterorder[nextsp - 1] - 1]->back, curtree.start, true); if (smoothit) dnamlcopy(&bestree, &curtree, nonodes2, rcategs); else { insert_(curtree.nodep[enterorder[nextsp - 1] - 1]->back, qwhere, true); smoothit = true; for (i = 1; i<=smoothings; i++) { smooth (curtree.start); smooth (curtree.start->back); } smoothit = false; bestyet = curtree.likelihood; } if (progress) { writename(nextsp - 1, 1, enterorder); #ifdef WIN32 phyFillScreenColor(); #endif } if (global && nextsp == spp && progress) { printf("Doing global rearrangements\n"); printf(" !"); for (j = spp ; j < nonodes2 ; j++) if ( (j - spp) % (( nonodes2 / 72 ) + 1 ) == 0 ) putchar('-'); printf("!\n"); } succeeded = true; while (succeeded) { succeeded = false; if (global && nextsp == spp && progress) { printf(" "); fflush(stdout); } if (global && nextsp == spp) globrearrange(); else rearrange(curtree.start, curtree.start->back); if (global && nextsp == spp && progress) putchar('\n'); } nextsp++; } if ( !smoothit) dnamlcopy(&curtree, &bestree, nonodes2, rcategs); if (global && progress) { putchar('\n'); fflush(stdout); } if (njumble > 1) { if (jumb == 1) dnamlcopy(&bestree, &bestree2, nonodes2, rcategs); else if (bestree2.likelihood < bestree.likelihood) dnamlcopy(&bestree, &bestree2, nonodes2, rcategs); } if (jumb == njumble) { if (njumble > 1) dnamlcopy(&bestree2, &curtree, nonodes2, rcategs); curtree.start = curtree.nodep[outgrno - 1]->back; for (i = 0; i < nonodes2; i++) { if (i < spp) curtree.nodep[i]->initialized = false; else { curtree.nodep[i]->initialized = false; curtree.nodep[i]->next->initialized = false; curtree.nodep[i]->next->next->initialized = false; } } treevaluate(); dnaml_printree(); summarize(); if (trout) { col = 0; dnaml_treeout(curtree.start); } } } if (usertree) { free(l0gl); for (i=0; i < shimotrees; i++) free(l0gf[i]); free(l0gf); } freetable(); if (jumb < njumble) return; free(contribution); free(mp); for (i=0; i < endsite; i++) free(term[i]); free(term); for (i=0; i < endsite; i++) free(slopeterm[i]); free(slopeterm); for (i=0; i < endsite; i++) free(curveterm[i]); free(curveterm); freex(nonodes2, curtree.nodep); if (!usertree) { freex(nonodes2, bestree.nodep); freex(nonodes2, priortree.nodep); if (njumble > 1) freex(nonodes2, bestree2.nodep); } if (progress) { printf("\nOutput written to file \"%s\"\n", outfilename); if (trout) printf("\nTree also written onto file \"%s\"\n", outtreename); } } /* maketree */ void clean_up() { /* Free and/or close stuff */ long i; free (rrate); free (probcat); free (rate); /* Seems to require freeing every time... */ for (i = 0; i < spp; i++) { free (y[i]); } free (y); free (nayme); free (enterorder); free (category); free (weight); free (alias); free (ally); free (location); free (aliasweight); FClose(infile); FClose(outfile); FClose(outtree); #ifdef MAC fixmacfile(outfilename); fixmacfile(outtreename); #endif } /* clean_up */ int main(int argc, Char *argv[]) { /* DNA Maximum Likelihood */ #ifdef MAC argc = 1; /* macsetup("DnaML",""); */ argv[0] = "DnaML"; #endif init(argc,argv); progname = argv[0]; openfile(&infile,INFILE,"input file","r",argv[0],infilename); openfile(&outfile,OUTFILE,"output file","w",argv[0],outfilename); mulsets = false; datasets = 1; firstset = true; ibmpc = IBMCRT; ansi = ANSICRT; grbg = NULL; doinit(); ttratio0 = ttratio; if (ctgry) openfile(&catfile,CATFILE,"categories file","r",argv[0],catfilename); if (weights || justwts) openfile(&weightfile,WEIGHTFILE,"weights file","r",argv[0],weightfilename); if (trout) openfile(&outtree,OUTTREE,"output tree file","w",argv[0],outtreename); if (!usertree) nonodes2--; for (ith = 1; ith <= datasets; ith++) { if (datasets > 1) { fprintf(outfile, "Data set # %ld:\n", ith); printf("\nData set # %ld:\n", ith); } ttratio = ttratio0; getinput(); if (ith == 1) firstset = false; if (usertree) maketree(); else for (jumb = 1; jumb <= njumble; jumb++) maketree(); } clean_up(); printf("\nDone.\n\n"); #ifdef WIN32 phyRestoreConsoleAttributes(); #endif return 0; } /* DNA Maximum Likelihood */ phylip-3.697/src/dnamlk.c0000644004732000473200000017522312406201116014751 0ustar joefelsenst_g/* PHYLIP version 3.696. Written by Joseph Felsenstein. Copyright (c) 1989-2014, Joseph Felsenstein. All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #include #include #include "phylip.h" #include "seq.h" #include "mlclock.h" #include "printree.h" #define over 60 /* Maximum xcoord of tip nodes */ /* These are redefined from phylip.h */ /* Fractional accuracy to which node tymes are optimized */ #undef epsilon double epsilon = 1e-3; /* Number of (failed) passes over the tree before giving up */ #undef smoothings #define smoothings 4 #undef initialv #define initialv 0.3 typedef struct valrec { double rat, ratxi, ratxv, orig_zz, z1, y1, z1zz, z1yy, xiz1, xiy1xv; double *ww, *zz, *wwzz, *vvzz; } valrec; struct options { boolean auto_; boolean ctgry; long categs; long rcategs; boolean freqsfrom; boolean gama; boolean invar; boolean global; boolean hypstate; boolean jumble; long njumble; double lambda; double lambda1; boolean lengthsopt; boolean trout; double ttratio; boolean ttr; boolean usertree; boolean weights; boolean printdata; boolean dotdiff; boolean progress; boolean treeprint; boolean interleaved; }; typedef double contribarr[maxcategs]; valrec ***tbl; #ifndef OLDC /* function prototypes */ static void menuconf(void); static void allocrest(void); static void doinit(void); static void inputoptions(void); static void makeweights(void); static void getinput(void); static void inittable_for_usertree (FILE *); static void inittable(void); static void alloc_nvd(long, nuview_data *); static void free_nvd(nuview_data *); static node *invalid_descendant_view(node *); static boolean nuview(node *); static double dnamlk_evaluate(node *); static boolean update(node *); static boolean smooth(node *); static void restoradd(node *, node *, node *, double); static void dnamlk_add(node *, node *, node *); static void dnamlk_re_move(node **, node **, boolean); static void tryadd(node *, node **, node **); static void addpreorder(node *, node *, node *, boolean, boolean); static boolean tryrearr(node *); static boolean repreorder(node *); static void rearrange(node **); static void initdnamlnode(node **, node **, node *, long, long, long *, long *, initops, pointarray, pointarray, Char *, Char *, FILE *); static double tymetrav(node *p); static void reconstr(node *, long); static void rectrav(node *, long, long); static void summarize(FILE *fp); static void dnamlk_treeout(node *); static void init_tymes(node *p, double minlength); static void treevaluate(void); static void maketree(void); static void reallocsites(void); static void freetable(void); #endif /* OLDC */ Char infilename[FNMLNGTH], outfilename[FNMLNGTH], intreename[FNMLNGTH], outtreename[FNMLNGTH], catfilename[FNMLNGTH], weightfilename[FNMLNGTH]; double *rrate; long sites, weightsum, categs, datasets, ith, njumble, jumb, numtrees, shimotrees; /* sites = number of sites in actual sequences numtrees = number of user-defined trees */ long inseed, inseed0, mx, mx0, mx1; boolean freqsfrom, global, global2=0, jumble, trout, usertree, weights, rctgry, ctgry, ttr, auto_, progress, mulsets, firstset, hypstate, smoothit, polishing, justwts, gama, invar; boolean lengthsopt = false; /* Use lengths in user tree option */ boolean lngths = false; /* Actually use lengths (depends on each input tree) */ tree curtree, bestree, bestree2; node *qwhere, *grbg; double *tymes; double xi, xv, ttratio, ttratio0, freqa, freqc, freqg, freqt, freqr, freqy, freqar, freqcy, freqgr, freqty, fracchange, sumrates, cv, alpha, lambda, lambda1, invarfrac; long *enterorder; steptr aliasweight; double *rate; longer seed; double *probcat; contribarr *contribution; char *progname; long rcategs; long **mp; char basechar[16] = "acmgrsvtwyhkdbn"; /* Local variables for maketree, propagated globally for C version: */ long k, maxwhich, col; double like, bestyet, maxlogl; boolean lastsp; boolean smoothed; /* set true before each smoothing run, and set false each time a branch cannot be completely optimized. */ double *l0gl; double expon1i[maxcategs], expon1v[maxcategs], expon2i[maxcategs], expon2v[maxcategs]; node *there; double **l0gf; Char ch, ch2; static void menuconf() { /* interactively set options */ long i, loopcount, loopcount2; Char ch; boolean done; boolean didchangecat, didchangercat; double probsum; fprintf(outfile, "\nNucleic acid sequence\n"); fprintf(outfile, " Maximum Likelihood method with molecular "); fprintf(outfile, "clock, version %s\n\n", VERSION); putchar('\n'); auto_ = false; ctgry = false; didchangecat = false; rctgry = false; didchangercat = false; categs = 1; rcategs = 1; freqsfrom = true; gama = false; invar = false; global = false; hypstate = false; jumble = false; njumble = 1; lambda = 1.0; lambda1 = 0.0; lengthsopt = false; trout = true; ttratio = 2.0; ttr = false; usertree = false; weights = false; printdata = false; dotdiff = true; progress = true; treeprint = true; interleaved = true; loopcount = 0; do { cleerhome(); printf("\nNucleic acid sequence\n"); printf(" Maximum Likelihood method with molecular clock, version %s\n\n", VERSION); printf("Settings for this run:\n"); printf(" U Search for best tree?"); if (usertree) printf(" No, use user trees in input file\n"); else printf(" Yes\n"); if (usertree) { printf(" L Use lengths from user tree?"); if (lengthsopt) printf(" Yes\n"); else printf(" No\n"); } printf(" T Transition/transversion ratio:"); if (!ttr) printf(" 2.0\n"); else printf(" %8.4f\n", ttratio); printf(" F Use empirical base frequencies?"); if (freqsfrom) printf(" Yes\n"); else printf(" No\n"); printf(" C One category of substitution rates?"); if (!ctgry) printf(" Yes\n"); else printf(" %ld categories\n", categs); printf(" R Rate variation among sites?"); if (!rctgry) printf(" constant rate\n"); else { if (gama) printf(" Gamma distributed rates\n"); else { if (invar) printf(" Gamma+Invariant sites\n"); else printf(" user-defined HMM of rates\n"); } printf(" A Rates at adjacent sites correlated?"); if (!auto_) printf(" No, they are independent\n"); else printf(" Yes, mean block length =%6.1f\n", 1.0 / lambda); } if (!usertree) { printf(" G Global rearrangements?"); if (global) printf(" Yes\n"); else printf(" No\n"); } printf(" W Sites weighted? %s\n", (weights ? "Yes" : "No")); if (!usertree) { printf(" J Randomize input order of sequences?"); if (jumble) printf(" Yes (seed = %8ld, %3ld times)\n", inseed0, njumble); else printf(" No. Use input order\n"); } printf(" M Analyze multiple data sets?"); if (mulsets) printf(" Yes, %2ld %s\n", datasets, (justwts ? "sets of weights" : "data sets")); else printf(" No\n"); printf(" I Input sequences interleaved?"); if (interleaved) printf(" Yes\n"); else printf(" No, sequential\n"); printf(" 0 Terminal type (IBM PC, ANSI, none)?"); if (ibmpc) printf(" IBM PC\n"); if (ansi) printf(" ANSI\n"); if (!(ibmpc || ansi)) printf(" (none)\n"); printf(" 1 Print out the data at start of run"); if (printdata) printf(" Yes\n"); else printf(" No\n"); printf(" 2 Print indications of progress of run"); if (progress) printf(" Yes\n"); else printf(" No\n"); printf(" 3 Print out tree"); if (treeprint) printf(" Yes\n"); else printf(" No\n"); printf(" 4 Write out trees onto tree file?"); if (trout) printf(" Yes\n"); else printf(" No\n"); printf(" 5 Reconstruct hypothetical sequences? %s\n", (hypstate ? "Yes" : "No")); printf("\nAre these settings correct? " "(type Y or the letter for one to change)\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); if (ch == '\n') ch = ' '; uppercase(&ch); done = (ch == 'Y'); if (!done) { uppercase(&ch); if (((!usertree) && (strchr("JUCRAFWGTMI012345", ch) != NULL)) || (usertree && ((strchr("UCRAFWLTMI012345", ch) != NULL)))){ switch (ch) { case 'C': ctgry = !ctgry; if (ctgry) { printf("\nSitewise user-assigned categories:\n\n"); initcatn(&categs); if (rate){ free(rate); } rate = (double *) Malloc( categs * sizeof(double)); didchangecat = true; initcategs(categs, rate); } break; case 'R': if (!rctgry) { rctgry = true; gama = true; } else { if (gama) { gama = false; invar = true; } else { if (invar) invar = false; else rctgry = false; } } break; case 'A': auto_ = !auto_; if (auto_) { initlambda(&lambda); lambda1 = 1.0 - lambda; } break; case 'F': freqsfrom = !freqsfrom; if (!freqsfrom) initfreqs(&freqa, &freqc, &freqg, &freqt); break; case 'G': global = !global; break; case 'W': weights = !weights; break; case 'J': jumble = !jumble; if (jumble) initjumble(&inseed, &inseed0, seed, &njumble); else njumble = 1; break; case 'L': lengthsopt = !lengthsopt; break; case 'T': ttr = !ttr; if (ttr) initratio(&ttratio); break; case 'U': usertree = !usertree; break; case 'M': mulsets = !mulsets; if (mulsets) { printf("Multiple data sets or multiple weights?"); loopcount2 = 0; do { printf(" (type D or W)\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch2); getchar(); if (ch2 == '\n') ch2 = ' '; uppercase(&ch2); countup(&loopcount2, 10); } while ((ch2 != 'W') && (ch2 != 'D')); justwts = (ch2 == 'W'); if (justwts) justweights(&datasets); else initdatasets(&datasets); if (!jumble) { jumble = true; initjumble(&inseed, &inseed0, seed, &njumble); } } break; case 'I': interleaved = !interleaved; break; case '0': initterminal(&ibmpc, &ansi); break; case '1': printdata = !printdata; break; case '2': progress = !progress; break; case '3': treeprint = !treeprint; break; case '4': trout = !trout; break; case '5': hypstate = !hypstate; break; } } else printf("Not a possible option!\n"); } countup(&loopcount, 100); } while (!done); if (gama || invar) { loopcount = 0; do { printf( "\nCoefficient of variation of substitution rate among sites (must be positive)\n"); printf( " In gamma distribution parameters, this is 1/(square root of alpha)\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%lf%*[^\n]", &cv); getchar(); countup(&loopcount, 10); } while (cv <= 0.0); alpha = 1.0 / (cv * cv); } if (!rctgry) auto_ = false; if (rctgry) { printf("\nRates in HMM"); if (invar) printf(" (including one for invariant sites)"); printf(":\n"); initcatn(&rcategs); if (probcat){ free(probcat); free(rrate); } probcat = (double *) Malloc(rcategs * sizeof(double)); rrate = (double *) Malloc(rcategs * sizeof(double)); didchangercat = true; if (gama) initgammacat(rcategs, alpha, rrate, probcat); else { if (invar) { loopcount = 0; do { printf("Fraction of invariant sites?\n"); fflush(stdout); scanf("%lf%*[^\n]", &invarfrac); getchar(); countup(&loopcount, 10); } while ((invarfrac <= 0.0) || (invarfrac >= 1.0)); initgammacat(rcategs-1, alpha, rrate, probcat); for (i = 0; i < rcategs-1; i++) probcat[i] = probcat[i]*(1.0-invarfrac); probcat[rcategs-1] = invarfrac; rrate[rcategs-1] = 0.0; } else { initcategs(rcategs, rrate); initprobcat(rcategs, &probsum, probcat); } } } if (!didchangercat){ rrate = (double *)Malloc( rcategs*sizeof(double)); probcat = (double *)Malloc( rcategs*sizeof(double)); rrate[0] = 1.0; probcat[0] = 1.0; } if (!didchangecat){ rate = (double *)Malloc( categs*sizeof(double)); rate[0] = 1.0; } } /* menuconf */ static void reallocsites(void) { long i; for (i = 0; i < spp; i++) { free(y[i]); y[i] = (char *)Malloc(sites * sizeof(char)); } free(weight); free(category); free(alias); free(aliasweight); free(ally); free(location); weight = (long *)Malloc(sites*sizeof(long)); category = (long *)Malloc(sites*sizeof(long)); alias = (long *)Malloc(sites*sizeof(long)); aliasweight = (long *)Malloc(sites*sizeof(long)); ally = (long *)Malloc(sites*sizeof(long)); location = (long *)Malloc(sites*sizeof(long)); } /* reallocsites */ static void allocrest() { long i; y = (Char **)Malloc(spp*sizeof(Char *)); nayme = (naym *)Malloc(spp*sizeof(naym)); for (i = 0; i < spp; i++) y[i] = (char *)Malloc(sites * sizeof(char)); enterorder = (long *)Malloc(spp*sizeof(long)); weight = (long *)Malloc(sites*sizeof(long)); category = (long *)Malloc(sites*sizeof(long)); alias = (long *)Malloc(sites*sizeof(long)); aliasweight = (long *)Malloc(sites*sizeof(long)); ally = (long *)Malloc(sites*sizeof(long)); location = (long *)Malloc(sites*sizeof(long)); tymes = (double *)Malloc((nonodes - spp) * sizeof(double)); } /* allocrest */ static void doinit() { /* initializes variables */ inputnumbers(&spp, &sites, &nonodes, 1); menuconf(); if (printdata) fprintf(outfile, "%2ld species, %3ld sites\n", spp, sites); alloctree(&curtree.nodep, nonodes, usertree); allocrest(); if (usertree) return; alloctree(&bestree.nodep, nonodes, 0); if (njumble <= 1) return; alloctree(&bestree2.nodep, nonodes, 0); } /* doinit */ static void inputoptions() { long i; if (!firstset && !justwts) { samenumsp(&sites, ith); reallocsites(); } for (i = 0; i < sites; i++) category[i] = 1; for (i = 0; i < sites; i++) weight[i] = 1; if (justwts || weights) inputweights(sites, weight, &weights); weightsum = 0; for (i = 0; i < sites; i++) weightsum += weight[i]; if (ctgry && categs > 1) { inputcategs(0, sites, category, categs, "DnaMLK"); if (printdata) printcategs(outfile, sites, category, "Site categories"); } if (weights && printdata) printweights(outfile, 0, sites, weight, "Sites"); } /* inputoptions */ static void makeweights() { /* make up weights vector to avoid duplicate computations */ long i; for (i = 1; i <= sites; i++) { alias[i - 1] = i; ally[i - 1] = i; aliasweight[i - 1] = weight[i - 1]; location[i - 1] = 0; } sitesort2(sites, aliasweight); sitecombine2(sites, aliasweight); sitescrunch2(sites, 1, 2, aliasweight); endsite = 0; for (i = 1; i <= sites; i++) { if (ally[i - 1] == i) endsite++; } for (i = 1; i <= endsite; i++) { location[alias[i - 1] - 1] = i; } contribution = (contribarr *) Malloc( endsite*sizeof(contribarr)); } /* makeweights */ static void getinput() { /* reads the input data */ inputoptions(); if (!freqsfrom) getbasefreqs(freqa, freqc, freqg, freqt, &freqr, &freqy, &freqar, &freqcy, &freqgr, &freqty, &ttratio, &xi, &xv, &fracchange, freqsfrom, true); if (!justwts || firstset) inputdata(sites); makeweights(); setuptree2(&curtree); if (!usertree) { setuptree2(&bestree); if (njumble > 1) setuptree2(&bestree2); } allocx(nonodes, rcategs, curtree.nodep, usertree); if (!usertree) { allocx(nonodes, rcategs, bestree.nodep, 0); if (njumble > 1) allocx(nonodes, rcategs, bestree2.nodep, 0); } makevalues2(rcategs, curtree.nodep, endsite, spp, y, alias); if (freqsfrom) { empiricalfreqs(&freqa, &freqc, &freqg, &freqt, aliasweight, curtree.nodep); getbasefreqs(freqa, freqc, freqg, freqt, &freqr, &freqy, &freqar, &freqcy, &freqgr, &freqty, &ttratio, &xi, &xv, &fracchange, freqsfrom, true); } if (!justwts || firstset) fprintf(outfile, "\nTransition/transversion ratio = %10.6f\n\n", ttratio); } /* getinput */ static void inittable_for_usertree (FILE *intree) { /* If there's a user tree, then the ww/zz/wwzz/vvzz elements need to be allocated appropriately. */ long num_comma; long i, j; /* First, figure out the largest possible furcation, i.e. the number of commas plus one */ countcomma (&intree, &num_comma); num_comma++; for (i = 0; i < rcategs; i++) { for (j = 0; j < categs; j++) { /* Free the stuff allocated assuming bifurcations */ free (tbl[i][j]->ww); free (tbl[i][j]->zz); free (tbl[i][j]->wwzz); free (tbl[i][j]->vvzz); /* Then allocate for worst-case multifurcations */ tbl[i][j]->ww = (double *) Malloc( num_comma * sizeof (double)); tbl[i][j]->zz = (double *) Malloc( num_comma * sizeof (double)); tbl[i][j]->wwzz = (double *) Malloc( num_comma * sizeof (double)); tbl[i][j]->vvzz = (double *) Malloc( num_comma * sizeof (double)); } } } /* inittable_for_usertree */ static void freetable() { long i, j; for (i = 0; i < rcategs; i++) { for (j = 0; j < categs; j++) { free(tbl[i][j]->ww); free(tbl[i][j]->zz); free(tbl[i][j]->wwzz); free(tbl[i][j]->vvzz); } } for (i = 0; i < rcategs; i++) { for (j = 0; j < categs; j++) free(tbl[i][j]); free(tbl[i]); } free(tbl); } static void inittable() { /* Define a lookup table. Precompute values and print them out in tables */ long i, j; double sumrates; tbl = (valrec ***) Malloc( rcategs * sizeof(valrec **)); for (i = 0; i < rcategs; i++) { tbl[i] = (valrec **) Malloc( categs*sizeof(valrec *)); for (j = 0; j < categs; j++) tbl[i][j] = (valrec *) Malloc( sizeof(valrec)); } for (i = 0; i < rcategs; i++) { for (j = 0; j < categs; j++) { tbl[i][j]->rat = rrate[i]*rate[j]; tbl[i][j]->ratxi = tbl[i][j]->rat * xi; tbl[i][j]->ratxv = tbl[i][j]->rat * xv; /* Allocate assuming bifurcations, will be changed later if neccesarry (i.e. there's a user tree) */ tbl[i][j]->ww = (double *) Malloc( 2 * sizeof (double)); tbl[i][j]->zz = (double *) Malloc( 2 * sizeof (double)); tbl[i][j]->wwzz = (double *) Malloc( 2 * sizeof (double)); tbl[i][j]->vvzz = (double *) Malloc( 2 * sizeof (double)); } } sumrates = 0.0; for (i = 0; i < endsite; i++) { for (j = 0; j < rcategs; j++) sumrates += aliasweight[i] * probcat[j] * tbl[j][category[alias[i] - 1] - 1]->rat; } sumrates /= (double)sites; for (i = 0; i < rcategs; i++) for (j = 0; j < categs; j++) { tbl[i][j]->rat /= sumrates; tbl[i][j]->ratxi /= sumrates; tbl[i][j]->ratxv /= sumrates; } if(jumb > 1) return; if (gama || invar) { fprintf(outfile, "\nDiscrete approximation to gamma distributed rates\n"); fprintf(outfile, " Coefficient of variation of rates = %f (alpha = %f)\n", cv, alpha); } if (rcategs > 1) { fprintf(outfile, "\nState in HMM Rate of change Probability\n\n"); for (i = 0; i < rcategs; i++) if (probcat[i] < 0.0001) fprintf(outfile, "%9ld%16.3f%20.6f\n", i+1, rrate[i], probcat[i]); else if (probcat[i] < 0.001) fprintf(outfile, "%9ld%16.3f%19.5f\n", i+1, rrate[i], probcat[i]); else if (probcat[i] < 0.01) fprintf(outfile, "%9ld%16.3f%18.4f\n", i+1, rrate[i], probcat[i]); else fprintf(outfile, "%9ld%16.3f%17.3f\n", i+1, rrate[i], probcat[i]); putc('\n', outfile); if (auto_) { fprintf(outfile, "Expected length of a patch of sites having the same rate = %8.3f\n", 1/lambda); putc('\n', outfile); } } if (categs > 1) { fprintf(outfile, "\nSite category Rate of change\n\n"); for (i = 0; i < categs; i++) fprintf(outfile, "%9ld%16.3f\n", i+1, rate[i]); fprintf(outfile, "\n\n"); } } /* inittable */ static void alloc_nvd(long num_sibs, nuview_data *local_nvd) { /* Allocate blocks of memory appropriate for the number of siblings a given node has */ local_nvd->yy = (double *) Malloc( num_sibs * sizeof (double)); local_nvd->wwzz = (double *) Malloc( num_sibs * sizeof (double)); local_nvd->vvzz = (double *) Malloc( num_sibs * sizeof (double)); local_nvd->vzsumr = (double *) Malloc( num_sibs * sizeof (double)); local_nvd->vzsumy = (double *) Malloc( num_sibs * sizeof (double)); local_nvd->sum = (double *) Malloc( num_sibs * sizeof (double)); local_nvd->sumr = (double *) Malloc( num_sibs * sizeof (double)); local_nvd->sumy = (double *) Malloc( num_sibs * sizeof (double)); local_nvd->xx = (sitelike *) Malloc( num_sibs * sizeof (sitelike)); } /* alloc_nvd */ static void free_nvd(nuview_data *local_nvd) { /* The natural complement to the alloc version */ free (local_nvd->yy); free (local_nvd->wwzz); free (local_nvd->vvzz); free (local_nvd->vzsumr); free (local_nvd->vzsumy); free (local_nvd->sum); free (local_nvd->sumr); free (local_nvd->sumy); free (local_nvd->xx); } /* free_nvd */ static node *invalid_descendant_view(node *p) { /* Useful as an assertion - Traverses all descendants of p looking for * uninitialized views. Returns NULL if none exist, otherwise a pointer to * the first one found. */ node *q = NULL; node *s = NULL; if ( p == NULL || p->tip ) return NULL; for ( q = p->next; q != p; q = q->next ) { s = invalid_descendant_view(q->back); if ( s != NULL ) return s; } return s; } /* invalid_descendant_view */ static boolean nuview(node *p) { /* Recursively update summary data for subtree rooted at p. Returns true if * view has changed. */ long i, j, k, l, num_sibs = 0, sib_index; nuview_data *local_nvd; node *q; node *sib_ptr, *sib_back_ptr; sitelike p_xx; double lw; double correction; double maxx; assert(p != NULL); if ( p == NULL ) return false; if ( p->tip ) return false; /* Tips do not need to be initialized */ for (q = p->next; q != p; q = q->next) { num_sibs++; if ( q->back != NULL && !q->tip) { if ( nuview(q->back) ) p->initialized = false; } } if ( p->initialized ) return false; /* At this point, all views downstream should be initialized. * If not, we have a problem. */ assert( invalid_descendant_view(p) == NULL ); /* Allocate the structure and blocks therein for variables used in this function */ local_nvd = (nuview_data *) Malloc( sizeof (nuview_data)); alloc_nvd (num_sibs, local_nvd); /* Loop 1: makes assignments to tbl based on some combination of what's already in tbl and the children's value of v */ sib_ptr = p; for (sib_index=0; sib_index < num_sibs; sib_index++) { sib_ptr = sib_ptr->next; sib_back_ptr = sib_ptr->back; if (sib_back_ptr != NULL) lw = -fabs(p->tyme - sib_back_ptr->tyme); else lw = 0.0; for (i = 0; i < rcategs; i++) for (j = 0; j < categs; j++) { tbl[i][j]->ww[sib_index] = exp(tbl[i][j]->ratxi * lw); tbl[i][j]->zz[sib_index] = exp(tbl[i][j]->ratxv * lw); tbl[i][j]->wwzz[sib_index] = tbl[i][j]->ww[sib_index] * tbl[i][j]->zz[sib_index]; tbl[i][j]->vvzz[sib_index] = (1.0 - tbl[i][j]->ww[sib_index]) * tbl[i][j]->zz[sib_index]; } } /* Loop 2: */ for (i = 0; i < endsite; i++) { correction = 0; maxx = 0; k = category[alias[i]-1] - 1; for (j = 0; j < rcategs; j++) { /* Loop 2.1 */ sib_ptr = p; for (sib_index=0; sib_index < num_sibs; sib_index++) { sib_ptr = sib_ptr->next; sib_back_ptr = sib_ptr->back; local_nvd->wwzz[sib_index] = tbl[j][k]->wwzz[sib_index]; local_nvd->vvzz[sib_index] = tbl[j][k]->vvzz[sib_index]; local_nvd->yy[sib_index] = 1.0 - tbl[j][k]->zz[sib_index]; if (sib_back_ptr != NULL) { memcpy(local_nvd->xx[sib_index], sib_back_ptr->x[i][j], sizeof(sitelike)); if ( j == 0) correction += sib_back_ptr->underflows[i]; } else { local_nvd->xx[sib_index][0] = 1.0; local_nvd->xx[sib_index][(long)C - (long)A] = 1.0; local_nvd->xx[sib_index][(long)G - (long)A] = 1.0; local_nvd->xx[sib_index][(long)T - (long)A] = 1.0; } } /* Loop 2.2 */ for (sib_index=0; sib_index < num_sibs; sib_index++) { local_nvd->sum[sib_index] = local_nvd->yy[sib_index] * (freqa * local_nvd->xx[sib_index][(long)A] + freqc * local_nvd->xx[sib_index][(long)C] + freqg * local_nvd->xx[sib_index][(long)G] + freqt * local_nvd->xx[sib_index][(long)T]); local_nvd->sumr[sib_index] = freqar * local_nvd->xx[sib_index][(long)A] + freqgr * local_nvd->xx[sib_index][(long)G]; local_nvd->sumy[sib_index] = freqcy * local_nvd->xx[sib_index][(long)C] + freqty * local_nvd->xx[sib_index][(long)T]; local_nvd->vzsumr[sib_index] = local_nvd->vvzz[sib_index] * local_nvd->sumr[sib_index]; local_nvd->vzsumy[sib_index] = local_nvd->vvzz[sib_index] * local_nvd->sumy[sib_index]; } /* Initialize to one, multiply incremental values for every sibling a node has */ p_xx[(long)A] = 1 ; p_xx[(long)C] = 1 ; p_xx[(long)G] = 1 ; p_xx[(long)T] = 1 ; for (sib_index=0; sib_index < num_sibs; sib_index++) { p_xx[(long)A] *= local_nvd->sum[sib_index] + local_nvd->wwzz[sib_index] * local_nvd->xx[sib_index][(long)A] + local_nvd->vzsumr[sib_index]; p_xx[(long)C] *= local_nvd->sum[sib_index] + local_nvd->wwzz[sib_index] * local_nvd->xx[sib_index][(long)C] + local_nvd->vzsumy[sib_index]; p_xx[(long)G] *= local_nvd->sum[sib_index] + local_nvd->wwzz[sib_index] * local_nvd->xx[sib_index][(long)G] + local_nvd->vzsumr[sib_index]; p_xx[(long)T] *= local_nvd->sum[sib_index] + local_nvd->wwzz[sib_index] * local_nvd->xx[sib_index][(long)T] + local_nvd->vzsumy[sib_index]; } for ( l = 0 ; l < ((long)T - (long)A + 1 ) ; l++ ) { if ( p_xx[l] > maxx ) maxx = p_xx[l]; } /* And the final point of this whole function: */ memcpy(p->x[i][j], p_xx, sizeof(sitelike)); } p->underflows[i] = 0; if ( maxx < MIN_DOUBLE) fix_x(p, i, maxx,rcategs); p->underflows[i] += correction; } free_nvd (local_nvd); free (local_nvd); p->initialized = true; return true; } /* nuview */ static double dnamlk_evaluate(node *p) { /* Evaluate and return the log likelihood of the current tree * as seen from the branch from p to p->back. If p is the root node, * the first child branch is used instead. Views are updated as needed. */ contribarr tterm; static contribarr like, nulike, clai; double sum, sum2, sumc=0, y, lz, y1, z1zz, z1yy, prod12, prod1, prod2, prod3, sumterm, lterm; long i, j, k, lai; node *q, *r; double *x1, *x2; /* pointers to sitelike elements in node->x */ sum = 0.0; assert( all_tymes_valid(curtree.root, 0.0, false) ); /* Root node has no branch, so use branch to first child */ if (p == curtree.root) p = p->next; r = p; q = p->back; nuview(r); nuview(q); y = fabs(r->tyme - q->tyme); lz = -y; for (i = 0; i < rcategs; i++) for (j = 0; j < categs; j++) { tbl[i][j]->orig_zz = exp(tbl[i][j]->ratxi * lz); tbl[i][j]->z1 = exp(tbl[i][j]->ratxv * lz); tbl[i][j]->z1zz = tbl[i][j]->z1 * tbl[i][j]->orig_zz; tbl[i][j]->z1yy = tbl[i][j]->z1 - tbl[i][j]->z1zz; } for (i = 0; i < endsite; i++) { k = category[alias[i]-1] - 1; for (j = 0; j < rcategs; j++) { if (y > 0.0) { y1 = 1.0 - tbl[j][k]->z1; z1zz = tbl[j][k]->z1zz; z1yy = tbl[j][k]->z1yy; } else { y1 = 0.0; z1zz = 1.0; z1yy = 0.0; } x1 = r->x[i][j]; prod1 = freqa * x1[0] + freqc * x1[(long)C - (long)A] + freqg * x1[(long)G - (long)A] + freqt * x1[(long)T - (long)A]; x2 = q->x[i][j]; prod2 = freqa * x2[0] + freqc * x2[(long)C - (long)A] + freqg * x2[(long)G - (long)A] + freqt * x2[(long)T - (long)A]; prod3 = (x1[0] * freqa + x1[(long)G - (long)A] * freqg) * (x2[0] * freqar + x2[(long)G - (long)A] * freqgr) + (x1[(long)C - (long)A] * freqc + x1[(long)T - (long)A] * freqt) * (x2[(long)C - (long)A] * freqcy + x2[(long)T - (long)A] * freqty); prod12 = freqa * x1[0] * x2[0] + freqc * x1[(long)C - (long)A] * x2[(long)C - (long)A] + freqg * x1[(long)G - (long)A] * x2[(long)G - (long)A] + freqt * x1[(long)T - (long)A] * x2[(long)T - (long)A]; tterm[j] = z1zz * prod12 + z1yy * prod3 + y1 * prod1 * prod2; } sumterm = 0.0; for (j = 0; j < rcategs; j++) sumterm += probcat[j] * tterm[j]; lterm = log(sumterm) + p->underflows[i] + q->underflows[i]; for (j = 0; j < rcategs; j++) clai[j] = tterm[j] / sumterm; memcpy(contribution[i], clai, sizeof(contribarr)); if (!auto_ && usertree && (which <= shimotrees)) l0gf[which - 1][i] = lterm; sum += aliasweight[i] * lterm; } if (auto_) { for (j = 0; j < rcategs; j++) like[j] = 1.0; for (i = 0; i < sites; i++) { sumc = 0.0; for (k = 0; k < rcategs; k++) sumc += probcat[k] * like[k]; sumc *= lambda; if ((ally[i] > 0) && (location[ally[i]-1] > 0)) { lai = location[ally[i] - 1]; memcpy(clai, contribution[lai - 1], sizeof(contribarr)); for (j = 0; j < rcategs; j++) nulike[j] = ((1.0 - lambda) * like[j] + sumc) * clai[j]; } else { for (j = 0; j < rcategs; j++) nulike[j] = ((1.0 - lambda) * like[j] + sumc); } memcpy(like, nulike, sizeof(contribarr)); } sum2 = 0.0; for (i = 0; i < rcategs; i++) sum2 += probcat[i] * like[i]; sum += log(sum2); } curtree.likelihood = sum; if (auto_ || !usertree) return sum; if(which <= shimotrees) l0gl[which - 1] = sum; if (which == 1) { maxwhich = 1; maxlogl = sum; return sum; } if (sum > maxlogl) { maxwhich = which; maxlogl = sum; } return sum; } /* dnamlk_evaluate */ static boolean update(node *p) { /* Conditionally optimize tyme at a node. Return true if successful. */ if (p == NULL) return false; if ( (!usertree) || (usertree && !lngths) ) return makenewv(p); return false; } /* update */ static boolean smooth(node *p) { node *q = NULL; boolean success; if (p == NULL) return false; if (p->tip) return false; /* optimize tyme here */ success = update(p); if (smoothit || polishing) { for (q = p->next; q != p; q = q->next) { /* smooth subtrees */ success = smooth(q->back) || success; /* optimize tyme again after each subtree */ success = update(p) || success; } } return success; } /* smooth */ static void restoradd(node *below, node *newtip, node *newfork, double prevtyme) { /* restore "new" tip and fork to place "below". restore tymes */ /* assumes bifurcation */ hookup(newfork, below->back); hookup(newfork->next, below); hookup(newtip, newfork->next->next); curtree.nodep[newfork->index-1] = newfork; setnodetymes(newfork,prevtyme); } /* restoradd */ static void dnamlk_add(node *below, node *newtip, node *newfork) { /* inserts the nodes newfork and its descendant, newtip, into the tree. */ long i; node *p; node *above; double newtyme; boolean success; assert( all_tymes_valid(curtree.root, 0.98*MIN_BRANCH_LENGTH, false) ); assert( floating_fork(newfork) ); assert( newtip->back == NULL ); /* Get parent nodelets */ below = pnode(&curtree, below); newfork = pnode(&curtree, newfork); newtip = pnode(&curtree, newtip); /* Join above node to newfork */ if (below->back == NULL) newfork->back = NULL; else { above = below->back; /* unhookup(below, above); */ hookup(newfork, above); } /* Join below to newfork->next->next */ hookup(below, newfork->next->next); /* Join newtip to newfork->next */ hookup(newfork->next, newtip); /* Move root if inserting there */ if (curtree.root == below) curtree.root = newfork; /* p = child with lowest tyme */ p = newtip->tyme < below->tyme ? newtip : below; /* If not at root, set newfork tyme to average below/above */ if (newfork->back != NULL) { if (p->tyme > newfork->back->tyme) newtyme = (p->tyme + newfork->back->tyme) / 2.0; else newtyme = p->tyme - INSERT_MIN_TYME; if (p->tyme - newtyme < MIN_BRANCH_LENGTH) newtyme = p->tyme - MIN_BRANCH_LENGTH; setnodetymes(newfork, newtyme); /* Now move from newfork to root, setting parent tymes older than children * by at least MIN_BRANCH_LENGTH */ p = newfork; while (p != curtree.root) { if (p->back->tyme > p->tyme - MIN_BRANCH_LENGTH) setnodetymes(p->back, p->tyme - MIN_BRANCH_LENGTH); else break; /* get parent node */ p = pnode(&curtree, p->back); } } else { /* root == newfork */ /* make root 2x older */ setnodetymes(newfork, p->tyme - 2*INSERT_MIN_TYME); } assert( all_tymes_valid(curtree.root, 0.98*MIN_BRANCH_LENGTH, false) ); /* Adjust branch lengths throughout */ for ( i = 1; i < smoothings; i++ ) { success = smooth(newfork); success = smooth(newfork->back) || success; if ( !success ) break; } } /* dnamlk_add */ static void dnamlk_re_move(node **item, node **fork, boolean tempadd) { /* removes nodes *item and its parent (returned in *fork), from the tree. the new descendant of fork's ancestor is made to be fork's descendant other than item. Item must point to node*, but *fork is not read */ node *p, *q; long i; boolean success; if ((*item)->back == NULL) { *fork = NULL; return; } *item = curtree.nodep[(*item)->index-1]; *fork = curtree.nodep[(*item)->back->index - 1]; if (curtree.root == *fork) { if (*item == (*fork)->next->back) curtree.root = (*fork)->next->next->back; else curtree.root = (*fork)->next->back; } p = (*item)->back->next->back; q = (*item)->back->next->next->back; if (p != NULL) p->back = q; if (q != NULL) q->back = p; (*fork)->back = NULL; p = (*fork)->next; while (p != *fork) { p->back = NULL; p = p->next; } (*item)->back = NULL; inittrav(p); inittrav(q); if (tempadd) return; for ( i = 1; i <= smoothings; i++ ) { success = smooth(q); if ( smoothit ) success = smooth(q->back) || success; if ( !success ) break; } } /* dnamlk_re_move */ static void tryadd(node *p, node **item, node **nufork) { /* temporarily adds one fork and one tip to the tree. if the location where they are added yields greater likelihood than other locations tested up to that time, then keeps that location as there */ if ( !global2 ) save_tymes(&curtree,tymes); dnamlk_add(p, *item, *nufork); like = dnamlk_evaluate(p); if (lastsp) { if (like >= bestree.likelihood || bestree.likelihood == UNDEFINED) { copy_(&curtree, &bestree, nonodes, rcategs); if ( global2 ) /* To be restored in maketree() */ save_tymes(&curtree,tymes); } } if (like > bestyet || bestyet == UNDEFINED) { bestyet = like; there = p; } dnamlk_re_move(item, nufork, true); if ( !global2 ) { restore_tymes(&curtree,tymes); } } /* tryadd */ static void addpreorder(node *p, node *item_, node *nufork_, boolean contin, boolean continagain) { /* Traverse tree, adding item at different locations until we * find a better likelihood. Afterwards, global 'there' will be * set to the best add location, or will be left alone if no * better could be found. */ node *item, *nufork; item = item_; nufork = nufork_; if (p == NULL) return; tryadd(p, &item, &nufork); contin = continagain; if ((!p->tip) && contin) { /* assumes bifurcation (OK) */ addpreorder(p->next->back, item, nufork, contin, continagain); addpreorder(p->next->next->back, item, nufork, contin, continagain); } } /* addpreorder */ static boolean tryrearr(node *p) { /* evaluates one rearrangement of the tree. if the new tree has greater likelihood than the old keeps the new tree and returns true. otherwise, restores the old tree and returns false. */ node *forknode; /* parent fork of p */ node *frombelow; /* other child of forknode */ node *whereto; /* parent fork of forknode */ double oldlike; /* likelihood before rearrangement */ double prevtyme; /* forknode->tyme before rearrange */ double like_delta; /* improvement in likelihood */ boolean wasonleft; /* true if p first child of forknode */ if (p == curtree.root) return false; /* forknode = parent fork of p */ forknode = curtree.nodep[p->back->index - 1]; if (forknode == curtree.root) return false; oldlike = bestyet; prevtyme = forknode->tyme; /* assumes bifurcation (OK) */ /* frombelow = other child of forknode (not p) */ if (forknode->next->back == p) { frombelow = forknode->next->next->back; wasonleft = true; } else { frombelow = forknode->next->back; wasonleft = false; } whereto = curtree.nodep[forknode->back->index - 1]; /* remove forknode and p */ dnamlk_re_move(&p, &forknode, true); /* add p and forknode as parent of whereto */ dnamlk_add(whereto, p, forknode); like = dnamlk_evaluate(p); like_delta = like - oldlike; if ( like_delta < LIKE_EPSILON && oldlike != UNDEFINED) { dnamlk_re_move(&p, &forknode, true); restoradd(frombelow, p, forknode, prevtyme); if (wasonleft && (forknode->next->next->back == p)) { hookup (forknode->next->back, forknode->next->next); hookup (forknode->next, p); } curtree.likelihood = oldlike; /* assumes bifurcation (OK) */ inittrav(forknode); inittrav(forknode->next); inittrav(forknode->next->next); return false; } else { bestyet = like; } return true; } /* tryrearr */ static boolean repreorder(node *p) { /* traverses a binary tree, calling function tryrearr at a node before calling tryrearr at its descendants. Returns true the first time rearrangement increases the tree's likelihood. */ if (p == NULL) return false; if ( !tryrearr(p) ) return false; if (p->tip) return true; /* assumes bifurcation */ if ( !repreorder(p->next->back) ) return false; if ( !repreorder(p->next->next->back) ) return false; return true; } /* repreorder */ static void rearrange(node **r) { /* traverses the tree (preorder), finding any local rearrangement which increases the likelihood. if traversal succeeds in increasing the tree's likelihood, function rearrange runs traversal again */ while ( repreorder(*r) ) /* continue */; } /* rearrange */ static void initdnamlnode(node **p, node **grbg, node *q, long len, long nodei, long *ntips, long *parens, initops whichinit, pointarray treenode, pointarray nodep, Char *str, Char *ch, FILE *intree) { /* Initializes each node as it is read from user tree by treeread(). * whichinit specifies which type of initialization is to be done. */ boolean minusread; double valyew, divisor; switch (whichinit) { case bottom: gnu(grbg, p); (*p)->index = nodei; (*p)->tip = false; malloc_pheno((*p), endsite, rcategs); nodep[(*p)->index - 1] = (*p); break; case nonbottom: gnu(grbg, p); malloc_pheno(*p, endsite, rcategs); (*p)->index = nodei; break; case tip: match_names_to_data (str, nodep, p, spp); break; case iter: (*p)->initialized = false; /* Initial branch lengths start at 0.0. tymetrav() enforces * MIN_BRANCH_LENGTH */ (*p)->v = 0.0; (*p)->iter = true; if ((*p)->back != NULL) (*p)->back->iter = true; break; case length: processlength(&valyew, &divisor, ch, &minusread, intree, parens); (*p)->v = valyew / divisor / fracchange; (*p)->iter = false; if ((*p)->back != NULL) { (*p)->back->v = (*p)->v; (*p)->back->iter = false; } break; case hslength: break; case hsnolength: if (usertree && lengthsopt && lngths) { printf("Warning: one or more lengths not defined in user tree number %ld.\n", which); printf("DNAMLK will attempt to optimize all branch lengths.\n\n"); lngths = false; } break; case treewt: break; case unittrwt: break; } } /* initdnamlnode */ static double tymetrav(node *p) { /* Recursively convert branch lengths to node tymes. Returns the maximum * branch length p's parent can have, which is p->tyme - max(p->v, * MIN_BRANCH_LENGTH) */ node *q; double xmax; double x; xmax = 0.0; if (!p->tip) { for (q = p->next; q != p; q = q->next) { x = tymetrav(q->back); if (xmax > x) xmax = x; } } else { x = 0.0; } setnodetymes(p,xmax); if (p->v < MIN_BRANCH_LENGTH) return xmax - MIN_BRANCH_LENGTH; else return xmax - p->v; } /* tymetrav */ static void reconstr(node *p, long n) { /* reconstruct and print out base at site n+1 at node p */ long i, j, k, m, first, second, num_sibs; double f, sum, xx[4]; node *q; if ((ally[n] == 0) || (location[ally[n]-1] == 0)) putc('.', outfile); else { j = location[ally[n]-1] - 1; for (i = 0; i < 4; i++) { f = p->x[j][mx-1][i]; num_sibs = count_sibs(p); q = p; for (k = 0; k < num_sibs; k++) { q = q->next; f *= q->x[j][mx-1][i]; } f = sqrt(f); xx[i] = f; } xx[0] *= freqa; xx[1] *= freqc; xx[2] *= freqg; xx[3] *= freqt; sum = xx[0]+xx[1]+xx[2]+xx[3]; for (i = 0; i < 4; i++) xx[i] /= sum; first = 0; for (i = 1; i < 4; i++) if (xx [i] > xx[first]) first = i; if (first == 0) second = 1; else second = 0; for (i = 0; i < 4; i++) if ((i != first) && (xx[i] > xx[second])) second = i; m = 1 << first; if (xx[first] < 0.4999995) m = m + (1 << second); if (xx[first] > 0.95) putc(toupper(basechar[m - 1]), outfile); else putc(basechar[m - 1], outfile); if (rctgry && rcategs > 1) mx = mp[n][mx - 1]; else mx = 1; } } /* reconstr */ static void rectrav(node *p, long m, long n) { /* print out segment of reconstructed sequence for one branch */ long num_sibs, i; node *sib_ptr; putc(' ', outfile); if (p->tip) { for (i = 0; i < nmlngth; i++) putc(nayme[p->index-1][i], outfile); } else fprintf(outfile, "%4ld ", p->index - spp); fprintf(outfile, " "); mx = mx0; for (i = m; i <= n; i++) { if ((i % 10 == 0) && (i != m)) putc(' ', outfile); if (p->tip) putc(y[p->index-1][i], outfile); else reconstr(p, i); } putc('\n', outfile); if (!p->tip) { num_sibs = count_sibs(p); sib_ptr = p; for (i = 0; i < num_sibs; i++) { sib_ptr = sib_ptr->next; rectrav(sib_ptr->back, m, n); } } mx1 = mx; } /* rectrav */ static void summarize(FILE *fp) { long i, j, mm; double mode, sum; double like[maxcategs], nulike[maxcategs]; double **marginal; mp = (long **)Malloc(sites * sizeof(long *)); for (i = 0; i <= sites-1; ++i) mp[i] = (long *)Malloc(sizeof(long)*rcategs); fprintf(fp, "\nLn Likelihood = %11.5f\n\n", curtree.likelihood); fprintf(fp, " Ancestor Node Node Height Length\n"); fprintf(fp, " -------- ---- ---- ------ ------\n"); mlk_describe(fp, &curtree, fracchange); putc('\n', fp); if (rctgry && rcategs > 1) { for (i = 0; i < rcategs; i++) like[i] = 1.0; for (i = sites - 1; i >= 0; i--) { sum = 0.0; for (j = 0; j < rcategs; j++) { nulike[j] = (lambda1 + lambda * probcat[j]) * like[j]; mp[i][j] = j + 1; for (k = 1; k <= rcategs; k++) { if (k != j + 1) { if (lambda * probcat[k - 1] * like[k - 1] > nulike[j]) { nulike[j] = lambda * probcat[k - 1] * like[k - 1]; mp[i][j] = k; } } } if ((ally[i] > 0) && (location[ally[i]-1] > 0)) nulike[j] *= contribution[location[ally[i] - 1] - 1][j]; sum += nulike[j]; } for (j = 0; j < rcategs; j++) nulike[j] /= sum; memcpy(like, nulike, rcategs * sizeof(double)); } mode = 0.0; mx = 1; for (i = 1; i <= rcategs; i++) { if (probcat[i - 1] * like[i - 1] > mode) { mx = i; mode = probcat[i - 1] * like[i - 1]; } } mx0 = mx; fprintf(fp, "Combination of categories that contributes the most to the likelihood:\n\n"); for (i = 1; i <= nmlngth + 3; i++) putc(' ', fp); for (i = 1; i <= sites; i++) { fprintf(fp, "%ld", mx); if (i % 10 == 0) putc(' ', fp); if (i % 60 == 0 && i != sites) { putc('\n', fp); for (j = 1; j <= nmlngth + 3; j++) putc(' ', fp); } mx = mp[i - 1][mx - 1]; } fprintf(fp, "\n\n"); marginal = (double **) Malloc( sites*sizeof(double *)); for (i = 0; i < sites; i++) marginal[i] = (double *) Malloc( rcategs*sizeof(double)); for (i = 0; i < rcategs; i++) like[i] = 1.0; for (i = sites - 1; i >= 0; i--) { sum = 0.0; for (j = 0; j < rcategs; j++) { nulike[j] = (lambda1 + lambda * probcat[j]) * like[j]; for (k = 1; k <= rcategs; k++) { if (k != j + 1) nulike[j] += lambda * probcat[k - 1] * like[k - 1]; } if ((ally[i] > 0) && (location[ally[i]-1] > 0)) nulike[j] *= contribution[location[ally[i] - 1] - 1][j]; sum += nulike[j]; } for (j = 0; j < rcategs; j++) { nulike[j] /= sum; marginal[i][j] = nulike[j]; } memcpy(like, nulike, rcategs * sizeof(double)); } for (i = 0; i < rcategs; i++) like[i] = 1.0; for (i = 0; i < sites; i++) { sum = 0.0; for (j = 0; j < rcategs; j++) { nulike[j] = (lambda1 + lambda * probcat[j]) * like[j]; for (k = 1; k <= rcategs; k++) { if (k != j + 1) nulike[j] += lambda * probcat[k - 1] * like[k - 1]; } marginal[i][j] *= like[j] * probcat[j]; sum += nulike[j]; } for (j = 0; j < rcategs; j++) nulike[j] /= sum; memcpy(like, nulike, rcategs * sizeof(double)); sum = 0.0; for (j = 0; j < rcategs; j++) sum += marginal[i][j]; for (j = 0; j < rcategs; j++) marginal[i][j] /= sum; } fprintf( fp, "Most probable category at each site if > 0.95 probability " "(\".\" otherwise)\n\n" ); for (i = 1; i <= nmlngth + 3; i++) putc(' ', fp); for (i = 0; i < sites; i++) { mm = 0; sum = 0.0; for (j = 0; j < rcategs; j++) if (marginal[i][j] > sum) { sum = marginal[i][j]; mm = j; } if (sum >= 0.95) fprintf(fp, "%ld", mm+1); else putc('.', fp); if ((i+1) % 60 == 0) { if (i != 0) { putc('\n', fp); for (j = 1; j <= nmlngth + 3; j++) putc(' ', fp); } } else if ((i+1) % 10 == 0) putc(' ', fp); } putc('\n', fp); for (i = 0; i < sites; i++) free(marginal[i]); free(marginal); } putc('\n', fp); putc('\n', fp); if (hypstate) { fprintf(fp, "Probable sequences at interior nodes:\n\n"); fprintf(fp, " node "); for (i = 0; (i < 13) && (i < ((sites + (sites-1)/10 - 39) / 2)); i++) putc(' ', fp); fprintf(fp, "Reconstructed sequence (caps if > 0.95)\n\n"); if (!rctgry || (rcategs == 1)) mx0 = 1; for (i = 0; i < sites; i += 60) { k = i + 59; if (k >= sites) k = sites - 1; rectrav(curtree.root, i, k); putc('\n', fp); mx0 = mx1; } } for (i = 0; i < sites; ++i) free(mp[i]); free(mp); } /* summarize */ static void dnamlk_treeout(node *p) { /* write out file with representation of final tree */ node *sib_ptr; long i, n, w, num_sibs; Char c; double x; if (p->tip) { n = 0; for (i = 1; i <= nmlngth; i++) { if (nayme[p->index - 1][i - 1] != ' ') n = i; } for (i = 0; i < n; i++) { c = nayme[p->index - 1][i]; if (c == ' ') c = '_'; putc(c, outtree); } col += n; } else { sib_ptr = p; num_sibs = count_sibs(p); putc('(', outtree); col++; for (i=0; i < (num_sibs - 1); i++) { sib_ptr = sib_ptr->next; dnamlk_treeout(sib_ptr->back); putc(',', outtree); col++; if (col > 55) { putc('\n', outtree); col = 0; } } sib_ptr = sib_ptr->next; dnamlk_treeout(sib_ptr->back); putc(')', outtree); col++; } if (p == curtree.root) { fprintf(outtree, ";\n"); return; } x = fracchange * (p->tyme - curtree.nodep[p->back->index - 1]->tyme); if (x > 0.0) w = (long)(0.4342944822 * log(x)); else if (x == 0.0) w = 0; else w = (long)(0.4342944822 * log(-x)) + 1; if (w < 0) w = 0; fprintf(outtree, ":%*.5f", (int)(w + 7), x); col += w + 8; } /* dnamlk_treeout */ static void init_tymes(node *p, double minlength) { /* Set all node tymes closest to the tips but with no branches shorter than * minlength */ long i, num_sibs; node *sib_ptr, *sib_back_ptr; /* traverse to set up times in subtrees */ if (p->tip) return; sib_ptr = p; num_sibs = count_sibs(p); for (i=0 ; i < num_sibs; i++) { sib_ptr = sib_ptr->next; sib_back_ptr = sib_ptr->back; init_tymes(sib_back_ptr, minlength); } /* set time at this node */ setnodetymes(p, min_child_tyme(p) - minlength); } /* init_tymes */ static void treevaluate() { /* evaluate likelihood of tree, after iterating branch lengths */ long i; if ( !usertree || (usertree && !lngths) ) { polishing = true; smoothit = true; for (i = 0; i < smoothings; ) { if ( !smooth(curtree.root) ) i++; } } dnamlk_evaluate(curtree.root); } /* treevaluate */ static void maketree() { /* constructs a binary tree from the pointers in curtree.nodep, adds each node at location which yields highest likelihood then rearranges the tree for greatest likelihood */ long i, j, numtrees; node *item, *nufork, *dummy, *q, *root=NULL; boolean succeded, dummy_haslengths, dummy_first, goteof; long max_nonodes; /* Maximum number of nodes required to * express all species in a bifurcating tree * */ long nextnode; pointarray dummy_treenode=NULL; double oldbest; node *tmp; inittable(); if (!usertree) { for (i = 1; i <= spp; i++) enterorder[i - 1] = i; if (jumble) randumize(seed, enterorder); curtree.root = curtree.nodep[spp]; curtree.root->back = NULL; for (i = 0; i < spp; i++) curtree.nodep[i]->back = NULL; for (i = spp; i < nonodes; i++) { q = curtree.nodep[i]; q->back = NULL; while ((q = q->next) != curtree.nodep[i]) q->back = NULL; } polishing = false; dnamlk_add(curtree.nodep[enterorder[0] - 1], curtree.nodep[enterorder[1] - 1], curtree.nodep[spp]); if (progress) { printf("\nAdding species:\n"); writename(0, 2, enterorder); #ifdef WIN32 phyFillScreenColor(); #endif } lastsp = false; smoothit = false; for (i = 3; i <= spp; i++) { bestree.likelihood = UNDEFINED; bestyet = UNDEFINED; there = curtree.root; item = curtree.nodep[enterorder[i - 1] - 1]; nufork = curtree.nodep[spp + i - 2]; lastsp = (i == spp); addpreorder(curtree.root, item, nufork, true, true); dnamlk_add(there, item, nufork); like = dnamlk_evaluate(curtree.root); copy_(&curtree, &bestree, nonodes, rcategs); rearrange(&curtree.root); if (curtree.likelihood > bestree.likelihood) { copy_(&curtree, &bestree, nonodes, rcategs); } if (progress) { writename(i - 1, 1, enterorder); #ifdef WIN32 phyFillScreenColor(); #endif } if (lastsp && global) { /* perform global rearrangements */ if (progress) { printf("Doing global rearrangements\n"); printf(" !"); for (j = 1; j <= nonodes; j++) if ( j % (( nonodes / 72 ) + 1 ) == 0 ) putchar('-'); printf("!\n"); } global2 = false; do { succeded = false; if (progress) printf(" "); /* FIXME: tymes gets clobbered by tryadd() */ /* save_tymes(&curtree, tymes); */ for (j = 0; j < nonodes; j++) { oldbest = bestree.likelihood; bestyet = UNDEFINED; item = curtree.nodep[j]; if (item != curtree.root) { nufork = pnode(&curtree, item->back); /* parent fork */ if (nufork != curtree.root) { tmp = nufork->next->back; if (tmp == item) tmp = nufork->next->next->back; /* can't figure out why we never get here */ } else { if (nufork->next->back != item) tmp = nufork->next->back; else tmp = nufork->next->next->back; } /* if we add item at tmp we have done nothing */ assert( all_tymes_valid(curtree.root, 0.98*MIN_BRANCH_LENGTH, false) ); dnamlk_re_move(&item, &nufork, false); /* there = curtree.root; */ there = tmp; addpreorder(curtree.root, item, nufork, true, true); if ( tmp != there && bestree.likelihood > oldbest) succeded = true; dnamlk_add(there, item, nufork); if (global2) restore_tymes(&curtree,tymes); } if (progress) { if ( j % (( nonodes / 72 ) + 1 ) == 0 ) putchar('.'); fflush(stdout); } } if (progress) putchar('\n'); } while ( succeded ); } } if (njumble > 1 && lastsp) { for (i = 0; i < spp; i++ ) dnamlk_re_move(&curtree.nodep[i], &dummy, false); if (jumb == 1 || bestree2.likelihood < bestree.likelihood) copy_(&bestree, &bestree2, nonodes, rcategs); } if (jumb == njumble) { if (njumble > 1) copy_(&bestree2, &curtree, nonodes, rcategs); else copy_(&bestree, &curtree, nonodes, rcategs); fprintf(outfile, "\n\n"); treevaluate(); curtree.likelihood = dnamlk_evaluate(curtree.root); if (treeprint) mlk_printree(outfile, &curtree); summarize(outfile); if (trout) { col = 0; dnamlk_treeout(curtree.root); } } } else { /* if ( usertree ) */ /* Open in binary: ftell() is broken for UNIX line-endings under WIN32 */ openfile(&intree, INTREE, "input tree file", "rb", progname, intreename); inittable_for_usertree (intree); numtrees = countsemic(&intree); if(numtrees > MAXSHIMOTREES) shimotrees = MAXSHIMOTREES; else shimotrees = numtrees; if (numtrees > 2) initseed(&inseed, &inseed0, seed); l0gl = (double *)Malloc(shimotrees * sizeof(double)); l0gf = (double **)Malloc(shimotrees * sizeof(double *)); for (i=0; i < shimotrees; ++i) l0gf[i] = (double *)Malloc(endsite * sizeof(double)); if (treeprint) { fprintf(outfile, "User-defined tree"); if (numtrees > 1) putc('s', outfile); fprintf(outfile, ":\n\n"); } fprintf(outfile, "\n\n"); which = 1; max_nonodes = nonodes; while (which <= numtrees) { /* These initializations required each time through the loop since multiple trees require re-initialization */ dummy_haslengths = true; nextnode = 0; dummy_first = true; goteof = false; lngths = lengthsopt; nonodes = max_nonodes; treeread(intree, &root, dummy_treenode, &goteof, &dummy_first, curtree.nodep, &nextnode, &dummy_haslengths, &grbg, initdnamlnode, false, nonodes); if (goteof && (which <= numtrees)) { /* if we hit the end of the file prematurely */ printf ("\n"); printf ("ERROR: trees missing at end of file.\n"); printf ("\tExpected number of trees:\t\t%ld\n", numtrees); printf ("\tNumber of trees actually in file:\t%ld.\n\n", which - 1); exxit(-1); } nonodes = nextnode; root = curtree.nodep[root->index - 1]; curtree.root = root; if (lngths) tymetrav(curtree.root); else init_tymes(curtree.root, initialv); treevaluate(); if (treeprint) mlk_printree(outfile, &curtree); summarize(outfile); if (trout) { col = 0; dnamlk_treeout(curtree.root); } if(which < numtrees){ freex_notip(nonodes, curtree.nodep); gdispose(curtree.root, &grbg, curtree.nodep); } which++; } FClose(intree); if (!auto_ && numtrees > 1 && weightsum > 1 ) standev2(numtrees, maxwhich, 0, endsite, maxlogl, l0gl, l0gf, aliasweight, seed); } if (jumb == njumble) { if (progress) { printf("\nOutput written to file \"%s\"\n", outfilename); if (trout) printf("\nTree also written onto file \"%s\"\n", outtreename); } free(contribution); freex(nonodes, curtree.nodep); if (!usertree) { freex(nonodes, bestree.nodep); if (njumble > 1) freex(nonodes, bestree2.nodep); } } free(root); freetable(); } /* maketree */ /*?? Dnaml has a clean-up function for freeing memory, closing files, etc. Put one here too? */ int main(int argc, Char *argv[]) { /* DNA Maximum Likelihood with molecular clock */ /* Initialize mlclock.c */ mlclock_init(&curtree, &dnamlk_evaluate); #ifdef MAC argc = 1; /* macsetup("Dnamlk", "Dnamlk"); */ argv[0] = "Dnamlk"; #endif init(argc,argv); progname = argv[0]; openfile(&infile, INFILE, "input file", "r", argv[0], infilename); openfile(&outfile, OUTFILE, "output file", "w", argv[0], outfilename); ibmpc = IBMCRT; ansi = ANSICRT; datasets = 1; mulsets = false; firstset = true; doinit(); ttratio0 = ttratio; /* Open output tree, categories, and weights files if needed */ if ( trout ) openfile(&outtree, OUTTREE, "output tree file", "w", argv[0], outtreename); if ( ctgry ) openfile(&catfile, CATFILE, "categories file", "r", argv[0], catfilename); if ( weights || justwts ) openfile(&weightfile, WEIGHTFILE, "weights file", "r", argv[0], weightfilename); /* Data set loop */ for ( ith = 1; ith <= datasets; ith++ ) { ttratio = ttratio0; if ( datasets > 1 ) { fprintf(outfile, "Data set # %ld:\n\n", ith); if ( progress ) printf("\nData set # %ld:\n", ith); } getinput(); if ( ith == 1 ) firstset = false; /* Jumble loop */ if (usertree) maketree(); else for (jumb = 1; jumb <= njumble; jumb++) maketree(); } /* Close files */ FClose(infile); FClose(outfile); if ( trout ) FClose(outtree); if ( ctgry ) FClose(catfile); if ( weights || justwts ) FClose(weightfile); #ifdef MAC fixmacfile(outfilename); fixmacfile(outtreename); #endif printf("\nDone.\n\n"); #ifdef WIN32 phyRestoreConsoleAttributes(); #endif return 0; } /* DNA Maximum Likelihood with molecular clock */ phylip-3.697/src/dnamove.c0000644004732000473200000016531612406201116015136 0ustar joefelsenst_g #include "phylip.h" #include "moves.h" #include "seq.h" /* version 3.696. Written by Joseph Felsenstein, Akiko Fuseki, Sean Lamont, and Andrew Keeffe. Copyright (c) 1993-2014, Joseph Felsenstein All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #define maxsz 999 /* size of pointer array for the undo trees */ /* this can be large without eating memory */ typedef struct treeset_t { node *root; pointarray nodep; pointarray treenode; long nonodes; boolean waswritten, hasmult, haslengths, nolengths, initialized; } treeset_t; treeset_t treesets[2]; node **treeone, **treetwo; typedef enum { horiz, vert, up, overt, upcorner, midcorner, downcorner, aa, cc, gg, tt, question } chartype; typedef enum { rearr, flipp, reroott, none } rearrtype; typedef struct gbase2 { baseptr2 base2; struct gbase2 *next; } gbase2; typedef enum { arb, use, spec } howtree; typedef enum {beforenode, atnode} movet; movet fromtype; typedef node **pointptr; #ifndef OLDC /* function prototypes */ void dnamove_gnu(gbases **); void dnamove_chuck(gbases *); void getoptions(void); void inputoptions(void); void allocrest(void); void doinput(void); void configure(void); void prefix(chartype); void postfix(chartype); void makechar(chartype); void dnamove_add(node *, node *, node *); void dnamove_re_move(node **, node **); void evaluate(node *); void dnamove_reroot(node *); void firstrav(node *, long); void dnamove_hyptrav(node *, long *, long, boolean *); void grwrite(chartype, long, long *); void dnamove_drawline(long); void dnamove_printree(void); void arbitree(void); void yourtree(void); void initdnamovenode(node **, node **, node *, long, long, long *, long *, initops, pointarray, pointarray, Char *, Char *, FILE *); void buildtree(void); void setorder(void); void mincomp(void); void rearrange(void); void dnamove_nextinc(void); void dnamove_nextchar(void); void dnamove_prevchar(void); void dnamove_show(void); void tryadd(node *, node **, node **, double *); void addpreorder(node *, node *, node *, double *); void try(void); void undo(void); void treewrite(boolean); void clade(void); void flip(long); void changeoutgroup(void); void redisplay(void); void treeconstruct(void); void maketriad(node **, long); void newdnamove_hyptrav(node *, long *, long, long, boolean, pointarray); void prepare_node(node *p); void dnamove_copynode(node *fromnode, node *tonode); node *copytrav(node *p); void numdesctrav(node *p); void chucktree(node *p); void copytree(void); void makeweights(void); void add_at(node *below, node *newtip, node *newfork); void add_before(node *atnode, node *newtip); void add_child(node *parent, node *newchild); void newdnamove_hypstates(long chars, node *root, pointarray treenode); void consolidatetree(long index); void fliptrav(node *p, boolean recurse); /* function prototypes */ #endif char infilename[FNMLNGTH],intreename[FNMLNGTH],outtreename[FNMLNGTH], weightfilename[FNMLNGTH]; node *root; long chars, screenlines, col, treelines, leftedge, topedge, vmargin, hscroll, vscroll, scrollinc, screenwidth, farthest, whichtree, othertree; boolean weights, thresh, waswritten; boolean usertree, goteof, firsttree, haslengths; /*treeread variables*/ pointarray nodep; /*treeread variables*/ node *grbg = NULL; /*treeread variables*/ long *zeros; /*treeread variables*/ pointptr treenode; /* pointers to all nodes in tree */ double threshold; double *threshwt; boolean reversed[(long)question - (long)horiz + 1]; boolean graphic[(long)question - (long)horiz + 1]; unsigned char chh[(long)question - (long)horiz + 1]; howtree how; gbases *garbage; char *progname; /* Local variables for treeconstruct, propogated global for C version: */ long dispchar, atwhat, what, fromwhere, towhere, oldoutgrno, compatible; double like, bestyet, gotlike; boolean display, newtree, changed, subtree, written, oldwritten, restoring, wasleft, oldleft, earlytree; steptr necsteps; boolean *in_tree; long sett[31]; steptr numsteps; node *nuroot; rearrtype lastop; Char ch; boolean *names; void maketriad(node **p, long index) { /* Initiate an internal node with stubs for two children */ long i, j; node *q; q = NULL; for (i = 1; i <= 3; i++) { gnu(&grbg, p); (*p)->index = index; (*p)->hasname = false; (*p)->haslength = false; (*p)->deleted=false; (*p)->deadend=false; (*p)->onebranch=false; (*p)->onebranchhaslength=false; if(!(*p)->base) (*p)->base = (baseptr)Malloc(chars*sizeof(long)); if(!(*p)->numnuc) (*p)->numnuc = (nucarray *)Malloc(endsite*sizeof(nucarray)); if(!(*p)->numsteps) (*p)->numsteps = (steptr)Malloc(endsite*sizeof(long)); for (j=0;jnayme[j] = '\0'; (*p)->next = q; q = *p; } (*p)->next->next->next = *p; q = (*p)->next; while (*p != q) { (*p)->back = NULL; (*p)->tip = false; *p = (*p)->next; } treenode[index - 1] = *p; } /* maketriad */ void prepare_node(node *p) { /* This function allocates the base, numnuc and numsteps arrays for a node. Because a node can change roles between tip, internal and ring member, all nodes need to have these in case they are used. */ p->base = (baseptr)Malloc(chars*sizeof(long)); p->numnuc = (nucarray *)Malloc(endsite*sizeof(nucarray)); p->numsteps = (steptr)Malloc(endsite*sizeof(long)); } /* prepare_tip */ void dnamove_gnu(gbases **p) { /* this and the following are do-it-yourself garbage collectors. Make a new node or pull one off the garbage list */ if (garbage != NULL) { *p = garbage; garbage = garbage->next; } else { *p = (gbases *)Malloc(sizeof(gbases)); (*p)->base = (baseptr2)Malloc(chars*sizeof(long)); } (*p)->next = NULL; } /* dnamove_gnu */ void dnamove_chuck(gbases *p) { /* collect garbage on p -- put it on front of garbage list */ p->next = garbage; garbage = p; } /* dnamove_chuck */ void dnamove_copynode(node *fromnode, node *tonode) { /* Copy the contents of a node from fromnode to tonode. */ int i = 0; /* printf("copynode: fromnode = %d, tonode = %d\n", fromnode->index,tonode->index); printf("copynode: fromnode->base = %ld, tonode->base = %ld\n", fromnode->base,tonode->base); */ memcpy(tonode->base, fromnode->base, chars*sizeof(long)); /* printf("copynode: fromnode->numnuc = %ld, tonode->numnuc = %ld\n", fromnode->numnuc,tonode->numnuc); */ if (fromnode->numnuc != NULL) memcpy(tonode->numnuc, fromnode->numnuc, endsite*sizeof(nucarray)); if (fromnode->numsteps != NULL) memcpy(tonode->numsteps, fromnode->numsteps, endsite*sizeof(long)); tonode->numdesc = fromnode->numdesc; tonode->state = fromnode->state; tonode->index = fromnode->index; tonode->tip = fromnode->tip; for (i=0;inayme[i] = fromnode->nayme[i]; } /* dnamove_copynode */ node *copytrav(node *p) { /* Traverse the tree from p on down, copying nodes to the other tree */ node *q, *newnode, *newnextnode, *temp; gnu(&grbg, &newnode); if(!newnode->base) newnode->base = (baseptr)Malloc(chars*sizeof(long)); if(!newnode->numnuc) newnode->numnuc = (nucarray *)Malloc(endsite*sizeof(nucarray)); if(!newnode->numsteps) newnode->numsteps = (steptr)Malloc(endsite*sizeof(long)); dnamove_copynode(p,newnode); if (treenode[p->index-1] == p) treesets[othertree].treenode[p->index-1] = newnode; /* if this is a tip, return now */ if (p->tip) return newnode; /* go around the ring, copying as we go */ q = p->next; gnu(&grbg, &newnextnode); if(!newnextnode->base) newnextnode->base = (baseptr)Malloc(chars*sizeof(long)); if(!newnextnode->numnuc) newnextnode->numnuc = (nucarray *)Malloc(endsite*sizeof(nucarray)); if(!newnextnode->numsteps) newnextnode->numsteps = (steptr)Malloc(endsite*sizeof(long)); dnamove_copynode(q, newnextnode); newnode->next = newnextnode; do { newnextnode->back = copytrav(q->back); newnextnode->back->back = newnextnode; q = q->next; if (q == p) newnextnode->next = newnode; else { temp = newnextnode; gnu(&grbg, &newnextnode); if(!newnextnode->base) newnextnode->base = (baseptr)Malloc(chars*sizeof(long)); if(!newnextnode->numnuc) newnextnode->numnuc = (nucarray *)Malloc(endsite*sizeof(nucarray)); if(!newnextnode->numsteps) newnextnode->numsteps = (steptr)Malloc(endsite*sizeof(long)); dnamove_copynode(q, newnextnode); temp->next = newnextnode; } } while (q != p); return newnode; } /* copytrav */ void numdesctrav(node *p) { node *q; long childcount = 0; if (p->tip) { p->numdesc = 0; return; } q = p->next; do { numdesctrav(q->back); childcount++; q = q->next; } while (q != p); p->numdesc = childcount; } /* numdesctrav */ void chucktree(node *p) { /* recursively run through a tree and chuck all of its nodes, putting them on the garbage list */ int i, numNodes = 1; node *q, *r; /* base case -- tip */ if(p->tip){ chuck(&grbg, p); return; } /* recursively callchuck tree on all decendants */ q = p->next; while(q != p){ chucktree(q->back); numNodes++; q = q->next; } /* now chuck all sub-nodes in the node ring */ for(i=0 ; i < numNodes ; i++){ r = q->next; chuck(&grbg, q); q = r; } } /* chucktree */ void copytree() { /* Make a complete copy of the current tree for undo purposes */ if (whichtree == 1) othertree = 0; else othertree = 1; if(treesets[othertree].root){ chucktree(treesets[othertree].root); } treesets[othertree].root = copytrav(root); treesets[othertree].nonodes = nonodes; treesets[othertree].waswritten = waswritten; treesets[othertree].initialized = true; } /* copytree */ void getoptions() { /* interactively set options */ Char ch; boolean done, gotopt; long loopcount; how = arb; usertree = false; goteof = false; outgrno = 1; outgropt = false; thresh = false; weights = false; interleaved = true; loopcount = 0; do { cleerhome(); printf("\nInteractive DNA parsimony, version %s\n\n",VERSION); printf("Settings for this run:\n"); printf(" O Outgroup root?"); if (outgropt) printf(" Yes, at sequence number%3ld\n", outgrno); else printf(" No, use as outgroup species%3ld\n", outgrno); printf(" W Sites weighted? %s\n", (weights ? "Yes" : "No")); printf(" T Use Threshold parsimony?"); if (thresh) printf(" Yes, count up to%4.1f per site\n", threshold); else printf(" No, use ordinary parsimony\n"); printf(" I Input sequences interleaved? %s\n", (interleaved ? "Yes" : "No, sequential")); printf(" U Initial tree (arbitrary, user, specify)? %s\n", (how == arb) ? "Arbitrary" : (how == use) ? "User tree from tree file" : "Tree you specify"); printf(" 0 Graphics type (IBM PC, ANSI, none)? %s\n", ibmpc ? "IBM PC" : ansi ? "ANSI" : "(none)"); printf(" S Width of terminal screen?"); printf("%4ld\n", screenwidth); printf(" L Number of lines on screen?%4ld\n",screenlines); printf("\nAre these settings correct? "); printf("(type Y or the letter for one to change)\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); uppercase(&ch); done = (ch == 'Y'); gotopt = (strchr("SOTIU0WL",ch) != NULL) ? true : false; if (gotopt) { switch (ch) { case 'O': outgropt = !outgropt; if (outgropt) initoutgroup(&outgrno, spp); break; case 'T': thresh = !thresh; if (thresh) initthreshold(&threshold); break; case 'I': interleaved = !interleaved; break; case 'W': weights = !weights; break; case 'U': if (how == arb){ how = use; usertree = 1;} else if (how == use){ how = spec; usertree = 0;} else how = arb; break; case '0': initterminal(&ibmpc, &ansi); break; case 'S': screenwidth= readlong("Width of terminal screen (in characters)?\n"); break; case 'L': initnumlines(&screenlines); break; } } if (!(gotopt || done)) printf("Not a possible option!\n"); countup(&loopcount, 100); } while (!done); if (scrollinc < screenwidth / 2.0) hscroll = scrollinc; else hscroll = screenwidth / 2; if (scrollinc < screenlines / 2.0) vscroll = scrollinc; else vscroll = screenlines / 2; } /* getoptions */ void inputoptions() { /* input the information on the options */ long i; for (i = 0; i < (chars); i++) weight[i] = 1; if (weights){ inputweights(chars, weight, &weights); printweights(stdout, 0, chars, weight, "Sites"); } if (!thresh) threshold = spp; for (i = 0; i < (chars); i++) threshwt[i] = threshold * weight[i]; } /* inputoptions */ void allocrest() { long i; nayme = (naym *)Malloc(spp*sizeof(naym)); in_tree = (boolean *)Malloc(nonodes*sizeof(boolean)); weight = (steptr)Malloc(chars*sizeof(long)); numsteps = (steptr)Malloc(chars*sizeof(long)); necsteps = (steptr)Malloc(chars*sizeof(long)); threshwt = (double *)Malloc(chars*sizeof(double)); alias = (long *)Malloc(chars*sizeof(long)); /* from dnapars */ ally = (long *)Malloc(chars*sizeof(long)); /* from dnapars */ y = (Char **)Malloc(spp*sizeof(Char *)); /* from dnapars */ for (i = 0; i < spp; i++) /* from dnapars */ y[i] = (Char *)Malloc(chars*sizeof(Char)); /* from dnapars */ location = (long *)Malloc(chars*sizeof(long)); /* from dnapars */ } /* allocrest */ void makeweights() { /* make up weights vector to avoid duplicate computations */ long i; for (i = 1; i <= chars; i++) { alias[i - 1] = i; ally[i - 1] = i; } endsite = 0; for (i = 1; i <= chars; i++) { if (ally[i - 1] == i) endsite++; } for (i = 1; i <= endsite; i++) location[alias[i - 1] - 1] = i; if (!thresh) threshold = spp; zeros = (long *)Malloc(endsite*sizeof(long)); for (i = 0; i < endsite; i++) zeros[i] = 0; } /* makeweights */ void doinput() { /* reads the input data */ inputnumbers(&spp, &chars, &nonodes, 1); printf("%2ld species, %3ld sites\n", spp, chars); getoptions(); printf("\nReading input file ...\n\n"); if (weights) openfile(&weightfile,WEIGHTFILE,"weights file","r",progname,weightfilename); allocrest(); inputoptions(); alloctree(&treenode, nonodes, usertree); setuptree(treenode, nonodes, usertree); inputdata(chars); makeweights(); makevalues(treenode, zeros, usertree); } /* doinput */ void configure() { /* configure to machine -- set up special characters */ chartype a; for (a = horiz; (long)a <= (long)question; a = (chartype)((long)a + 1)) reversed[(long)a] = false; for (a = horiz; (long)a <= (long)question; a = (chartype)((long)a + 1)) graphic[(long)a] = false; if (ibmpc) { chh[(long)horiz] = 205; graphic[(long)horiz] = true; chh[(long)vert] = 186; graphic[(long)vert] = true; chh[(long)up] = 186; graphic[(long)up] = true; chh[(long)overt] = 205; graphic[(long)overt] = true; chh[(long)upcorner] = 200; graphic[(long)upcorner] = true; chh[(long)midcorner] = 204; graphic[(long)midcorner] = true; chh[(long)downcorner] = 201; graphic[(long)downcorner] = true; chh[(long)aa] = 176; chh[(long)cc] = 178; chh[(long)gg] = 177; chh[(long)tt] = 219; chh[(long)question] = '\001'; return; } if (ansi) { chh[(long)horiz] = ' '; reversed[(long)horiz] = true; chh[(long)vert] = chh[(long)horiz]; reversed[(long)vert] = true; chh[(long)up] = 'x'; graphic[(long)up] = true; chh[(long)overt] = 'q'; graphic[(long)overt] = true; chh[(long)upcorner] = 'm'; graphic[(long)upcorner] = true; chh[(long)midcorner] = 't'; graphic[(long)midcorner] = true; chh[(long)downcorner] = 'l'; graphic[(long)downcorner] = true; chh[(long)aa] = 'a'; reversed[(long)aa] = true; chh[(long)cc] = 'c'; reversed[(long)cc] = true; chh[(long)gg] = 'g'; reversed[(long)gg] = true; chh[(long)tt] = 't'; reversed[(long)tt] = true; chh[(long)question] = '?'; reversed[(long)question] = true; return; } chh[(long)horiz] = '='; chh[(long)vert] = ' '; chh[(long)up] = '!'; chh[(long)upcorner] = '`'; chh[(long)midcorner] = '+'; chh[(long)downcorner] = ','; chh[(long)overt] = '-'; chh[(long)aa] = 'a'; chh[(long)cc] = 'c'; chh[(long)gg] = 'g'; chh[(long)tt] = 't'; chh[(long)question] = '.'; } /* configure */ void prefix(chartype a) { /* give prefix appropriate for this character */ if (reversed[(long)a]) prereverse(ansi); if (graphic[(long)a]) pregraph2(ansi); } /* prefix */ void postfix(chartype a) { /* give postfix appropriate for this character */ if (reversed[(long)a]) postreverse(ansi); if (graphic[(long)a]) postgraph2(ansi); } /* postfix */ void makechar(chartype a) { /* print out a character with appropriate prefix or postfix */ prefix(a); putchar(chh[(long)a]); postfix(a); } /* makechar */ void add_at(node *below, node *newtip, node *newfork) { /* inserts the nodes newfork and its left descendant, newtip, to the tree. below becomes newfork's right descendant */ node *leftdesc, *rtdesc; if (below != treenode[below->index - 1]) below = treenode[below->index - 1]; if (newfork == NULL) { nonodes++; maketriad (&newfork, nonodes); } if (below->back != NULL) { below->back->back = newfork; } newfork->back = below->back; leftdesc = newtip; rtdesc = below; rtdesc->back = newfork->next->next; newfork->next->next->back = rtdesc; newfork->next->back = leftdesc; leftdesc->back = newfork->next; if (root == below) root = newfork; root->back = NULL; } /* add_at */ void add_before(node *atnode, node *newtip) { /* inserts the node newtip together with its ancestral fork into the tree next to the node atnode. */ node *q; if (atnode != treenode[atnode->index - 1]) atnode = treenode[atnode->index - 1]; q = treenode[newtip->index-1]->back; if (q != NULL) { q = treenode[q->index-1]; if (newtip == q->next->next->back) { q->next->back = newtip; newtip->back = q->next; q->next->next->back = NULL; } } if (newtip->back != NULL) { add_at(atnode, newtip, treenode[newtip->back->index-1]); } else { add_at(atnode, newtip, NULL); } } /* add_before */ void add_child(node *parent, node *newchild) { /* adds the node newchild into the tree as the last child of parent */ int i; node *newnode, *q; if (parent != treenode[parent->index - 1]) parent = treenode[parent->index - 1]; gnu(&grbg, &newnode); newnode->tip = false; newnode->deleted=false; newnode->deadend=false; newnode->onebranch=false; newnode->onebranchhaslength=false; for (i=0;inayme[i] = '\0'; newnode->index = parent->index; if(!newnode->base) newnode->base = (baseptr)Malloc(chars*sizeof(long)); if(!newnode->numnuc) newnode->numnuc = (nucarray *)Malloc(endsite*sizeof(nucarray)); if(!newnode->numsteps) newnode->numsteps = (steptr)Malloc(endsite*sizeof(long)); q = parent; do { q = q->next; } while (q->next != parent); newnode->next = parent; q->next = newnode; newnode->back = newchild; newchild->back = newnode; if (newchild->haslength) { newnode->length = newchild->length; newnode->haslength = true; } else newnode->haslength = false; } /* add_child */ void dnamove_add(node *below, node *newtip, node *newfork) { /* inserts the nodes newfork and its left descendant, newtip, to the tree. below becomes newfork's right descendant */ boolean putleft; node *leftdesc, *rtdesc; if (below != treenode[below->index - 1]) below = treenode[below->index - 1]; if (below->back != NULL) below->back->back = newfork; newfork->back = below->back; putleft = true; if (restoring) putleft = wasleft; if (putleft) { leftdesc = newtip; rtdesc = below; } else { leftdesc = below; rtdesc = newtip; } rtdesc->back = newfork->next->next; newfork->next->next->back = rtdesc; newfork->next->back = leftdesc; leftdesc->back = newfork->next; if (root == below) root = newfork; root->back = NULL; newfork->numdesc = 2; } /* dnamove_add */ void dnamove_re_move(node **item, node **fork) { /* Removes node item from the tree. If item has one sibling, removes its ancestor, fork, from the tree as well and attach item's sib to fork's ancestor. In this case, it returns a pointer to the removed fork node which is still attached to item. */ node *p=NULL, *q; int nodecount; if ((*item)->back == NULL) { *fork = NULL; return; } *fork = treenode[(*item)->back->index - 1]; nodecount = 0; if ((*fork)->next->back == *item) p = *fork; q = (*fork)->next; do { nodecount++; if (q->next->back == *item) p = q; q = q->next; } while (*fork != q); if (nodecount > 2) { fromtype = atnode; p->next = (*item)->back->next; chuck(&grbg, (*item)->back); (*item)->back = NULL; *fork = NULL; } else { /* traditional (binary tree) remove code */ if (*item == (*fork)->next->back) { if (root == *fork) root = (*fork)->next->next->back; } else { if (root == *fork) root = (*fork)->next->back; } fromtype = beforenode; /* stitch nodes together, leaving out item */ p = (*item)->back->next->back; q = (*item)->back->next->next->back; if (p != NULL) p->back = q; if (q != NULL) q->back = p; if (haslengths) { if (p != NULL && q != NULL) { p->length += q->length; q->length = p->length; } else (*item)->length = (*fork)->next->length + (*fork)->next->next->length; } (*fork)->back = NULL; p = (*fork)->next; while (p != *fork) { p->back = NULL; p = p->next; } (*item)->back = NULL; } /* endif nodecount > 2 else */ } /* dnamove_re_move */ void evaluate(node *r) { /* determines the number of steps needed for a tree. this is the minimum number of steps needed to evolve sequences on this tree */ long i, steps; double sum; compatible = 0; sum = 0.0; for (i = 0; i < (chars); i++) numsteps[i] = 0; /* set numdesc at each node to reflect current number of descendants */ numdesctrav(root); postorder(r); for (i = 0; i < endsite; i++) { steps = r->numsteps[i]; if (steps <= threshwt[i]) { sum += steps; } else { sum += threshwt[i]; } if (steps <= necsteps[i] && !earlytree) compatible += weight[i]; } like = -sum; } /* evaluate */ void dnamove_reroot(node *outgroup) { /* Reorient tree so that outgroup is by itself on the left of the root */ node *p, *q, *r; long nodecount = 0; double templen; if(outgroup->back->index == root->index) return; q = root->next; do { /* when this loop exits, p points to the internal */ p = q; /* node to the right of root */ nodecount++; q = p->next; } while (q != root); r = p; /* reorient nodep array The nodep array must point to the ring member of each ring that is closest to the root. The while loop changes the ring member pointed to by treenode[] for those nodes that will have their orientation changed by the reroot operation. */ p = outgroup->back; while (p->index != root->index) { q = treenode[p->index - 1]->back; treenode[p->index - 1] = p; p = q; } if (nodecount > 2) treenode[p->index - 1] = p; /* If nodecount > 2, the current node ring to which root is pointing will remain in place and root will point somewhere else. */ /* detach root from old location */ if (nodecount > 2) { r->next = root->next; root->next = NULL; nonodes++; maketriad(&root, nonodes); if (haslengths) { /* root->haslength remains false, or else treeout() will generate a bogus extra length */ root->next->haslength = true; root->next->next->haslength = true; } } else { /* if (nodecount > 2) else */ q = root->next; q->back->back = r->back; r->back->back = q->back; if (haslengths) { r->back->length = r->back->length + q->back->length; q->back->length = r->back->length; } } /* if (nodecount > 2) endif */ /* tie root into new location */ root->next->back = outgroup; root->next->next->back = outgroup->back; outgroup->back->back = root->next->next; outgroup->back = root->next; /* place root equidistant between left child (outgroup) and right child by deviding outgroup's length */ if (haslengths) { templen = outgroup->length / 2.0; outgroup->length = templen; outgroup->back->length = templen; root->next->next->length = templen; root->next->next->back->length = templen; } } /* dnamove_reroot */ void newdnamove_hyptrav(node *r_, long *hypset_, long b1, long b2, boolean bottom_, pointarray treenode) { /* compute, print out states at one interior node */ struct LOC_hyptrav Vars; long i, j, k; long largest; gbases *ancset; nucarray *tempnuc; node *p, *q; Vars.bottom = bottom_; Vars.r = r_; Vars.hypset = hypset_; dnamove_gnu(&ancset); tempnuc = (nucarray *)Malloc(endsite*sizeof(nucarray)); Vars.maybe = false; Vars.nonzero = false; if (!Vars.r->tip) zeronumnuc(Vars.r, endsite); for (i = b1 - 1; i < b2; i++) { j = location[ally[i] - 1]; Vars.anc = Vars.hypset[j - 1]; if (!Vars.r->tip) { p = Vars.r->next; for (k = (long)A; k <= (long)O; k++) if (Vars.anc & (1 << k)) Vars.r->numnuc[j - 1][k]++; do { for (k = (long)A; k <= (long)O; k++) if (p->back->base[j - 1] & (1 << k)) Vars.r->numnuc[j - 1][k]++; p = p->next; } while (p != Vars.r); largest = getlargest(Vars.r->numnuc[j - 1]); Vars.tempset = 0; for (k = (long)A; k <= (long)O; k++) { if (Vars.r->numnuc[j - 1][k] == largest) Vars.tempset |= (1 << k); } Vars.r->base[j - 1] = Vars.tempset; } if (!Vars.bottom) Vars.anc = treenode[Vars.r->back->index - 1]->base[j - 1]; Vars.nonzero = (Vars.nonzero || (Vars.r->base[j - 1] & Vars.anc) == 0); Vars.maybe = (Vars.maybe || Vars.r->base[j - 1] != Vars.anc); } j = location[ally[dispchar - 1] - 1]; Vars.tempset = Vars.r->base[j - 1]; Vars.anc = Vars.hypset[j - 1]; if (!Vars.bottom) Vars.anc = treenode[Vars.r->back->index - 1]->base[j - 1]; r_->state = '?'; if (Vars.tempset == (1 << A)) r_->state = 'A'; if (Vars.tempset == (1 << C)) r_->state = 'C'; if (Vars.tempset == (1 << G)) r_->state = 'G'; if (Vars.tempset == (1 << T)) r_->state = 'T'; Vars.bottom = false; if (!Vars.r->tip) { memcpy(tempnuc, Vars.r->numnuc, endsite*sizeof(nucarray)); q = Vars.r->next; do { memcpy(Vars.r->numnuc, tempnuc, endsite*sizeof(nucarray)); for (i = b1 - 1; i < b2; i++) { j = location[ally[i] - 1]; for (k = (long)A; k <= (long)O; k++) if (q->back->base[j - 1] & (1 << k)) Vars.r->numnuc[j - 1][k]--; largest = getlargest(Vars.r->numnuc[j - 1]); ancset->base[j - 1] = 0; for (k = (long)A; k <= (long)O; k++) if (Vars.r->numnuc[j - 1][k] == largest) ancset->base[j - 1] |= (1 << k); if (!Vars.bottom) Vars.anc = ancset->base[j - 1]; } newdnamove_hyptrav(q->back, ancset->base, b1, b2, Vars.bottom, treenode); q = q->next; } while (q != Vars.r); } dnamove_chuck(ancset); } /* newdnamove_hyptrav */ void newdnamove_hypstates(long chars, node *root, pointarray treenode) { /* fill in and describe states at interior nodes */ /* used in dnacomp, dnapars, & dnapenny */ long i, n; baseptr nothing; /* garbage is passed along without usage to newdnamove_hyptrav, which also does not use it. */ nothing = (baseptr)Malloc(endsite*sizeof(long)); for (i = 0; i < endsite; i++) nothing[i] = 0; for (i = 1; i <= ((chars - 1) / 40 + 1); i++) { putc('\n', outfile); n = i * 40; if (n > chars) n = chars; newdnamove_hyptrav(root, nothing, i * 40 - 39, n, true, treenode); } free(nothing); } /* newdnamove_hypstates */ void grwrite(chartype c, long num, long *pos) { long i; prefix(c); for (i = 1; i <= num; i++) { if ((*pos) >= leftedge && (*pos) - leftedge + 1 < screenwidth) putchar(chh[(long)c]); (*pos)++; } postfix(c); } /* grwrite */ void dnamove_drawline(long i) { /* draws one row of the tree diagram by moving up tree */ node *p, *q, *r, *first =NULL, *last =NULL; long n, j, pos; boolean extra, done; Char st; chartype c, d; pos = 1; p = nuroot; q = nuroot; extra = false; if (i == p->ycoord && (p == root || subtree)) { extra = true; c = overt; if (display) { switch (p->state) { case 'A': c = aa; break; case 'C': c = cc; break; case 'G': c = gg; break; case 'T': c = tt; break; case '?': c = question; break; } } if ((subtree)) stwrite("Subtree:", 8, &pos, leftedge, screenwidth); if (p->index >= 100) nnwrite(p->index, 3, &pos, leftedge, screenwidth); else if (p->index >= 10) { grwrite(c, 1, &pos); nnwrite(p->index, 2, &pos, leftedge, screenwidth); } else { grwrite(c, 2, &pos); nnwrite(p->index, 1, &pos, leftedge, screenwidth); } } else { if ((subtree)) stwrite(" ", 10, &pos, leftedge, screenwidth); else stwrite(" ", 2, &pos, leftedge, screenwidth); } do { if (!p->tip) { r = p->next; done = false; do { if (i >= r->back->ymin && i <= r->back->ymax) { q = r->back; done = true; } r = r->next; } while (!(done || r == p)); first = p->next->back; r = p->next; while (r->next != p) r = r->next; last = r->back; } done = (p == q); n = p->xcoord - q->xcoord; if (n < 3 && !q->tip) n = 3; if (extra) { n--; extra = false; } if (q->ycoord == i && !done) { c = overt; if (q == first) d = downcorner; else if (q == last) d = upcorner; else if ((long)q->ycoord == (long)p->ycoord) d = c; else d = midcorner; if (display) { switch (q->state) { case 'A': c = aa; break; case 'C': c = cc; break; case 'G': c = gg; break; case 'T': c = tt; break; case '?': c = question; break; } d = c; } if (n > 1) { grwrite(d, 1, &pos); grwrite(c, n - 3, &pos); } if (q->index >= 100) nnwrite(q->index, 3, &pos, leftedge, screenwidth); else if (q->index >= 10) { grwrite(c, 1, &pos); nnwrite(q->index, 2, &pos, leftedge, screenwidth); } else { grwrite(c, 2, &pos); nnwrite(q->index, 1, &pos, leftedge, screenwidth); } extra = true; } else if (!q->tip) { if (last->ycoord > i && first->ycoord < i && i != p->ycoord) { c = up; if (i < p->ycoord) st = p->next->back->state; else st = p->next->next->back->state; if (display) { switch (st) { case 'A': c = aa; break; case 'C': c = cc; break; case 'G': c = gg; break; case 'T': c = tt; break; case '?': c = question; break; } } grwrite(c, 1, &pos); chwrite(' ', n - 1, &pos, leftedge, screenwidth); } else chwrite(' ', n, &pos, leftedge, screenwidth); } else chwrite(' ', n, &pos, leftedge, screenwidth); if (p != q) p = q; } while (!done); if (p->ycoord == i && p->tip) { n = 0; for (j = 1; j <= nmlngth; j++) { if (nayme[p->index - 1][j - 1] != '\0') n = j; } chwrite(':', 1, &pos, leftedge, screenwidth); for (j = 0; j < n; j++) chwrite(nayme[p->index - 1][j], 1, &pos, leftedge, screenwidth); } putchar('\n'); } /* dnamove_drawline */ void dnamove_printree() { /* prints out diagram of the tree */ long tipy; long i, dow; if (!subtree) nuroot = root; if (changed || newtree) evaluate(root); if (display) { outfile = stdout; newdnamove_hypstates(chars, root, treenode); } #ifdef WIN32 if(ibmpc || ansi) phyClearScreen(); else printf("\n"); #else printf((ansi || ibmpc) ? "\033[2J\033[H" : "\n"); #endif tipy = 1; dow = down; if (spp * dow > screenlines && !subtree) dow--; printf(" (unrooted)"); if (display) { printf(" "); makechar(aa); printf(":A "); makechar(cc); printf(":C "); makechar(gg); printf(":G "); makechar(tt); printf(":T "); makechar(question); printf(":?"); } else printf(" "); if (!earlytree) { printf("%10.1f Steps", -like); } if (display) printf(" SITE%4ld", dispchar); else printf(" "); if (!earlytree) { printf(" %3ld sites compatible\n", compatible); } printf(" "); if (changed && !earlytree) { if (-like < bestyet) { printf(" BEST YET!"); bestyet = -like; } else if (fabs(-like - bestyet) < 0.000001) printf(" (as good as best)"); else { if (-like < gotlike) printf(" better"); else if (-like > gotlike) printf(" worse!"); } } printf("\n"); farthest = 0; coordinates(nuroot, &tipy, 1.5, &farthest); vmargin = 4; treelines = tipy - dow; if (topedge != 1) { printf("** %ld lines above screen **\n", topedge - 1); vmargin++; } if ((treelines - topedge + 1) > (screenlines - vmargin)) vmargin++; for (i = 1; i <= treelines; i++) { if (i >= topedge && i < topedge + screenlines - vmargin) dnamove_drawline(i); } if ((treelines - topedge + 1) > (screenlines - vmargin)) { printf("** %ld", treelines - (topedge - 1 + screenlines - vmargin)); printf(" lines below screen **\n"); } if (treelines - topedge + vmargin + 1 < screenlines) putchar('\n'); gotlike = -like; changed = false; } /* dnamove_printree */ void arbitree() { long i; root = treenode[0]; dnamove_add(treenode[0], treenode[1], treenode[spp]); for (i = 3; i <= (spp); i++) { dnamove_add(treenode[spp + i - 3], treenode[i - 1], treenode[spp + i - 2]); } } /* arbitree */ void yourtree() { long i, j; boolean ok; root = treenode[0]; dnamove_add(treenode[0], treenode[1], treenode[spp]); i = 2; do { i++; dnamove_printree(); printf("Add species%3ld: ", i); for (j = 0; j < nmlngth; j++) putchar(nayme[i - 1][j]); do { printf("\n at or before which node (type number): "); inpnum(&j, &ok); ok = (ok && ((j >= 1 && j < i) || (j > spp && j < spp + i - 1))); if (!ok) printf("Impossible number. Please try again:\n"); } while (!ok); if (j >= i) { /* has user chosen a non-tip? if so, offer choice */ do { printf(" Insert at node (A) or before node (B)? "); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); if (ch == '\n') ch = ' '; ch = isupper(ch) ? ch : toupper(ch); } while (ch != 'A' && ch != 'B'); } else ch = 'B'; /* if user has chosen a tip, set Before */ if (j != 0) { if (ch == 'A') { if (!treenode[j - 1]->tip) { add_child(treenode[j - 1], treenode[i - 1]); } } else { printf("dnamove_add(below %ld, newtip %ld, newfork %ld)\n",j-1,i-1,spp+i-2); dnamove_add(treenode[j - 1], treenode[i - 1], treenode[spp + i - 2]); } /* endif (before or at node) */ } } while (i != spp); } /* yourtree */ void initdnamovenode(node **p, node **grbg, node *q, long len, long nodei, long *ntips, long *parens, initops whichinit, pointarray treenode, pointarray nodep, Char *str, Char *ch, FILE *intree) { /* initializes a node */ /* LM 7/27 I added this function and the commented lines around */ /* treeread() to get the program running, but all 4 move programs*/ /* are improperly integrated into the v4.0 support files. As is */ /* endsite = chars and this is a patchwork function */ boolean minusread; double valyew, divisor; switch (whichinit) { case bottom: gnutreenode(grbg, p, nodei, endsite, zeros); treenode[nodei - 1] = *p; break; case nonbottom: gnutreenode(grbg, p, nodei, endsite, zeros); break; case tip: match_names_to_data (str, treenode, p, spp); break; case length: processlength(&valyew, &divisor, ch, &minusread, intree, parens); /* process lengths and discard */ default: /*cases hslength,hsnolength,treewt,unittrwt,iter,*/ break; } } /* initdnamovenode */ void buildtree() { long i, nextnode; node *p; long j; treeone = (node **)Malloc(maxsz*sizeof(node *)); treetwo = (node **)Malloc(maxsz*sizeof(node *)); treesets[othertree].treenode = treetwo; changed = false; newtree = false; switch (how) { case arb: treesets[othertree].treenode = treetwo; arbitree(); break; case use: /* Open in binary: ftell() is broken for UNIX line-endings under WIN32 */ openfile(&intree,intreename,"input tree file", "rb",progname,intreename); names = (boolean *)Malloc(spp*sizeof(boolean)); firsttree = true; nodep = NULL; nextnode = 0; haslengths = 0; for (i = 0; i < endsite; i++) zeros[i] = 0; treesets[whichtree].nodep = nodep; treeread(intree, &root, treenode, &goteof, &firsttree, nodep, &nextnode, &haslengths, &grbg, initdnamovenode,true,nonodes); for (i = spp; i < (nextnode); i++) { p = treenode[i]; for (j = 1; j <= 3; j++) { p->base = (baseptr2)Malloc(chars*sizeof(long)); p = p->next; } } /* debug: see comment at initdnamovenode() */ free(names); FClose(intree); break; case spec: treesets[othertree].treenode = treetwo; yourtree(); break; } if (!outgropt) outgrno = root->next->back->index; if (outgropt) dnamove_reroot(treenode[outgrno - 1]); } /* buildtree */ void setorder() { /* sets in order of number of members */ sett[0] = 1L << ((long)A); sett[1] = 1L << ((long)C); sett[2] = 1L << ((long)G); sett[3] = 1L << ((long)T); sett[4] = 1L << ((long)O); sett[5] = (1L << ((long)A)) | (1L << ((long)C)); sett[6] = (1L << ((long)A)) | (1L << ((long)G)); sett[7] = (1L << ((long)A)) | (1L << ((long)T)); sett[8] = (1L << ((long)A)) | (1L << ((long)O)); sett[9] = (1L << ((long)C)) | (1L << ((long)G)); sett[10] = (1L << ((long)C)) | (1L << ((long)T)); sett[11] = (1L << ((long)C)) | (1L << ((long)O)); sett[12] = (1L << ((long)G)) | (1L << ((long)T)); sett[13] = (1L << ((long)G)) | (1L << ((long)O)); sett[14] = (1L << ((long)T)) | (1L << ((long)O)); sett[15] = (1L << ((long)A)) | (1L << ((long)C)) | (1L << ((long)G)); sett[16] = (1L << ((long)A)) | (1L << ((long)C)) | (1L << ((long)T)); sett[17] = (1L << ((long)A)) | (1L << ((long)C)) | (1L << ((long)O)); sett[18] = (1L << ((long)A)) | (1L << ((long)G)) | (1L << ((long)T)); sett[19] = (1L << ((long)A)) | (1L << ((long)G)) | (1L << ((long)O)); sett[20] = (1L << ((long)A)) | (1L << ((long)T)) | (1L << ((long)O)); sett[21] = (1L << ((long)C)) | (1L << ((long)G)) | (1L << ((long)T)); sett[22] = (1L << ((long)C)) | (1L << ((long)G)) | (1L << ((long)O)); sett[23] = (1L << ((long)C)) | (1L << ((long)T)) | (1L << ((long)O)); sett[24] = (1L << ((long)G)) | (1L << ((long)T)) | (1L << ((long)O)); sett[25] = (1L << ((long)A)) | (1L << ((long)C)) | (1L << ((long)G)) | (1L << ((long)T)); sett[26] = (1L << ((long)A)) | (1L << ((long)C)) | (1L << ((long)G)) | (1L << ((long)O)); sett[27] = (1L << ((long)A)) | (1L << ((long)C)) | (1L << ((long)T)) | (1L << ((long)O)); sett[28] = (1L << ((long)A)) | (1L << ((long)G)) | (1L << ((long)T)) | (1L << ((long)O)); sett[29] = (1L << ((long)C)) | (1L << ((long)G)) | (1L << ((long)T)) | (1L << ((long)O)); sett[30] = (1L << ((long)A)) | (1L << ((long)C)) | (1L << ((long)G)) | (1L << ((long)T)) | (1L << ((long)O)); } /* setorder */ void mincomp() { /* computes for each site the minimum number of steps necessary to accomodate those species already in the analysis */ long i, j, k; boolean done; for (i = 0; i < (chars); i++) { done = false; j = 0; while (!done) { j++; done = true; k = 1; do { if (k < nonodes) done = (done && (treenode[k - 1]->base[i] & sett[j - 1]) != 0); k++; } while (k <= spp && done); } if (j == 31) necsteps[i] = 4; if (j <= 30) necsteps[i] = 3; if (j <= 25) necsteps[i] = 2; if (j <= 15) necsteps[i] = 1; if (j <= 5) necsteps[i] = 0; necsteps[i] *= weight[i]; } } /* mincomp */ void consolidatetree(long index) { node *start, *r, *q; int i; i = 0; start = treenode[index - 1]; q = start->next; while (q != start) { r = q; q = q->next; chuck(&grbg, r); } chuck(&grbg, q); i = index; while (i <= nonodes) { r = treenode[i - 1]; if (!(r->tip)) r->index--; if (!(r->tip)) { q = r->next; do { q->index--; q = q->next; } while (r != q && q != NULL); } treenode[i - 1] = treenode[i]; i++; } nonodes--; } /* consolidatetree */ void rearrange() { long i, j, maxinput; boolean ok1, ok2; node *p, *q; char ch; printf("Remove everything to the right of which node? "); inpnum(&i, &ok1); ok1 = (ok1 && i >= 1 && i <= (spp * 2 - 1) && i != root->index); if (ok1) ok1 = !treenode[i - 1]->deleted; if (ok1) { printf("Add at or before which node? "); inpnum(&j, &ok2); ok2 = (ok2 && j >= 1 && j <= (spp * 2 - 1)); if (ok2) { if (j != root->index) ok2 = !treenode[treenode[j - 1]->back->index - 1]->deleted; } if (ok2) { /*xx This edit says "j must not be i's parent." Is this necessary anymore? */ /* ok2 = (nodep[j - 1] != nodep[nodep[i - 1]->back->index - 1]);*/ p = treenode[j - 1]; /* make sure that j is not a descendent of i */ while (p != root) { ok2 = (ok2 && p != treenode[i - 1]); p = treenode[p->back->index - 1]; } if (ok1 && ok2) { maxinput = 1; do { printf("Insert at node (A) or before node (B)? "); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); if (ch == '\n') ch = ' '; ch = isupper(ch) ? ch : toupper(ch); maxinput++; if (maxinput == 100) { printf("ERROR: too many tries at choosing option\n"); exxit(-1); } } while (ch != 'A' && ch != 'B'); if (ch == 'A') { if (!(treenode[j - 1]->deleted) && !treenode[j - 1]->tip) { changed = true; copytree(); dnamove_re_move(&treenode[i - 1], &q); add_child(treenode[j - 1], treenode[i - 1]); if (fromtype == beforenode) consolidatetree(q->index); } else ok2 = false; } else { if (j != root->index) { /* can't insert at root */ changed = true; copytree(); dnamove_re_move(&treenode[i - 1], &q); if (q != NULL) { treenode[q->index-1]->next->back = treenode[i-1]; treenode[i-1]->back = treenode[q->index-1]->next; } add_before(treenode[j - 1], treenode[i - 1]); } else ok2 = false; } /* endif (before or at node) */ } /* endif (ok to do move) */ } /* endif (destination node ok) */ } /* endif (from node ok) */ dnamove_printree(); if (!(ok1 && ok2)) printf("Not a possible rearrangement. Try again: \n"); else { written = false; } } /* rearrange */ void dnamove_nextinc() { /* show next incompatible site */ long disp0; boolean done; display = true; disp0 = dispchar; done = false; do { dispchar++; if (dispchar > chars) { dispchar = 1; done = (disp0 == 0); } } while (!(necsteps[dispchar - 1] != numsteps[dispchar - 1] || dispchar == disp0 || done)); dnamove_printree(); } /* dnamove_nextinc */ void dnamove_nextchar() { /* show next site */ display = true; dispchar++; if (dispchar > chars) dispchar = 1; dnamove_printree(); } /* dnamove_nextchar */ void dnamove_prevchar() { /* show previous site */ display = true; dispchar--; if (dispchar < 1) dispchar = chars; dnamove_printree(); } /* dnamove_prevchar */ void dnamove_show() { long i; boolean ok; do { printf("SHOW: (Character number or 0 to see none)? "); inpnum(&i, &ok); ok = (ok && (i == 0 || (i >= 1 && i <= chars))); if (ok && i != 0) { display = true; dispchar = i; } if (ok && i == 0) display = false; } while (!ok); dnamove_printree(); } /* dnamove_show */ void tryadd(node *p, node **item, node **nufork, double *place) { /* temporarily adds one fork and one tip to the tree. Records scores in ARRAY place */ dnamove_add(p, *item, *nufork); evaluate(root); place[p->index - 1] = -like; dnamove_re_move(item, nufork); } /* tryadd */ void addpreorder(node *p, node *item_, node *nufork_, double *place) { /* traverses a binary tree, calling PROCEDURE tryadd at a node before calling tryadd at its descendants */ node *item, *nufork, *q; item = item_; nufork = nufork_; if (p == NULL) return; tryadd(p,&item,&nufork,place); if (!p->tip) { q = p->next; do { addpreorder(q->back, item,nufork,place); q = q->next; } while (q != p); } } /* addpreorder */ void try() { /* Remove node, try it in all possible places */ double *place; long i, j, oldcompat, saveparent; double current; node *q, *dummy, *rute; boolean tied, better, ok, madenode; madenode = false; printf("Try other positions for which node? "); inpnum(&i, &ok); if (!(ok && i >= 1 && i <= nonodes && i != root->index)) { printf("Not a possible choice! "); return; } copytree(); printf("WAIT ...\n"); place = (double *)Malloc(nonodes*sizeof(double)); for (j = 0; j < (nonodes); j++) place[j] = -1.0; evaluate(root); current = -like; oldcompat = compatible; what = i; /* q = ring base of i's parent */ q = treenode[treenode[i - 1]->back->index - 1]; saveparent = q->index; /* if i is a left child, fromwhere = index of right sibling (binary) */ /* if i is a right child, fromwhere = index of left sibling (binary) */ if (q->next->back->index == i) fromwhere = q->next->next->back->index; else fromwhere = q->next->back->index; rute = root; /* if root is i's parent ... */ if (q->next->next->next == q) { if (root == treenode[treenode[i - 1]->back->index - 1]) { /* if i is left child then rute becomes right child, and vice-versa */ if (treenode[treenode[i - 1]->back->index - 1]->next->back == treenode[i - 1]) rute = treenode[treenode[i - 1]->back->index - 1]->next->next->back; else rute = treenode[treenode[i - 1]->back->index - 1]->next->back; } } /* Remove i and perhaps its parent node from the tree. If i is part of a multifurcation, *dummy will come back null. If so, make a new internal node to be i's parent as it is inserted in various places around the tree. */ dnamove_re_move(&treenode[i - 1], &dummy); if (dummy == NULL) { madenode = true; nonodes++; maketriad(&dummy, nonodes); } oldleft = wasleft; root = rute; addpreorder(root, treenode[i - 1], dummy, place); wasleft = oldleft; restoring = true; if (madenode) { add_child(treenode[saveparent - 1], treenode[i - 1]); nonodes--; } else dnamove_add(treenode[fromwhere - 1], treenode[what - 1], q); like = -current; compatible = oldcompat; restoring = false; better = false; printf(" BETTER: "); for (j = 1; j <= (nonodes); j++) { if (place[j - 1] < current && place[j - 1] >= 0.0) { printf("%3ld:%6.2f", j, place[j - 1]); better = true; } } if (!better) printf(" NONE"); printf("\n TIED: "); tied = false; for (j = 1; j <= (nonodes); j++) { if (fabs(place[j - 1] - current) < 1.0e-6 && j != fromwhere) { if (j < 10) printf("%2ld", j); else printf("%3ld", j); tied = true; } } if (tied) printf(":%6.2f\n", current); else printf("NONE\n"); changed = true; free(place); } /* try */ void undo() { boolean btemp; /* don't undo to an uninitialized tree */ if (!treesets[othertree].initialized) { dnamove_printree(); printf("Nothing to undo.\n"); return; } treesets[whichtree].root = root; treesets[whichtree].treenode = treenode; treesets[whichtree].nonodes = nonodes; treesets[whichtree].waswritten = waswritten; treesets[whichtree].initialized = true; whichtree = othertree; root = treesets[whichtree].root; treenode = treesets[whichtree].treenode; nonodes = treesets[whichtree].nonodes; waswritten = treesets[whichtree].waswritten; if (othertree == 0) othertree = 1; else othertree = 0; changed = true; dnamove_printree(); btemp = oldwritten; oldwritten = written; written = btemp; } /* undo */ void treewrite(boolean done) { /* write out tree to a file */ Char ch; treeoptions(waswritten, &ch, &outtree, outtreename, progname); if (!done) dnamove_printree(); if (waswritten && ch != 'A' && ch != 'R') return; col = 0; treeout(root, 1, &col, root); printf("\nTree written to file \"%s\"\n\n", outtreename); waswritten = true; written = true; FClose(outtree); #ifdef MAC fixmacfile(outtreename); #endif } /* treewrite */ void clade() { /* pick a subtree and show only that on screen */ long i; boolean ok; printf("Select subtree rooted at which node (0 for whole tree)? "); inpnum(&i, &ok); ok = (ok && ((unsigned)i) <= ((unsigned)nonodes)); if (ok) { subtree = (i > 0); if (subtree) nuroot = treenode[i - 1]; else nuroot = root; } dnamove_printree(); if (!ok) printf("Not possible to use this node. "); } /* clade */ void fliptrav(node *p, boolean recurse) { node *q, *temp, *r =NULL, *rprev =NULL, *l, *lprev; boolean lprevflag; int nodecount, loopcount, i; if (p->tip) return; q = p->next; l = q; lprev = p; nodecount = 0; do { nodecount++; if (q->next->next == p) { rprev = q; r = q->next; } q = q->next; } while (p != q); if (nodecount == 1) return; loopcount = nodecount / 2; for (i=0; inext = r; rprev->next = l; temp = r->next; r->next = l->next; l->next = temp; if (i < (loopcount - 1)) { lprevflag = false; q = p->next; do { if (q == lprev->next && !lprevflag) { lprev = q; l = q->next; lprevflag = true; } if (q->next == rprev) { rprev = q; r = q->next; } q = q->next; } while (p != q); } } if (recurse) { q = p->next; do { fliptrav(q->back, true); q = q->next; } while (p != q); } } /* fliptrav */ void flip(long atnode) { /* flip at a node left-right */ long i; boolean ok; if (atnode == 0) { printf("Flip branches at which node? "); inpnum(&i, &ok); ok = (ok && i > spp && i <= nonodes); } else { i = atnode; ok = true; } if (ok) { copytree(); fliptrav(treenode[i - 1], true); } if (atnode == 0) dnamove_printree(); if (ok) { written = false; return; } if ((i >= 1 && i <= spp) || (i > spp && i <= nonodes)) printf("Can't flip there. "); else printf("No such node. "); } /* flip */ void changeoutgroup() { long i; boolean ok; oldoutgrno = outgrno; do { printf("Which node should be the new outgroup? "); inpnum(&i, &ok); ok = (ok && i >= 1 && i <= nonodes && i != root->index); if (ok) outgrno = i; } while (!ok); copytree(); dnamove_reroot(treenode[outgrno - 1]); changed = true; lastop = reroott; dnamove_printree(); oldwritten = written; written = false; } /* changeoutgroup */ void redisplay() { boolean done = false; waswritten = false; do { printf("NEXT? (Options: R # + - S . T U W O F H J K L C ? X Q) "); printf("(? for Help) "); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); uppercase(&ch); if (strchr("HJKLCFORSTUXQ+#-.W?",ch) != NULL){ switch (ch) { case 'R': rearrange(); break; case '#': dnamove_nextinc(); break; case '+': dnamove_nextchar(); break; case '-': dnamove_prevchar(); break; case 'S': dnamove_show(); break; case '.': dnamove_printree(); break; case 'T': try(); break; case 'U': undo(); break; case 'W': treewrite(done); break; case 'O': changeoutgroup(); break; case 'F': flip(0); break; case 'H': window(left, &leftedge, &topedge, hscroll, vscroll, treelines, screenlines, screenwidth, farthest, subtree); dnamove_printree(); break; case 'J': window(downn, &leftedge, &topedge, hscroll, vscroll, treelines, screenlines, screenwidth, farthest, subtree); dnamove_printree(); break; case 'K': window(upp, &leftedge, &topedge, hscroll, vscroll, treelines, screenlines, screenwidth, farthest, subtree); dnamove_printree(); break; case 'L': window(right, &leftedge, &topedge, hscroll, vscroll, treelines, screenlines, screenwidth, farthest, subtree); dnamove_printree(); break; case 'C': clade(); break; case '?': help("site"); dnamove_printree(); break; case 'X': done = true; break; case 'Q': done = true; break; } } } while (!done); if (written) return; do { printf("Do you want to write out the tree to a file? (Y or N) "); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); if (ch == 'Y' || ch == 'y') treewrite(done); } while (ch != 'Y' && ch != 'y' && ch != 'N' && ch != 'n'); } /* redisplay */ void treeconstruct() { /* constructs a binary tree from the pointers in treenode. */ int i; restoring = false; subtree = false; display = false; dispchar = 0; earlytree = true; waswritten = false; buildtree(); /* get an accurate value for nonodes by finding out where the nodes really stop */ for (i=0;i MAXNUMTREES) maxtrees = MAXNUMTREES; getchar(); countup(&loopcount2, 10); } while (maxtrees < 1); break; case 'I': interleaved = !interleaved; break; case '0': initterminal(&ibmpc, &ansi); break; case '1': printdata = !printdata; break; case '2': progress = !progress; break; case '3': treeprint = !treeprint; break; case '4': stepbox = !stepbox; break; case '5': ancseq = !ancseq; break; case '.': dotdiff = !dotdiff; break; case '6': trout = !trout; break; } } else printf("Not a possible option!\n"); countup(&loopcount, 100); } if (transvp) fprintf(outfile, "Transversion parsimony\n\n"); } /* getoptions */ void allocrest() { long i; y = (Char **)Malloc(spp*sizeof(Char *)); for (i = 0; i < spp; i++) y[i] = (Char *)Malloc(chars*sizeof(Char)); bestrees = (bestelm *)Malloc(maxtrees*sizeof(bestelm)); for (i = 1; i <= maxtrees; i++) bestrees[i - 1].btree = (long *)Malloc(nonodes*sizeof(long)); nayme = (naym *)Malloc(spp*sizeof(naym)); enterorder = (long *)Malloc(spp*sizeof(long)); place = (long *)Malloc(nonodes*sizeof(long)); weight = (long *)Malloc(chars*sizeof(long)); oldweight = (long *)Malloc(chars*sizeof(long)); alias = (long *)Malloc(chars*sizeof(long)); ally = (long *)Malloc(chars*sizeof(long)); location = (long *)Malloc(chars*sizeof(long)); } /* allocrest */ void doinit() { /* initializes variables */ inputnumbers(&spp, &chars, &nonodes, 1); getoptions(); if (printdata) fprintf(outfile, "%2ld species, %3ld sites\n\n", spp, chars); alloctree(&treenode, nonodes, usertree); } /* doinit */ void makeweights() { /* make up weights vector to avoid duplicate computations */ long i; for (i = 1; i <= chars; i++) { alias[i - 1] = i; oldweight[i - 1] = weight[i - 1]; ally[i - 1] = i; } sitesort(chars, weight); sitecombine(chars); sitescrunch(chars); endsite = 0; for (i = 1; i <= chars; i++) { if (ally[i - 1] == i) endsite++; } for (i = 1; i <= endsite; i++) location[alias[i - 1] - 1] = i; if (!thresh) threshold = spp; threshwt = (long *)Malloc(endsite*sizeof(long)); for (i = 0; i < endsite; i++) { weight[i] *= 10; threshwt[i] = (long)(threshold * weight[i] + 0.5); } zeros = (long *)Malloc(endsite*sizeof(long)); for (i = 0; i < endsite; i++) zeros[i] = 0; } /* makeweights */ void doinput() { /* reads the input data */ long i; if (justwts) { if (firstset) inputdata(chars); for (i = 0; i < chars; i++) weight[i] = 1; inputweights(chars, weight, &weights); if (justwts) { fprintf(outfile, "\n\nWeights set # %ld:\n\n", ith); if (progress) printf("\nWeights set # %ld:\n\n", ith); } if (printdata) printweights(outfile, 0, chars, weight, "Sites"); } else { if (!firstset){ samenumsp(&chars, ith); reallocchars(); } inputdata(chars); for (i = 0; i < chars; i++) weight[i] = 1; if (weights) { inputweights(chars, weight, &weights); if (printdata) printweights(outfile, 0, chars, weight, "Sites"); } } makeweights(); makevalues(treenode, zeros, usertree); if (!usertree) { allocnode(&temp, zeros, endsite); allocnode(&temp1, zeros, endsite); allocnode(&temp2, zeros, endsite); allocnode(&tempsum, zeros, endsite); allocnode(&temprm, zeros, endsite); allocnode(&tempadd, zeros, endsite); allocnode(&tempf, zeros, endsite); allocnode(&tmp, zeros, endsite); allocnode(&tmp1, zeros, endsite); allocnode(&tmp2, zeros, endsite); allocnode(&tmp3, zeros, endsite); allocnode(&tmprm, zeros, endsite); allocnode(&tmpadd, zeros, endsite); } } /* doinput */ void initdnaparsnode(node **p, node **grbg, node *q, long len, long nodei, long *ntips, long *parens, initops whichinit, pointarray treenode, pointarray nodep, Char *str, Char *ch, FILE *intree) { /* initializes a node */ boolean minusread; double valyew, divisor; switch (whichinit) { case bottom: gnutreenode(grbg, p, nodei, endsite, zeros); treenode[nodei - 1] = *p; break; case nonbottom: gnutreenode(grbg, p, nodei, endsite, zeros); break; case tip: match_names_to_data (str, treenode, p, spp); break; case length: /* if there is a length, read it and discard value */ processlength(&valyew, &divisor, ch, &minusread, intree, parens); break; default: /*cases hslength,hsnolength,treewt,unittrwt,iter,*/ break; } } /* initdnaparsnode */ void evaluate(node *r) { /* determines the number of steps needed for a tree. this is the minimum number of steps needed to evolve sequences on this tree */ long i, steps; long term; double sum; sum = 0.0; for (i = 0; i < endsite; i++) { steps = r->numsteps[i]; if ((long)steps <= threshwt[i]) term = steps; else term = threshwt[i]; sum += (double)term; if (usertree && which <= maxuser) fsteps[which - 1][i] = term; } if (usertree && which <= maxuser) { nsteps[which - 1] = sum; if (which == 1) { minwhich = 1; minsteps = sum; } else if (sum < minsteps) { minwhich = which; minsteps = sum; } } like = -sum; } /* evaluate */ void tryadd(node *p, node *item, node *nufork) { /* temporarily adds one fork and one tip to the tree. if the location where they are added yields greater "likelihood" than other locations tested up to that time, then keeps that location as there */ long pos; double belowsum, parentsum; boolean found, collapse, changethere, trysave; if (!p->tip) { memcpy(temp->base, p->base, endsite*sizeof(long)); memcpy(temp->numsteps, p->numsteps, endsite*sizeof(long)); memcpy(temp->numnuc, p->numnuc, endsite*sizeof(nucarray)); temp->numdesc = p->numdesc + 1; if (p->back) { multifillin(temp, tempadd, 1); sumnsteps2(tempsum, temp, p->back, 0, endsite, threshwt); } else { multisumnsteps(temp, tempadd, 0, endsite, threshwt); tempsum->sumsteps = temp->sumsteps; } if (tempsum->sumsteps <= -bestyet) { if (p->back) sumnsteps2(tempsum, temp, p->back, endsite+1, endsite, threshwt); else { multisumnsteps(temp, temp1, endsite+1, endsite, threshwt); tempsum->sumsteps = temp->sumsteps; } } p->sumsteps = tempsum->sumsteps; } if (p == root) sumnsteps2(temp, item, p, 0, endsite, threshwt); else { sumnsteps(temp1, item, p, 0, endsite); sumnsteps2(temp, temp1, p->back, 0, endsite, threshwt); } if (temp->sumsteps <= -bestyet) { if (p == root) sumnsteps2(temp, item, p, endsite+1, endsite, threshwt); else { sumnsteps(temp1, item, p, endsite+1, endsite); sumnsteps2(temp, temp1, p->back, endsite+1, endsite, threshwt); } } belowsum = temp->sumsteps; multf = false; like = -belowsum; if (!p->tip && belowsum >= p->sumsteps) { multf = true; like = -p->sumsteps; } trysave = true; if (!multf && p != root) { parentsum = treenode[p->back->index - 1]->sumsteps; if (belowsum >= parentsum) trysave = false; } if (lastrearr) { changethere = true; if (like >= bstlike2 && trysave) { if (like > bstlike2) found = false; else { addnsave(p, item, nufork, &root, &grbg, multf, treenode, place, zeros); pos = 0; findtree(&found, &pos, nextree, place, bestrees); } if (!found) { collapse = collapsible(item, p, temp, temp1, temp2, tempsum, temprm, tmpadd, multf, root, zeros, treenode); if (!thorough) changethere = !collapse; if (thorough || !collapse || like > bstlike2 || (nextree == 1)) { if (like > bstlike2) { addnsave(p, item, nufork, &root, &grbg, multf, treenode, place, zeros); bestlike = bstlike2 = like; addbestever(&pos, &nextree, maxtrees, collapse, place, bestrees); } else addtiedtree(pos, &nextree, maxtrees, collapse, place, bestrees); } } } if (like >= bestyet) { if (like > bstlike2) bstlike2 = like; if (changethere && trysave) { bestyet = like; there = p; mulf = multf; } } } else if ((like > bestyet) || (like >= bestyet && trysave)) { bestyet = like; there = p; mulf = multf; } } /* tryadd */ void addpreorder(node *p, node *item, node *nufork) { /* traverses a n-ary tree, calling function tryadd at a node before calling tryadd at its descendants */ node *q; if (p == NULL) return; tryadd(p, item, nufork); if (!p->tip) { q = p->next; while (q != p) { addpreorder(q->back, item, nufork); q = q->next; } } } /* addpreorder */ void trydescendants(node *item, node *forknode, node *parent, node *parentback, boolean trybelow) { /* tries rearrangements at parent and below parent's descendants */ node *q, *tempblw; boolean bestever=0, belowbetter, multf=0, saved, trysave; double parentsum=0, belowsum; memcpy(temp->base, parent->base, endsite*sizeof(long)); memcpy(temp->numsteps, parent->numsteps, endsite*sizeof(long)); memcpy(temp->numnuc, parent->numnuc, endsite*sizeof(nucarray)); temp->numdesc = parent->numdesc + 1; multifillin(temp, tempadd, 1); sumnsteps2(tempsum, parentback, temp, 0, endsite, threshwt); belowbetter = true; if (lastrearr) { parentsum = tempsum->sumsteps; if (-tempsum->sumsteps >= bstlike2) { belowbetter = false; bestever = false; multf = true; if (-tempsum->sumsteps > bstlike2) bestever = true; savelocrearr(item, forknode, parent, tmp, tmp1, tmp2, tmp3, tmprm, tmpadd, &root, maxtrees, &nextree, multf, bestever, &saved, place, bestrees, treenode, &grbg, zeros); if (saved) { like = bstlike2 = -tempsum->sumsteps; there = parent; mulf = true; } } } else if (-tempsum->sumsteps >= like) { there = parent; mulf = true; like = -tempsum->sumsteps; } if (trybelow) { sumnsteps(temp, parent, tempadd, 0, endsite); sumnsteps2(tempsum, temp, parentback, 0, endsite, threshwt); if (lastrearr) { belowsum = tempsum->sumsteps; if (-tempsum->sumsteps >= bstlike2 && belowbetter && (forknode->numdesc > 2 || (forknode->numdesc == 2 && parent->back->index != forknode->index))) { trysave = false; memcpy(temp->base, parentback->base, endsite*sizeof(long)); memcpy(temp->numsteps, parentback->numsteps, endsite*sizeof(long)); memcpy(temp->numnuc, parentback->numnuc, endsite*sizeof(nucarray)); temp->numdesc = parentback->numdesc + 1; multifillin(temp, tempadd, 1); sumnsteps2(tempsum, parent, temp, 0, endsite, threshwt); if (-tempsum->sumsteps < bstlike2) { multf = false; bestever = false; trysave = true; } if (-belowsum > bstlike2) { multf = false; bestever = true; trysave = true; } if (trysave) { if (treenode[parent->index - 1] != parent) tempblw = parent->back; else tempblw = parent; savelocrearr(item, forknode, tempblw, tmp, tmp1, tmp2, tmp3, tmprm, tmpadd, &root, maxtrees, &nextree, multf, bestever, &saved, place, bestrees, treenode, &grbg, zeros); if (saved) { like = bstlike2 = -belowsum; there = tempblw; mulf = false; } } } } else if (-tempsum->sumsteps > like) { like = -tempsum->sumsteps; if (-tempsum->sumsteps > bestyet) { if (treenode[parent->index - 1] != parent) tempblw = parent->back; else tempblw = parent; there = tempblw; mulf = false; } } } q = parent->next; while (q != parent) { if (q->back && q->back != item) { memcpy(temp1->base, q->base, endsite*sizeof(long)); memcpy(temp1->numsteps, q->numsteps, endsite*sizeof(long)); memcpy(temp1->numnuc, q->numnuc, endsite*sizeof(nucarray)); temp1->numdesc = q->numdesc; multifillin(temp1, parentback, 0); if (lastrearr) belowbetter = (-parentsum < bstlike2); if (!q->back->tip) { memcpy(temp->base, q->back->base, endsite*sizeof(long)); memcpy(temp->numsteps, q->back->numsteps, endsite*sizeof(long)); memcpy(temp->numnuc, q->back->numnuc, endsite*sizeof(nucarray)); temp->numdesc = q->back->numdesc + 1; multifillin(temp, tempadd, 1); sumnsteps2(tempsum, temp1, temp, 0, endsite, threshwt); if (lastrearr) { if (-tempsum->sumsteps >= bstlike2) { belowbetter = false; bestever = false; multf = true; if (-tempsum->sumsteps > bstlike2) bestever = true; savelocrearr(item, forknode, q->back, tmp, tmp1, tmp2, tmp3, tmprm, tmpadd, &root, maxtrees, &nextree, multf, bestever, &saved, place, bestrees, treenode, &grbg, zeros); if (saved) { like = bstlike2 = -tempsum->sumsteps; there = q->back; mulf = true; } } } else if (-tempsum->sumsteps >= like) { like = -tempsum->sumsteps; there = q->back; mulf = true; } } sumnsteps(temp, q->back, tempadd, 0, endsite); sumnsteps2(tempsum, temp, temp1, 0, endsite, threshwt); if (lastrearr) { if (-tempsum->sumsteps >= bstlike2) { trysave = false; multf = false; if (belowbetter) { bestever = false; trysave = true; } if (-tempsum->sumsteps > bstlike2) { bestever = true; trysave = true; } if (trysave) { if (treenode[q->back->index - 1] != q->back) tempblw = q; else tempblw = q->back; savelocrearr(item, forknode, tempblw, tmp, tmp1, tmp2, tmp3, tmprm, tmpadd, &root, maxtrees, &nextree, multf, bestever, &saved, place, bestrees, treenode, &grbg, zeros); if (saved) { like = bstlike2 = -tempsum->sumsteps; there = tempblw; mulf = false; } } } } else if (-tempsum->sumsteps > like) { like = -tempsum->sumsteps; if (-tempsum->sumsteps > bestyet) { if (treenode[q->back->index - 1] != q->back) tempblw = q; else tempblw = q->back; there = tempblw; mulf = false; } } } q = q->next; } } /* trydescendants */ void trylocal(node *item, node *forknode) { /* rearranges below forknode, below descendants of forknode when there are more than 2 descendants, then unroots the back of forknode and rearranges on its descendants */ node *q; boolean bestever, multf, saved; memcpy(temprm->base, zeros, endsite*sizeof(long)); memcpy(temprm->numsteps, zeros, endsite*sizeof(long)); memcpy(temprm->oldbase, item->base, endsite*sizeof(long)); memcpy(temprm->oldnumsteps, item->numsteps, endsite*sizeof(long)); memcpy(tempf->base, forknode->base, endsite*sizeof(long)); memcpy(tempf->numsteps, forknode->numsteps, endsite*sizeof(long)); memcpy(tempf->numnuc, forknode->numnuc, endsite*sizeof(nucarray)); tempf->numdesc = forknode->numdesc - 1; multifillin(tempf, temprm, -1); if (!forknode->back) { sumnsteps2(tempsum, tempf, tempadd, 0, endsite, threshwt); if (lastrearr) { if (-tempsum->sumsteps > bstlike2) { bestever = true; multf = false; savelocrearr(item, forknode, forknode, tmp, tmp1, tmp2, tmp3, tmprm, tmpadd, &root, maxtrees, &nextree, multf, bestever, &saved, place, bestrees, treenode, &grbg, zeros); if (saved) { like = bstlike2 = -tempsum->sumsteps; there = forknode; mulf = false; } } } else if (-tempsum->sumsteps > like) { like = -tempsum->sumsteps; if (-tempsum->sumsteps > bestyet) { there = forknode; mulf = false; } } } else { sumnsteps(temp, tempf, tempadd, 0, endsite); sumnsteps2(tempsum, temp, forknode->back, 0, endsite, threshwt); if (lastrearr) { if (-tempsum->sumsteps > bstlike2) { bestever = true; multf = false; savelocrearr(item, forknode, forknode, tmp, tmp1, tmp2, tmp3, tmprm, tmpadd, &root, maxtrees, &nextree, multf, bestever, &saved, place, bestrees, treenode, &grbg, zeros); if (saved) { like = bstlike2 = -tempsum->sumsteps; there = forknode; mulf = false; } } } else if (-tempsum->sumsteps > like) { like = -tempsum->sumsteps; if (-tempsum->sumsteps > bestyet) { there = forknode; mulf = false; } } trydescendants(item, forknode, forknode->back, tempf, false); } q = forknode->next; while (q != forknode) { if (q->back != item) { memcpy(temp2->base, q->base, endsite*sizeof(long)); memcpy(temp2->numsteps, q->numsteps, endsite*sizeof(long)); memcpy(temp2->numnuc, q->numnuc, endsite*sizeof(nucarray)); temp2->numdesc = q->numdesc - 1; multifillin(temp2, temprm, -1); if (!q->back->tip) { trydescendants(item, forknode, q->back, temp2, true); } else { sumnsteps(temp1, q->back, tempadd, 0, endsite); sumnsteps2(tempsum, temp1, temp2, 0, endsite, threshwt); if (lastrearr) { if (-tempsum->sumsteps > bstlike2) { multf = false; bestever = true; savelocrearr(item, forknode, q->back, tmp, tmp1, tmp2, tmp3, tmprm, tmpadd, &root, maxtrees, &nextree, multf, bestever, &saved, place, bestrees, treenode, &grbg, zeros); if (saved) { like = bstlike2 = -tempsum->sumsteps; there = q->back; mulf = false; } } } else if ((-tempsum->sumsteps) > like) { like = -tempsum->sumsteps; if (-tempsum->sumsteps > bestyet) { there = q->back; mulf = false; } } } } q = q->next; } } /* trylocal */ void trylocal2(node *item, node *forknode, node *other) { /* rearranges below forknode, below descendants of forknode when there are more than 2 descendants, then unroots the back of forknode and rearranges on its descendants. Used if forknode has binary descendants */ node *q; boolean bestever=0, multf, saved, belowbetter, trysave; memcpy(tempf->base, other->base, endsite*sizeof(long)); memcpy(tempf->numsteps, other->numsteps, endsite*sizeof(long)); memcpy(tempf->oldbase, forknode->base, endsite*sizeof(long)); memcpy(tempf->oldnumsteps, forknode->numsteps, endsite*sizeof(long)); tempf->numdesc = other->numdesc; if (forknode->back) trydescendants(item, forknode, forknode->back, tempf, false); if (!other->tip) { memcpy(temp->base, other->base, endsite*sizeof(long)); memcpy(temp->numsteps, other->numsteps, endsite*sizeof(long)); memcpy(temp->numnuc, other->numnuc, endsite*sizeof(nucarray)); temp->numdesc = other->numdesc + 1; multifillin(temp, tempadd, 1); if (forknode->back) sumnsteps2(tempsum, forknode->back, temp, 0, endsite, threshwt); else sumnsteps2(tempsum, NULL, temp, 0, endsite, threshwt); belowbetter = true; if (lastrearr) { if (-tempsum->sumsteps >= bstlike2) { belowbetter = false; bestever = false; multf = true; if (-tempsum->sumsteps > bstlike2) bestever = true; savelocrearr(item, forknode, other, tmp, tmp1, tmp2, tmp3, tmprm, tmpadd, &root, maxtrees, &nextree, multf, bestever, &saved, place, bestrees, treenode, &grbg, zeros); if (saved) { like = bstlike2 = -tempsum->sumsteps; there = other; mulf = true; } } } else if (-tempsum->sumsteps >= like) { there = other; mulf = true; like = -tempsum->sumsteps; } if (forknode->back) { memcpy(temprm->base, forknode->back->base, endsite*sizeof(long)); memcpy(temprm->numsteps, forknode->back->numsteps, endsite*sizeof(long)); } else { memcpy(temprm->base, zeros, endsite*sizeof(long)); memcpy(temprm->numsteps, zeros, endsite*sizeof(long)); } memcpy(temprm->oldbase, other->back->base, endsite*sizeof(long)); memcpy(temprm->oldnumsteps, other->back->numsteps, endsite*sizeof(long)); q = other->next; while (q != other) { memcpy(temp2->base, q->base, endsite*sizeof(long)); memcpy(temp2->numsteps, q->numsteps, endsite*sizeof(long)); memcpy(temp2->numnuc, q->numnuc, endsite*sizeof(nucarray)); if (forknode->back) { temp2->numdesc = q->numdesc; multifillin(temp2, temprm, 0); } else { temp2->numdesc = q->numdesc - 1; multifillin(temp2, temprm, -1); } if (!q->back->tip) trydescendants(item, forknode, q->back, temp2, true); else { sumnsteps(temp1, q->back, tempadd, 0, endsite); sumnsteps2(tempsum, temp1, temp2, 0, endsite, threshwt); if (lastrearr) { if (-tempsum->sumsteps >= bstlike2) { trysave = false; multf = false; if (belowbetter) { bestever = false; trysave = true; } if (-tempsum->sumsteps > bstlike2) { bestever = true; trysave = true; } if (trysave) { savelocrearr(item, forknode, q->back, tmp, tmp1, tmp2, tmp3, tmprm, tmpadd, &root, maxtrees, &nextree, multf, bestever, &saved, place, bestrees, treenode, &grbg, zeros); if (saved) { like = bstlike2 = -tempsum->sumsteps; there = q->back; mulf = false; } } } } else if (-tempsum->sumsteps > like) { like = -tempsum->sumsteps; if (-tempsum->sumsteps > bestyet) { there = q->back; mulf = false; } } } q = q->next; } } } /* trylocal2 */ void tryrearr(node *p, boolean *success) { /* evaluates one rearrangement of the tree. if the new tree has greater "likelihood" than the old one sets success = TRUE and keeps the new tree. otherwise, restores the old tree */ node *forknode, *newfork, *other, *oldthere; double oldlike; boolean oldmulf; if (p->back == NULL) return; forknode = treenode[p->back->index - 1]; if (!forknode->back && forknode->numdesc <= 2 && alltips(forknode, p)) return; oldlike = bestyet; like = -10.0 * spp * chars; memcpy(tempadd->base, p->base, endsite*sizeof(long)); memcpy(tempadd->numsteps, p->numsteps, endsite*sizeof(long)); memcpy(tempadd->oldbase, zeros, endsite*sizeof(long)); memcpy(tempadd->oldnumsteps, zeros, endsite*sizeof(long)); if (forknode->numdesc > 2) { oldthere = there = forknode; oldmulf = mulf = true; trylocal(p, forknode); } else { findbelow(&other, p, forknode); oldthere = there = other; oldmulf = mulf = false; trylocal2(p, forknode, other); } if ((like <= oldlike) || (there == oldthere && mulf == oldmulf)) return; recompute = true; re_move(p, &forknode, &root, recompute, treenode, &grbg, zeros); if (mulf) add(there, p, NULL, &root, recompute, treenode, &grbg, zeros); else { if (forknode->numdesc > 0) getnufork(&newfork, &grbg, treenode, zeros); else newfork = forknode; add(there, p, newfork, &root, recompute, treenode, &grbg, zeros); } if (like > oldlike + LIKE_EPSILON) { *success = true; bestyet = like; } } /* tryrearr */ void repreorder(node *p, boolean *success) { /* traverses a binary tree, calling PROCEDURE tryrearr at a node before calling tryrearr at its descendants */ node *q, *this; if (p == NULL) return; if (!p->visited) { tryrearr(p, success); p->visited = true; } if (!p->tip) { q = p; while (q->next != p) { this = q->next->back; repreorder(q->next->back,success); if (q->next->back == this) q = q->next; } } } /* repreorder */ void rearrange(node **r) { /* traverses the tree (preorder), finding any local rearrangement which decreases the number of steps. if traversal succeeds in increasing the tree's "likelihood", PROCEDURE rearrange runs traversal again */ boolean success=true; while (success) { success = false; clearvisited(treenode); repreorder(*r, &success); } } /* rearrange */ void describe() { /* prints ancestors, steps and table of numbers of steps in each site */ if (treeprint) { fprintf(outfile, "\nrequires a total of %10.3f\n", like / -10.0); fprintf(outfile, "\n between and length\n"); fprintf(outfile, " ------- --- ------\n"); printbranchlengths(root); } if (stepbox) writesteps(chars, weights, oldweight, root); if (ancseq) { hypstates(chars, root, treenode, &garbage, basechar); putc('\n', outfile); } putc('\n', outfile); if (trout) { col = 0; treeout3(root, nextree, &col, root); } } /* describe */ void dnapars_coordinates(node *p, double lengthsum, long *tipy, double *tipmax) { /* establishes coordinates of nodes */ node *q, *first, *last; double xx; if (p == NULL) return; if (p->tip) { p->xcoord = (long)(over * lengthsum + 0.5); p->ycoord = (*tipy); p->ymin = (*tipy); p->ymax = (*tipy); (*tipy) += down; if (lengthsum > (*tipmax)) (*tipmax) = lengthsum; return; } q = p->next; do { xx = q->v; if (xx > 100.0) xx = 100.0; dnapars_coordinates(q->back, lengthsum + xx, tipy,tipmax); q = q->next; } while (p != q); first = p->next->back; q = p; while (q->next != p) q = q->next; last = q->back; p->xcoord = (long)(over * lengthsum + 0.5); if ((p == root) || count_sibs(p) > 2) p->ycoord = p->next->next->back->ycoord; else p->ycoord = (first->ycoord + last->ycoord) / 2; p->ymin = first->ymin; p->ymax = last->ymax; } /* dnapars_coordinates */ void dnapars_printree() { /* prints out diagram of the tree2 */ long tipy; double scale, tipmax; long i; if (!treeprint) return; putc('\n', outfile); tipy = 1; tipmax = 0.0; dnapars_coordinates(root, 0.0, &tipy, &tipmax); scale = 1.0 / (long)(tipmax + 1.000); for (i = 1; i <= (tipy - down); i++) drawline3(i, scale, root); putc('\n', outfile); } /* dnapars_printree */ void globrearrange() { /* does global rearrangements */ long j; double gotlike; boolean frommulti; node *item, *nufork; recompute = true; do { printf(" "); gotlike = bestlike = bstlike2; /* order matters here ! */ for (j = 0; j < nonodes; j++) { bestyet = -10.0 * spp * chars; if (j < spp) item = treenode[enterorder[j] -1]; else item = treenode[j]; if ((item != root) && ((j < spp) || ((j >= spp) && (item->numdesc > 0))) && !((item->back->index == root->index) && (root->numdesc == 2) && alltips(root, item))) { re_move(item, &nufork, &root, recompute, treenode, &grbg, zeros); frommulti = (nufork->numdesc > 0); clearcollapse(treenode); there = root; memcpy(tempadd->base, item->base, endsite*sizeof(long)); memcpy(tempadd->numsteps, item->numsteps, endsite*sizeof(long)); memcpy(tempadd->oldbase, zeros, endsite*sizeof(long)); memcpy(tempadd->oldnumsteps, zeros, endsite*sizeof(long)); if (frommulti){ oldnufork = nufork; getnufork(&nufork, &grbg, treenode, zeros); } addpreorder(root, item, nufork); if (frommulti) oldnufork = NULL; if (!mulf) add(there, item, nufork, &root, recompute, treenode, &grbg, zeros); else add(there, item, NULL, &root, recompute, treenode, &grbg, zeros); } if (progress) { if (j % ((nonodes / 72) + 1) == 0) putchar('.'); fflush(stdout); } } if (progress) { putchar('\n'); #ifdef WIN32 phyFillScreenColor(); #endif } } while (bestlike > gotlike); } /* globrearrange */ void load_tree(long treei) { /* restores a tree from bestrees */ long j, nextnode; boolean recompute = false; node *dummy; for (j = spp - 1; j >= 1; j--) re_move(treenode[j], &dummy, &root, recompute, treenode, &grbg, zeros); root = treenode[0]; recompute = true; add(treenode[0], treenode[1], treenode[spp], &root, recompute, treenode, &grbg, zeros); nextnode = spp + 2; for (j = 3; j <= spp; j++) { if (bestrees[treei].btree[j - 1] > 0) add(treenode[bestrees[treei].btree[j - 1] - 1], treenode[j - 1], treenode[nextnode++ - 1], &root, recompute, treenode, &grbg, zeros); else add(treenode[treenode[-bestrees[treei].btree[j-1]-1]->back->index-1], treenode[j - 1], NULL, &root, recompute, treenode, &grbg, zeros); } } void grandrearr() { /* calls global rearrangement on best trees */ long treei; boolean done; done = false; do { treei = findunrearranged(bestrees, nextree, true); if (treei < 0) done = true; else bestrees[treei].gloreange = true; if (!done) { load_tree(treei); globrearrange(); done = rearrfirst; } } while (!done); } /* grandrearr */ void maketree() { /* constructs a binary tree from the pointers in treenode. adds each node at location which yields highest "likelihood" then rearranges the tree for greatest "likelihood" */ long i, j, numtrees, nextnode; boolean done, firsttree, goteof, haslengths; node *item, *nufork, *dummy; pointarray nodep; numtrees = 0; if (!usertree) { for (i = 1; i <= spp; i++) enterorder[i - 1] = i; if (jumble) randumize(seed, enterorder); recompute = true; root = treenode[enterorder[0] - 1]; add(treenode[enterorder[0] - 1], treenode[enterorder[1] - 1], treenode[spp], &root, recompute, treenode, &grbg, zeros); if (progress) { printf("Adding species:\n"); writename(0, 2, enterorder); #ifdef WIN32 phyFillScreenColor(); #endif } lastrearr = false; oldnufork = NULL; for (i = 3; i <= spp; i++) { bestyet = -10.0 * spp * chars; item = treenode[enterorder[i - 1] - 1]; getnufork(&nufork, &grbg, treenode, zeros); there = root; memcpy(tempadd->base, item->base, endsite*sizeof(long)); memcpy(tempadd->numsteps, item->numsteps, endsite*sizeof(long)); memcpy(tempadd->oldbase, zeros, endsite*sizeof(long)); memcpy(tempadd->oldnumsteps, zeros, endsite*sizeof(long)); addpreorder(root, item, nufork); if (!mulf) add(there, item, nufork, &root, recompute, treenode, &grbg, zeros); else add(there, item, NULL, &root, recompute, treenode, &grbg, zeros); like = bestyet; rearrange(&root); if (progress) { writename(i - 1, 1, enterorder); #ifdef WIN32 phyFillScreenColor(); #endif } lastrearr = (i == spp); if (lastrearr) { bestlike = bestyet; if (jumb == 1) { bstlike2 = bestlike; nextree = 1; initbestrees(bestrees, maxtrees, true); initbestrees(bestrees, maxtrees, false); } if (progress) { printf("\nDoing global rearrangements"); if (rearrfirst) printf(" on the first of the trees tied for best\n"); else printf(" on all trees tied for best\n"); printf(" !"); for (j = 0; j < nonodes; j++) if (j % ((nonodes / 72) + 1) == 0) putchar('-'); printf("!\n"); #ifdef WIN32 phyFillScreenColor(); #endif } globrearrange(); } } done = false; while (!done && findunrearranged(bestrees, nextree, true) >= 0) { grandrearr(); done = rearrfirst; } if (progress) putchar('\n'); recompute = false; for (i = spp - 1; i >= 1; i--) re_move(treenode[i], &dummy, &root, recompute, treenode, &grbg, zeros); if (jumb == njumble) { collapsebestrees(&root, &grbg, treenode, bestrees, place, zeros, chars, recompute, progress); if (treeprint) { putc('\n', outfile); if (nextree == 2) fprintf(outfile, "One most parsimonious tree found:\n"); else fprintf(outfile, "%6ld trees in all found\n", nextree - 1); } if (nextree > maxtrees + 1) { if (treeprint) fprintf(outfile, "here are the first %4ld of them\n", (long)maxtrees); nextree = maxtrees + 1; } if (treeprint) putc('\n', outfile); for (i = 0; i <= (nextree - 2); i++) { root = treenode[0]; add(treenode[0], treenode[1], treenode[spp], &root, recompute, treenode, &grbg, zeros); nextnode = spp + 2; for (j = 3; j <= spp; j++) { if (bestrees[i].btree[j - 1] > 0) add(treenode[bestrees[i].btree[j - 1] - 1], treenode[j - 1], treenode[nextnode++ - 1], &root, recompute, treenode, &grbg, zeros); else add(treenode[treenode[-bestrees[i].btree[j - 1]-1]->back->index-1], treenode[j - 1], NULL, &root, recompute, treenode, &grbg, zeros); } reroot(treenode[outgrno - 1], root); postorder(root); evaluate(root); treelength(root, chars, treenode); dnapars_printree(); describe(); for (j = 1; j < spp; j++) re_move(treenode[j], &dummy, &root, recompute, treenode, &grbg, zeros); } } } else { /* Open in binary: ftell() is broken for UNIX line-endings under WIN32 */ openfile(&intree,INTREE,"input tree", "rb",progname,intreename); numtrees = countsemic(&intree); if (numtrees > 2) initseed(&inseed, &inseed0, seed); if (numtrees > MAXNUMTREES) { printf("\nERROR: number of input trees is read incorrectly from %s\n", intreename); exxit(-1); } if (treeprint) { fprintf(outfile, "User-defined tree"); if (numtrees > 1) putc('s', outfile); fprintf(outfile, ":\n"); } fsteps = (long **)Malloc(maxuser*sizeof(long *)); for (j = 1; j <= maxuser; j++) fsteps[j - 1] = (long *)Malloc(endsite*sizeof(long)); if (trout) fprintf(outtree, "%ld\n", numtrees); nodep = NULL; which = 1; while (which <= numtrees) { firsttree = true; nextnode = 0; haslengths = true; treeread(intree, &root, treenode, &goteof, &firsttree, nodep, &nextnode, &haslengths, &grbg, initdnaparsnode,false,nonodes); if (treeprint) fprintf(outfile, "\n\n"); if (outgropt) reroot(treenode[outgrno - 1], root); postorder(root); evaluate(root); treelength(root, chars, treenode); dnapars_printree(); describe(); if (which < numtrees) gdispose(root, &grbg, treenode); which++; } FClose(intree); putc('\n', outfile); if (numtrees > 1 && chars > 1 ) standev(chars, numtrees, minwhich, minsteps, nsteps, fsteps, seed); for (j = 1; j <= maxuser; j++) free(fsteps[j - 1]); free(fsteps); } if (jumb == njumble) { if (progress) { printf("\nOutput written to file \"%s\"\n", outfilename); if (trout) { printf("\nTree"); if ((usertree && numtrees > 1) || (!usertree && nextree != 2)) printf("s"); printf(" also written onto file \"%s\"\n", outtreename); } } } } /* maketree */ void reallocchars() { /* The amount of chars can change between runs this function reallocates all the variables whose size depends on the amount of chars */ long i; for (i=0; i < spp; i++){ free(y[i]); y[i] = (Char *)Malloc(chars*sizeof(Char)); } free(weight); free(oldweight); free(alias); free(ally); free(location); weight = (long *)Malloc(chars*sizeof(long)); oldweight = (long *)Malloc(chars*sizeof(long)); alias = (long *)Malloc(chars*sizeof(long)); ally = (long *)Malloc(chars*sizeof(long)); location = (long *)Malloc(chars*sizeof(long)); } void freerest() { /* free variables that are allocated each data set */ long i; if (!usertree) { freenode(&temp); freenode(&temp1); freenode(&temp2); freenode(&tempsum); freenode(&temprm); freenode(&tempadd); freenode(&tempf); freenode(&tmp); freenode(&tmp1); freenode(&tmp2); freenode(&tmp3); freenode(&tmprm); freenode(&tmpadd); } for (i = 0; i < spp; i++) free(y[i]); free(y); for (i = 1; i <= maxtrees; i++) free(bestrees[i - 1].btree); free(bestrees); free(nayme); free(enterorder); free(place); free(weight); free(oldweight); free(alias); free(ally); free(location); freegrbg(&grbg); if (ancseq) freegarbage(&garbage); free(threshwt); free(zeros); freenodes(nonodes, treenode); } /* freerest */ int main(int argc, Char *argv[]) { /* DNA parsimony by uphill search */ /* reads in spp, chars, and the data. Then calls maketree to construct the tree */ #ifdef MAC argc = 1; /* macsetup("Dnapars",""); */ argv[0] = "Dnapars"; #endif init(argc, argv); progname = argv[0]; openfile(&infile,INFILE,"input file", "r",argv[0],infilename); openfile(&outfile,OUTFILE,"output file", "w",argv[0],outfilename); ibmpc = IBMCRT; ansi = ANSICRT; msets = 1; firstset = true; garbage = NULL; grbg = NULL; doinit(); if (weights || justwts) openfile(&weightfile,WEIGHTFILE,"weights file","r",argv[0],weightfilename); if (trout) openfile(&outtree,OUTTREE,"output tree file", "w",argv[0],outtreename); for (ith = 1; ith <= msets; ith++) { if (!(justwts && !firstset)) allocrest(); if (msets > 1 && !justwts) { fprintf(outfile, "\nData set # %ld:\n\n", ith); if (progress) printf("\nData set # %ld:\n\n", ith); } doinput(); if (ith == 1) firstset = false; for (jumb = 1; jumb <= njumble; jumb++) maketree(); if (!justwts) freerest(); } freetree(nonodes, treenode); FClose(infile); FClose(outfile); if (weights || justwts) FClose(weightfile); if (trout) FClose(outtree); if (usertree) FClose(intree); #ifdef MAC fixmacfile(outfilename); fixmacfile(outtreename); #endif if (progress) printf("\nDone.\n\n"); #ifdef WIN32 phyRestoreConsoleAttributes(); #endif return 0; } /* DNA parsimony by uphill search */ phylip-3.697/src/dnapenny.c0000644004732000473200000005627112406201116015320 0ustar joefelsenst_g#include "phylip.h" #include "seq.h" /* version 3.696. Written by Joseph Felsenstein, Akiko Fuseki, Sean Lamont, and Andrew Keeffe. Copyright (c) 1993-2014, Joseph Felsenstein All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #define maxtrees 1000 /* maximum number of trees to be printed out */ #define often 1000 /* how often to notify how many trees examined */ #define many 10000 /* how many multiples of howoften before stop */ typedef node **pointptr; typedef long *treenumbers; typedef double *valptr; typedef long *placeptr; #ifndef OLDC /* function prototypes */ void getoptions(void); void allocrest(void); void doinit(void); void makeweights(void); void doinput(void); void supplement(node *); void evaluate(node *); void addtraverse(node *, node *, node *, long *, long *, valptr, placeptr); void addit(long ); void dnapenny_reroot(node *); void describe(void); void maketree(void); void reallocchars(void); /* function prototypes */ #endif Char infilename[FNMLNGTH],outfilename[FNMLNGTH],outtreename[FNMLNGTH], weightfilename[FNMLNGTH]; node *root, *p; long *zeros=NULL; long chars, howmanny, howoften, col, msets, ith; boolean weights, thresh, simple, trout, progress, stepbox, ancseq, mulsets, firstset, justwts; double threshold; steptr oldweight; pointptr treenode; /* pointers to all nodes in tree */ double fracdone, fracinc; boolean *added; gbases *garbage; node **grbg; Char basechar[]="ACMGRSVTWYHKDBNO???????????????"; /* Variables for maketree, propagated globally for C version: */ long examined, mults; boolean firsttime, done, recompute; double like, bestyet; treenumbers *bestorders, *bestrees; treenumbers current, order; long *threshwt; baseptr nothing; node *temp, *temp1; long suppno[] = { 0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,3,3,3,3,3,4}; long suppset[] = /* this was previously a function. */ { /* in C, it doesn't need to be. */ 1 << ((long)A), 1 << ((long)C), 1 << ((long)G), 1 << ((long)T), 1 << ((long)O), (1 << ((long)A)) | (1 << ((long)C)), (1 << ((long)A)) | (1 << ((long)G)), (1 << ((long)A)) | (1 << ((long)T)), (1 << ((long)A)) | (1 << ((long)O)), (1 << ((long)C)) | (1 << ((long)G)), (1 << ((long)C)) | (1 << ((long)T)), (1 << ((long)C)) | (1 << ((long)O)), (1 << ((long)G)) | (1 << ((long)T)), (1 << ((long)G)) | (1 << ((long)O)), (1 << ((long)T)) | (1 << ((long)O)), (1 << ((long)A)) | (1 << ((long)C)) | (1 << ((long)G)), (1 << ((long)A)) | (1 << ((long)C)) | (1 << ((long)T)), (1 << ((long)A)) | (1 << ((long)C)) | (1 << ((long)O)), (1 << ((long)A)) | (1 << ((long)G)) | (1 << ((long)T)), (1 << ((long)A)) | (1 << ((long)G)) | (1 << ((long)O)), (1 << ((long)A)) | (1 << ((long)T)) | (1 << ((long)O)), (1 << ((long)C)) | (1 << ((long)G)) | (1 << ((long)T)), (1 << ((long)C)) | (1 << ((long)G)) | (1 << ((long)O)), (1 << ((long)C)) | (1 << ((long)T)) | (1 << ((long)O)), (1 << ((long)G)) | (1 << ((long)T)) | (1 << ((long)O)), (1 << ((long)A))|(1 << ((long)C))|(1 << ((long)G))|(1 << ((long)T)), (1 << ((long)A))|(1 << ((long)C))|(1 << ((long)G))|(1 << ((long)O)), (1 << ((long)A))|(1 << ((long)C))|(1 << ((long)T))|(1 << ((long)O)), (1 << ((long)A))|(1 << ((long)G))|(1 << ((long)T))|(1 << ((long)O)), (1 << ((long)C))|(1 << ((long)G))|(1 << ((long)T)) | (1 << ((long)O)), (1 << ((long)A))|(1 << ((long)C))|(1 << ((long)G)) | (1 << ((long)T)) | (1 << ((long)O))}; void getoptions() { /* interactively set options */ long loopcount, loopcount2; Char ch, ch2; fprintf(outfile, "\nPenny algorithm for DNA, version %s\n",VERSION); fprintf(outfile, " branch-and-bound to find all"); fprintf(outfile, " most parsimonious trees\n\n"); howoften = often; howmanny = many; outgrno = 1; outgropt = false; simple = true; thresh = false; threshold = spp; trout = true; weights = false; justwts = false; printdata = false; dotdiff = true; progress = true; treeprint = true; stepbox = false; ancseq = false; interleaved = true; loopcount = 0; for (;;) { cleerhome(); printf("\nPenny algorithm for DNA, version %s\n",VERSION); printf(" branch-and-bound to find all most parsimonious trees\n\n"); printf("Settings for this run:\n"); printf(" H How many groups of %4ld trees:%6ld\n", howoften, howmanny); printf(" F How often to report, in trees: %4ld\n", howoften); printf(" S Branch and bound is simple? %s\n", (simple ? "Yes" : "No. reconsiders order of species")); printf(" O Outgroup root? %s%3ld\n", (outgropt ? "Yes, at sequence number" : "No, use as outgroup species"),outgrno); printf(" T Use Threshold parsimony?"); if (thresh) printf(" Yes, count steps up to%4.1f per site\n", threshold); else printf(" No, use ordinary parsimony\n"); printf(" W Sites weighted? %s\n", (weights ? "Yes" : "No")); printf(" M Analyze multiple data sets?"); if (mulsets) printf(" Yes, %2ld %s\n", msets, (justwts ? "sets of weights" : "data sets")); else printf(" No\n"); printf(" I Input sequences interleaved? %s\n", (interleaved ? "Yes" : "No, sequential")); printf(" 0 Terminal type (IBM PC, ANSI, none)? %s\n", (ibmpc ? "IBM PC" : ansi ? "ANSI" : "(none)")); printf(" 1 Print out the data at start of run %s\n", (printdata ? "Yes" : "No")); printf(" 2 Print indications of progress of run %s\n", (progress ? "Yes" : "No")); printf(" 3 Print out tree %s\n", (treeprint ? "Yes" : "No")); printf(" 4 Print out steps in each site %s\n", (stepbox ? "Yes" : "No" )); printf(" 5 Print sequences at all nodes of tree %s\n", (ancseq ? "Yes" : "No")); printf(" 6 Write out trees onto tree file? %s\n", (trout ? "Yes" : "No")); if(weights && justwts){ printf("WARNING: W option and Multiple Weights options are both on. "); printf( "The W menu option is unnecessary and has no additional effect. \n"); } printf("\nAre these settings correct? (type Y or the letter for one to change)\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); if (ch == '\n') ch = ' '; uppercase(&ch); if (ch == 'Y') break; uppercase(&ch); if ((strchr("WHMSOFTI1234560",ch)) != NULL){ switch (ch) { case 'H': inithowmany(&howmanny, howoften); break; case 'W': weights = !weights; break; case 'M': mulsets = !mulsets; if (mulsets){ printf("Multiple data sets or multiple weights?"); loopcount2 = 0; do { printf(" (type D or W)\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch2); getchar(); if (ch2 == '\n') ch2 = ' '; uppercase(&ch2); countup(&loopcount2, 10); } while ((ch2 != 'W') && (ch2 != 'D')); justwts = (ch2 == 'W'); if (justwts) justweights(&msets); else initdatasets(&msets); } break; case 'F': inithowoften(&howoften); break; case 'S': simple = !simple; break; case 'O': outgropt = !outgropt; if (outgropt) initoutgroup(&outgrno, spp); else outgrno = 1; break; case 'T': thresh = !thresh; if (thresh) initthreshold(&threshold); break; case 'I': interleaved = !interleaved; break; case '0': initterminal(&ibmpc, &ansi); break; case '1': printdata = !printdata; break; case '2': progress = !progress; break; case '3': treeprint = !treeprint; break; case '4': stepbox = !stepbox; break; case '5': ancseq = !ancseq; break; case '6': trout = !trout; break; } } else printf("Not a possible option!\n"); countup(&loopcount, 100); } } /* getoptions */ void allocrest() { long i; y = (Char **)Malloc(spp*sizeof(Char *)); for (i = 0; i < spp; i++) y[i] = (Char *)Malloc(chars*sizeof(Char)); weight = (long *)Malloc(chars*sizeof(long)); oldweight = (long *)Malloc(chars*sizeof(long)); alias = (steptr)Malloc(chars*sizeof(long)); ally = (steptr)Malloc(chars*sizeof(long)); location = (steptr)Malloc(chars*sizeof(long)); nayme = (naym *)Malloc(spp*sizeof(naym)); bestorders = (treenumbers *)Malloc(maxtrees*sizeof(treenumbers)); for (i = 1; i <= maxtrees; i++) bestorders[i - 1] = (treenumbers)Malloc(spp*sizeof(long)); bestrees = (treenumbers *)Malloc(maxtrees*sizeof(treenumbers)); for (i = 1; i <= maxtrees; i++) bestrees[i - 1] = (treenumbers)Malloc(spp*sizeof(long)); current = (treenumbers)Malloc(spp*sizeof(long)); order = (treenumbers)Malloc(spp*sizeof(long)); added = (boolean *)Malloc(nonodes*sizeof(boolean)); } /* allocrest */ void reallocchars(void) {/* The amount of chars can change between runs this function reallocates all the variables whose size depends on the amount of chars */ long i; for (i = 0; i < spp; i++) { free(y[i]); y[i] = (Char *)Malloc(chars*sizeof(Char)); } free(weight); free(oldweight); free(alias); free(ally); free(location); weight = (long *)Malloc(chars*sizeof(long)); oldweight = (long *)Malloc(chars*sizeof(long)); alias = (steptr)Malloc(chars*sizeof(long)); ally = (steptr)Malloc(chars*sizeof(long)); location = (steptr)Malloc(chars*sizeof(long)); } /* reallocchars */ void doinit() { /* initializes variables */ inputnumbers(&spp, &chars, &nonodes, 1); getoptions(); if (printdata) fprintf(outfile, "%2ld species, %3ld sites\n", spp, chars); alloctree(&treenode, nonodes, false); allocrest(); } /* doinit */ void makeweights() { /* make up weights vector to avoid duplicate computations */ long i; for (i = 1; i <= chars; i++) { alias[i - 1] = i; oldweight[i - 1] = weight[i - 1]; ally[i - 1] = i; } sitesort(chars, weight); sitecombine(chars); sitescrunch(chars); endsite = 0; for (i = 1; i <= chars; i++) { if (ally[i - 1] == i) endsite++; } for (i = 1; i <= endsite; i++) location[alias[i - 1] - 1] = i; if (!thresh) threshold = spp; threshwt = (long *)Malloc(endsite*sizeof(long)); for (i = 0; i < endsite; i++) { weight[i] *= 10; threshwt[i] = (long)(threshold * weight[i] + 0.5); } if ( zeros != NULL ) free(zeros); zeros = (long *)Malloc(endsite*sizeof(long)); /*in makeweights()*/ for (i = 0; i < endsite; i++) zeros[i] = 0; } /* makeweights */ void doinput() { /* reads the input data */ long i; if (justwts) { if (firstset) inputdata(chars); for (i = 0; i < chars; i++) weight[i] = 1; inputweights(chars, weight, &weights); if (justwts) { fprintf(outfile, "\n\nWeights set # %ld:\n\n", ith); if (progress) printf("\nWeights set # %ld:\n\n", ith); } if (printdata) printweights(outfile, 0, chars, weight, "Sites"); } else { if (!firstset){ samenumsp(&chars, ith); reallocchars(); } inputdata(chars); for (i = 0; i < chars; i++) weight[i] = 1; if (weights) { inputweights(chars, weight, &weights); if (printdata) printweights(outfile, 0, chars, weight, "Sites"); } } makeweights(); makevalues(treenode, zeros, false); alloctemp(&temp, zeros, endsite); alloctemp(&temp1, zeros, endsite); } /* doinput */ void supplement(node *r) { /* determine minimum number of steps more which will be added when rest of species are put in tree */ long i, j, k, has, sum; boolean addedmayhave, nonaddedhave; for (i = 0; i < endsite; i++) { nonaddedhave = 0;; addedmayhave = 0; for (k = 0; k < spp; k++) { has = treenode[k]->base[i]; if (has != 31) { if (added[k]) addedmayhave |= has; else { if ((has == 1) || (has == 2) || (has == 4) || (has == 8) || (has == 16)) nonaddedhave |= has; } } } sum = 0; j = 1; for (k = 1; k <= 5; k++) { if ((j & nonaddedhave) != 0) if ((j & addedmayhave) == 0) sum++; j += j; } r->numsteps[i] += sum * weight[i]; } } /* supplement */ void evaluate(node *r) { /* determines the number of steps needed for a tree. this is the minimum number of steps needed to evolve sequences on this tree */ long i, steps; double sum; sum = 0.0; supplement(r); for (i = 0; i < endsite; i++) { steps = r->numsteps[i]; if ((long)steps <= threshwt[i]) sum += steps; else sum += threshwt[i]; } if (examined == 0 && mults == 0) bestyet = -1.0; like = sum; } /* evaluate */ void addtraverse(node *p, node *item, node *fork, long *m, long *n, valptr valyew, placeptr place) { /* traverse all places to add item */ if (done) return; if (*m <= 2 || (p != root && p != root->next->back)) { if (p == root) fillin(temp, item, p); else { fillin(temp1, item, p); fillin(temp, temp1, p->back); } (*n)++; evaluate(temp); examined++; if (examined == howoften) { examined = 0; mults++; if (mults == howmanny) done = true; if (progress) { printf("%7ld", mults); if (bestyet >= 0) printf("%16.1f", bestyet / 10.0); else printf(" - "); printf("%17ld%20.2f\n", nextree - 1, fracdone * 100); #ifdef WIN32 phyFillScreenColor(); #endif } } valyew[(*n) - 1] = like; place[(*n) - 1] = p->index; } if (!p->tip) { addtraverse(p->next->back, item, fork, m,n,valyew,place); addtraverse(p->next->next->back, item, fork,m,n,valyew,place); } } /* addtraverse */ void addit(long m) { /* adds the species one by one, recursively */ long n; valptr valyew; placeptr place; long i, j, n1, besttoadd=0; valptr bestval; placeptr bestplace; double oldfrac, oldfdone, sum, bestsum; valyew = (valptr)Malloc(nonodes*sizeof(double)); bestval = (valptr)Malloc(nonodes*sizeof(double)); place = (placeptr)Malloc(nonodes*sizeof(long)); bestplace = (placeptr)Malloc(nonodes*sizeof(long)); if (simple && !firsttime) { n = 0; added[order[m - 1] - 1] = true; addtraverse(root, treenode[order[m - 1] - 1], treenode[spp + m - 2], &m,&n,valyew,place); besttoadd = order[m - 1]; memcpy(bestplace, place, nonodes*sizeof(long)); memcpy(bestval, valyew, nonodes*sizeof(double)); } else { bestsum = -1.0; for (i = 1; i <= spp; i++) { if (!added[i - 1]) { n = 0; added[i - 1] = true; addtraverse(root, treenode[i - 1], treenode[spp + m - 2], &m,&n,valyew,place); added[i - 1] = false; sum = 0.0; for (j = 0; j < n; j++) sum += valyew[j]; if (sum > bestsum) { bestsum = sum; besttoadd = i; memcpy(bestplace, place, nonodes*sizeof(long)); memcpy(bestval, valyew, nonodes*sizeof(double)); } } } } order[m - 1] = besttoadd; memcpy(place, bestplace, nonodes*sizeof(long)); memcpy(valyew, bestval, nonodes*sizeof(double)); shellsort(valyew, place, n); oldfrac = fracinc; oldfdone = fracdone; n1 = 0; for (i = 0; i < n; i++) { if (valyew[i] <= bestyet || bestyet < 0.0) n1++; } if (n1 > 0) fracinc /= n1; for (i = 0; i < n; i++) { if (valyew[i] <= bestyet || bestyet < 0.0) { current[m - 1] = place[i]; recompute = (m < spp); add(treenode[place[i] - 1], treenode[besttoadd - 1], treenode[spp + m - 2], &root, recompute, treenode, grbg, zeros); added[besttoadd - 1] = true; if (m < spp) addit(m + 1); else { if (valyew[i] < bestyet || bestyet < 0.0) { nextree = 1; bestyet = valyew[i]; } if (nextree <= maxtrees) { memcpy(bestorders[nextree - 1], order, spp*sizeof(long)); memcpy(bestrees[nextree - 1], current, spp*sizeof(long)); } nextree++; firsttime = false; } recompute = (m < spp); re_move(treenode[besttoadd - 1], &treenode[spp + m - 2], &root, recompute, treenode, grbg, zeros); added[besttoadd - 1] = false; } fracdone += fracinc; } fracinc = oldfrac; fracdone = oldfdone; free(valyew); free(bestval); free(place); free(bestplace); } /* addit */ void dnapenny_reroot(node *outgroup) { /* reorients tree, putting outgroup in desired position. */ node *p, *q, *newbottom, *oldbottom; if (outgroup->back->index == root->index) return; newbottom = outgroup->back; p = treenode[newbottom->index - 1]->back; while (p->index != root->index) { oldbottom = treenode[p->index - 1]; treenode[p->index - 1] = p; p = oldbottom->back; } p = root->next; q = root->next->next; p->back->back = q->back; q->back->back = p->back; p->back = outgroup; q->back = outgroup->back; outgroup->back->back = root->next->next; outgroup->back = root->next; treenode[newbottom->index - 1] = newbottom; } /* dnapenny_reroot */ void describe() { /* prints ancestors, steps and table of numbers of steps in each site */ if (stepbox) writesteps(chars, weights, oldweight, root); if (ancseq) { hypstates(chars, root, treenode, &garbage, basechar); putc('\n', outfile); } putc('\n', outfile); if (trout) { col = 0; treeout(root, nextree, &col, root); } } /* describe */ void maketree() { /* tree construction recursively by branch and bound */ long i, j, k; node *dummy; if (progress) { printf("\nHow many\n"); printf("trees looked Approximate\n"); printf("at so far Length of How many percentage\n"); printf("(multiples shortest tree trees this short searched\n"); printf("of %4ld): found so far found so far so far\n", howoften); printf("---------- ------------ ------------ ------------\n"); } #ifdef WIN32 phyFillScreenColor(); #endif done = false; mults = 0; examined = 0; nextree = 1; root = treenode[0]; firsttime = true; for (i = 0; i < spp; i++) added[i] = false; added[0] = true; order[0] = 1; k = 2; fracdone = 0.0; fracinc = 1.0; bestyet = -1.0; recompute = true; addit(k); if (done) { if (progress) { printf("Search broken off! Not guaranteed to\n"); printf(" have found the most parsimonious trees.\n"); } if (treeprint) { fprintf(outfile, "Search broken off! Not guaranteed to\n"); fprintf(outfile, " have found the most parsimonious\n"); fprintf(outfile, " trees, but here is what we found:\n"); } } if (treeprint) { fprintf(outfile, "\nrequires a total of %18.3f\n\n", bestyet / 10.0); if (nextree == 2) fprintf(outfile, "One most parsimonious tree found:\n"); else fprintf(outfile, "%6ld trees in all found\n", nextree - 1); } if (nextree > maxtrees + 1) { if (treeprint) fprintf(outfile, "here are the first%4ld of them\n", (long)maxtrees); nextree = maxtrees + 1; } if (treeprint) putc('\n', outfile); for (i = 0; i < spp; i++) added[i] = true; for (i = 0; i <= nextree - 2; i++) { root = treenode[0]; for (j = k; j <= spp; j++) add(treenode[bestrees[i][j - 1] - 1], treenode[bestorders[i][j - 1] - 1], treenode[spp + j - 2], &root, recompute, treenode, grbg, zeros); dnapenny_reroot(treenode[outgrno - 1]); postorder(root); evaluate(root); printree(root, 1.0); describe(); for (j = k - 1; j < spp; j++) re_move(treenode[bestorders[i][j] - 1], &dummy, &root, recompute, treenode, grbg, zeros); } if (progress) { printf("\nOutput written to file \"%s\"\n\n", outfilename); if (trout) printf("Trees also written onto file \"%s\"\n\n", outtreename); } freetemp(&temp); freetemp(&temp1); if (ancseq) freegarbage(&garbage); } /* maketree */ int main(int argc, Char *argv[]) { /* Penny's branch-and-bound method for DNA sequences */ #ifdef MAC argc = 1; /* macsetup("Dnapenny",""); */ argv[0] = "Dnapenny"; #endif init(argc, argv); /* Reads in the number of species, number of characters, options and data. Then finds all most parsimonious trees */ openfile(&infile,INFILE,"input file", "r",argv[0],infilename); openfile(&outfile,OUTFILE,"output file", "w",argv[0],outfilename); ibmpc = IBMCRT; ansi = ANSICRT; mulsets = false; garbage = NULL; msets = 1; firstset = true; doinit(); if (weights || justwts) openfile(&weightfile,WEIGHTFILE,"weights file","r",argv[0],weightfilename); if (trout) openfile(&outtree,OUTTREE,"output tree file", "w",argv[0],outtreename); for (ith = 1; ith <= msets; ith++) { doinput(); if (ith == 1) firstset = false; if (msets > 1 && !justwts) { fprintf(outfile, "\nData set # %ld:\n",ith); if (progress) printf("\nData set # %ld:\n",ith); } maketree(); free(threshwt); freenodes(nonodes,treenode); } FClose(infile); FClose(outfile); FClose(outtree); #ifdef MAC fixmacfile(outfilename); fixmacfile(outtreename); #endif #ifdef WIN32 phyRestoreConsoleAttributes(); #endif return 0; } /* Penny's branch-and-bound method for DNA sequences */ phylip-3.697/src/dollo.c0000644004732000473200000002726712407046276014637 0ustar joefelsenst_g#include "phylip.h" #include "disc.h" #include "dollo.h" /* version 3.696. Written by Joseph Felsenstein, Akiko Fuseki, Sean Lamont, and Andrew Keeffe. Copyright (c) 1993-2014, Joseph Felsenstein All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ void correct(node *p, long fullset, boolean dollo, bitptr zeroanc, pointptr treenode) { /* get final states for intermediate nodes */ long i; long z0, z1, s0, s1, temp; if (p->tip) return; for (i = 0; i < (words); i++) { if (p->back == NULL) { s0 = zeroanc[i]; s1 = fullset & (~zeroanc[i]); } else { s0 = treenode[p->back->index - 1]->statezero[i]; s1 = treenode[p->back->index - 1]->stateone[i]; } z0 = (s0 & p->statezero[i]) | (p->next->back->statezero[i] & p->next->next->back->statezero[i]); z1 = (s1 & p->stateone[i]) | (p->next->back->stateone[i] & p->next->next->back->stateone[i]); if (dollo) { temp = z0 & (~(zeroanc[i] & z1)); z1 &= ~(fullset & (~zeroanc[i]) & z0); z0 = temp; } temp = fullset & (~z0) & (~z1); p->statezero[i] = z0 | (temp & s0 & (~s1)); p->stateone[i] = z1 | (temp & s1 & (~s0)); } } /* correct */ void fillin(node *p) { /* Sets up for each node in the tree two statesets. stateone and statezero are the sets of character states that must be 1 or must be 0, respectively, in a most parsimonious reconstruction, based on the information at or above this node. Note that this state assignment may change based on information further down the tree. If a character is in both sets it is in state "P". If in neither, it is "?". */ long i; for (i = 0; i < words; i++) { p->stateone[i] = p->next->back->stateone[i] | p->next->next->back->stateone[i]; p->statezero[i] = p->next->back->statezero[i] | p->next->next->back->statezero[i]; } } /* fillin */ void postorder(node *p) { /* traverses a binary tree, calling PROCEDURE fillin at a node's descendants before calling fillin at the node */ /* used in dollop, dolmove, & move */ if (p->tip) return; postorder(p->next->back); postorder(p->next->next->back); fillin(p); } /* postorder */ void count(long *stps, bitptr zeroanc, steptr numszero, steptr numsone) { /* counts the number of steps in a branch of the tree. The program spends much of its time in this PROCEDURE */ /* used in dolpenny & move */ long i, j, l; j = 1; l = 0; for (i = 0; i < (chars); i++) { l++; if (l > bits) { l = 1; j++; } if (((1L << l) & stps[j - 1]) != 0) { if (((1L << l) & zeroanc[j - 1]) != 0) numszero[i] += weight[i]; else numsone[i] += weight[i]; } } } /* count */ void filltrav(node *r) { /* traverse to fill in interior node states */ if (r->tip) return; filltrav(r->next->back); filltrav(r->next->next->back); fillin(r); } /* filltrav */ void hyprint(struct htrav_vars *Hyptrav, boolean *unknown, bitptr dohyp, Char *guess) { /* print out states at node */ long i, j, k; char l; boolean dot, a0, a1, s0, s1; if (Hyptrav->bottom) fprintf(outfile, "root "); else fprintf(outfile, "%3ld ", Hyptrav->r->back->index - spp); if (Hyptrav->r->tip) { for (i = 0; i < nmlngth; i++) putc(nayme[Hyptrav->r->index - 1][i], outfile); } else fprintf(outfile, "%4ld ", Hyptrav->r->index - spp); if (Hyptrav->nonzero) fprintf(outfile, " yes "); else if (*unknown) fprintf(outfile, " ? "); else fprintf(outfile, " no "); for (j = 1; j <= (chars); j++) { newline(outfile, j, 40, nmlngth + 17); k = (j - 1) / bits + 1; l = (j - 1) % bits + 1; dot = (((1L << l) & dohyp[k - 1]) == 0 && guess[j - 1] == '?'); s0 = (((1L << l) & Hyptrav->r->statezero[k - 1]) != 0); s1 = (((1L << l) & Hyptrav->r->stateone[k - 1]) != 0); a0 = (((1L << l) & Hyptrav->zerobelow->bits_[k - 1]) != 0); a1 = (((1L << l) & Hyptrav->onebelow->bits_[k - 1]) != 0); dot = (dot || (a1 == s1 && a0 == s0)); if (dot) putc('.', outfile); else { if (s0) { if (s1) putc('P', outfile); else putc('0', outfile); } else if (s1) putc('1', outfile); else putc('?', outfile); } if (j % 5 == 0) putc(' ', outfile); } putc('\n', outfile); } /* hyprint */ void hyptrav(node *r_, boolean *unknown, bitptr dohyp, long fullset, boolean dollo, Char *guess, pointptr treenode, gbit *garbage, bitptr zeroanc, bitptr oneanc) { /* compute, print out states at one interior node */ struct htrav_vars HypVars; long i; HypVars.r = r_; disc_gnu(&HypVars.zerobelow, &garbage); disc_gnu(&HypVars.onebelow, &garbage); if (!HypVars.r->tip) correct(HypVars.r, fullset, dollo, zeroanc, treenode); HypVars.bottom = (HypVars.r->back == NULL); HypVars.nonzero = false; if (HypVars.bottom) { memcpy(HypVars.zerobelow->bits_, zeroanc, words*sizeof(long)); memcpy(HypVars.onebelow->bits_, oneanc, words*sizeof(long)); } else { memcpy(HypVars.zerobelow->bits_, treenode[HypVars.r->back->index - 1]->statezero, words*sizeof(long)); memcpy(HypVars.onebelow->bits_, treenode[HypVars.r->back->index - 1]->stateone, words*sizeof(long)); } for (i = 0; i < (words); i++) HypVars.nonzero = (HypVars.nonzero || ((HypVars.r->stateone[i] & HypVars.zerobelow->bits_[i]) | (HypVars.r->statezero[i] & HypVars.onebelow->bits_[i])) != 0); hyprint(&HypVars,unknown,dohyp, guess); if (!HypVars.r->tip) { hyptrav(HypVars.r->next->back, unknown,dohyp, fullset, dollo, guess, treenode, garbage, zeroanc, oneanc); hyptrav(HypVars.r->next->next->back, unknown,dohyp, fullset, dollo, guess, treenode, garbage, zeroanc, oneanc); } disc_chuck(HypVars.zerobelow, &garbage); disc_chuck(HypVars.onebelow, &garbage); } /* hyptrav */ void hypstates(long fullset, boolean dollo, Char *guess, pointptr treenode, node *root, gbit *garbage, bitptr zeroanc, bitptr oneanc) { /* fill in and describe states at interior nodes */ /* used in dollop & dolpenny */ boolean unknown = false; bitptr dohyp; long i, j, k; for (i = 0; i < (words); i++) { zeroanc[i] = 0; oneanc[i] = 0; } for (i = 0; i < (chars); i++) { j = i / bits + 1; k = i % bits + 1; if (guess[i] == '0') zeroanc[j - 1] = ((long)zeroanc[j - 1]) | (1L << k); if (guess[i] == '1') oneanc[j - 1] = ((long)oneanc[j - 1]) | (1L << k); unknown = (unknown || guess[i] == '?'); } dohyp = (bitptr)Malloc(words*sizeof(long)); for (i = 0; i < words; i++) dohyp[i] = zeroanc[i] | oneanc[i]; filltrav(root); fprintf(outfile, "From To Any Steps?"); fprintf(outfile, " State at upper node\n"); fprintf(outfile, " "); fprintf(outfile, " ( . means same as in the node below it on tree)\n\n"); hyptrav(root, &unknown,dohyp, fullset, dollo, guess, treenode, garbage, zeroanc, oneanc); free(dohyp); } /* hypstates */ void drawline(long i, double scale, node *root) { /* draws one row of the tree diagram by moving up tree */ node *p, *q, *r, *first =NULL, *last =NULL; long n, j; boolean extra, done; p = root; q = root; extra = false; if (i == (long)p->ycoord && p == root) { if (p->index - spp >= 10) fprintf(outfile, "-%2ld", p->index - spp); else fprintf(outfile, "--%ld", p->index - spp); extra = true; } else fprintf(outfile, " "); do { if (!p->tip) { r = p->next; done = false; do { if (i >= r->back->ymin && i <= r->back->ymax) { q = r->back; done = true; } r = r->next; } while (!(done || r == p)); first = p->next->back; r = p->next; while (r->next != p) r = r->next; last = r->back; } done = (p == q); n = (long)(scale * (p->xcoord - q->xcoord) + 0.5); if (n < 3 && !q->tip) n = 3; if (extra) { n--; extra = false; } if ((long)q->ycoord == i && !done) { putc('+', outfile); if (!q->tip) { for (j = 1; j <= n - 2; j++) putc('-', outfile); if (q->index - spp >= 10) fprintf(outfile, "%2ld", q->index - spp); else fprintf(outfile, "-%ld", q->index - spp); extra = true; } else { for (j = 1; j < n; j++) putc('-', outfile); } } else if (!p->tip) { if ((long)last->ycoord > i && (long)first->ycoord < i && i != (long)p->ycoord) { putc('!', outfile); for (j = 1; j < n; j++) putc(' ', outfile); } else { for (j = 1; j <= n; j++) putc(' ', outfile); } } else { for (j = 1; j <= n; j++) putc(' ', outfile); } if (p != q) p = q; } while (!done); if ((long)p->ycoord == i && p->tip) { for (j = 0; j < nmlngth; j++) putc(nayme[p->index - 1][j], outfile); } putc('\n', outfile); } /* drawline */ void printree(double f, boolean treeprint, node *root) { /* prints out diagram of the tree */ /* used in dollop & dolpenny */ long i, tipy, dummy; double scale; putc('\n', outfile); if (!treeprint) return; putc('\n', outfile); tipy = 1; dummy = 0; coordinates(root, &tipy, f, &dummy); scale = 1.5; putc('\n', outfile); for (i = 1; i <= (tipy - down); i++) drawline(i, scale, root); putc('\n', outfile); } /* printree */ void writesteps(boolean weights, boolean dollo, steptr numsteps) { /* write number of steps */ /* used in dollop & dolpenny */ long i, j, k; if (weights) fprintf(outfile, "weighted"); if (dollo) fprintf(outfile, " reversions "); else fprintf(outfile, " polymorphisms "); fprintf(outfile, "in each character:\n"); fprintf(outfile, " "); for (i = 0; i <= 9; i++) fprintf(outfile, "%4ld", i); fprintf(outfile, "\n *-----------------------------------------\n"); for (i = 0; i <= (chars / 10); i++) { fprintf(outfile, "%5ld", i * 10); putc('!', outfile); for (j = 0; j <= 9; j++) { k = i * 10 + j; if (k == 0 || k > chars) fprintf(outfile, " "); else fprintf(outfile, "%4ld", numsteps[k - 1] + extras[k - 1]); } putc('\n', outfile); } putc('\n', outfile); } /* writesteps */ phylip-3.697/src/dollo.h0000644004732000473200000000422712407046356014632 0ustar joefelsenst_g /* version 3.696. Written by Joseph Felsenstein, Akiko Fuseki, Sean Lamont, and Andrew Keeffe. Copyright (c) 1993-2014, Joseph Felsenstein All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ /* dollo.h: included in dollop, dolmove & dolpenny */ #ifndef OLDC /* function prototypes */ void correct(node *, long, boolean, bitptr, pointptr); void fillin(node *); void postorder(node *); void count(long *, bitptr, steptr, steptr); void filltrav(node *); void hyprint(struct htrav_vars *, boolean *, bitptr, Char *); void hyptrav(node *, boolean *, bitptr, long, boolean, Char *, pointptr, gbit *, bitptr, bitptr); void hypstates(long, boolean, Char *, pointptr, node *, gbit *, bitptr, bitptr); void drawline(long, double, node *); void printree(double, boolean, node *); void writesteps(boolean, boolean, steptr); /* function prototypes */ #endif phylip-3.697/src/dollop.c0000644004732000473200000007125412406201116014773 0ustar joefelsenst_g #include "phylip.h" #include "disc.h" #include "dollo.h" /* version 3.696. Written by Joseph Felsenstein, Akiko Fuseki, Sean Lamont, and Andrew Keeffe. Copyright (c) 1993-2014, Joseph Felsenstein All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #define maxtrees 100 /* maximum number of tied trees stored */ #ifndef OLDC /* function prototypes */ void getoptions(void); void allocrest(void); void doinit(void); void inputoptions(void); void doinput(void); void dollop_count(node *, steptr, steptr); void preorder(node *, steptr, steptr, long, boolean, long, bitptr, pointptr); void evaluate(node *); void savetree(void); void dollop_addtree(long *); void tryadd(node *, node **, node **); void addpreorder(node *, node *, node *); void tryrearr(node *, node **, boolean *); void repreorder(node *, node **, boolean *); void rearrange(node **); void describe(void); void initdollopnode(node **, node **, node *, long, long, long *, long *, initops, pointarray, pointarray, Char *, Char *, FILE *); void maketree(void); void reallocchars(void); /* function prototypes */ #endif Char infilename[FNMLNGTH], outfilename[FNMLNGTH], intreename[FNMLNGTH], outtreename[FNMLNGTH], weightfilename[FNMLNGTH], ancfilename[FNMLNGTH]; node *root; long col, msets, ith, j, l, njumble, jumb; long inseed, inseed0; boolean jumble, usertree, weights, thresh, ancvar, questions, dollo, trout, progress, treeprint, stepbox, ancseq, mulsets, firstset, justwts; boolean *ancone, *anczero, *ancone0, *anczero0; pointptr treenode; /* pointers to all nodes in tree */ double threshold; double *threshwt; longer seed; long *enterorder; double **fsteps; steptr numsteps; bestelm *bestrees; Char *guess; gbit *garbage; char *progname; /* Variables for treeread */ boolean goteof, firsttree, haslengths, phirst; pointarray nodep; node *grbg; long *zeros; /* Local variables for maketree, propagated globally for C version: */ long minwhich; double like, bestyet, bestlike, bstlike2, minsteps; boolean lastrearr; double nsteps[maxuser]; node *there; long fullset; long shimotrees; bitptr zeroanc, oneanc; long *place; Char ch; boolean *names; steptr numsone, numszero; bitptr steps; void getoptions() { /* interactively set options */ long loopcount, loopcount2; Char ch, ch2; fprintf(outfile,"\nDollo and polymorphism parsimony algorithm,"); fprintf(outfile," version %s\n\n",VERSION); putchar('\n'); ancvar = false; dollo = true; jumble = false; njumble = 1; thresh = false; threshold = spp; trout = true; usertree = false; goteof = false; weights = false; justwts = false; printdata = false; progress = true; treeprint = true; stepbox = false; ancseq = false; loopcount = 0; for (;;) { cleerhome(); printf("\nDollo and polymorphism parsimony algorithm, version %s\n\n", VERSION); printf("Settings for this run:\n"); printf(" U Search for best tree? %s\n", (usertree ? "No, use user trees in input file" : "Yes")); printf(" P Parsimony method? %s\n", dollo ? "Dollo" : "Polymorphism"); if (!usertree) { printf(" J Randomize input order of species?"); if (jumble) printf(" Yes (seed =%8ld,%3ld times)\n", inseed0, njumble); else printf(" No. Use input order\n"); } printf(" T Use Threshold parsimony?"); if (thresh) printf(" Yes, count steps up to%4.1f per char.\n", threshold); else printf(" No, use ordinary parsimony\n"); printf(" A Use ancestral states in input file? %s\n", ancvar ? "Yes" : "No"); printf(" W Sites weighted? %s\n", (weights ? "Yes" : "No")); printf(" M Analyze multiple data sets?"); if (mulsets) printf(" Yes, %2ld %s\n", msets, (justwts ? "sets of weights" : "data sets")); else printf(" No\n"); printf(" 0 Terminal type (IBM PC, ANSI, none)? %s\n", ibmpc ? "IBM PC" : ansi ? "ANSI" : "(none)"); printf(" 1 Print out the data at start of run %s\n", printdata ? "Yes" : "No"); printf(" 2 Print indications of progress of run %s\n", progress ? "Yes" : "No"); printf(" 3 Print out tree %s\n", treeprint ? "Yes" : "No"); printf(" 4 Print out steps in each character %s\n", stepbox ? "Yes" : "No"); printf(" 5 Print states at all nodes of tree %s\n", ancseq ? "Yes" : "No"); printf(" 6 Write out trees onto tree file? %s\n", trout ? "Yes" : "No"); if(weights && justwts){ printf( "WARNING: W option and Multiple Weights options are both on. "); printf( "The W menu option is unnecessary and has no additional effect. \n"); } printf("\nAre these settings correct? "); printf("(type Y or the letter for one to change)\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); uppercase(&ch); if (ch == 'Y') break; if (((!usertree) && (strchr("WAPJTUM1234560", ch) != NULL)) || (usertree && ((strchr("WAPTUM1234560", ch) != NULL)))){ switch (ch) { case 'A': ancvar = !ancvar; break; case 'P': dollo = !dollo; break; case 'J': jumble = !jumble; if (jumble) initjumble(&inseed, &inseed0, seed, &njumble); else njumble = 1; break; case 'W': weights = !weights; break; case 'T': thresh = !thresh; if (thresh) initthreshold(&threshold); break; case 'U': usertree = !usertree; break; case 'M': mulsets = !mulsets; if (mulsets){ printf("Multiple data sets or multiple weights?"); loopcount2 = 0; do { printf(" (type D or W)\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch2); getchar(); if (ch2 == '\n') ch2 = ' '; uppercase(&ch2); countup(&loopcount2, 10); } while ((ch2 != 'W') && (ch2 != 'D')); justwts = (ch2 == 'W'); if (justwts) justweights(&msets); else initdatasets(&msets); if (!jumble) { jumble = true; initjumble(&inseed, &inseed0, seed, &njumble); } } break; case '0': initterminal(&ibmpc, &ansi); break; case '1': printdata = !printdata; break; case '2': progress = !progress; break; case '3': treeprint = !treeprint; break; case '4': stepbox = !stepbox; break; case '5': ancseq = !ancseq; break; case '6': trout = !trout; break; } } else printf("Not a possible option!\n"); countup(&loopcount, 100); } } /* getoptions */ void reallocchars() { long i; free(extras); free(weight); free(threshwt); free(numsteps); free(ancone); free(anczero); free(ancone0); free(anczero0); free(numsone); free(numszero); free(guess); if (usertree) { for (i = 1; i <= maxuser; i++){ free(fsteps); fsteps[i - 1] = (double *)Malloc(chars*sizeof(double)); } } extras = (steptr)Malloc(chars*sizeof(long)); weight = (steptr)Malloc(chars*sizeof(long)); threshwt = (double *)Malloc(chars*sizeof(double)); numsteps = (steptr)Malloc(chars*sizeof(long)); ancone = (boolean *)Malloc(chars*sizeof(boolean)); anczero = (boolean *)Malloc(chars*sizeof(boolean)); ancone0 = (boolean *)Malloc(chars*sizeof(boolean)); anczero0 = (boolean *)Malloc(chars*sizeof(boolean)); numsone = (steptr)Malloc(chars*sizeof(long)); numszero = (steptr)Malloc(chars*sizeof(long)); guess = (Char *)Malloc(chars*sizeof(Char)); } void allocrest() { long i; extras = (steptr)Malloc(chars*sizeof(long)); weight = (steptr)Malloc(chars*sizeof(long)); threshwt = (double *)Malloc(chars*sizeof(double)); if (usertree) { fsteps = (double **)Malloc(maxuser*sizeof(double *)); for (i = 1; i <= maxuser; i++) fsteps[i - 1] = (double *)Malloc(chars*sizeof(double)); } bestrees = (bestelm *) Malloc(maxtrees*sizeof(bestelm)); for (i = 1; i <= maxtrees; i++) bestrees[i - 1].btree = (long *)Malloc(nonodes*sizeof(long)); numsteps = (steptr)Malloc(chars*sizeof(long)); nayme = (naym *)Malloc(spp*sizeof(naym)); enterorder = (long *)Malloc(spp*sizeof(long)); place = (long *)Malloc(nonodes*sizeof(long)); ancone = (boolean *)Malloc(chars*sizeof(boolean)); anczero = (boolean *)Malloc(chars*sizeof(boolean)); ancone0 = (boolean *)Malloc(chars*sizeof(boolean)); anczero0 = (boolean *)Malloc(chars*sizeof(boolean)); numsone = (steptr)Malloc(chars*sizeof(long)); numszero = (steptr)Malloc(chars*sizeof(long)); guess = (Char *)Malloc(chars*sizeof(Char)); zeroanc = (bitptr)Malloc(words*sizeof(long)); oneanc = (bitptr)Malloc(words*sizeof(long)); steps = (bitptr)Malloc(words*sizeof(long)); } /* allocrest */ void doinit() { /* initializes variables */ inputnumbers(&spp, &chars, &nonodes, 1); words = chars / bits + 1; getoptions(); alloctree(&treenode); setuptree(treenode); allocrest(); } /* doinit */ void inputoptions() { /* input the information on the options */ long i; if(justwts){ if(firstset){ scan_eoln(infile); if (ancvar) { inputancestors(anczero0, ancone0); } } for (i = 0; i < (chars); i++) weight[i] = 1; inputweights(chars, weight, &weights); } else { if (!firstset) { samenumsp(&chars, ith); reallocchars(); } scan_eoln(infile); for (i = 0; i < (chars); i++) weight[i] = 1; if (ancvar) inputancestors(anczero0, ancone0); if (weights) inputweights(chars, weight, &weights); } if ((weights || justwts) && printdata) printweights(outfile, 0, chars, weight, "Characters"); for (i = 0; i < (chars); i++) { if (!ancvar) { anczero[i] = true; ancone[i] = false; } else { anczero[i] = anczero0[i]; ancone[i] = ancone0[i]; } } if (ancvar && printdata) printancestors(outfile, anczero, ancone); questions = false; for (i = 0; i < (chars); i++) { questions = (questions || (ancone[i] && anczero[i])); threshwt[i] = threshold * weight[i]; } } /* inputoptions */ void doinput() { /* reads the input data */ inputoptions(); if(!justwts || firstset) inputdata(treenode, dollo, printdata, outfile); } /* doinput */ void dollop_count(node *p, steptr numsone, steptr numszero) { /* counts the number of steps in a fork of the tree. The program spends much of its time in this PROCEDURE */ long i, j, l; if (dollo) { for (i = 0; i < (words); i++) steps[i] = (treenode[p->back->index - 1]->stateone[i] & p->statezero[i] & zeroanc[i]) | (treenode[p->back->index - 1]->statezero[i] & p->stateone[i] & fullset & (~zeroanc[i])); } else { for (i = 0; i < (words); i++) steps[i] = treenode[p->back->index - 1]->stateone[i] & treenode[p->back->index - 1]->statezero[i] & p->stateone[i] & p->statezero[i]; } j = 1; l = 0; for (i = 0; i < (chars); i++) { l++; if (l > bits) { l = 1; j++; } if (((1L << l) & steps[j - 1]) != 0) { assert(j <= words); /* checking array indexing */ if (((1L << l) & zeroanc[j - 1]) != 0) numszero[i] += weight[i]; else numsone[i] += weight[i]; } } } /* dollop_count */ void preorder(node *p, steptr numsone, steptr numszero, long words, boolean dollo, long fullset, bitptr zeroanc, pointptr treenode) { /* go back up tree setting up and counting interior node states */ if (!p->tip) { correct(p, fullset, dollo, zeroanc, treenode); preorder(p->next->back, numsone,numszero, words, dollo, fullset, zeroanc, treenode); preorder(p->next->next->back, numsone,numszero, words, dollo, fullset, zeroanc, treenode); } if (p->back != NULL) dollop_count(p, numsone,numszero); } /* preorder */ void evaluate(node *r) { /* Determines the number of losses or polymorphisms needed for a tree. This is the minimum number needed to evolve chars on this tree */ long i, stepnum, smaller; double sum, term; sum = 0.0; for (i = 0; i < (chars); i++) { numszero[i] = 0; numsone[i] = 0; } for (i = 0; i < (words); i++) zeroanc[i] = fullset; postorder(r); preorder(r, numsone, numszero, words, dollo, fullset, zeroanc, treenode); for (i = 0; i < (words); i++) zeroanc[i] = 0; postorder(r); preorder(r, numsone, numszero, words, dollo, fullset, zeroanc, treenode); for (i = 0; i < (chars); i++) { smaller = spp * weight[i]; numsteps[i] = smaller; if (anczero[i]) { numsteps[i] = numszero[i]; smaller = numszero[i]; } if (ancone[i] && numsone[i] < smaller) numsteps[i] = numsone[i]; stepnum = numsteps[i] + extras[i]; if (stepnum <= threshwt[i]) term = stepnum; else term = threshwt[i]; sum += term; if (usertree && which <= maxuser) fsteps[which - 1][i] = term; guess[i] = '?'; if (!ancone[i] || (anczero[i] && numszero[i] < numsone[i])) guess[i] = '0'; else if (!anczero[i] || (ancone[i] && numsone[i] < numszero[i])) guess[i] = '1'; } if (usertree && which <= maxuser) { nsteps[which - 1] = sum; if (which == 1) { minwhich = 1; minsteps = sum; } else if (sum < minsteps) { minwhich = which; minsteps = sum; } } like = -sum; } /* evaluate */ void savetree() { /* record in place where each species has to be added to reconstruct this tree */ long i, j; node *p; boolean done; for (i = 0; i < (nonodes); i++) place[i] = 0; place[root->index - 1] = 1; for (i = 1; i <= (spp); i++) { p = treenode[i - 1]; while (place[p->index - 1] == 0) { place[p->index - 1] = i; p = p->back; if (p != NULL) p = treenode[p->index - 1]; } if (i > 1) { place[i - 1] = place[p->index - 1]; j = place[p->index - 1]; done = false; while (!done) { place[p->index - 1] = spp + i - 1; p = treenode[p->index - 1]; p = p->back; done = (p == NULL); if (!done) done = (place[p->index - 1] != j); } } } } /* savetree */ void dollop_addtree(long *pos) { /*puts tree from ARRAY place in its proper position in ARRAY bestrees */ long i; for (i =nextree - 1; i >= (*pos); i--) { memcpy(bestrees[i].btree, bestrees[i - 1].btree, spp*sizeof(long)); bestrees[i].gloreange = bestrees[i - 1].gloreange; bestrees[i].locreange = bestrees[i - 1].locreange; bestrees[i].collapse = bestrees[i - 1].collapse; } for (i = 0; i < (spp); i++) bestrees[(*pos) - 1].btree[i] = place[i]; nextree++; } /* dollop_addtree */ void tryadd(node *p, node **item, node **nufork) { /* temporarily adds one fork and one tip to the tree. if the location where they are added yields greater "likelihood" than other locations tested up to that time, then keeps that location as there */ long pos; boolean found; add(p, *item, *nufork, &root, treenode); evaluate(root); if (lastrearr) { if (like >= bstlike2) { savetree(); if (like > bstlike2) { bestlike = bstlike2 = like; pos = 1; nextree = 1; dollop_addtree(&pos); } else { pos = 0; findtree(&found, &pos, nextree, place, bestrees); /* findtree calls for a bestelm* but is getting */ /* a long**, LM */ if (!found) { if (nextree <= maxtrees) dollop_addtree(&pos); } } } } if (like > bestyet) { bestyet = like; there = p; } re_move(item, nufork, &root, treenode); } /* tryadd */ void addpreorder(node *p, node *item_, node *nufork_) { /* traverses a binary tree, calling PROCEDURE tryadd at a node before calling tryadd at its descendants */ node *item= item_; node *nufork = nufork_; if (p == NULL) return; tryadd(p, &item,&nufork); if (!p->tip) { addpreorder(p->next->back, item, nufork); addpreorder(p->next->next->back, item, nufork); } } /* addpreorder */ void tryrearr(node *p, node **r, boolean *success) { /* evaluates one rearrangement of the tree. if the new tree has greater "likelihood" than the old one sets success := TRUE and keeps the new tree. otherwise, restores the old tree */ node *frombelow, *whereto, *forknode; double oldlike; if (p->back == NULL) return; forknode = treenode[p->back->index - 1]; if (forknode->back == NULL) return; oldlike = bestyet; if (p->back->next->next == forknode) frombelow = forknode->next->next->back; else frombelow = forknode->next->back; whereto = forknode->back; re_move(&p, &forknode, &root, treenode); add(whereto, p, forknode, &root, treenode); evaluate(*r); if (oldlike - like < LIKE_EPSILON) { re_move(&p, &forknode, &root, treenode); add(frombelow, p, forknode, &root, treenode); } else { (*success) = true; bestyet = like; } } /* tryrearr */ void repreorder(node *p, node **r, boolean *success) { /* traverses a binary tree, calling PROCEDURE tryrearr at a node before calling tryrearr at its descendants */ if (p == NULL) return; tryrearr(p, r,success); if (!p->tip) { repreorder(p->next->back, r,success); repreorder(p->next->next->back, r,success); } } /* repreorder */ void rearrange(node **r_) { /* traverses the tree (preorder), finding any local rearrangement which decreases the number of steps. if traversal succeeds in increasing the tree's "likelihood", PROCEDURE rearrange runs traversal again */ node **r = r_; boolean success = true; while (success) { success = false; repreorder(*r, r,&success); } } /* rearrange */ void describe() { /* prints ancestors, steps and table of numbers of steps in each character */ if (treeprint) fprintf(outfile, "\nrequires a total of %10.3f\n", -like); if (stepbox) { putc('\n', outfile); writesteps(weights, dollo, numsteps); } if (questions) guesstates(guess); if (ancseq) { hypstates(fullset, dollo, guess, treenode, root, garbage, zeroanc, oneanc); putc('\n', outfile); } putc('\n', outfile); if (trout) { col = 0; treeout(root, nextree, &col, root); } } /* describe */ void initdollopnode(node **p, node **grbg, node *q, long len, long nodei, long *ntips, long *parens, initops whichinit, pointarray treenode, pointarray nodep, Char *str, Char *ch, FILE *intree) { /* initializes a node */ /* LM 7/27 I added this function and the commented lines around */ /* treeread() to get the program running, but all 4 move programs*/ /* are improperly integrated into the v4.0 support files. As is */ /* this is a patchwork function */ boolean minusread; double valyew, divisor; switch (whichinit) { case bottom: gnutreenode(grbg, p, nodei, chars, zeros); treenode[nodei - 1] = *p; break; case nonbottom: gnutreenode(grbg, p, nodei, chars, zeros); break; case tip: match_names_to_data (str, treenode, p, spp); break; case length: /* if there is a length, read it and discard value */ processlength(&valyew, &divisor, ch, &minusread, intree, parens); break; default: /*cases hslength,hsnolength,treewt,unittrwt,iter,*/ break; } } /* initdollopnode */ void maketree() { /* constructs a binary tree from the pointers in treenode. adds each node at location which yields highest "likelihood" then rearranges the tree for greatest "likelihood" */ long i, j, numtrees, nextnode; double gotlike; node *item, *nufork, *dummy, *p; fullset = (1L << (bits + 1)) - (1L << 1); if (!usertree) { for (i = 1; i <= (spp); i++) enterorder[i - 1] = i; if (jumble) randumize(seed, enterorder); root = treenode[enterorder[0] - 1]; add(treenode[enterorder[0] - 1], treenode[enterorder[1] - 1], treenode[spp], &root, treenode); if (progress) { printf("Adding species:\n"); writename(0, 2, enterorder); #ifdef WIN32 phyFillScreenColor(); #endif } lastrearr = false; for (i = 3; i <= (spp); i++) { bestyet = -350.0 * spp * chars; item = treenode[enterorder[i - 1] - 1]; nufork = treenode[spp + i - 2]; addpreorder(root, item, nufork); add(there, item, nufork, &root, treenode); like = bestyet; rearrange(&root); if (progress) { writename(i - 1, 1, enterorder); #ifdef WIN32 phyFillScreenColor(); #endif } lastrearr = (i == spp); if (lastrearr) { if (progress) { printf("\nDoing global rearrangements\n"); printf(" !"); for (j = 1; j <= (nonodes); j++) if ( j % (( nonodes / 72 ) + 1 ) == 0 ) putchar('-'); printf("!\n"); #ifdef WIN32 phyFillScreenColor(); #endif } bestlike = bestyet; if (jumb == 1) { bstlike2 = bestlike; nextree = 1; } do { if (progress) printf(" "); gotlike = bestlike; for (j = 0; j < (nonodes); j++) { bestyet = - 350.0 * spp * chars; item = treenode[j]; if (item != root) { nufork = treenode[j]->back; re_move(&item, &nufork, &root, treenode); there = root; addpreorder(root, item, nufork); add(there, item, nufork, &root, treenode); } if (progress) { if ( j % (( nonodes / 72 ) + 1 ) == 0 ) putchar('.'); fflush(stdout); } } if (progress) { putchar('\n'); #ifdef WIN32 phyFillScreenColor(); #endif } } while (bestlike > gotlike); } } if (progress) putchar('\n'); for (i = spp - 1; i >= 1; i--) re_move(&treenode[i], &dummy, &root, treenode); if (jumb == njumble) { if (treeprint) { putc('\n', outfile); if (nextree == 2) fprintf(outfile, "One most parsimonious tree found:\n"); else fprintf(outfile, "%6ld trees in all found\n", nextree - 1); } if (nextree > maxtrees + 1) { if (treeprint) fprintf(outfile, "here are the first%4ld of them\n", (long)maxtrees); nextree = maxtrees + 1; } if (treeprint) putc('\n', outfile); for (i = 0; i <= (nextree - 2); i++) { root = treenode[0]; add(treenode[0], treenode[1], treenode[spp], &root, treenode); for (j = 3; j <= spp; j++) { add(treenode[bestrees[i].btree[j - 1] - 1], treenode[j - 1], treenode[spp + j - 2], &root, treenode);} evaluate(root); printree(1.0, treeprint, root); describe(); for (j = 1; j < (spp); j++) re_move(&treenode[j], &dummy, &root, treenode); } } } else { /* Open in binary: ftell() is broken for UNIX line-endings under WIN32 */ openfile(&intree,INTREE,"input tree file", "rb",progname,intreename); numtrees = countsemic(&intree); if (numtrees > 2){ initseed(&inseed, &inseed0, seed); printf("\n"); } if (treeprint) { fprintf(outfile, "User-defined tree"); if (numtrees > 1) putc('s', outfile); fprintf(outfile, ":\n"); } names = (boolean *)Malloc(spp*sizeof(boolean)); which = 1; firsttree = true; /**/ nodep = NULL; /**/ nextnode = 0; /**/ haslengths = 0; /**/ phirst = 0; /**/ zeros = (long *)Malloc(chars*sizeof(long)); /**/ for (i = 0; i < chars; i++) /**/ zeros[i] = 0; /**/ while (which <= numtrees) { treeread(intree, &root, treenode, &goteof, &firsttree, nodep, &nextnode, &haslengths, &grbg, initdollopnode,false,nonodes); for (i = spp; i < (nonodes); i++) { p = treenode[i]; for (j = 1; j <= 3; j++) { p->stateone = (bitptr)Malloc(words*sizeof(long)); p->statezero = (bitptr)Malloc(words*sizeof(long)); p = p->next; } } /* debug: see comment at initdollopnode() */ if (treeprint) fprintf(outfile, "\n\n"); evaluate(root); printree(1.0, treeprint, root); describe(); which++; } FClose(intree); fprintf(outfile, "\n\n"); if (numtrees > 1 && chars > 1) standev(numtrees, minwhich, minsteps, nsteps, fsteps, seed); free(names); } if (jumb == njumble) { if (progress) { printf("Output written to file \"%s\"\n\n", outfilename); if (trout) printf("Trees also written onto file \"%s\"\n\n", outtreename); } if (ancseq) freegarbage(&garbage); } } /* maketree */ int main(int argc, Char *argv[]) { /* Dollo or polymorphism parsimony by uphill search */ #ifdef MAC argc = 1; /* macsetup("Dollop",""); */ argv[0] = "Dollop"; #endif init(argc, argv); /* reads in spp, chars, and the data. Then calls maketree to construct the tree */ progname = argv[0]; openfile(&infile,INFILE,"input file", "r",argv[0],infilename); openfile(&outfile,OUTFILE,"output file", "w",argv[0],outfilename); ibmpc = IBMCRT; ansi = ANSICRT; garbage = NULL; mulsets = false; msets = 1; firstset = true; bits = 8*sizeof(long) - 1; doinit(); if (weights || justwts) openfile(&weightfile,WEIGHTFILE,"weights file","r",argv[0],weightfilename); if (trout) openfile(&outtree,OUTTREE,"output tree file", "w",argv[0],outtreename); if(ancvar) openfile(&ancfile,ANCFILE,"ancestors file", "r",argv[0],ancfilename); if (dollo) fprintf(outfile, "Dollo"); else fprintf(outfile, "Polymorphism"); fprintf(outfile, " parsimony method\n\n"); if (printdata && justwts) fprintf(outfile, "%2ld species, %3ld characters\n\n", spp, chars); for (ith = 1; ith <= (msets); ith++) { if (msets > 1 && !justwts) { fprintf(outfile, "Data set # %ld:\n\n",ith); if (progress) printf("\nData set # %ld:\n",ith); } if (justwts){ fprintf(outfile, "Weights set # %ld:\n\n", ith); if (progress) printf("\nWeights set # %ld:\n\n", ith); } if (printdata && !justwts) fprintf(outfile, "%2ld species, %3ld characters\n\n", spp, chars); doinput(); if (ith == 1) firstset = false; for (jumb = 1; jumb <= njumble; jumb++) maketree(); } /* this would be an appropriate place to deallocate memory, including these items: free(steps); */ FClose(infile); FClose(outfile); FClose(outtree); #ifdef MAC fixmacfile(outfilename); fixmacfile(outtreename); #endif #ifdef WIN32 phyRestoreConsoleAttributes(); #endif return 0; } /* Dollo or polymorphism parsimony by uphill search */ phylip-3.697/src/dolmove.c0000644004732000473200000011672312406201116015150 0ustar joefelsenst_g#include "phylip.h" #include "moves.h" #include "disc.h" #include "dollo.h" /* version 3.696. Written by Joseph Felsenstein, Akiko Fuseki, Sean Lamont, and Andrew Keeffe. Copyright (c) 1993-2014, Joseph Felsenstein All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #define overr 4 #define which 1 typedef enum { horiz, vert, up, overt, upcorner, downcorner, onne, zerro, question, polym } chartype; typedef enum { rearr, flipp, reroott, none } rearrtype; typedef enum { arb, use, spec } howtree; #ifndef OLDC /* function prototypes */ void getoptions(void); void inputoptions(void); void allocrest(void); void doinput(void); void configure(void); void prefix(chartype); void postfix(chartype); void makechar(chartype); void dolmove_correct(node *); void dolmove_count(node *); void preorder(node *); void evaluate(node *); void reroot(node *); void dolmove_hyptrav(node *); void dolmove_hypstates(void); void grwrite(chartype, long, long *); void dolmove_drawline(long); void dolmove_printree(void); void arbitree(void); void yourtree(void); void initdolmovenode(node **, node **, node *, long, long, long *, long *, initops, pointarray, pointarray, Char *, Char *, FILE *); void buildtree(void); void rearrange(void); void tryadd(node *, node **, node **, double *); void addpreorder(node *, node *, node *, double *); void try(void); void undo(void); void treewrite(boolean); void clade(void); void flip(void); void changeoutgroup(void); void redisplay(void); void treeconstruct(void); /* function prototypes */ #endif Char infilename[FNMLNGTH],intreename[FNMLNGTH],outtreename[FNMLNGTH], ancfilename[FNMLNGTH], factfilename[FNMLNGTH], weightfilename[FNMLNGTH]; node *root; long outgrno, col, screenlines, screenwidth, scrollinc,treelines, leftedge,topedge,vmargin,hscroll,vscroll,farthest; /* outgrno indicates outgroup */ boolean weights, thresh, ancvar, questions, dollo, factors, waswritten; boolean *ancone, *anczero, *ancone0, *anczero0; Char *factor; pointptr treenode; /* pointers to all nodes in tree */ double threshold; double *threshwt; unsigned char cha[10]; boolean reversed[10]; boolean graphic[10]; howtree how; char *progname; char ch; /* Variables for treeread */ boolean usertree, goteof, firsttree, haslengths; pointarray nodep; node *grbg; long *zeros; /* Local variables for treeconstruct, propagated globally for c version: */ long dispchar, dispword, dispbit, atwhat, what, fromwhere, towhere, oldoutgrno, compatible; double like, bestyet, gotlike; Char *guess; boolean display, newtree, changed, subtree, written, oldwritten, restoring, wasleft, oldleft, earlytree; boolean *in_tree; steptr numsteps; long fullset; bitptr zeroanc, oneanc; node *nuroot; rearrtype lastop; steptr numsone, numszero; boolean *names; void getoptions() { /* interactively set options */ long loopcount; Char ch; boolean done, gotopt; char input[100]; how = arb; usertree = false; goteof = false; thresh = false; threshold = spp; weights = false; ancvar = false; factors = false; dollo = true; loopcount = 0; do { cleerhome(); printf("\nInteractive Dollo or polymorphism parsimony,"); printf(" version %s\n\n",VERSION); printf("Settings for this run:\n"); printf(" P Parsimony method?"); printf(" %s\n",(dollo ? "Dollo" : "Polymorphism")); printf(" A Use ancestral states? %s\n", ancvar ? "Yes" : "No"); printf(" F Use factors information? %s\n", factors ? "Yes" : "No"); printf(" W Sites weighted? %s\n", (weights ? "Yes" : "No")); printf(" T Use Threshold parsimony?"); if (thresh) printf(" Yes, count steps up to%4.1f\n", threshold); else printf(" No, use ordinary parsimony\n"); printf(" A Use ancestral states in input file?"); printf(" %s\n",(ancvar ? "Yes" : "No")); printf(" U Initial tree (arbitrary, user, specify)?"); printf(" %s\n",(how == arb) ? "Arbitrary" : (how == use) ? "User tree from tree file" : "Tree you specify"); printf(" 0 Graphics type (IBM PC, ANSI, none)? %s\n", ibmpc ? "IBM PC" : ansi ? "ANSI" : "(none)"); printf(" L Number of lines on screen?%4ld\n",screenlines); printf(" S Width of terminal screen?%4ld\n",screenwidth); printf( "\n\nAre these settings correct? (type Y or the letter for one to change)\n"); #ifdef WIN32 phyFillScreenColor(); #endif getstryng(input); ch = input[0]; uppercase(&ch); done = (ch == 'Y'); gotopt = (strchr("SFTPULA0W",ch) != NULL) ? true : false; if (gotopt) { switch (ch) { case 'A': ancvar = !ancvar; break; case 'F': factors = !factors; break; case 'W': weights = !weights; break; case 'P': dollo = !dollo; break; case 'T': thresh = !thresh; if (thresh) initthreshold(&threshold); break; case 'U': if (how == arb) how = use; else if (how == use) how = spec; else how = arb; break; case '0': initterminal(&ibmpc, &ansi); break; case 'L': initnumlines(&screenlines); break; case 'S': screenwidth = readlong("Width of terminal screen (in characters)?\n"); } } else printf("Not a possible option!\n"); countup(&loopcount, 100); } while (!done); if (scrollinc < screenwidth / 2.0) hscroll = scrollinc; else hscroll = screenwidth / 2; if (scrollinc < screenlines / 2.0) vscroll = scrollinc; else vscroll = screenlines / 2; } /* getoptions */ void inputoptions() { /* input the information on the options */ long i; scan_eoln(infile); for (i = 0; i < (chars); i++) weight[i] = 1; if (ancvar) inputancestors(anczero0, ancone0); if (factors) inputfactors(chars, factor, &factors); if (weights) inputweights(chars, weight, &weights); putchar('\n'); if (weights) printweights(stdout, 0, chars, weight, "Characters"); if (factors) printfactors(stdout, chars, factor, ""); for (i = 0; i < (chars); i++) { if (!ancvar) { anczero[i] = true; ancone[i] = false; } else { anczero[i] = anczero0[i]; ancone[i] = ancone0[i]; } } if (ancvar) printancestors(stdout, anczero, ancone); if (!thresh) threshold = spp; questions = false; for (i = 0; i < (chars); i++) { questions = (questions || (ancone[i] && anczero[i])); threshwt[i] = threshold * weight[i]; } } /* inputoptions */ void allocrest() { nayme = (naym *)Malloc(spp*sizeof(naym)); in_tree = (boolean *)Malloc(nonodes*sizeof(boolean)); extras = (steptr)Malloc(chars*sizeof(long)); weight = (steptr)Malloc(chars*sizeof(long)); numszero = (steptr)Malloc(chars*sizeof(long)); numsone = (steptr)Malloc(chars*sizeof(long)); threshwt = (double *)Malloc(chars*sizeof(double)); factor = (Char *)Malloc(chars*sizeof(Char)); ancone = (boolean *)Malloc(chars*sizeof(boolean)); anczero = (boolean *)Malloc(chars*sizeof(boolean)); ancone0 = (boolean *)Malloc(chars*sizeof(boolean)); anczero0 = (boolean *)Malloc(chars*sizeof(boolean)); zeroanc = (bitptr)Malloc(words*sizeof(long)); oneanc = (bitptr)Malloc(words*sizeof(long)); } /* allocrest */ void doinput() { /* reads the input data */ inputnumbers(&spp, &chars, &nonodes, 1); words = chars / bits + 1; printf("%2ld species, %3ld characters\n", spp, chars); printf("\nReading input file ...\n\n"); getoptions(); if (weights) openfile(&weightfile,WEIGHTFILE,"weights file","r",progname,weightfilename); if(ancvar) openfile(&ancfile,ANCFILE,"ancestors file", "r",progname,ancfilename); if(factors) openfile(&factfile,FACTFILE,"factors file", "r",progname,factfilename); alloctree(&treenode); setuptree(treenode); allocrest(); inputoptions(); inputdata(treenode, dollo, false, stdout); } /* doinput */ void configure() { /* configure to machine -- set up special characters */ chartype a; for (a = horiz; (long)a <= (long)polym; a = (chartype)((long)a + 1)) reversed[(long)a] = false; for (a = horiz; (long)a <= (long)polym; a = (chartype)((long)a + 1)) graphic[(long)a] = false; if (ibmpc) { cha[(long)horiz] = 205; graphic[(long)horiz] = true; cha[(long)vert] = 186; graphic[(long)vert] = true; cha[(long)up] = 186; graphic[(long)up] = true; cha[(long)overt] = 205; graphic[(long)overt] = true; cha[(long)onne] = 219; reversed[(long)onne] = true; cha[(long)zerro] = 176; graphic[(long)zerro] = true; cha[(long)question] = 178; /* or try CHR(177) */ cha[(long)polym] = '\001'; reversed[(long)polym] = true; cha[(long)upcorner] = 200; graphic[(long)upcorner] = true; cha[(long)downcorner] = 201; graphic[(long)downcorner] = true; graphic[(long)question] = true; return; } if (ansi) { cha[(long)onne] = ' '; reversed[(long)onne] = true; cha[(long)horiz] = cha[(long)onne]; reversed[(long)horiz] = true; cha[(long)vert] = cha[(long)onne]; reversed[(long)vert] = true; cha[(long)up] = 'x'; graphic[(long)up] = true; cha[(long)overt] = 'q'; graphic[(long)overt] = true; cha[(long)zerro] = 'a'; graphic[(long)zerro] = true; reversed[(long)zerro] = true; cha[(long)question] = '?'; cha[(long)question] = '?'; reversed[(long)question] = true; cha[(long)polym] = '%'; reversed[(long)polym] = true; cha[(long)upcorner] = 'm'; graphic[(long)upcorner] = true; cha[(long)downcorner] = 'l'; graphic[(long)downcorner] = true; return; } cha[(long)horiz] = '='; cha[(long)vert] = ' '; cha[(long)up] = '!'; cha[(long)overt] = '-'; cha[(long)onne] = '*'; cha[(long)zerro] = '='; cha[(long)question] = '.'; cha[(long)polym] = '%'; cha[(long)upcorner] = '`'; cha[(long)downcorner] = ','; } /* configure */ void prefix(chartype a) { /* give prefix appropriate for this character */ if (reversed[(long)a]) prereverse(ansi); if (graphic[(long)a]) pregraph(ansi); } /* prefix */ void postfix(chartype a) { /* give postfix appropriate for this character */ if (reversed[(long)a]) postreverse(ansi); if (graphic[(long)a]) postgraph(ansi); } /* postfix */ void makechar(chartype a) { /* print out a character with appropriate prefix or postfix */ prefix(a); putchar(cha[(long)a]); postfix(a); } /* makechar */ void dolmove_correct(node *p) { /* get final states for intermediate nodes */ long i; long z0, z1, s0, s1, temp; if (p->tip) return; for (i = 0; i < (words); i++) { if (p->back == NULL) { s0 = zeroanc[i]; s1 = oneanc[i]; } else { s0 = treenode[p->back->index - 1]->statezero[i]; s1 = treenode[p->back->index - 1]->stateone[i]; } z0 = (s0 & p->statezero[i]) | (p->next->back->statezero[i] & p->next->next->back->statezero[i]); z1 = (s1 & p->stateone[i]) | (p->next->back->stateone[i] & p->next->next->back->stateone[i]); if (dollo) { temp = z0 & (~(zeroanc[i] & z1)); z1 &= ~(oneanc[i] & z0); z0 = temp; } temp = fullset & (~z0) & (~z1); p->statezero[i] = z0 | (temp & s0 & (~s1)); p->stateone[i] = z1 | (temp & s1 & (~s0)); } } /* dolmove_correct */ void dolmove_count(node *p) { /* counts the number of steps in a fork of the tree. The program spends much of its time in this PROCEDURE */ long i, j, l; bitptr steps; steps = (bitptr)Malloc(words*sizeof(long)); if (dollo) { for (i = 0; i < (words); i++) steps[i] = (treenode[p->back->index - 1]->stateone[i] & p->statezero[i] & zeroanc[i]) | (treenode[p->back->index - 1]->statezero[i] & p->stateone[i] & oneanc[i]); } else { for (i = 0; i < (words); i++) steps[i] = treenode[p->back->index - 1]->stateone[i] & treenode[p->back->index - 1]->statezero[i] & p->stateone[i] & p->statezero[i]; } j = 1; l = 0; for (i = 0; i < (chars); i++) { l++; if (l > bits) { l = 1; j++; } if (((1L << l) & steps[j - 1]) != 0) { if (((1L << l) & zeroanc[j - 1]) != 0) numszero[i] += weight[i]; else numsone[i] += weight[i]; } } free(steps); } /* dolmove_count */ void preorder(node *p) { /* go back up tree setting up and counting interior node states */ if (!p->tip) { dolmove_correct(p); preorder(p->next->back); preorder(p->next->next->back); } if (p->back != NULL) dolmove_count(p); } /* preorder */ void evaluate(node *r) { /* Determines the number of losses or polymorphisms needed for a tree. This is the minimum number needed to evolve chars on this tree */ long i, stepnum, smaller; double sum; boolean nextcompat, thiscompat, done; sum = 0.0; for (i = 0; i < (chars); i++) { numszero[i] = 0; numsone[i] = 0; } for (i = 0; i < (words); i++) { zeroanc[i] = fullset; oneanc[i] = 0; } compatible = 0; nextcompat = true; postorder(r); preorder(r); for (i = 0; i < (words); i++) { zeroanc[i] = 0; oneanc[i] = fullset; } postorder(r); preorder(r); for (i = 0; i < (chars); i++) { smaller = spp * weight[i]; numsteps[i] = smaller; if (anczero[i]) { numsteps[i] = numszero[i]; smaller = numszero[i]; } if (ancone[i] && numsone[i] < smaller) numsteps[i] = numsone[i]; stepnum = numsteps[i] + extras[i]; if (stepnum <= threshwt[i]) sum += stepnum; else sum += threshwt[i]; thiscompat = (stepnum <= weight[i]); if (factors) { done = (i + 1 == chars); if (!done) done = (factor[i + 1] != factor[i]); nextcompat = (nextcompat && thiscompat); if (done) { if (nextcompat) compatible += weight[i]; nextcompat = true; } } else if (thiscompat) compatible += weight[i]; guess[i] = '?'; if (!ancone[i] || (anczero[i] && numszero[i] < numsone[i])) guess[i] = '0'; else if (!anczero[i] || (ancone[i] && numsone[i] < numszero[i])) guess[i] = '1'; } like = -sum; } /* evaluate */ void reroot(node *outgroup) { /* reorients tree, putting outgroup in desired position. */ node *p, *q, *newbottom, *oldbottom; boolean onleft; if (outgroup->back->index == root->index) return; newbottom = outgroup->back; p = treenode[newbottom->index - 1]->back; while (p->index != root->index) { oldbottom = treenode[p->index - 1]; treenode[p->index - 1] = p; p = oldbottom->back; } onleft = (p == root->next); if (restoring) if (!onleft && wasleft){ p = root->next->next; q = root->next; } else { p = root->next; q = root->next->next; } else { if (onleft) oldoutgrno = root->next->next->back->index; else oldoutgrno = root->next->back->index; wasleft = onleft; p = root->next; q = root->next->next; } p->back->back = q->back; q->back->back = p->back; p->back = outgroup; q->back = outgroup->back; if (restoring) { if (!onleft && wasleft) { outgroup->back->back = root->next; outgroup->back = root->next->next; } else { outgroup->back->back = root->next->next; outgroup->back = root->next; } } else { outgroup->back->back = root->next->next; outgroup->back = root->next; } treenode[newbottom->index - 1] = newbottom; } /* reroot */ void dolmove_hyptrav(node *r) { /* compute states at interior nodes for one character */ if (!r->tip) dolmove_correct(r); if (((1L << dispbit) & r->stateone[dispword - 1]) != 0) { if (((1L << dispbit) & r->statezero[dispword - 1]) != 0) { if (dollo) r->state = '?'; else r->state = 'P'; } else r->state = '1'; } else { if (((1L << dispbit) & r->statezero[dispword - 1]) != 0) r->state = '0'; else r->state = '?'; } if (!r->tip) { dolmove_hyptrav(r->next->back); dolmove_hyptrav(r->next->next->back); } } /* dolmove_hyptrav */ void dolmove_hypstates() { /* fill in and describe states at interior nodes */ long i, j, k; for (i = 0; i < (words); i++) { zeroanc[i] = 0; oneanc[i] = 0; } for (i = 0; i < (chars); i++) { j = i / bits + 1; k = i % bits + 1; if (guess[i] == '0') zeroanc[j - 1] = ((long)zeroanc[j - 1]) | (1L << k); if (guess[i] == '1') oneanc[j - 1] = ((long)oneanc[j - 1]) | (1L << k); } filltrav(root); dolmove_hyptrav(root); } /* dolmove_hypstates */ void grwrite(chartype c, long num, long *pos) { int i; prefix(c); for (i = 1; i <= num; i++) { if ((*pos) >= leftedge && (*pos) - leftedge + 1 < screenwidth) putchar(cha[(long)c]); (*pos)++; } postfix(c); } /* grwrite */ void dolmove_drawline(long i) { /* draws one row of the tree diagram by moving up tree */ node *p, *q, *r, *first =NULL, *last =NULL; long n, j, pos; boolean extra, done; Char s, cc; chartype c, d; pos = 1; p = nuroot; q = nuroot; extra = false; if (i == (long)p->ycoord && (p == root || subtree)) { c = overt; if (display) { if (p == root) cc = guess[dispchar - 1]; else cc = p->state; switch (cc) { case '1': c = onne; break; case '0': c = zerro; break; case '?': c = question; break; case 'P': c = polym; break; } } if ((subtree)) stwrite("Subtree:", 8, &pos, leftedge, screenwidth); if (p->index >= 100) nnwrite(p->index, 3, &pos, leftedge, screenwidth); else if (p->index >= 10) { grwrite(c, 1, &pos); nnwrite(p->index, 2, &pos, leftedge, screenwidth); } else { grwrite(c, 2, &pos); nnwrite(p->index, 1, &pos, leftedge, screenwidth); } extra = true; } else { if (subtree) stwrite(" ", 10, &pos, leftedge, screenwidth); else stwrite(" ", 2, &pos, leftedge, screenwidth); } do { if (!p->tip) { r = p->next; done = false; do { if (i >= r->back->ymin && i <= r->back->ymax) { q = r->back; done = true; } r = r->next; } while (!(done || r == p)); first = p->next->back; r = p->next; while (r->next != p) r = r->next; last = r->back; } done = (p == q); n = (long)p->xcoord - (long)q->xcoord; if (n < 3 && !q->tip) n = 3; if (extra) { n--; extra = false; } if ((long)q->ycoord == i && !done) { if ((long)q->ycoord > (long)p->ycoord) d = upcorner; else d = downcorner; c = overt; s = q->state; if (s == 'P' && p->state != 'P') s = p->state; if (display) { switch (s) { case '1': c = onne; break; case '0': c = zerro; break; case '?': c = question; break; case 'P': c = polym; break; } d = c; } if (n > 1) { grwrite(d, 1, &pos); grwrite(c, n - 3, &pos); } if (q->index >= 100) nnwrite(q->index, 3, &pos, leftedge, screenwidth); else if (q->index >= 10) { grwrite(c, 1, &pos); nnwrite(q->index, 2, &pos, leftedge, screenwidth); } else { grwrite(c, 2, &pos); nnwrite(q->index, 1, &pos, leftedge, screenwidth); } extra = true; } else if (!q->tip) { if ((long)last->ycoord > i && (long)first->ycoord < i && i != (long)p->ycoord) { c = up; if (i < (long)p->ycoord) s = p->next->back->state; else s = p->next->next->back->state; if (s == 'P' && p->state != 'P') s = p->state; if (display) { switch (s) { case '1': c = onne; break; case '0': c = zerro; break; case '?': c = question; break; case 'P': c = polym; break; } } grwrite(c, 1, &pos); chwrite(' ', n - 1, &pos, leftedge, screenwidth); } else chwrite(' ', n, &pos, leftedge, screenwidth); } else chwrite(' ', n, &pos, leftedge, screenwidth); if (p != q) p = q; } while (!done); if ((long)p->ycoord == i && p->tip) { n = 0; for (j = 1; j <= nmlngth; j++) { if (nayme[p->index - 1][j - 1] != '\0') n = j; } chwrite(':', 1, &pos, leftedge, screenwidth); for (j = 0; j < n; j++) chwrite(nayme[p->index - 1][j], 1, &pos, leftedge, screenwidth); } putchar('\n'); } /* dolmove_drawline */ void dolmove_printree() { /* prints out diagram of the tree */ long tipy; long i, dow; if (!subtree) nuroot = root; if (changed || newtree) evaluate(root); if (display) dolmove_hypstates(); #ifdef WIN32 if(ibmpc || ansi){ phyClearScreen(); } else { printf("\n"); } #else if (ansi || ibmpc) printf("\033[2J\033[H"); else putchar('\n'); #endif tipy = 1; dow = down; if (spp * dow > screenlines && !subtree) dow--; printf("(unrooted)"); if (display) { printf(" "); makechar(onne); printf(":1 "); makechar(question); printf(":? "); makechar(zerro); printf(":0 "); makechar(polym); printf(":0/1"); } else printf(" "); if (!earlytree) { printf("%10.1f Steps", -like); } if (display) printf(" SITE%4ld", dispchar); else printf(" "); if (!earlytree) { printf(" %3ld chars compatible\n", compatible); } printf("%-20s",dollo ? "Dollo" : "Polymorphism"); if (changed && !earlytree) { if (-like < bestyet) { printf(" BEST YET!"); bestyet = -like; } else if (fabs(-like - bestyet) < 0.000001) printf(" (as good as best)"); else { if (-like < gotlike) printf(" better"); else if (-like > gotlike) printf(" worse!"); } } printf("\n"); farthest = 0; coordinates(nuroot, &tipy, 1.5, &farthest); vmargin = 5; treelines = tipy - dow; if (topedge != 1){ printf("** %ld lines above screen **\n", topedge - 1); vmargin++;} if ((treelines - topedge + 1) > (screenlines - vmargin)) vmargin++; for (i = 1; i <= treelines; i++) { if (i >= topedge && i < topedge + screenlines - vmargin) dolmove_drawline(i); } if ((treelines - topedge + 1) > (screenlines - vmargin)) printf("** %ld lines below screen **\n", treelines - (topedge - 1 + screenlines - vmargin)); if (treelines - topedge + vmargin + 1 < screenlines) putchar('\n'); gotlike = -like; changed = false; } /* dolmove_printree */ void arbitree() { long i; root = treenode[0]; add2(treenode[0], treenode[1], treenode[spp], &root, restoring, wasleft, treenode); for (i = 3; i <= (spp); i++) add2(treenode[spp + i - 3], treenode[i - 1], treenode[spp + i - 2], &root, restoring, wasleft, treenode); for (i = 0; i < (nonodes); i++) in_tree[i] = true; } /* arbitree */ void yourtree() { long i, j; boolean ok; root = treenode[0]; add2(treenode[0], treenode[1], treenode[spp], &root, restoring, wasleft, treenode); i = 2; do { i++; dolmove_printree(); printf("Add species%3ld: ", i); for (j = 0; j < nmlngth; j++) putchar(nayme[i - 1][j]); do { printf("\nbefore node (type number): "); inpnum(&j, &ok); ok = (ok && ((j >= 1 && j < i) || (j > spp && j < spp + i - 1))); if (!ok) printf("Impossible number. Please try again:\n"); } while (!ok); add2(treenode[j - 1], treenode[i - 1], treenode[spp + i - 2], &root, restoring, wasleft, treenode); } while (i != spp); for (i = 0; i < (nonodes); i++) in_tree[i] = true; } /* yourtree */ void initdolmovenode(node **p, node **grbg, node *q, long len, long nodei, long *ntips, long *parens, initops whichinit, pointarray treenode, pointarray nodep, Char *str, Char *ch, FILE *intree) { /* initializes a node */ /* LM 7/27 I added this function and the commented lines around */ /* treeread() to get the program running, but all 4 move programs*/ /* are improperly integrated into the v4.0 support files. As is */ /* this is a patchwork function */ boolean minusread; double valyew, divisor; switch (whichinit) { case bottom: gnutreenode(grbg, p, nodei, chars, zeros); treenode[nodei - 1] = *p; break; case nonbottom: gnutreenode(grbg, p, nodei, chars, zeros); break; case tip: match_names_to_data (str, treenode, p, spp); break; case length: processlength(&valyew, &divisor, ch, &minusread, intree, parens); /* process lengths and discard */ default: /*cases hslength,hsnolength,treewt,unittrwt,iter,*/ break; /*length should never occur */ } } /* initdolmovenode */ void buildtree() { long i, j, nextnode; node *p; changed = false; newtree = false; switch (how) { case arb: arbitree(); break; case use: /* Open in binary: ftell() is broken for UNIX line-endings under WIN32 */ openfile(&intree,INTREE,"input tree file", "rb",progname,intreename); names = (boolean *)Malloc(spp*sizeof(boolean)); firsttree = true; /**/ nodep = NULL; /**/ nextnode = 0; /**/ haslengths = 0; /**/ zeros = (long *)Malloc(chars*sizeof(long)); /**/ for (i = 0; i < chars; i++) /**/ zeros[i] = 0; /**/ treeread(intree, &root, treenode, &goteof, &firsttree, nodep, &nextnode, &haslengths, &grbg, initdolmovenode,false,nonodes); for (i = spp; i < (nonodes); i++) { p = treenode[i]; for (j = 1; j <= 3; j++) { p->stateone = (bitptr)Malloc(words*sizeof(long)); p->statezero = (bitptr)Malloc(words*sizeof(long)); p = p->next; } } /* debug: see comment at initdolmovenode() */ /*treeread(which, ch, &root, treenode, names);*/ for (i = 0; i < (spp); i++) in_tree[i] = names[i]; free(names); FClose(intree); break; case spec: yourtree(); break; } outgrno = root->next->back->index; if (in_tree[outgrno - 1]) reroot(treenode[outgrno - 1]); } /* buildtree */ void rearrange() { long i, j; boolean ok1, ok2; node *p, *q; printf("Remove everything to the right of which node? "); inpnum(&i, &ok1); ok1 = (ok1 && i >= 1 && i < spp * 2 && i != root->index); if (ok1) { printf("Add before which node? "); inpnum(&j, &ok2); ok2 = (ok2 && j >= 1 && j < spp * 2); if (ok2) { ok2 = (treenode[j - 1] != treenode[treenode[i - 1]->back->index - 1]); p = treenode[j - 1]; while (p != root) { ok2 = (ok2 && p != treenode[i - 1]); p = treenode[p->back->index - 1]; } if (ok1 && ok2) { what = i; q = treenode[treenode[i - 1]->back->index - 1]; if (q->next->back->index == i) fromwhere = q->next->next->back->index; else fromwhere = q->next->back->index; towhere = j; re_move2(&treenode[i - 1], &q, &root, &wasleft, treenode); add2(treenode[j - 1], treenode[i - 1], q, &root, restoring, wasleft, treenode); } lastop = rearr; } } changed = (ok1 && ok2); dolmove_printree(); if (!(ok1 && ok2)) printf("Not a possible rearrangement. Try again: \n"); else { oldwritten = written; written = false; } } /* rearrange */ void tryadd(node *p, node **item, node **nufork, double *place) { /* temporarily adds one fork and one tip to the tree. Records scores in ARRAY place */ add2(p, *item, *nufork, &root, restoring, wasleft, treenode); evaluate(root); place[p->index - 1] = -like; re_move2(item, nufork, &root, &wasleft, treenode); } /* tryadd */ void addpreorder(node *p, node *item_, node *nufork_, double *place) { /* traverses a binary tree, calling PROCEDURE tryadd at a node before calling tryadd at its descendants */ node *item, *nufork; item = item_; nufork = nufork_; if (p == NULL) return; tryadd(p, &item,&nufork,place); if (!p->tip) { addpreorder(p->next->back, item,nufork,place); addpreorder(p->next->next->back,item,nufork,place); } } /* addpreorder */ void try() { /* Remove node, try it in all possible places */ double *place; long i, j, oldcompat; double current; node *q, *dummy, *rute; boolean tied, better, ok; printf("Try other positions for which node? "); inpnum(&i, &ok); if (!(ok && i >= 1 && i <= nonodes && i != root->index)) { printf("Not a possible choice! "); return; } printf("WAIT ...\n"); place = (double *)Malloc(nonodes*sizeof(double)); for (j = 0; j < (nonodes); j++) place[j] = -1.0; evaluate(root); current = -like; oldcompat = compatible; what = i; q = treenode[treenode[i - 1]->back->index - 1]; if (q->next->back->index == i) fromwhere = q->next->next->back->index; else fromwhere = q->next->back->index; rute = root; if (root->index == treenode[i - 1]->back->index) { if (treenode[treenode[i - 1]->back->index - 1]->next->back == treenode[i - 1]) rute = treenode[treenode[i - 1]->back->index - 1]->next->next->back; else rute = treenode[treenode[i - 1]->back->index - 1]->next->back; } re_move2(&treenode[i - 1], &dummy, &root, &wasleft, treenode); oldleft = wasleft; root = rute; addpreorder(root, treenode[i - 1], dummy, place); wasleft = oldleft; restoring = true; add2(treenode[fromwhere - 1], treenode[what - 1], dummy, &root, restoring, wasleft, treenode); like = -current; compatible = oldcompat; restoring = false; better = false; printf(" BETTER: "); for (j = 1; j <= (nonodes); j++) { if (place[j - 1] < current && place[j - 1] >= 0.0) { printf("%3ld:%6.2f", j, place[j - 1]); better = true; } } if (!better) printf(" NONE"); printf("\n TIED: "); tied = false; for (j = 1; j <= (nonodes); j++) { if (fabs(place[j - 1] - current) < 1.0e-6 && j != fromwhere) { if (j < 10) printf("%2ld", j); else printf("%3ld", j); tied = true; } } if (tied) printf(":%6.2f\n", current); else printf("NONE\n"); changed = true; free(place); } /* try */ void undo() { /* restore to tree before last rearrangement */ long temp; boolean btemp; node *q; switch (lastop) { case rearr: restoring = true; oldleft = wasleft; re_move2(&treenode[what - 1], &q, &root, &wasleft, treenode); btemp = wasleft; wasleft = oldleft; add2(treenode[fromwhere - 1], treenode[what - 1], q, &root, restoring, wasleft, treenode); wasleft = btemp; restoring = false; temp = fromwhere; fromwhere = towhere; towhere = temp; changed = true; break; case flipp: q = treenode[atwhat - 1]->next->back; treenode[atwhat - 1]->next->back = treenode[atwhat - 1]->next->next->back; treenode[atwhat - 1]->next->next->back = q; treenode[atwhat - 1]->next->back->back = treenode[atwhat - 1]->next; treenode[atwhat - 1]->next->next->back->back = treenode[atwhat - 1]->next->next; break; case reroott: restoring = true; temp = oldoutgrno; oldoutgrno = outgrno; outgrno = temp; reroot(treenode[outgrno - 1]); restoring = false; break; case none: /* blank case */ break; } dolmove_printree(); if (lastop == none) { printf("No operation to undo! "); return; } btemp = oldwritten; oldwritten = written; written = btemp; } /* undo */ void treewrite(boolean done) { /* write out tree to a file */ Char ch; treeoptions(waswritten, &ch, &outtree, outtreename, progname); if (!done) dolmove_printree(); if (waswritten && ch == 'N') return; col = 0; treeout(root, 1, &col, root); printf("\nTree written to file \"%s\"\n\n", outtreename); waswritten = true; written = true; FClose(outtree); #ifdef MAC fixmacfile(outtreename); #endif } /* treewrite */ void clade() { /* pick a subtree and show only that on screen */ long i; boolean ok; printf("Select subtree rooted at which node (0 for whole tree)? "); inpnum(&i, &ok); ok = (ok && ((unsigned)i) <= ((unsigned)nonodes)); if (ok) { subtree = (i > 0); if (subtree) nuroot = treenode[i - 1]; else nuroot = root; } dolmove_printree(); if (!ok) printf("Not possible to use this node. "); } /* clade */ void flip() { /* flip at a node left-right */ long i; boolean ok; node *p; printf("Flip branches at which node? "); inpnum(&i, &ok); ok = (ok && i > spp && i <= nonodes); if (ok) { p = treenode[i - 1]->next->back; treenode[i - 1]->next->back = treenode[i - 1]->next->next->back; treenode[i - 1]->next->next->back = p; treenode[i - 1]->next->back->back = treenode[i - 1]->next; treenode[i - 1]->next->next->back->back = treenode[i - 1]->next->next; atwhat = i; lastop = flipp; } dolmove_printree(); if (ok) { oldwritten = written; written = false; return; } if (i >= 1 && i <= spp) printf("Can't flip there. "); else printf("No such node. "); } /* flip */ void changeoutgroup() { long i; boolean ok; oldoutgrno = outgrno; do { printf("Which node should be the new outgroup? "); inpnum(&i, &ok); ok = (ok && in_tree[i - 1] && i >= 1 && i <= nonodes && i != root->index); if (ok) outgrno = i; } while (!ok); if (in_tree[outgrno - 1]) reroot(treenode[outgrno - 1]); changed = true; lastop = reroott; dolmove_printree(); oldwritten = written; written = false; } /* changeoutgroup */ void redisplay() { boolean done; char input[100]; done = false; waswritten = false; do { printf("NEXT? (Options: R # + - S . T U W O F H J K L C ? X Q) "); printf("(? for Help) "); #ifdef WIN32 phyFillScreenColor(); #endif getstryng(input); ch = input[0]; uppercase(&ch); if (strchr("RSWH#.O?+TFX-UCQHJKL",ch) != NULL){ switch (ch) { case 'R': rearrange(); break; case '#': nextinc(&dispchar, &dispword, &dispbit, chars, bits, &display, numsteps, weight); dolmove_printree(); break; case '+': nextchar(&dispchar, &dispword, &dispbit, chars, bits, &display); dolmove_printree(); break; case '-': prevchar(&dispchar, &dispword, &dispbit, chars, bits, &display); dolmove_printree(); break; case 'S': show(&dispchar, &dispword, &dispbit, chars, bits, &display); dolmove_printree(); break; case '.': dolmove_printree(); break; case 'T': try(); break; case 'U': undo(); break; case 'W': treewrite(done); break; case 'O': changeoutgroup(); break; case 'F': flip(); break; case 'H': window(left, &leftedge, &topedge, hscroll, vscroll, treelines, screenlines, screenwidth, farthest, subtree); dolmove_printree(); break; case 'J': window(downn, &leftedge, &topedge, hscroll, vscroll, treelines, screenlines, screenwidth, farthest, subtree); dolmove_printree(); break; case 'K': window(upp, &leftedge, &topedge, hscroll, vscroll, treelines, screenlines, screenwidth, farthest, subtree); dolmove_printree(); break; case 'L': window(right, &leftedge, &topedge, hscroll, vscroll, treelines, screenlines, screenwidth, farthest, subtree); dolmove_printree(); break; case 'C': clade(); break; case '?': help("character"); dolmove_printree(); break; case 'X': done = true; break; case 'Q': done = true; break; } } } while (!done); if (!written) { do { printf("Do you want to write out the tree to a file? (Y or N) "); #ifdef WIN32 phyFillScreenColor(); #endif getstryng(input); ch = input[0]; } while (ch != 'Y' && ch != 'y' && ch != 'N' && ch != 'n'); } if (ch == 'Y' || ch == 'y') treewrite(done); } /* redisplay */ void treeconstruct() { /* constructs a binary tree from the pointers in treenode. */ restoring = false; subtree = false; display = false; dispchar = 0; fullset = (1L << (bits + 1)) - (1L << 1); guess = (Char *)Malloc(chars*sizeof(Char)); numsteps = (steptr)Malloc(chars*sizeof(long)); earlytree = true; buildtree(); waswritten = false; printf("\nComputing steps needed for compatibility in sites ...\n\n"); newtree = true; earlytree = false; dolmove_printree(); bestyet = -like; gotlike = -like; lastop = none; newtree = false; written = false; lastop = none; redisplay(); } /* treeconstruct */ int main(int argc, Char *argv[]) { /* Interactive Dollo/polymorphism parsimony */ /* reads in spp, chars, and the data. Then calls treeconstruct to construct the tree and query the user */ #ifdef MAC argc = 1; /* macsetup("Dolmove",""); */ argv[0] = "Dolmove"; #endif init(argc, argv); progname = argv[0]; strcpy(infilename,INFILE); strcpy(outtreename,OUTTREE); strcpy(intreename,INTREE); openfile(&infile,infilename,"input file", "r",argv[0],infilename); screenlines = 24; scrollinc = 20; screenwidth = 80; topedge = 1; leftedge = 1; ibmpc = IBMCRT; ansi = ANSICRT; root = NULL; bits = 8*sizeof(long) - 1; doinput(); configure(); treeconstruct(); if (waswritten) { FClose(outtree); #ifdef MAC fixmacfile(outtreename); #endif } FClose(infile); #ifdef WIN32 phyRestoreConsoleAttributes(); #endif return 0; } /* Interactive Dollo/polymorphism parsimony */ phylip-3.697/src/dolpenny.c0000644004732000473200000005120512406201116015324 0ustar joefelsenst_g#include "phylip.h" #include "disc.h" #include "dollo.h" /* version 3.696. Written by Joseph Felsenstein, Akiko Fuseki, Sean Lamont, and Andrew Keeffe. Copyright (c) 1993-2014, Joseph Felsenstein All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #define maxtrees 1000 /* maximum number of trees to be printed out */ #define often 100 /* how often to notify how many trees examined */ #define many 1000 /* how many multiples of howoften before stop */ typedef long *treenumbers; typedef double *valptr; typedef long *placeptr; #ifndef OLDC /* function prototypes */ void getoptions(void); void allocrest(void); void doinit(void); void inputoptions(void); void doinput(void); void preorder(node *); void evaluate(node *); void addtraverse(node *, node *, node *, placeptr, valptr, long *); void addit(long); void describe(void); void maketree(void); /* function prototypes */ #endif Char infilename[FNMLNGTH], outfilename[FNMLNGTH], outtreename[FNMLNGTH], weightfilename[FNMLNGTH], ancfilename[FNMLNGTH]; node *root; long howmanny, howoften, col, msets, ith; boolean weights, thresh, ancvar, questions, dollo, simple, trout, progress, treeprint, stepbox, ancseq, mulsets, firstset, justwts; boolean *ancone, *anczero, *ancone0, *anczero0; pointptr treenode; /* pointers to all nodes in tree */ double fracdone, fracinc; double threshold; double *threshwt; boolean *added; Char *guess; steptr numsteps, numsone, numszero; gbit *garbage; long **bestorders, **bestrees; /* Local variables for maketree, propagated globally for C version: */ long examined, mults; boolean firsttime, done; double like, bestyet; treenumbers current, order; long fullset; bitptr zeroanc, oneanc; bitptr stps; void getoptions() { /* interactively set options */ long loopcount, loopcount2; Char ch, ch2; boolean done; fprintf(outfile, "\nPenny algorithm for Dollo or polymorphism"); fprintf(outfile, " parsimony, version %s\n",VERSION); fprintf(outfile, " branch-and-bound to find all"); fprintf(outfile, " most parsimonious trees\n\n"); howoften = often; howmanny = many; simple = true; thresh = false; threshold = spp; trout = true; weights = false; justwts = false; ancvar = false; dollo = true; printdata = false; progress = true; treeprint = true; stepbox = false; ancseq = false; loopcount = 0; do { cleerhome(); printf("\nPenny algorithm for Dollo or polymorphism parsimony,"); printf(" version %s\n",VERSION); printf(" branch-and-bound to find all most parsimonious trees\n\n"); printf("Settings for this run:\n"); printf(" P Parsimony method? %s\n", (dollo ? "Dollo" : "Polymorphism")); printf(" H How many groups of %4ld trees:%6ld\n",howoften,howmanny); printf(" F How often to report, in trees:%5ld\n",howoften); printf(" S Branch and bound is simple? %s\n", (simple ? "Yes" : "No. Reconsiders order of species")); printf(" T Use Threshold parsimony?"); if (thresh) printf(" Yes, count steps up to%4.1f per char.\n", threshold); else printf(" No, use ordinary parsimony\n"); printf(" A Use ancestral states? %s\n", (ancvar ? "Yes" : "No")); printf(" W Sites weighted? %s\n", (weights ? "Yes" : "No")); printf(" M Analyze multiple data sets?"); if (mulsets) printf(" Yes, %2ld %s\n", msets, (justwts ? "sets of weights" : "data sets")); else printf(" No\n"); printf(" 0 Terminal type (IBM PC, ANSI, none)? %s\n", (ibmpc ? "IBM PC" : ansi ? "ANSI" : "(none)")); printf(" 1 Print out the data at start of run %s\n", (printdata ? "Yes" : "No")); printf(" 2 Print indications of progress of run %s\n", (progress ? "Yes" : "no")); printf(" 3 Print out tree %s\n", (treeprint ? "Yes": "No")); printf(" 4 Print out steps in each character %s\n", (stepbox ? "Yes" : "No")); printf(" 5 Print states at all nodes of tree %s\n", (ancseq ? "Yes" : "No")); printf(" 6 Write out trees onto tree file? %s\n", (trout ? "Yes" : "No")); if(weights && justwts){ printf( "WARNING: W option and Multiple Weights options are both on. "); printf( "The W menu option is unnecessary and has no additional effect. \n"); } printf("\nAre these settings correct? (type Y or the letter for one to change)\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); uppercase(&ch); done = (ch == 'Y'); if (!done) { if (strchr("WHMSTAPF1234560",ch) != NULL){ switch (ch) { case 'W': weights = !weights; break; case 'H': inithowmany(&howmanny, howoften); break; case 'F': inithowoften(&howoften); break; case 'A': ancvar = !ancvar; break; case 'P': dollo = !dollo; break; case 'S': simple = !simple; break; case 'T': thresh = !thresh; if (thresh) initthreshold(&threshold); break; case 'M': mulsets = !mulsets; if (mulsets){ printf("Multiple data sets or multiple weights?"); loopcount2 = 0; do { printf(" (type D or W)\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch2); getchar(); if (ch2 == '\n') ch2 = ' '; uppercase(&ch2); countup(&loopcount2, 10); } while ((ch2 != 'W') && (ch2 != 'D')); justwts = (ch2 == 'W'); if (justwts) justweights(&msets); else initdatasets(&msets); } break; case '0': initterminal(&ibmpc, &ansi); break; case '1': printdata = !printdata; break; case '2': progress = !progress; break; case '3': treeprint = !treeprint; break; case '4': stepbox = !stepbox; break; case '5': ancseq = !ancseq; break; case '6': trout = !trout; break; } } else printf("Not a possible option!\n"); } countup(&loopcount, 100); } while (!done); } /* getoptions */ void allocrest() { long i; extras = (long *)Malloc(chars*sizeof(long)); weight = (long *)Malloc(chars*sizeof(long)); threshwt = (double *)Malloc(chars*sizeof(double)); guess = (Char *)Malloc(chars*sizeof(Char)); numsteps = (long *)Malloc(chars*sizeof(long)); numszero = (long *)Malloc(chars*sizeof(long)); numsone = (long *)Malloc(chars*sizeof(long)); bestorders = (long **)Malloc(maxtrees*sizeof(long *)); bestrees = (long **)Malloc(maxtrees*sizeof(long *)); for (i = 1; i <= maxtrees; i++) { bestorders[i - 1] = (long *)Malloc(spp*sizeof(long)); bestrees[i - 1] = (long *)Malloc(spp*sizeof(long)); } current = (treenumbers)Malloc(spp*sizeof(long)); order = (treenumbers)Malloc(spp*sizeof(long)); nayme = (naym *)Malloc(spp*sizeof(naym)); added = (boolean *)Malloc(nonodes*sizeof(boolean)); ancone = (boolean *)Malloc(chars*sizeof(boolean)); anczero = (boolean *)Malloc(chars*sizeof(boolean)); ancone0 = (boolean *)Malloc(chars*sizeof(boolean)); anczero0 = (boolean *)Malloc(chars*sizeof(boolean)); zeroanc = (bitptr)Malloc(words*sizeof(long)); oneanc = (bitptr)Malloc(words*sizeof(long)); } /* allocrest */ void doinit() { /* initializes variables */ inputnumbers(&spp, &chars, &nonodes, 1); words = chars / bits + 1; getoptions(); if (printdata) fprintf(outfile, "%2ld species, %3ld characters\n", spp, chars); alloctree(&treenode); setuptree(treenode); allocrest(); } /* doinit */ void inputoptions() { /* input the information on the options */ long i; if(justwts){ if(firstset){ scan_eoln(infile); if (ancvar) { inputancestors(anczero0, ancone0); } } for (i = 0; i < (chars); i++) weight[i] = 1; inputweights(chars, weight, &weights); } else { if (!firstset) samenumsp(&chars, ith); scan_eoln(infile); for (i = 0; i < (chars); i++) weight[i] = 1; if (ancvar) inputancestors(anczero0, ancone0); if (weights) inputweights(chars, weight, &weights); } for (i = 0; i < (chars); i++) { if (!ancvar) { anczero[i] = true; ancone[i] = false; } else { anczero[i] = anczero0[i]; ancone[i] = ancone0[i]; } } questions = false; if (!thresh) threshold = spp; for (i = 0; i < (chars); i++) { questions = (questions || (ancone[i] && anczero[i])); threshwt[i] = threshold * weight[i]; } } /* inputoptions */ void doinput() { /* reads the input data */ inputoptions(); if(!justwts || firstset) inputdata(treenode, dollo, printdata, outfile); } /* doinput */ void preorder(node *p) { /* go back up tree setting up and counting interior node states */ long i; if (!p->tip) { correct(p, fullset, dollo, zeroanc, treenode); preorder(p->next->back); preorder(p->next->next->back); } if (p->back == NULL) return; if (dollo) { for (i = 0; i < (words); i++) stps[i] = (treenode[p->back->index - 1]->stateone[i] & p->statezero[i] & zeroanc[i]) | (treenode[p->back->index - 1]->statezero[i] & p->stateone[i] & fullset & (~zeroanc[i])); } else { for (i = 0; i < (words); i++) stps[i] = treenode[p->back->index - 1]->stateone[i] & treenode[p->back->index - 1]->statezero[i] & p->stateone[i] & p->statezero[i]; } count(stps, zeroanc, numszero, numsone); } /* preorder */ void evaluate(node *r) { /* Determines the number of losses or polymorphisms needed for a tree. This is the minimum number needed to evolve chars on this tree */ long i, stepnum, smaller; double sum; sum = 0.0; for (i = 0; i < (chars); i++) { numszero[i] = 0; numsone[i] = 0; } for (i = 0; i < (words); i++) zeroanc[i] = fullset; postorder(r); preorder(r); for (i = 0; i < (words); i++) zeroanc[i] = 0; postorder(r); preorder(r); for (i = 0; i < (chars); i++) { smaller = spp * weight[i]; numsteps[i] = smaller; if (anczero[i]) { numsteps[i] = numszero[i]; smaller = numszero[i]; } if (ancone[i] && numsone[i] < smaller) numsteps[i] = numsone[i]; stepnum = numsteps[i] + extras[i]; if (stepnum <= threshwt[i]) sum += stepnum; else sum += threshwt[i]; guess[i] = '?'; if (!ancone[i] || (anczero[i] && numszero[i] < numsone[i])) guess[i] = '0'; else if (!anczero[i] || (ancone[i] && numsone[i] < numszero[i])) guess[i] = '1'; } if (examined == 0 && mults == 0) bestyet = -1.0; like = sum; } /* evaluate */ void addtraverse(node *a, node *b, node *c, placeptr place, valptr valyew, long *n) { /* traverse all places to add b */ if (done) return; add(a, b, c, &root, treenode); (*n)++; evaluate(root); examined++; if (examined == howoften) { examined = 0; mults++; if (mults == howmanny) done = true; if (progress) { printf("%6ld",mults); if (bestyet >= 0) printf("%18.5f", bestyet); else printf(" - "); printf("%17ld%20.2f\n", nextree - 1, fracdone * 100); #ifdef WIN32 phyFillScreenColor(); #endif } } valyew[*n - 1] = like; place[*n - 1] = a->index; re_move(&b, &c, &root, treenode); if (!a->tip) { addtraverse(a->next->back, b, c, place,valyew,n); addtraverse(a->next->next->back, b, c, place,valyew,n); } } /* addtraverse */ void addit(long m) { /* adds the species one by one, recursively */ long n; valptr valyew; placeptr place; long i, j, n1, besttoadd = 0; valptr bestval; placeptr bestplace; double oldfrac, oldfdone, sum, bestsum; valyew = (valptr)Malloc(nonodes*sizeof(double)); bestval = (valptr)Malloc(nonodes*sizeof(double)); place = (placeptr)Malloc(nonodes*sizeof(long)); bestplace = (placeptr)Malloc(nonodes*sizeof(long)); if (simple && !firsttime) { n = 0; added[order[m - 1] - 1] = true; addtraverse(root, treenode[order[m - 1] - 1], treenode[spp + m - 2], place, valyew, &n); besttoadd = order[m - 1]; memcpy(bestplace, place, nonodes*sizeof(long)); memcpy(bestval, valyew, nonodes*sizeof(double)); } else { bestsum = -1.0; for (i = 1; i <= (spp); i++) { if (!added[i - 1]) { n = 0; added[i - 1] = true; addtraverse(root, treenode[i - 1], treenode[spp + m - 2], place, valyew, &n); added[i - 1] = false; sum = 0.0; for (j = 0; j < n; j++) sum += valyew[j]; if (sum > bestsum) { bestsum = sum; besttoadd = i; memcpy(bestplace, place, nonodes*sizeof(long)); memcpy(bestval, valyew, nonodes*sizeof(double)); } } } } order[m - 1] = besttoadd; memcpy(place, bestplace, nonodes*sizeof(long)); memcpy(valyew, bestval, nonodes*sizeof(double)); shellsort(valyew, place, n); oldfrac = fracinc; oldfdone = fracdone; n1 = 0; for (i = 0; i < (n); i++) { if (valyew[i] <= bestyet || bestyet < 0.0) n1++; } if (n1 > 0) fracinc /= n1; for (i = 0; i < n; i++) { if (valyew[i] <=bestyet ||bestyet < 0.0) { current[m - 1] = place[i]; add(treenode[place[i] - 1], treenode[besttoadd - 1], treenode[spp + m - 2], &root, treenode); added[besttoadd - 1] = true; if (m < spp) addit(m + 1); else { if (valyew[i] < bestyet || bestyet < 0.0) { nextree = 1; bestyet = valyew[i]; } if (nextree <= maxtrees) { memcpy(bestorders[nextree - 1], order, spp*sizeof(long)); memcpy(bestrees[nextree - 1], current, spp*sizeof(long)); } nextree++; firsttime = false; } re_move(&treenode[besttoadd - 1], &treenode[spp + m - 2], &root, treenode); added[besttoadd - 1] = false; } fracdone += fracinc; } fracinc = oldfrac; fracdone = oldfdone; free(valyew); free(bestval); free(place); free(bestplace); } /* addit */ void describe() { /* prints ancestors, steps and table of numbers of steps in each character */ if (stepbox) { putc('\n', outfile); writesteps(weights, dollo, numsteps); } if (questions) guesstates(guess); if (ancseq) { hypstates(fullset, dollo, guess, treenode, root, garbage, zeroanc, oneanc); putc('\n', outfile); } putc('\n', outfile); if (trout) { col = 0; treeout(root, nextree, &col, root); } } /* describe */ void maketree() { /* tree construction recursively by branch and bound */ long i, j, k; node *dummy; fullset = (1L << (bits + 1)) - (1L << 1); if (progress) { printf("\nHow many\n"); printf("trees looked Approximate\n"); printf("at so far Length of How many percentage\n"); printf("(multiples shortest tree trees this long searched\n"); printf("of %4ld): found so far found so far so far\n", howoften); printf("---------- ------------ ------------ ------------\n"); #ifdef WIN32 phyFillScreenColor(); #endif } done = false; mults = 0; examined = 0; nextree = 1; root = treenode[0]; firsttime = true; for (i = 0; i < (spp); i++) added[i] = false; added[0] = true; order[0] = 1; k = 2; fracdone = 0.0; fracinc = 1.0; bestyet = -1.0; stps = (bitptr)Malloc(words*sizeof(long)); addit(k); if (done) { if (progress) { printf("Search broken off! Not guaranteed to\n"); printf(" have found the most parsimonious trees.\n"); } if (treeprint) { fprintf(outfile, "Search broken off! Not guaranteed to\n"); fprintf(outfile, " have found the most parsimonious\n"); fprintf(outfile, " trees, but here is what we found:\n"); } } if (treeprint) { fprintf(outfile, "\nrequires a total of %18.3f\n\n", bestyet); if (nextree == 2) fprintf(outfile, "One most parsimonious tree found:\n"); else fprintf(outfile, "%5ld trees in all found\n", nextree - 1); } if (nextree > maxtrees + 1) { if (treeprint) fprintf(outfile, "here are the first%4ld of them\n", (long)maxtrees); nextree = maxtrees + 1; } if (treeprint) putc('\n', outfile); for (i = 0; i < (spp); i++) added[i] = true; for (i = 0; i <= (nextree - 2); i++) { for (j = k; j <= (spp); j++) add(treenode[bestrees[i][j - 1] - 1], treenode[bestorders[i][j - 1] - 1], treenode[spp + j - 2], &root, treenode); evaluate(root); printree(1.0, treeprint, root); describe(); for (j = k - 1; j < (spp); j++) re_move(&treenode[bestorders[i][j] - 1], &dummy, &root, treenode); } if (progress) { printf("\nOutput written to file \"%s\"\n\n", outfilename); if (trout) printf("Trees also written onto file \"%s\"\n\n", outtreename); } free(stps); if (ancseq) freegarbage(&garbage); } /* maketree */ int main(int argc, Char *argv[]) { /* branch-and-bound method for Dollo, polymorphism parsimony */ /* Reads in the number of species, number of characters, options and data. Then finds all most parsimonious trees */ #ifdef MAC argc = 1; /* macsetup("Dolpenny",""); */ argv[0] = "Dolpenny"; #endif init(argc, argv); openfile(&infile,INFILE,"input file", "r",argv[0],infilename); openfile(&outfile,OUTFILE,"output file", "w",argv[0],outfilename); ibmpc = IBMCRT; ansi = ANSICRT; garbage = NULL; mulsets = false; msets = 1; firstset = true; bits = 8*sizeof(long) - 1; doinit(); if (weights || justwts) openfile(&weightfile,WEIGHTFILE,"weights file","r",argv[0],weightfilename); if (trout) openfile(&outtree,OUTTREE,"output tree file", "w",argv[0],outtreename); if(ancvar) openfile(&ancfile,ANCFILE,"ancestors file", "r",argv[0],ancfilename); fprintf(outfile,"%s parsimony method\n\n",dollo ? "Dollo" : "Polymorphism"); for (ith = 1; ith <= msets; ith++) { doinput(); if (msets > 1 && !justwts) { fprintf(outfile, "Data set # %ld:\n\n",ith); if (progress) printf("\nData set # %ld:\n",ith); } if (justwts){ fprintf(outfile, "Weights set # %ld:\n\n", ith); if (progress) printf("\nWeights set # %ld:\n\n", ith); } if (printdata){ if (weights || justwts) printweights(outfile, 0, chars, weight, "Characters"); if (ancvar) printancestors(outfile, anczero, ancone); } if (ith == 1) firstset = false; maketree(); } FClose(infile); FClose(outfile); FClose(outtree); #ifdef MAC fixmacfile(outfilename); fixmacfile(outtreename); #endif #ifdef WIN32 phyRestoreConsoleAttributes(); #endif return 0; } /* branch-and-bound method for Dollo, polymorphism parsimony */ phylip-3.697/src/draw.c0000644004732000473200000027102712406201116014437 0ustar joefelsenst_g#ifdef WIN32 #include #endif #include "phylip.h" #include "draw.h" #ifdef QUICKC struct videoconfig myscreen; void setupgraphics(); #endif long winheight; long winwidth; extern winactiontype winaction; colortype colors[7] = { {"White ",1.0,1.0,1.0}, {"Red ",1.0,0.3,0.3}, {"Orange ",1.0,0.6,0.6}, {"Yellow ",1.0,0.9,0.4}, {"Green ",0.3,0.8,0.3}, {"Blue ",0.5,0.5,1.0}, {"Violet ",0.6,0.4,0.8}, }; vrmllighttype vrmllights[3] = { {1.0, -100.0, 100.0, 100.0}, {0.5, 100.0, -100.0, -100.0}, {0.3, 0.0, -100.0, 100.0}, }; char fontname[LARGE_BUF_LENGTH]; /* format of matrix: capheight, length[32],length[33],..length[256]*/ byte *full_pic ; int increment = 0 ; int total_bytes = 0 ; short unknown_metric[256]; static short helvetica_metric[] = { 718, 278,278,355,556,556,889,667,222,333,333,389,584,278,333,278,278,556,556,556, 556,556,556,556,556,556,556,278,278,584,584,584,556,1015,667,667,722,722,667, 611,778,722,278,500,667,556,833,722,778,667,778,722,667,611,722,667,944,667, 667,611,278,278,278,469,556,222,556,556,500,556,556,278,556,556,222,222,500, 222,833,556,556,556,556,333,500,278,556,500,722,500,500,500,334,260,334,584, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,333,556, 556,167,556,556,556,556,191,333,556,333,333,500,500,0,556,556,556,278,0,537, 350,222,333,333,556,1000,1000,0,611,0,333,333,333,333,333,333,333,333,0,333, 333,0,333,333,333,1000,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1000,0,370,0,0,0,0,556, 778,1000,365,0,0,0,0,0,889,0,0,0,278,0,0,222,611,944,611,0,0,0}; static short helveticabold_metric[] = {718, /* height */ 278,333,474,556,556,889,722,278,333,333,389,584,278,333,278,278,556,556,556, 556,556,556,556,556,556,556,333,333,584,584,584,611,975,722,722,722,722,667, 611,778,722,278,556,722,611,833,722,778,667,778,722,667,611,722,667,944,667, 667,611,333,278,333,584,556,278,556,611,556,611,556,333,611,611,278,278,556, 278,889,611,611,611,611,389,556,333,611,556,778,556,556,500,389,280,389,584, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,333,556, 556,167,556,556,556,556,238,500,556,333,333,611,611,0,556,556,556,278,0,556, 350,278,500,500,556,1000,1000,0,611,0,333,333,333,333,333,333,333,333,0,333, 333,0,333,333,333,1000,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1000,0,370,0,0,0,0,611, 778,1000,365,0,0,0,0,0,889,0,0,0,278,0,0,278,611,944,611,0,0,0}; static short timesroman_metric[] = {662, 250,333,408,500,500,833,778,333,333,333,500,564,250,333,250,278,500,500,500, 500,500,500,500,500,500,500,278,278,564,564,564,444,921,722,667,667,722,611, 556,722,722,333,389,722,611,889,722,722,556,722,667,556,611,722,722,944,722, 722,611,333,278,333,469,500,333,444,500,444,500,444,333,500,500,278,278,500, 278,778,500,500,500,500,333,389,278,500,500,722,500,500,444,480,200,480,541, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,333,500, 500,167,500,500,500,500,180,444,500,333,333,556,556,0,500,500,500,250,0,453, 350,333,444,444,500,1000,1000,0,444,0,333,333,333,333,333,333,333,333,0,333, 333,0,333,333,333,1000,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,889,0,276,0,0,0,0,611, 722,889,310,0,0,0,0,0,667,0,0,0,278,0,0,278,500,722,500,0,0,0}; static short timesitalic_metric[] = {660, /* height */ 250,333,420,500,500,833,778,333,333,333,500,675,250,333,250,278,500,500,500, 500,500,500,500,500,500,500,333,333,675,675,675,500,920,611,611,667,722,611, 611,722,722,333,444,667,556,833,667,722,611,722,611,500,556,722,611,833,611, 556,556,389,278,389,422,500,333,500,500,444,500,444,278,500,500,278,278,444, 278,722,500,500,500,500,389,389,278,500,444,667,444,444,389,400,275,400,541, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,389,500, 500,167,500,500,500,500,214,556,500,333,333,500,500,0,500,500,500,250,0,523, 350,333,556,556,500,889,1000,0,500,0,333,333,333,333,333,333,333,333,0,333, 333,0,333,333,333,889,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,889,0,276,0,0,0,0,556, 722,944,310,0,0,0,0,0,667,0,0,0,278,0,0,278,500,667,500,0,0,0}; static short timesbold_metric[] = {681, /* height */ 250,333,555,500,500,1000,833,333,333,333,500,570,250,333,250,278,500,500,500, 500,500,500,500,500,500,500,333,333,570,570,570,500,930,722,667,722,722,667, 611,778,778,389,500,778,667,944,722,778,611,778,722,556,667,722,722,1000,722, 722,667,333,278,333,581,500,333,500,556,444,556,444,333,500,556,278,333,556, 278,833,556,500,556,556,444,389,333,556,500,722,500,500,444,394,220,394,520,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,333,500,500, 167,500,500,500,500,278,500,500,333,333,556,556,0,500,500,500,250,0,540,350, 333,500,500,500,1000,1000,0,500,0,333,333,333,333,333,333,333,333,0,333,333, 0,333,333,333,1000,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1000,0,300,0,0,0,0,667,778, 1000,330,0,0,0,0,0,722,0,0,0,278,0,0,278,500,722,556,0,0,0}; static short timesbolditalic_metric[] = {662, /* height */ 250,389,555,500,500,833,778,333,333,333,500,570,250,333,250,278,500,500,500, 500,500,500,500,500,500,500,333,333,570,570,570,500,832,667,667,667,722,667, 667,722,778,389,500,667,611,889,722,722,611,722,667,556,611,722,667,889,667, 611,611,333,278,333,570,500,333,500,500,444,500,444,333,500,556,278,278,500, 278,778,556,500,500,500,389,389,278,556,444,667,500,444,389,348,220,348,570, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,389,500, 500,167,500,500,500,500,278,500,500,333,333,556,556,0,500,500,500,250,0,500, 350,333,500,500,500,1000,1000,0,500,0,333,333,333,333,333,333,333,333,0,333, 333,0,333,333,333,1000,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,944,0,266,0,0,0,0,611 ,722,944,300,0,0,0,0,0,722,0,0,0,278,0,0,278,500,722,500,0,0,0}; // WARNING: If you modify the following "figfont" array, particularly the order, the // Hershey font approximations in the Java interface of Drawgram and Drawtree may break static const char *figfonts[] = {"Times-Roman","Times-Italic","Times-Bold","Times-BoldItalic", "AvantGarde-Book","AvantGarde-BookOblique","AvantGarde-Demi","AvantGarde-DemiOblique", "Bookman-Light","Bookman-LightItalic","Bookman-Demi","Bookman-DemiItalic", "Courier","Courier-Italic","Courier-Bold","Courier-BoldItalic", "Helvetica","Helvetica-Oblique","Helvetica-Bold","Helvetica-BoldOblique", "Helvetica-Narrow","Helvetica-Narrow-Oblique","Helvetica-Narrow-Bold","Helvetica-Narrow-BoldOblique", "NewCenturySchlbk-Roman","NewCenturySchlbk-Italic","NewCenturySchlbk-Bold","NewCenturySchlbk-BoldItalic", "Palatino-Roman","Palatino-Italic","Palatino-Bold","Palatino-BoldItalic", "Symbol","ZapfChancery-MediumItalic","ZapfDingbats"}; double oldx, oldy; boolean didloadmetric; long nmoves,oldpictint,pagecount; double labelline,linewidth,oldxhigh,oldxlow,oldyhigh,oldylow, vrmllinewidth, raylinewidth,treeline,oldxsize,oldysize,oldxunitspercm, oldyunitspercm,oldxcorner,oldycorner,oldxmargin,oldymargin, oldhpmargin,oldvpmargin,clipx0,clipx1,clipy0,clipy1,userxsize,userysize; long rootmatrix[51][51]; long HiMode,GraphDriver,GraphMode,LoMode,bytewrite; /* externals should move to .h file later. */ extern long strpbottom,strptop,strpwide,strpdeep,strpdiv,hpresolution; extern boolean dotmatrix,empty,preview,previewing,pictbold,pictitalic, pictshadow,pictoutline; extern double expand,xcorner,xnow,xsize,xscale,xunitspercm, ycorner,ynow,ysize,yscale,yunitspercm,labelrotation, labelheight,xmargin,ymargin,pagex,pagey,paperx,papery, hpmargin,vpmargin; extern long filesize; extern growth grows; extern enum {yes,no} penchange,oldpenchange; extern FILE *plotfile; extern plottertype plotter,oldplotter; extern striptype stripe; extern char resopts; pentype lastpen; extern char pltfilename[FNMLNGTH]; extern char progname[FNMLNGTH]; #define NO_PLANE 666 /* To make POVRay happy */ #ifndef OLDC /* function prototypes */ int pointinrect(double, double, double, double, double, double); int rectintersects(double, double, double, double, double, double, double, double); long upbyte(long); long lobyte(long); void pictoutint(FILE *, long); Local long SFactor(void); long DigitsInt(long); Local boolean IsColumnEmpty(striparray *, long, long); void Skip(long Amount); Local long FirstBlack(striparray *, long, long); Local long FirstWhite(striparray *, long, long); Local boolean IsBlankStrip(striparray *mystripe, long deep); void striprint(long, long); long showvrmlparms(long vrmltreecolor, long vrmlnamecolor, long vrmlskycolornear, long vrmlskycolorfar, long vrmlgroundcolornear); void getvrmlparms(long *vrmltreecolor, long *vrmlnamecolor, long *vrmlskycolornear,long *vrmlskycolorfar, long *vrmlgroundcolornear,long *vrmlgroundcolorfar, long numtochange); #ifdef QUICKC void setupgraphics(void); #endif long showrayparms(long, long, long, long, long, long); void getrayparms(long *, long *, long *, long *, long *,long *, long); int readafmfile(char *, short *); void metricforfont(char *, short *); void plotchar(long *, struct LOC_plottext *); void swap_charptr(char **, char **); void plotpb(void); char *findXfont(char *, double, double *, int *); int macfontid(char *); void makebox(char *, double *, double *, double *, long); /* function prototypes */ #endif int pointinrect(double x,double y,double x0,double y0,double x1,double y1) { double tmp; if (x0 > x1) tmp = x0, x0 = x1, x1 = tmp; if (y0 > y1) tmp = y0, y0 = y1, y1 = tmp; return ((x >= x0 && x <= x1) && (y >= y0 && y <= y1)); } /* pointinrect */ int rectintersects(double xmin1,double ymin1,double xmax1,double ymax1, double xmin2,double ymin2,double xmax2,double ymax2) { double temp; /* check if any of the corners of either square are contained within the * * other one. This catches MOST cases, the last one (two) is two thin * * bands crossing each other (like a '+' ) */ if (xmin1 > xmax1){ temp = xmin1; xmin1 = xmax1; xmax1 = temp;} if (xmin2 > xmax2){ temp = xmin2; xmin2 = xmax2; xmax2 = temp;} if (ymin1 > ymax1){ temp = ymin1; ymin1 = ymax1; ymax1 = temp;} if (ymin2 > ymax2){ temp = ymin2; ymin2 = ymax2; ymax2 = temp;} return (pointinrect(xmin1,ymin1,xmin2,ymin2,xmax2,ymax2) || pointinrect(xmax1,ymin1,xmin2,ymin2,xmax2,ymax2) || pointinrect(xmin1,ymax1,xmin2,ymin2,xmax2,ymax2) || pointinrect(xmax1,ymax1,xmin2,ymin2,xmax2,ymax2) || pointinrect(xmin2,ymin2,xmin1,ymin1,xmax1,ymax1) || pointinrect(xmax2,ymin2,xmin1,ymin1,xmax1,ymax1) || pointinrect(xmin2,ymax2,xmin1,ymin1,xmax1,ymax1) || pointinrect(xmax2,ymax2,xmin1,ymin1,xmax1,ymax1) || (xmin1 >= xmin2 && xmax1 <= xmax2 && ymin2 >= ymin1 && ymax2 <= ymax1) || (xmin2 >= xmin1 && xmax2 <= xmax1 && ymin1 >= ymin2 && ymax1 <= ymax2)); } /* rectintersects */ void clearit() { long i; if (ansi || ibmpc) #ifdef WIN32 phyClearScreen(); #else printf("\033[2J\033[H"); #endif else { for (i = 1; i <= 24; i++) putchar('\n'); } #ifdef WIN32 phyClearScreen(); #endif } /* clearit */ boolean isfigfont(char *fontname) { int i; if (strcmp(fontname,"Hershey") == 0) return 1; for (i=0;i<34;++i) if (strcmp(fontname,figfonts[i]) == 0) break; return (i < 34); } /* isfigfont */ int figfontid(char *fontname) { int i; for (i=0;i<34;++i) if (strcmp(fontname,figfonts[i]) == 0) return i; return -1; } /* figfontid */ const char *figfontname(int id) { return figfonts[id]; } /* figfontname */ void pout(long n) { #ifdef MAC fprintf(plotfile, "%*ld", (int)((long)(0.434295 * log((double)n) + 0.0001)), n); #endif #ifndef MAC fprintf(plotfile, "%*ld", (int)((long)(0.434295 * log((double)n) + 0.0001)), n); #endif } /* pout */ long upbyte(long num) { /* get upper nibble of byte */ long Result = 0, i, j, bytenum, nibcount; boolean done; bytenum = 0; done = false; nibcount = 0; i = num / 16; i /= 16; j = 1; while (!done) { bytenum += (i & 15) * j; nibcount++; if (nibcount == 2) { Result = bytenum; done = true; } else { j *= 16; i /= 16; } } return Result; } /* upbyte */ long lobyte(long num) { /* get low order nibble of byte */ long Result = 0, i, j, bytenum, nibcount; boolean done; bytenum = 0; done = false; nibcount = 0; i = num; j = 1; while (!done) { bytenum += (i & 15) * j; nibcount++; if (nibcount == 2) { Result = bytenum; done = true; } else { j *= 16; i /= 16; } } return Result; } /* lobyte */ void pictoutint(FILE *file, long pictint) { char picthi, pictlo; picthi = (char)(pictint / 256); pictlo = (char)(pictint % 256); fprintf(file, "%c%c", picthi, pictlo); } /********************************************************************** * fixes: * - update to adobe 3.0 * - drop the PaperSize, deprecated in 2.1 * - drop the PageBoundingBox, the BoundingBox * suffices for all pages since the pages are * uniform anyway. * - hardwiring the BoundingBox makes no sense * given that the user can specify a physical * page size and number of pages across and down. * * variables inherited from drawgram.c, populated * in getparms(). * * user input: * paperx, papery => physical paper wd/ht (cm) * xmargin, ymargin => physical paper margin (cm) * m, n => pages high/wide * hpmargin, vpmargin => horiz/vert overlap of pages * * calculated: * * pagex = ( (double)n * ( paperx - hpmargin) + hpmargin ); * pagey = ( (double)m * ( papery - vpmargin) + vpmargin ); * * the bounding box is based on pysical paper size, * so it is ( paperx - 2 * xmargin ) in units by * * the boundng box for all pages is the paper size minus * physical margin in PS units. with a paper size in cm: * * unit = 72 units / inch * inch / 2.54 cm * = (double)( 72.0 / 2.53 ); * * the bounding box is paper size - margins, so the lower * left corner is at * ( unit * xmargin ), ( unit * ymargin ), * upper right is at * ( unit * ( paperx - xmargin ) ), ( unit * ( papery - ymargin ) ) * * this leaves the bounding box at: * %%BoundingBox: llx lly urx ury * * * %%DocumentMedia: Page * leave out weight, color, type, leaves: * %%DocumentMedia Page 0 ( ) ( ) * *********************************************************************/ void postscript_header( void ) { static double unit = (double)( 72.0 / 2.54 ); int count_x = pagex/paperx; int count_y = pagey/papery; int pages = count_x * count_y; int dm_x = (int)( unit * pagex ); int dm_y = (int)( unit * pagey ); int bb_ll_x = (int)( unit * xmargin ); int bb_ll_y = (int)( unit * ymargin ); int bb_ur_x = (int)( dm_x - bb_ll_x ); /* re-cycle the margins for */ int bb_ur_y = (int)( dm_y - bb_ll_y ); /* upper-right dimensions */ fprintf( plotfile, "%s\n", "%!PS-Adobe-3.0" ); fprintf( plotfile, "%s\n", "%Test postscript" ); fprintf( plotfile, "%s\n", "%%Title: Phylip Tree Output" ); fprintf( plotfile, "%s\n", "%%Creator: Phylip Drawgram" ); fprintf( plotfile, "%s %d %d\n","%%Pages:", pages, 1 ); fprintf( plotfile, "%s\n", "%%DocumentFonts: (atend)" ); fprintf( plotfile, "%s\n", "%%Orientation: Portrait" ); fprintf ( plotfile, /* * comment page name number width height */ "%s %s %d %d 0 ( ) ( )\n", "%%DocumentMedia:", "Page", dm_x, dm_y ); fprintf ( /* * comment lower left, upper right. */ plotfile, "%s %d %d %d %d\n", "%%BoundingBox:", bb_ll_x, bb_ll_y, bb_ur_x, bb_ur_y ); fprintf( plotfile, "%s\n", "%%EndComments" ); fprintf( plotfile, "%s\n", "/l {newpath moveto lineto stroke} def" ); fprintf( plotfile, "%s\n", "%%EndProlog" ); fprintf( plotfile, "%s\n", "%%Page: 1 1" ); fprintf ( /* * set the media spec's. */ plotfile, "<< /PageSize [ %d %d ] >> setpagedevice", dm_x, dm_y ); fprintf( plotfile, "%s\n", " 1 setlinecap \n 1 setlinejoin" ); fprintf ( plotfile, "%8.2f %s\n", treeline, "setlinewidth newpath" ); } void initplotter ( long ntips, char *fontname ) { long i,j, hres, vres; Char picthi, pictlo; long pictint; int padded_width, byte_width; unsigned int dummy1, dummy2; double viewangle; treeline = 0.18 * labelheight * yscale * expand; labelline = 0.06 * labelheight * yscale * expand; linewidth = treeline; if (dotmatrix ) { for (i = 0; i <= 50; i++) { /* for fast circle calculations */ for (j = 0; j <= 50; j++){ rootmatrix[i][j] = (long)floor(sqrt((double)(i * i + j * j)) + 0.5);} } } switch (plotter) { case tek: oldxhigh = -1.0; oldxlow = -1.0; oldyhigh = -1.0; oldylow = -1.0; nmoves = 0; /* DLS/JMH -- See function PLOT */ fprintf(plotfile, "%c\f", escape); break; case hp: fprintf(plotfile, "IN;SP1;VS10.0;\n"); break; case ray: fprintf(plotfile, "report verbose\n"); fprintf(plotfile, "screen %f %f\n", xsize, ysize); if (ysize >= xsize) { viewangle = 2 * atan(ysize / (2 * 1.21 * xsize)) * 180 / pi; fprintf(plotfile, "fov 45 %3.1f\n", viewangle); fprintf(plotfile, "light 1 point 0 %6.2f %6.2f\n", -xsize * 1.8, xsize * 1.5); fprintf(plotfile, "eyep %6.2f %6.2f %6.2f\n", xsize * 0.5, -xsize * 1.21, ysize * 0.55); } else { viewangle = 2 * atan(xsize / (2 * 1.21 * ysize)) * 180 / pi; fprintf(plotfile, "fov %3.1f 45\n", viewangle); fprintf(plotfile, "light 1 point 0 %6.2f %6.2f\n", -ysize * 1.8, ysize * 1.5); fprintf(plotfile, "eyep %6.2f %6.2f %6.2f\n", xsize * 0.5, -ysize * 1.21, ysize * 0.55); } fprintf(plotfile, "lookp %6.2f 0 %6.2f\n", xsize * 0.5, ysize * 0.5); fprintf(plotfile, "/* %.10s */\n", colors[treecolor - 1].name); fprintf(plotfile, "surface treecolor diffuse %5.2f%5.2f%5.2f specular 1 1 1 specpow 30\n", colors[treecolor - 1].red, colors[treecolor - 1].green, colors[treecolor - 1].blue); fprintf(plotfile, "/* %.10s */\n", colors[namecolor - 1].name); fprintf(plotfile, "surface namecolor diffuse %5.2f%5.2f%5.2f specular 1 1 1 specpow 30\n", colors[namecolor - 1].red, colors[namecolor - 1].green, colors[namecolor - 1].blue); fprintf(plotfile, "/* %.10s */\n", colors[backcolor - 1].name); fprintf(plotfile, "surface backcolor diffuse %5.2f%5.2f%5.2f\n\n", colors[backcolor - 1].red, colors[backcolor - 1].green, colors[backcolor - 1].blue); treeline = 0.27 * labelheight * yscale * expand; linewidth = treeline; raylinewidth = treeline; if (grows == vertical) fprintf(plotfile, "plane backcolor 0 0 %2.4f 0 0 1\n", ymargin); else fprintf(plotfile, "plane backcolor 0 0 %2.4f 0 0 1\n", ymargin - ysize / (ntips - 1)); fprintf(plotfile, "\nname tree\n"); fprintf(plotfile, "grid 22 22 22\n"); break; case pov: fprintf(plotfile, "// Declare the colors\n\n"); fprintf(plotfile, "#declare C_Tree = color rgb<%6.2f, %6.2f, %6.2f>;\n", colors[treecolor-1].red, colors[treecolor-1].green, colors[treecolor-1].blue); fprintf(plotfile, "#declare C_Name = color rgb<%6.2f, %6.2f, %6.2f>;\n\n", colors[namecolor-1].red, colors[namecolor-1].green, colors[namecolor-1].blue); fprintf(plotfile, "// Declare the textures\n\n"); fprintf(plotfile, "#declare %s = texture { pigment { C_Tree }\n", TREE_TEXTURE); fprintf(plotfile, "\t\tfinish { phong 1 phong_size 100 }};\n"); fprintf(plotfile, "#declare %s = texture { pigment { C_Name }\n", NAME_TEXTURE); fprintf(plotfile, "\t\tfinish { phong 1 phong_size 100 }};\n"); fprintf(plotfile, "\n#global_settings { assumed_gamma 2.2 }\n\n"); fprintf(plotfile, "light_source { <0, %6.2f, %6.2f> color <1,1,1> }\n\n", xsize * 1.8, xsize * 1.5); /* The camera location */ fprintf(plotfile, "camera {\n"); if (ysize >= xsize) { fprintf(plotfile, "\tlocation <%6.2f, %6.2f, %6.2f>\n", -xsize * 0.5, -xsize * 1.21, ysize * 0.55); } else { fprintf(plotfile, "\tlocation <%6.2f, %6.2f, %6.2f>\n", -xsize * 0.5, -ysize * 1.21, ysize * 0.55); } fprintf(plotfile, "\tlook_at <%6.2f, 0, %6.2f>\n", -xsize * 0.5, ysize * 0.5); /* Handily, we can rotate since the rayshade paradigm ain't exactly congruent to the povray paradigm */ fprintf(plotfile, "\trotate z*180\n"); fprintf(plotfile, "}\n\n"); fprintf(plotfile, "#background { color rgb <%6.2f, %6.2f, %6.2f> }\n\n", colors[backcolor-1].red, colors[backcolor-1].green, colors[backcolor-1].blue); if (bottomcolor != NO_PLANE) { /* The user wants a plane on the bottom... */ if (grows == vertical) fprintf(plotfile, "plane { z, %2.4f\n", 0.0 /*ymargin*/); else fprintf(plotfile, "plane { z, %2.4f\n", ymargin - ysize / (ntips - 1)); fprintf(plotfile, "\tpigment {color rgb <%6.2f, %6.2f, %6.2f> }}\n\n", colors[bottomcolor-1].red, colors[bottomcolor-1].green, colors[bottomcolor-1].blue); } treeline = 0.27 * labelheight * yscale * expand; linewidth = treeline; raylinewidth = treeline; fprintf(plotfile, "\n// First, the tree\n\n"); break; case vrml: vrmllinewidth = treeline; break; case pict: plotfile = freopen(pltfilename,"wb",plotfile); for (i=0;i<512;++i) putc('\000',plotfile); pictoutint(plotfile,1000); /* size...replaced later with seek */ pictoutint(plotfile,1); /* bbx0 */ pictoutint(plotfile,1); /* bby0 */ pictoutint(plotfile,612); /* bbx1 */ pictoutint(plotfile,792); /* bby1 */ fprintf(plotfile,"%c%c",0x11,0x01); /* version "1" (B&W) pict */ fprintf(plotfile,"%c%c%c",0xa0,0x00,0x82); fprintf(plotfile,"%c",1); /* clip rect */ pictoutint(plotfile,10); /* region size, bytes. */ pictoutint(plotfile,1); /* clip x0 */ pictoutint(plotfile,1); /* clip y0 */ pictoutint(plotfile,612); /* clip x1 */ pictoutint(plotfile,792); /* clip y1 */ bytewrite+=543; oldpictint = 0; pictint = (long)(linewidth + 0.5); if (pictint == 0) pictint = 1; picthi = (Char)(pictint / 256); pictlo = (Char)(pictint % 256); fprintf(plotfile, "\007%c%c%c%c", picthi, pictlo, picthi, pictlo); /* Set pen size for drawing tree. */ break; case bmp: write_bmp_header(plotfile, (int)(xsize*xunitspercm), (int)(ysize*yunitspercm)); byte_width = (int) ceil (xsize / 8.0); padded_width = ((byte_width + 3) / 4) * 4 ; //printf("xsize: %f byte_width: %i padded_width: %i ysize: %f\n", xsize, byte_width, padded_width, ysize); //fflush(stdout); full_pic = (byte *) Malloc ((padded_width *2) * (int) ysize) ; break ; case xbm: /* what a completely verbose data representation format! */ fprintf(plotfile, "#define drawgram_width %5ld\n", (long)(xunitspercm * xsize)); fprintf(plotfile, "#define drawgram_height %5ld\n", (long)(yunitspercm * ysize)); fprintf(plotfile, "static char drawgram_bits[] = {\n"); /*filesize := 53; */ break; case lw: /* write conforming postscript */ postscript_header(); break; case idraw: fprintf(plotfile, "%%I Idraw 9 Grid 8 \n\n"); fprintf(plotfile,"%%%%Page: 1 1\n\n"); fprintf(plotfile,"Begin\n"); fprintf(plotfile,"%%I b u\n"); fprintf(plotfile,"%%I cfg u\n"); fprintf(plotfile,"%%I cbg u\n"); fprintf(plotfile,"%%I f u\n"); fprintf(plotfile,"%%I p u\n"); fprintf(plotfile,"%%I t\n"); fprintf(plotfile,"[ 0.679245 0 0 0.679245 0 0 ] concat\n"); fprintf(plotfile,"/originalCTM matrix currentmatrix def\n\n"); break; case ibm: #ifdef TURBOC initgraph(&GraphDriver,&HiMode,""); #endif #ifdef QUICKC setupgraphics(); #endif break; case mac: #ifdef MAC gfxmode(); pictint=(long)(linewidth + 0.5); #endif break; case houston: break; case decregis: oldx = (double) 300; oldy = (double) 1; nmoves = 0; fprintf(plotfile, "%c[2J%cPpW(I3);S(A[0,0][799,479]);S(I(W))S(E);S(C0);W(I(D))\n", escape,escape); break; case epson: plotfile = freopen(pltfilename,"wb",plotfile); fprintf(plotfile, "\0333\030"); break; case oki: plotfile = freopen(pltfilename,"wb",plotfile); fprintf(plotfile, "\033%%9\020"); break; case citoh: plotfile = freopen(pltfilename,"wb",plotfile); fprintf(plotfile, "\033T16"); break; case toshiba: /* reopen in binary since we always need \n\r on the file */ /* and dos in text mode puts it, but unix does not */ //printf("in initplotter\n"); //fflush(stdout); plotfile = freopen(pltfilename,"wb",plotfile); fprintf(plotfile, "\033\032I\n\r\n\r"); fprintf(plotfile, "\033L06\n\r"); break; case pcl: plotfile = freopen(pltfilename,"wb",plotfile); if (hpresolution == 150 || hpresolution == 300) fprintf(plotfile, "\033*t%3ldR", hpresolution); else if (hpresolution == 75) fprintf(plotfile, "\033*t75R"); break; case pcx: plotfile = freopen(pltfilename,"wb",plotfile); fprintf(plotfile,"\012\003\001\001%c%c%c%c",0,0,0,0); /* Manufacturer version (1 byte) version (1 byte), encoding (1 byte), bits per pixel (1 byte), xmin (2 bytes) ymin (2 bytes), Version */ hres = strpwide; vres = (long)floor(yunitspercm * ysize + 0.5); fprintf(plotfile, "%c%c", (unsigned char)lobyte(hres - 1), (unsigned char)upbyte(hres - 1)); /* Xmax */ fprintf(plotfile, "%c%c", (unsigned char)lobyte(vres - 1), (unsigned char)upbyte(vres - 1)); /* Ymax */ fprintf(plotfile, "%c%c", (unsigned char)lobyte(hres), (unsigned char)upbyte(hres)); /* Horizontal resolution */ fprintf(plotfile, "%c%c", (unsigned char)lobyte(vres), (unsigned char)upbyte(vres)); /* Vertical resolution */ for (i = 1; i <= 48; i++) /* Color Map */ putc('\000', plotfile); putc('\000', plotfile); putc('\001', plotfile); /* Num Planes */ putc(hres / 8, plotfile); /* Bytes per line */ putc('\000',plotfile); for (i = 1; i <= 60; i++) /* Filler */ putc('\000',plotfile); break; case fig: fprintf(plotfile, "#FIG 2.0\n"); fprintf(plotfile, "80 2\n"); break; case gif: case other: break; default: /* case vrml not handled */ break; /* initialization code for a new plotter goes here */ } } /* initplotter */ void finishplotter() { int padded_width, byte_width; /* For bmp code */ switch (plotter) { case tek: putc('\n', plotfile); plot(penup, 1.0, 1.0); break; case hp: plot(penup, 1.0, 1.0); fprintf(plotfile, "SP;\n"); break; case ray: fprintf(plotfile,"end\n\nobject treecolor tree\n"); fprintf(plotfile,"object namecolor species_names\n"); break; case pov: break; case pict: fprintf(plotfile,"%c%c%c%c%c",0xa0,0x00,0x82,0xff,0x00); bytewrite+=5; fseek(plotfile,512L,SEEK_SET); pictoutint(plotfile,bytewrite); break; case lw: fprintf(plotfile, "stroke showpage \n\n"); fprintf(plotfile,"%%%%PageTrailer\n"); fprintf(plotfile,"%%%%PageFonts: %s\n", (strcmp(fontname,"Hershey") == 0) ? "" : fontname); fprintf(plotfile,"%%%%Trailer\n"); fprintf(plotfile,"%%%%DocumentFonts: %s\n", (strcmp(fontname,"Hershey") == 0) ? "" : fontname); break; case idraw: fprintf(plotfile, "\nEnd %%I eop\n\n"); fprintf(plotfile, "showpage\n\n"); fprintf(plotfile, "%%%%Trailer\n\n"); fprintf(plotfile, "end\n"); break; case ibm: #ifdef TURBOC getchar(); restorecrtmode(); #endif #ifdef QUICKC getchar(); _clearscreen(_GCLEARSCREEN); _setvideomode(_DEFAULTMODE); #endif break; case mac: #ifdef MAC plotter=oldplotter; eventloop(); #endif break; case houston: break; case decregis: plot(penup, 1.0, 1.0); fprintf(plotfile, "%c\\", escape); break; case epson: fprintf(plotfile, "\0333$"); break; case oki: /* blank case */ break; case citoh: fprintf(plotfile, "\033A"); break; case toshiba: //printf("in finishplotter\n"); //fflush(stdout); fprintf(plotfile, "\033\032I\n\r"); break; case pcl: fprintf(plotfile, "\033*rB"); /* Exit graphics mode */ putc('\f', plotfile); /* just to make sure? */ break; case pcx: /* blank case */ break; case bmp: byte_width = (int) ceil (xsize / 8.0); padded_width = ((byte_width + 3) / 4) * 4 ; //printf("calling turn_rows padded_width: %i ysize: %f\n", padded_width, ysize); //fflush(stdout); turn_rows (full_pic, padded_width, (int) ysize); //printf("calling write_full_pic total_bytes: %i\n", total_bytes); //fflush(stdout); write_full_pic(full_pic, total_bytes); //printf("write_full_pic done\n"); //fflush(stdout); increment = 0 ; total_bytes = 0 ; free (full_pic) ; break; case xbm: fprintf(plotfile, "}\n"); break; case fig: /* blank case */ break; case gif: case other: break; default: /* case vrml not handled */ break; /* termination code for a new plotter goes here */ } } /* finishplotter */ Local long SFactor() { /* the dot-skip is resolution-independent. */ /* this makes all the point-skip instructions skip the same # of dots. */ long Result = 0; if (hpresolution == 150) Result = 2; if (hpresolution == 300) Result = 1; if (hpresolution == 75) return 4; return Result; } /* SFactor */ long DigitsInt(long x) { if (x < 10) return 1; else if (x >= 10 && x < 100) return 2; else return 3; } /* DigistInt */ Local boolean IsColumnEmpty(striparray *mystripe, long pos, long deep) { long j; boolean ok; ok = true; j = 1; while (ok && j <= deep) { ok = (ok && mystripe[j - 1][pos - 1] == null); j++; } return ok; } /* IsColumnEmpty */ void Skip(long Amount) { /* assume we're not in gfx mode. */ fprintf(plotfile, "\033&f1S"); /* Pop the graphics cursor */ #ifdef MAC fprintf(plotfile, "\033*p+%*ldX", (int)DigitsInt(Amount * SFactor()), Amount * SFactor()); #endif #ifndef MAC fprintf(plotfile, "\033*p+%*ldX", (int)DigitsInt(Amount * SFactor()), Amount * SFactor()); #endif fprintf(plotfile, "\033&f0S"); /* Push the cursor to new location */ filesize += 15 + DigitsInt(Amount * SFactor()); } /* Skip */ Local long FirstBlack(striparray *mystripe, long startpos, long deep) { /* returns, given a strip and a position, next x with some y's nonzero */ long i; boolean columnempty; i = startpos; columnempty = true; while (columnempty && i < strpwide / 8) { columnempty = (columnempty && IsColumnEmpty(mystripe, i,deep)); if (columnempty) i++; } return i; } /* FirstBlack */ Local long FirstWhite(striparray *mystripe, long startpos, long deep) { /* returns, given a strip and a position, the next x with all y's zero */ long i; boolean columnempty; i = startpos; columnempty = false; while (!columnempty && i < strpwide / 8) { columnempty = IsColumnEmpty(mystripe, i,deep); if (!columnempty) i++; } return i; } /* FirstWhite */ Local boolean IsBlankStrip(striparray *mystripe, long deep) { long i, j; boolean ok; ok = true; i = 1; while (ok && i <= strpwide / 8) { for (j = 0; j < (deep); j++) ok = (ok && mystripe[j][i - 1] == '\0'); i++; } return ok; } /* IsBlankStrip */ void striprint(long div, long deep) { //printf("in striprint, div: %li deep: %li\n",div,deep); // fflush(stdout); long i, j, t, x, theend, width; unsigned char counter; boolean done; done = false; width = strpwide; if (plotter != pcx && plotter != pcl && plotter != bmp && plotter != xbm) { while (!done) { for (i = 0; i < div; i++) done = done || (stripe[i] && (stripe[i][width - 1] != null)); if (!done) width--; done = (done || width == 0); } } switch (plotter) { case epson: if (!empty) { fprintf(plotfile, "\033L%c%c", (char) width & 255, (char) width / 256); for (i = 0; i < width; i++) putc(stripe[0][i], plotfile); filesize += width + 4; } putc('\n', plotfile); putc('\r', plotfile); break; case oki: if (!empty) { fprintf(plotfile, "\033%%1%c%c", (char) width / 128, (char) width & 127); for (i = 0; i < width; i++) putc(stripe[0][i], plotfile); filesize += width + 5; } putc('\n', plotfile); putc('\r', plotfile); break; case citoh: if (!empty) { fprintf(plotfile, "\033S%04ld",width); for (i = 0; i < width; i++) putc(stripe[0][i], plotfile); filesize += width + 6; } putc('\n', plotfile); putc('\r', plotfile); break; case toshiba: //printf("in striprint div: %li deep: %li width: %li\n", div, deep, width); //fflush(stdout); if (!empty) { for (i = 0; i < width; i++) { for (j = 0; j <= 3; j++) stripe[j][i] += 64; } fprintf(plotfile, "\033;%04ld",width); for (i = 0; i < width; i++) fprintf(plotfile, "%c%c%c%c", stripe[0][i], stripe[1][i], stripe[2][i], stripe[3][i]); filesize += width * 4 + 6; } putc('\n', plotfile); putc('\r', plotfile); break; case pcx: width = strpwide / 8; for (j = 0; j < div; j++) { t = 1; while (1) { i = 0; /* i == RLE count ???? */ while ((stripe[j][t + i - 1]) == (stripe[j][t + i]) && t + i < width && i < 63) i++; if (i > 0) { counter = 192; counter += i; putc(counter, plotfile); putc(255 - stripe[j][t - 1], plotfile); t += i; filesize += 2; } else { if (255 - (stripe[j][t - 1] & 255) >= 192) { putc(193, plotfile); filesize++; } putc(255 - stripe[j][t - 1], plotfile); t++; filesize++; } if (t >width) break; } } break; case pcl: //printf("in pcl case\n"); //fflush(stdout); width = strpwide / 8; if (IsBlankStrip(stripe,deep)) { #ifdef MAC fprintf(plotfile, "\033&f1S\033*p0X\033*p+%*ldY\033&f0S", (int)DigitsInt(deep * SFactor()), deep * SFactor()); #endif #ifndef MAC fprintf(plotfile, "\033&f1S\033*p0X\033*p+%*dY\033&f0S", (int)DigitsInt(deep * SFactor()), (int) (deep * SFactor())); //printf("IsBlankStrip not MAC filesize: %li DigitsInt: %li\n", filesize, DigitsInt(deep * SFactor())); //fflush(stdout); #endif filesize += DEFAULT_STRIPE_HEIGHT + DigitsInt(deep * SFactor()); } else { /* plotting the actual strip as bitmap data */ x = 1; theend = 1; while (x < width) { x = FirstBlack(stripe, x,deep); /* all-black strip is now */ Skip((x - theend - 1) * 8); /* x..theend */ theend = FirstWhite(stripe, x,deep) - 1;/* like lastblack */ fprintf(plotfile, "\033*r1A"); /* enter gfx mode */ for (j = 0; j < div; j++) { #ifdef MAC fprintf(plotfile, "\033*b%*ldW", (int)DigitsInt(theend - x + 1), theend - x + 1); #endif #ifndef MAC fprintf(plotfile, "\033*b%*dW", (int)DigitsInt(theend - x + 1), (int) (theend - x + 1)); //printf("not IsBlankStrip not MAC\n"); //fflush(stdout); #endif /* dump theend-x+1 bytes */ for (t = x - 1; t < theend; t++) putc(stripe[j][t], plotfile); filesize += theend - x + DigitsInt(theend - x + 1) + 5; } fprintf(plotfile, "\033*rB"); /* end gfx mode */ Skip((theend - x + 1) * 8); filesize += 9; x = theend + 1; } fprintf(plotfile, "\033&f1S"); /* Pop cursor */ #ifdef MAC fprintf(plotfile, "\033*p0X\033*p+%*ldY", (int)DigitsInt(deep * SFactor()), deep * SFactor()); #endif #ifndef MAC fprintf(plotfile, "\033*p0X\033*p+%*dY", (int)DigitsInt(deep * SFactor()), (int) (deep * SFactor())); //printf("not MAC deep: %li SFactor(): %li\n", deep, SFactor()); //fflush(stdout); #endif filesize += DEFAULT_STRIPE_HEIGHT + DigitsInt(deep * SFactor()); fprintf(plotfile, "\033&f0S"); /* Push cursor */ } break; /* case for hpcl code */ case bmp: width = ((strpwide -1) / 8) +1; //printf("increment: %i width: %li div: %li total_bytes: %i\n",increment,width,div,total_bytes); //fflush(stdout); translate_stripe_to_bmp (&stripe, full_pic, increment++, width, div, &total_bytes) ; break; /* case for bmp code */ case xbm: x = 0; /* count up # of bytes so we can put returns. */ width = ((strpwide -1) / 8) +1; for (j = 0; j < div; j++) { for (i = 0; i < width; i++) { fprintf(plotfile, "0x%02x,",(unsigned char)stripe[j][i]); filesize += 5; x++; if ((x % 15) == 0) { putc('\n', plotfile); filesize++; } } } putc('\n',plotfile); break; case lw: case hp: case tek: case ibm: case mac: case houston: case decregis: case fig: case pict: case ray: case pov: case gif: case idraw: case other: break; default: /* case vrml not handled */ break; /* graphics print code for a new printer goes here */ } } /* striprint */ #ifdef QUICKC void setupgraphics() { _getvideoconfig(&myscreen); #ifndef WATCOM switch(myscreen.adapter){ case _CGA: case _OCGA: _setvideomode(_HRESBW); break; case _EGA: case _OEGA: _setvideomode(_ERESNOCOLOR); case _VGA: case _OVGA: case _MCGA: _setvideomode(_VRES2COLOR); break; case _HGC: _setvideomode(_HERCMONO); break; default: printf("Your display hardware is unsupported by this program.\n"); break; } #endif #ifdef WATCOM switch(myscreen.adapter){ case _VGA: case _SVGA: _setvideomode(_VRES16COLOR); break; case _MCGA: _setvideomode(_MRES256COLOR); break; case _EGA: _setvideomode(_ERESNOCOLOR); break; case _CGA: _setvideomode(_MRES4COLOR); break; case _HERCULES: _setvideomode(_HERCMONO); break; default: printf("Your display hardware is unsupported by this program.\n"); exxit(-1); break; } #endif _getvideoconfig(&myscreen); _setlinestyle(0xffff); xunitspercm=myscreen.numxpixels / 25; yunitspercm=myscreen.numypixels / 17.5; xsize = 25.0; ysize = 17.5; } /* setupgraphics */ #endif void loadfont(short *font, char* fontname, char *application) { FILE *fontfile; long i, charstart = 0, dummy; Char ch = 'A'; /* long j; if (javarun) { // clean out the font array if user changed font for(j=0; j= 1 && numtochange <= 4)) break; countup(&loopcount, 10); } return (ch == 'Y') ? -1 : numtochange; } /* showrayparms */ void getrayparms(long *treecolor, long *namecolor, long *backcolor, long *bottomcolor, long *rx,long *ry, long numtochange) { Char ch; long i, loopcount; if (numtochange == 0) { loopcount = 0; do { printf(" Type the number of one that you want to change (1-4):\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%ld%*[^\n]", &numtochange); getchar(); countup(&loopcount, 10); } while (numtochange < 1 || numtochange > 10); } switch (numtochange) { case 1: printf("\nWhich of these colors will the tree be?:\n"); printf(" White, Red, Orange, Yellow, Green, Blue, or Violet\n"); printf(" (W, R, O, Y, G, B, or V)\n"); loopcount = 0; do { printf(" Choose one: \n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); if (ch == '\n') ch = ' '; uppercase(&ch); (*treecolor) = 0; for (i = 1; i <= 7; i++) { if (ch == colors[i - 1].name[0]) { (*treecolor) = i; return; } } countup(&loopcount, 10); } while ((*treecolor) == 0); break; case 2: printf("\nWhich of these colors will the species names be?:\n"); printf(" White, Red, Orange, Yellow, Green, Blue, or Violet\n"); printf(" (W, R, O, Y, G, B, or V)\n"); loopcount = 0; do { printf(" Choose one: \n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); if (ch == '\n') ch = ' '; uppercase(&ch); (*namecolor) = 0; for (i = 1; i <= 7; i++) { if (ch == colors[i - 1].name[0]) { (*namecolor) = i; return; } } countup(&loopcount, 10); } while ((*namecolor) == 0); break; case 3: printf("\nWhich of these colors will the background be?:\n"); printf(" White, Red, Orange, Yellow, Green, Blue, or Violet\n"); printf(" (W, R, O, Y, G, B, or V)\n"); loopcount = 0; do { printf(" Choose one: \n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); if (ch == '\n') ch = ' '; uppercase(&ch); (*backcolor) = 0; for (i = 1; i <= 7; i++) { if (ch == colors[i - 1].name[0]) { (*backcolor) = i; return; } } countup(&loopcount, 10); } while ((*backcolor) == 0); break; case 4: /* Dimensions for rayshade, bottom plane for povray */ if (plotter == pov) { printf("\nWhich of these colors will the bottom plane be?:\n"); printf(" White, Red, Orange, Yellow, Green, Blue, Violet, or None (no plane)\n"); printf(" (W, R, O, Y, G, B, V, or N)\n"); loopcount = 0; do { printf(" Choose one: \n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); if (ch == '\n') ch = ' '; uppercase(&ch); /* If the user doesn't want a bottom plane. . . */ if (ch == 'N') { (*bottomcolor) = NO_PLANE; return; } else { (*bottomcolor) = 0; for (i = 1; i <= 7; i++) { if (ch == colors[i - 1].name[0]) { (*bottomcolor) = i; return; } } } countup(&loopcount, 10); } while ((*bottomcolor) == 0); } else if (plotter == ray) { printf("\nEnter the X resolution:\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%ld%*[^\n]", rx); getchar(); printf("Enter the Y resolution:\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%ld%*[^\n]",ry); getchar(); } break; } } /* getrayparms */ long showvrmlparms(long vrmltreecolor, long vrmlnamecolor, long vrmlskycolornear, long vrmlskycolorfar, long vrmlgroundcolornear) { long i, loopcount; Char ch,input[32]; long numtochange; for (i = 1; i <= 24; i++) putchar('\n'); printf("Settings for VRML file: \n\n"); printf(" (1) Tree color: %.10s\n",colors[vrmltreecolor-1].name); printf(" (2) Species names color: %.10s\n",colors[vrmlnamecolor-1].name); printf(" (3) Horizon color: %.10s\n",colors[vrmlskycolorfar-1].name); printf(" (4) Zenith color: %.10s\n",colors[vrmlskycolornear-1].name); printf(" (5) Ground color: %.10s\n",colors[vrmlgroundcolornear-1].name); printf(" Do you want to accept these? (Yes or No)\n"); loopcount = 0; for (;;) { printf(" Type Y or N or the number (1-5) of the one to change: \n"); #ifdef WIN32 phyFillScreenColor(); #endif getstryng(input); numtochange=atoi(input); uppercase(&input[0]); ch=input[0]; if (ch == 'Y' || ch == 'N' || (numtochange >= 1 && numtochange <= 5)) break; countup(&loopcount, 10); } return (ch == 'Y') ? -1 : numtochange; } /* showvrmlparms */ void getvrmlparms(long *vrmltreecolor, long *vrmlnamecolor, long *vrmlskycolornear, long *vrmlskycolorfar, long *vrmlgroundcolornear, long *vrmlgroundcolorfar, long numtochange) { Char ch; long i, loopcount; if (numtochange == 0) { loopcount = 0; do { printf(" Type the number of one that you want to change (1-4):\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%ld%*[^\n]", &numtochange); getchar(); countup(&loopcount, 10); } while (numtochange < 1 || numtochange > 10); } switch (numtochange) { case 1: printf("\nWhich of these colors will the tree be?:\n"); printf(" White, Red, Orange, Yellow, Green, Blue, or Violet\n"); printf(" (W, R, O, Y, G, B, or V)\n"); loopcount = 0; do { printf(" Choose one: \n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); if (ch == '\n') ch = ' '; uppercase(&ch); (*vrmltreecolor) = 0; for (i = 1; i <= 7; i++) { if (ch == colors[i - 1].name[0]) { (*vrmltreecolor) = i; return; } } countup(&loopcount, 10); } while ((*vrmltreecolor) == 0); break; case 2: printf("\nWhich of these colors will the species names be?:\n"); printf(" White, Red, Orange, Yellow, Green, Blue, or Violet\n"); printf(" (W, R, O, Y, G, B, or V)\n"); loopcount = 0; do { printf(" Choose one: \n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); if (ch == '\n') ch = ' '; uppercase(&ch); (*vrmlnamecolor) = 0; for (i = 1; i <= 7; i++) { if (ch == colors[i - 1].name[0]) { (*vrmlnamecolor) = i; return; } } countup(&loopcount, 10); } while ((*vrmlnamecolor) == 0); break; case 3: printf("\nWhich of these colors will the horizon be?:\n"); printf(" White, Red, Orange, Yellow, Green, Blue, or Violet\n"); printf(" (W, R, O, Y, G, B, or V)\n"); loopcount = 0; do { printf(" Choose one: \n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); if (ch == '\n') ch = ' '; uppercase(&ch); (*vrmlskycolorfar) = 0; for (i = 1; i <= 7; i++) { if (ch == colors[i - 1].name[0]) { (*vrmlskycolorfar) = i; return; } } countup(&loopcount, 10); } while ((*vrmlskycolorfar) == 0); break; case 4: printf("\nWhich of these colors will the zenith be?:\n"); printf(" White, Red, Orange, Yellow, Green, Blue, or Violet\n"); printf(" (W, R, O, Y, G, B, or V)\n"); loopcount = 0; do { printf(" Choose one: \n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); if (ch == '\n') ch = ' '; uppercase(&ch); (*vrmlskycolornear) = 0; for (i = 1; i <= 7; i++) { if (ch == colors[i - 1].name[0]) { (*vrmlskycolornear) = i; return; } } countup(&loopcount, 10); } while ((*vrmlskycolornear) == 0); break; case 5: printf("\nWhich of these colors will the ground be?:\n"); printf(" White, Red, Orange, Yellow, Green, Blue, or Violet\n"); printf(" (W, R, O, Y, G, B, or V)\n"); loopcount = 0; do { printf(" Choose one: \n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); if (ch == '\n') ch = ' '; uppercase(&ch); (*vrmlgroundcolornear) = 0; for (i = 1; i <= 7; i++) { if (ch == colors[i - 1].name[0]) { (*vrmlgroundcolornear) = i; (*vrmlgroundcolorfar) = i; return; } } countup(&loopcount, 10); } while ((*vrmlgroundcolornear) == 0); break; } } /* gevrmlparms */ void plotrparms(long ntips) { /* set up initial characteristics of plotter or printer */ long rayresx, rayresy; long n, loopcount; double xsizehold, ysizehold; xsizehold = xsize; ysizehold = ysize; penchange = no; xcorner = 0.0; ycorner = 0.0; if (dotmatrix) strpdiv = 1; switch (plotter) { case ray: penchange = yes; xunitspercm = 1.0; yunitspercm = 1.0; xsize = 10.0; ysize = 10.0; rayresx = 512; rayresy = 512; treecolor = 6; namecolor = 4; backcolor = 1; /* MSVC gave warning that bottomcolor was uninitialized. Unsure what this should be */ bottomcolor = 1; loopcount = 0; if (!javarun) { do { n=showrayparms(treecolor,namecolor,backcolor,bottomcolor,rayresx,rayresy); if (n != -1) getrayparms(&treecolor,&namecolor,&backcolor,&bottomcolor,&rayresx,&rayresy,n); countup(&loopcount, 10); } while (n != -1); xsize = rayresx; ysize = rayresy; } break; case pov: penchange = yes; xunitspercm = 1.0; yunitspercm = 1.0; xsize = 10.0; ysize = 10.0; rayresx = 512; rayresy = 512; treecolor = 6; namecolor = 4; backcolor = 1; bottomcolor = 1; loopcount = 0; if (!javarun) { do { n=showrayparms(treecolor,namecolor,backcolor,bottomcolor,rayresx,rayresy); if (n != -1) getrayparms(&treecolor,&namecolor,&backcolor,&bottomcolor,&rayresx,&rayresy,n); countup(&loopcount, 10); } while (n != -1); } xsize = rayresx; ysize = rayresy; break; case vrml: #ifndef MAC penchange = yes; xunitspercm = 1.0; yunitspercm = 1.0; xsize = 10.0; ysize = 10.0; vrmlplotcolor = treecolor; loopcount = 0; if (!javarun) { do { n=showvrmlparms(treecolor, namecolor, vrmlskycolornear, vrmlskycolorfar, vrmlgroundcolornear); if (n != -1) getvrmlparms(&treecolor, &namecolor, &vrmlskycolornear, &vrmlskycolorfar, &vrmlgroundcolornear, &vrmlgroundcolorfar, n); countup(&loopcount, 10); } while (n != -1); } break; #endif case pict: strcpy(fontname,"Times"); penchange = yes; xunitspercm = 28.346456693; yunitspercm = 28.346456693; /*7.5 x 10 inch default PICT page size*/ xsize = 19.05; ysize = 25.40; break; case lw: penchange = yes; xunitspercm = 28.346456693; yunitspercm = 28.346456693; xsize = pagex; ysize = pagey; break; case idraw: penchange = yes; xunitspercm = 28.346456693; yunitspercm = 28.346456693; xsize = 21.59; ysize = 27.94; break; case hp: penchange = no; xunitspercm = 400.0; yunitspercm = 400.0; xsize = 24.0; ysize = 18.0; break; case tek: xunitspercm = 50.0; yunitspercm = 50.0; xsize = 20.46; ysize = 15.6; break; case ibm: #ifdef TURBOC GraphDriver = 0; detectgraph(&GraphDriver,&GraphMode); getmoderange(GraphDriver,&LoMode,&HiMode); initgraph(&GraphDriver,&HiMode,""); xunitspercm = getmaxx()/25; yunitspercm = getmaxy() / 17.5; restorecrtmode(); xsize = 25.0; ysize = 17.5; #endif #ifdef QUICKC setupgraphics(); #endif break; case mac: penchange = yes; penchange = yes; xunitspercm = 28.346456693; yunitspercm = 28.346456693; xsize = winwidth / xunitspercm; ysize = winheight / yunitspercm; break; case houston: penchange = yes; xunitspercm = 100.0; yunitspercm = 100.0; xsize = 24.5; ysize = 17.5; break; case decregis: xunitspercm = 30.0; yunitspercm = 30.0; xsize = 25.0; ysize = 15.0; break; case epson: penchange = yes; xunitspercm = 47.244; yunitspercm = 28.346; xsize = 18.70; ysize = 22.0; strpwide = 960; strpdeep = 8; strpdiv = 1; break; case oki: penchange = yes; xunitspercm = 56.692; yunitspercm = 28.346; xsize = 19.0; ysize = 22.0; strpwide = 1100; strpdeep = 8; strpdiv = 1; break; case citoh: penchange = yes; xunitspercm = 28.346; yunitspercm = 28.346; xsize = 22.3; ysize = 26.0; strpwide = 640; strpdeep = 8; strpdiv = 1; break; case toshiba: //printf("in plotrparms\n"); //fflush(stdout); penchange = yes; xunitspercm = 70.866; yunitspercm = 70.866; xsize = 19.0; ysize = 25.0; strpwide = 1350; strpdeep = 24; strpdiv = 4; break; case pcl: penchange = yes; xsize = 21.59; ysize = 27.94; xunitspercm = 118.11023622; /* 300 DPI = 118.1 DPC */ yunitspercm = 118.11023622; strpwide = 2550; /* 8.5 * 300 DPI */ strpdeep = DEFAULT_STRIPE_HEIGHT; /* height of the strip */ strpdiv = DEFAULT_STRIPE_HEIGHT; /* in this case == strpdeep */ /* this is information for 300 DPI resolution */ switch (hpresolution) { case 75: strpwide /= 4; xunitspercm /= 4.0; yunitspercm /= 4.0; break; case 150: strpwide /= 2; xunitspercm /= 2.0; yunitspercm /= 2.0; break; case 300: break; } break; case bmp: /* since it's resolution dependent, make 1x1 pixels */ penchange = yes; /* per square cm for easier math. */ xunitspercm = 1.0; yunitspercm = 1.0; xsize = userxsize / xunitspercm; ysize = userysize / yunitspercm; strpdeep = DEFAULT_STRIPE_HEIGHT; strpdiv = DEFAULT_STRIPE_HEIGHT; strpwide = (long)(xsize * xunitspercm); break; case xbm: /* since it's resolution dependent, make 1x1 pixels */ penchange = yes; /* per square cm for easier math. */ xunitspercm = 1.0; yunitspercm = 1.0; xsize = userxsize / xunitspercm; ysize = userysize / yunitspercm; strpdeep = 10; strpdiv = 10; strpwide = (long)(xsize*xunitspercm); break; case pcx: penchange = yes; xsize = 21.16; ysize = 15.88; strpdeep = 10; strpdiv = 10; xunitspercm = strpwide / xsize; switch (resopts) { case 1: strpwide = 640; yunitspercm = 350 / ysize; break; case 2: strpwide = 800; yunitspercm = 600 / ysize; break; case 3: strpwide = 1024; yunitspercm = 768 / ysize; break; } break; case fig: penchange = yes; xunitspercm = 31.011; yunitspercm = 29.78; xsize = 25.4; ysize = 20.32; break; case gif: case other: break; default: break; /* initial parameter settings for a new plotter go here */ } if (xsizehold != 0.0 && ysizehold != 0.0) { xmargin = xmargin * xsize / xsizehold; ymargin = ymargin * ysize / ysizehold; } } /* plotrparms */ void getplotter() { long loopcount; Char ch,input[100]; clearit() ; printf("\nWhich plotter or printer will the tree be drawn on?\n"); printf("(many other brands or models are compatible with these)\n\n"); printf(" type: to choose one compatible with:\n\n"); printf(" L Postscript printer file format\n"); printf(" M PICT format (for drawing programs)\n"); printf(" J HP Laserjet PCL file format\n"); printf(" W MS-Windows Bitmap\n"); #ifdef DOS printf(" I IBM PC graphics screens\n"); #endif printf(" F FIG 2.0 drawing program format \n"); printf(" A Idraw drawing program format \n"); printf(" Z VRML Virtual Reality Markup Language file\n"); printf(" P PCX file format (for drawing programs)\n"); printf(" K TeKtronix 4010 graphics terminal\n"); printf(" X X Bitmap format\n"); printf(" V POVRAY 3D rendering program file\n"); printf(" R Rayshade 3D rendering program file\n"); printf(" H Hewlett-Packard pen plotter (HPGL file format)\n"); printf(" D DEC ReGIS graphics (VT240 terminal)\n"); printf(" E Epson MX-80 dot-matrix printer\n"); printf(" C Prowriter/Imagewriter dot-matrix printer\n"); printf(" T Toshiba 24-pin dot-matrix printer\n"); printf(" O Okidata dot-matrix printer\n"); printf(" B Houston Instruments plotter\n"); printf(" U other: one you have inserted code for\n"); loopcount = 0; do { printf(" Choose one: \n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); uppercase(&ch); countup(&loopcount, 10); } #ifdef DOS while (strchr("LJKHIDBECOTUAZPXRMFWV",ch) == NULL); #endif #ifndef DOS while (strchr("LJKHDBECOTAZUPXRMFWV",ch) == NULL); #endif switch (ch) { case 'L': plotter = lw; strcpy(fontname, "Times-Roman"); break; case 'A': plotter = idraw; strcpy(fontname, "Times-Bold"); break; case 'M': plotter = pict; strcpy(fontname, "Times"); break; case 'R': plotter = ray; strcpy(fontname, "Hershey"); break; case 'V': plotter = pov; strcpy(fontname, "Hershey"); break; case 'Z': plotter = vrml; strcpy(fontname, "Hershey"); treecolor = 5; namecolor = 4; vrmlskycolornear = 6; vrmlskycolorfar = 6; vrmlgroundcolornear = 3; vrmlgroundcolorfar = 3; break; case 'J': plotter = pcl; strcpy(fontname, "Hershey"); printf("Please select Laserjet resolution\n\n"); printf("1: 75 DPI\n2: 150 DPI\n3: 300 DPI\n\n"); loopcount = 0; do { #ifdef WIN32 phyFillScreenColor(); #endif getstryng(input); ch = atoi(input); countup(&loopcount, 10); } while (ch != 1 && ch != 2 && ch != 3); hpresolution = 75*(1<<(ch-1)); /* following pcl init code copied here from plotrparms */ xunitspercm = 118.11023622; /* 300 DPI = 118.1 DPC */ yunitspercm = 118.11023622; strpwide = 2550; /* 8.5 * 300 DPI */ strpdeep = DEFAULT_STRIPE_HEIGHT; /* height of the strip */ strpdiv = DEFAULT_STRIPE_HEIGHT; /* in this case == strpdeep */ /* this is information for 300 DPI resolution */ switch (hpresolution) { case 75: strpwide /= 4; xunitspercm /= 4.0; yunitspercm /= 4.0; break; case 150: strpwide /= 2; xunitspercm /= 2.0; yunitspercm /= 2.0; break; case 300: break; } break; case 'K': plotter = tek; strcpy(fontname, "Hershey"); break; case 'H': plotter = hp; strcpy(fontname, "Hershey"); break; case 'I': plotter = ibm; strcpy(fontname, "Hershey"); break; case 'D': plotter = decregis; strcpy(fontname, "Hershey"); break; case 'B': plotter = houston; strcpy(fontname, "Hershey"); break; case 'E': plotter = epson; strcpy(fontname, "Hershey"); break; case 'C': plotter = citoh; strcpy(fontname, "Hershey"); break; case 'O': plotter = oki; strcpy(fontname, "Hershey"); break; case 'T': plotter = toshiba; strcpy(fontname, "Hershey"); break; case 'P': plotter = pcx; strcpy(fontname, "Hershey"); printf("Please select the PCX file resolution\n\n"); printf("1: EGA 640 X 350\n"); printf("2: VGA 800 X 600\n"); printf("3: VGA 1024 X 768\n\n"); loopcount = 0; do { #ifdef WIN32 phyFillScreenColor(); #endif getstryng(input); ch = (char)atoi(input); uppercase(&ch); countup(&loopcount, 10); } while (ch != 1 && ch != 2 && ch != 3); switch (ch) { case 1: strpwide = 640; yunitspercm = 350 / ysize; resopts = 1; break; case 2: strpwide = 800; yunitspercm = 600 / ysize; resopts = 2; break; case 3: strpwide = 1024; yunitspercm = 768 / ysize; resopts = 3; break; } break; case 'W': plotter = bmp; strcpy(fontname, "Hershey"); printf("Please select the MS-Windows bitmap file resolution\n"); printf("X resolution?\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%lf%*[^\n]", &userxsize); getchar(); printf("Y resolution?\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%lf%*[^\n]", &userysize); getchar(); xunitspercm = 1.0; yunitspercm = 1.0; /* Assuming existing reasonable margin values, set the margins to be the same as those in the previous output mode/resolution. This corrects the problem of the tree being hard up against the border when large resolutions are entered. */ xmargin = userxsize / xsize * xmargin; ymargin = userysize / ysize * ymargin; xsize = userxsize; ysize = userysize; strpdeep = DEFAULT_STRIPE_HEIGHT; strpdiv = DEFAULT_STRIPE_HEIGHT; strpwide = (long)xsize; break; case 'X': plotter = xbm; strcpy(fontname, "Hershey"); printf("Please select the X-bitmap file resolution\n"); printf("X resolution?\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%lf%*[^\n]", &userxsize); getchar(); printf("Y resolution?\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%lf%*[^\n]", &userysize); getchar(); xunitspercm = 1.0; yunitspercm = 1.0; /* Assuming existing reasonable margin values, set the margins to be the same as those in the previous output mode/resolution. This corrects the problem of the tree being hard up against the border when large resolutions are entered. */ xmargin = userxsize / xsize * xmargin; ymargin = userysize / ysize * ymargin; xsize = userxsize; ysize = userysize; strpdeep = DEFAULT_STRIPE_HEIGHT; strpdiv = DEFAULT_STRIPE_HEIGHT; strpwide = (long)xsize; break; case 'F': plotter = fig; strcpy(fontname, "Times-Roman"); break; case 'U': plotter = other; break; } dotmatrix = (plotter == epson || plotter == oki || plotter == citoh || plotter == toshiba || plotter == pcx || plotter == pcl || plotter == xbm || plotter == bmp); } /* getplotter */ void changepen(pentype pen) { Char picthi, pictlo; long pictint; lastpen = pen; switch (pen) { case treepen: linewidth = treeline; if (plotter == hp) fprintf(plotfile, "SP1;\n"); if (plotter == lw) { fprintf(plotfile, "stroke %8.2f setlinewidth \n", treeline); fprintf(plotfile, " 1 setlinecap 1 setlinejoin \n"); } break; case labelpen: linewidth = labelline; if (plotter == hp) fprintf(plotfile, "SP2;\n"); if (plotter == lw) { fprintf(plotfile, " stroke%8.2f setlinewidth \n", labelline); fprintf(plotfile, "1 setlinecap 1 setlinejoin \n"); } break; } #ifdef MAC if (plotter == mac){ pictint = ( long)(linewidth + 0.5); if (pictint ==0) pictint = 1; } #endif if (plotter != pict) return; pictint = ( long)(linewidth + 0.5); if (pictint == 0) pictint = 1; picthi = (Char)(pictint / 256); pictlo = (Char)(pictint & 255); fprintf(plotfile, "\007%c%c%c%c", picthi, pictlo, picthi, pictlo); bytewrite += 5; } /* changepen */ int readafmfile(char *filename, short *metric) { char line[256], word1[100], word2[100]; int scanned = 1, nmetrics=0, inmetrics, charnum, charlen, i, capheight=0; FILE *fp; fp = fopen(filename,"r"); if (!fp) return 0; inmetrics = 0; for (i=0;i<256;i++){ metric[i] = (short)0; } for (;;){ scan_eoln(fp); if (scanned != 1 ) break; scanned=sscanf(line,"%s %s",word1,word2); if (scanned == 2 && strcmp(word1,"CapHeight") == 0) capheight = atoi(word2); if (inmetrics){ sscanf(line,"%*s %s %*s %*s %s",word1,word2); charnum = atoi(word1); charlen = atoi(word2); nmetrics--; if (nmetrics == 0) break; if (charnum != -1 && charnum >= 32) metric[charnum-31] = charlen; } else if (scanned == 2 && strcmp(word1,"StartCharMetrics") == 0) nmetrics = atoi(word2), inmetrics = 1; if ((strcmp(word1,"EndCharMetrics") == 0) || (feof(fp))) break; } FClose(fp); metric[0] = capheight; return 1; } /* readafmfile */ void metricforfont(char *fontname, short *fontmetric) { int i; long loopcount; if ((strcmp(fontname,"Helvetica") == 0) || (strcmp(fontname,"Helvetica-Oblique") == 0)) for (i=31;i<256;++i) fontmetric[i-31] = helvetica_metric[i-31]; else if ((strcmp(fontname,"Helvetica-Bold") == 0) || (strcmp(fontname,"Helvetica-BoldOblique") == 0)) for (i=31;i<256;++i) fontmetric[i-31] = helveticabold_metric[i-31]; else if (strcmp(fontname,"Times-Roman") == 0) for (i=31;i<256;++i) fontmetric[i-31] = timesroman_metric[i-31]; else if (strcmp(fontname,"Times") == 0) for (i=31;i<256;++i) fontmetric[i-31] = timesroman_metric[i-31]; else if (strcmp(fontname,"Times-Italic") == 0) for (i=31;i<256;++i) fontmetric[i-31] = timesitalic_metric[i-31]; else if (strcmp(fontname,"Times-Bold") == 0) for (i=31;i<256;++i) fontmetric[i-31] = timesbold_metric[i-31]; else if (strcmp(fontname,"Times-BoldItalic") == 0) for (i=31;i<256;++i) fontmetric[i-31] = timesbolditalic_metric[i-31]; else if (strncmp(fontname,"Courier",7) == 0){ fontmetric[0] = 562; for (i=32;i<256;++i) fontmetric[i-31] = (short)600; } else { if (didloadmetric){ for (i=31;i<256;++i) fontmetric[i-31] = unknown_metric[i-31];} else { didloadmetric = 1; sprintf(afmfile,"%s.afm",fontname); /* search current dir */ if (readafmfile(afmfile,unknown_metric)){ for (i=31;i<256;++i) fontmetric[i-31] = unknown_metric[i-31]; return;} sprintf(afmfile,"%s%s.afm",AFMDIR,fontname); /* search afm dir */ if (readafmfile(afmfile,unknown_metric)){ for (i=31;i<256;++i) fontmetric[i-31] = unknown_metric[i-31]; return;} #ifdef NeXT sprintf(afmfile,"%s/Library/Fonts/%s.font/%s.afm",getenv("HOME"), fontname,fontname); if (readafmfile(afmfile,unknown_metric)){ for (i=31;i<256;++i) fontmetric[i-31] = unknown_metric[i-31]; return;} sprintf(afmfile,"/LocalLibrary/Fonts/%s.font/%s.afm",fontname,fontname); if (readafmfile(afmfile,unknown_metric)){ for (i=31;i<256;++i) fontmetric[i-31] = unknown_metric[i-31]; return;} #endif loopcount = 0; if (javarun) { for (i=31;i<256;++i) fontmetric[i-31] = timesroman_metric[i-31], unknown_metric[i-31] = timesroman_metric[i-31], didloadmetric =1; return; } for (;;){ printf("Enter the path of the %s.afm file, or \"none\" for best guess:", fontname); getstryng(afmfile); if (strcmp(afmfile,"none") == 0){ for (i=31;i<256;++i) fontmetric[i-31] = timesroman_metric[i-31], unknown_metric[i-31] = timesroman_metric[i-31], didloadmetric =1; return; } else { if (readafmfile(afmfile,unknown_metric)){ for (i=31;i<256;++i) fontmetric[i-31] = unknown_metric[i-31]; return;} else printf("Can't read that file. Please re-enter.\n"); } countup(&loopcount, 10); } } } } /* metricforfont */ double heighttext(fonttype font, char *fontname) { short afmetric[256]; #ifdef MAC FontInfo info; #endif if (strcmp(fontname,"Hershey") == 0) return (double)font[2]; #ifdef MAC else if (((plotter == pict || plotter == mac) && (((grows == vertical && labelrotation == 0.0) || (grows == horizontal && labelrotation == 90.0))))){ TextFont(macfontid(fontname)); TextSize((int)(1000)); TextFace((int)((pictbold ? 1: 0) | (pictitalic ? 2 : 0)| (pictoutline ? 8 : 0)|(pictshadow ? 16 : 0))); GetFontInfo(&info); TextFont(macfontid("courier")); TextSize(10); TextFace(0); return (double)info.ascent; } #endif else if (strcmp(fontname,"Hershey") == 0) return (double)font[2]; else{ metricforfont(fontname,afmetric); return (double)afmetric[0];} } /* heighttext */ double lengthtext(char *pstring, long nchars, char *fontname, fonttype font) { /* lengthext */ long i, j, code; static double sumlength; long sumbigunits; short afmetric[256]; sumlength = 0.0; if (strcmp(fontname,"Hershey") == 0) { for (i = 0; i < nchars; i++) { code = pstring[i]; //printf("pstring[i]: %c code: %li\n", pstring[i], code); j = 1; while (font[j] != code && font[j - 1] != 0) { j = font[j - 1]; } //printf("j: %li font[j]: %i code: %li\n", j, font[j], code); if (font[j] == code) { sumlength += font[j + 2]; } } return sumlength; } #ifdef MAC else if (((plotter == pict || plotter == mac) && (((grows == vertical && labelrotation == 0.0) || (grows == horizontal && labelrotation == 90.0))))){ TextFont(macfontid(fontname)); TextSize((int)(1000)); TextFace((int)((pictbold ? 1: 0) | (pictitalic ? 2 : 0)| (pictoutline ? 8 : 0)|(pictshadow ? 16 : 0))); sumbigunits = 0; for (i = 0; i < nchars; i++) sumbigunits += (long)CharWidth(pstring[i]); TextFace(0); TextSize(10); TextFont(macfontid("courier")); return (double)sumbigunits; } #endif else { metricforfont(fontname,afmetric); sumbigunits = 0; for (i = 0; i < nchars; i++) sumbigunits += afmetric[(int)(1+(unsigned char)pstring[i] - 32)]; sumlength = (double)sumbigunits; } return sumlength; } /* lengthtext */ void plotchar(long *place, struct LOC_plottext *text) { text->heightfont = text->font[*place + 1]; text->yfactor = text->height / text->heightfont; text->xfactor = text->yfactor; *place += 3; do { (*place)++; text->coord = text->font[*place - 1]; if (text->coord > 0) text->penstatus = pendown; else text->penstatus = penup; text->coord = abs(text->coord); text->coord %= 10000; text->xfont = (text->coord / 100 - xstart) * text->xfactor; text->yfont = (text->coord % 100 - ystart) * text->yfactor; text->xplot = text->xx + (text->xfont * text->cosslope + text->yfont * text->sinslope) * text->compress; text->yplot = text->yy - text->xfont * text->sinslope + text->yfont * text->cosslope; plot(text->penstatus, text->xplot, text->yplot); } while (abs(text->font[*place - 1]) < 10000); text->xx = text->xplot; text->yy = text->yplot; } /* plotchar */ void swap_charptr(char **one, char **two) { char *tmp = (*one); (*one)= (*two); (*two) = tmp; } /* swap */ void plotpb() { pagecount++; fprintf ( plotfile, "\n showpage \n%%%%PageTrailer\n" ); fprintf ( plotfile, "%%%%DocumentFonts: %s\n", (strcmp(fontname,"Hershey") == 0) ? "" : fontname ); fprintf ( plotfile, "%%%%\n%%%%Page: %ld %ld\n", pagecount, pagecount ); fprintf ( plotfile, "%%%%PageBoundingBox: 0 0 %d %d\n", (int)(xunitspercm*paperx), (int)(yunitspercm*papery) ); fprintf ( plotfile, "%%%%PageFonts: (atend)\n%%%%BeginPageSetup\n%%%%PaperSize: Letter\n" ); /* hack to make changepen work w/o errors */ fprintf ( plotfile, "0 0 moveto\n" ); changepen(lastpen); } /* plotpb */ void drawit ( char *fontname, double *xoffset, double *yoffset, long numlines, node *root ) { long i, j, line, xpag, ypag; long test_long ; /* To get a division out of a loop */ //(*xoffset) = 0.0; //(*yoffset) = 0.0; xpag = (int)((pagex-hpmargin-0.01)/(paperx - hpmargin))+1; ypag = (int)((pagey-vpmargin-0.01)/(papery - vpmargin))+1; //printf("xpag: %li\n", xpag); //printf("ypag: %li\n", ypag); if (dotmatrix){ strptop = (long)(ysize * yunitspercm); strpbottom = numlines*strpdeep + 1; //printf("strptop: %li strpbottom: %li\n", strptop, strpbottom); //fflush(stdout); } else { pagecount = 1; for (j=0; j DEFAULT_STRIPE_HEIGHT){ /* large stripe, do in DEFAULT_STRIPE_HEIGHT (20)-line */ //printf(" large stripe\n"); //fflush(stdout); for (i=0;i b) ? a : b) #ifdef min #undef min #endif #define min(a,b) ((a > b) ? b : a) #define max4(a,b,c,d) (max(max(a,b),max(c,d))) #define min4(a,b,c,d) (min(min(a,b),min(c,d))) struct LOC_plottext text; long i, j, code; double pointsize; int epointsize; /* effective pointsize before scale in idraw matrix */ double iscale; double textlen; double px0,py0,px1,py1; /* square bounding box of text */ text.heightfont = font_[2]; pointsize = (((height_ / xunitspercm) / 2.54) * 72.0); if (strcmp(fontname,"Hershey") !=0) pointsize *= ((double)1000.0 / heighttext(font_,fontname)); text.height = height_; text.compress = cmpress2; text.font = font_; text.xx = x; text.yy = y; text.sinslope = sin(pi * slope / 180.0); text.cosslope = cos(pi * slope / 180.0); if (strcmp(fontname,"Hershey") == 0){ for (i = 0; i < nchars; i++) { code = pstring[i]; j = 1; while (text.font[j] != code && text.font[j - 1] != 0) j = text.font[j - 1]; plotchar(&j, &text); } } /* print native font. idraw, PS, pict, and fig. */ else if (plotter == fig) { fprintf(plotfile,"4 0 %d %d 0 -1 0 %1.5f 4 19 163 %d %d %s\001\n", figfontid(fontname), /* font ID */ (int)pointsize, /* font size */ (double)0.0, /* font rotation */ (int)x, /* x position */ (int)(606.0 - y), /* y position */ pstring); } else if (plotter == lw) { /* If there's NO possibility that the line intersects the square bounding * box of the font, leave it out. Otherwise, let postscript clip to region. * Compute text boundary, be REAL generous. */ textlen = (lengthtext(pstring,nchars,fontname,font_)/1000)*pointsize; px0 = min4(x + (text.cosslope * pointsize), x - (text.cosslope * pointsize), x + (text.cosslope * pointsize) + (text.sinslope * textlen), x - (text.cosslope * pointsize) + (text.sinslope * textlen)) /28.346; px1 = max4(x + (text.cosslope * pointsize), x - (text.cosslope * pointsize), x + (text.cosslope * pointsize) + (text.sinslope * textlen), x - (text.cosslope * pointsize) + (text.sinslope * textlen)) /28.346; py0 = min4(y + (text.sinslope * pointsize), y - (text.sinslope * pointsize), y + (text.sinslope * pointsize) + (text.cosslope * textlen), y - (text.sinslope * pointsize) + (text.cosslope * textlen)) /28.346; py1 = max4(y + (text.sinslope * pointsize), y - (text.sinslope * pointsize), y + (text.sinslope * pointsize) + (text.cosslope * textlen), y - (text.sinslope * pointsize) + (text.cosslope * textlen)) /28.346; /* if rectangles intersect, print it. */ if (rectintersects(px0,py0,px1,py1,clipx0,clipy0,clipx1,clipy1)) { fprintf(plotfile,"gsave\n"); fprintf(plotfile,"/%s findfont %f scalefont setfont\n",fontname, pointsize); fprintf(plotfile,"%f %f translate %f rotate\n", x-(clipx0*xunitspercm),y-(clipy0*xunitspercm),-slope); fprintf(plotfile,"0 0 moveto\n"); fprintf(plotfile,"(%s) show\n",pstring); fprintf(plotfile,"grestore\n"); } } else if (plotter == idraw) { iscale = pointsize / 12.0; y += text.height * text.cosslope; x += text.height * text.sinslope; fprintf(plotfile, "Begin %%I Text\n"); fprintf(plotfile, "%%I cfg Black\n"); fprintf(plotfile, "0 0 0 SetCFg\n"); fprintf(plotfile, "%%I f %s\n", findXfont(fontname,pointsize,&iscale,&epointsize)); fprintf(plotfile,"%s %d SetF\n",fontname,epointsize); fprintf(plotfile, "%%I t\n"); fprintf(plotfile, "[ %f %f %f %f %f %f ] concat\n", text.cosslope*iscale, -text.sinslope*iscale, text.sinslope*iscale, text.cosslope*iscale, x+216.0 ,y+285.0); fprintf(plotfile, "%%I\n"); fprintf(plotfile, "[\n(%s)\n] Text\nEnd\n\n",pstring); } else if (plotter == pict || plotter == mac) { /* txfont: */ fprintf(plotfile,"%c",(unsigned char)3); pictoutint(plotfile,macfontid(fontname)); /* txsize: */ fprintf(plotfile,"%c",13); pictoutint(plotfile,(int)(pointsize+0.5)); /* txface: */ fprintf(plotfile,"%c%c",4, (int)((pictbold ? 1: 0) | (pictitalic ? 2 : 0)| (pictoutline ? 8 : 0)|(pictshadow ? 16 : 0))); /* txfloc: */ fprintf(plotfile,"%c",40); pictoutint(plotfile,(int)floor(ysize * yunitspercm - y + 0.5)); pictoutint(plotfile,(int)(x+0.5)); fprintf(plotfile,"%c%s",(char)strlen(pstring),pstring); bytewrite+=(14+strlen(pstring)); } } /* plottext */ void makebox(char *fn,double *xo,double *yo,double *scale,long ntips) /* fn--fontname| xo,yo--x and y offsets */ { /* draw the box on screen which represents plotting area. */ char ch; long xpag,ypag,i,j; double xpagecorrection, ypagecorrection; /* if (previewer != winpreview && previewer != mac && previewer != xpreview) { printf("\nWe now will preview the tree. The box that will be\n"); printf("plotted on the screen represents the boundary of the\n"); printf("final plotting surface. To see the preview, press on\n"); printf("the ENTER or RETURN key (you may need to do it twice).\n"); printf("When finished viewing it, press on that key again.\n"); } */ oldpenchange = penchange; oldxsize = xsize; oldysize = ysize; oldxunitspercm = xunitspercm; oldyunitspercm = yunitspercm; oldxcorner = xcorner; oldycorner = ycorner; oldxmargin = xmargin; oldymargin = ymargin; oldhpmargin = hpmargin; oldvpmargin = vpmargin; oldplotter = plotter; /* JRMDebug*/ /* printf("***in makebox***\n"); printf("penchange: %i\n", penchange); printf("xsize: %f\n", xsize); printf("ysize: %f\n", ysize); printf("xunitspercm: %f\n", xunitspercm); printf("yunitspercm: %f\n", yunitspercm); printf("xcorner: %f\n", xcorner); printf("ycorner: %f\n", ycorner); printf("xmargin: %f\n", xmargin); printf("ymargin: %f\n", ymargin); printf("hpmargin: %f\n", hpmargin); printf("vpmargin: %f\n", vpmargin); printf("plotter: %i\n", plotter); fflush(stdout); */ /* plotter = lw; if (previewer != winpreview && previewer != mac && previewer != xpreview) { #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch); (void)getchar(); if (ch == '\n') ch = ' '; } */ //plotrparms(ntips); //initplotter(ntips,fn); xcorner += 0.05 * xsize; ycorner += 0.05 * ysize; xsize *= 0.9; ysize *= 0.9; (*scale) = ysize / oldysize; if (xsize / oldxsize < (*scale)) (*scale) = xsize / oldxsize; xpagecorrection = oldxsize / pagex; ypagecorrection = oldysize / pagey; (*xo) = (xcorner + (xsize - oldxsize * (*scale)) / 2.0) / (*scale); (*yo) = (ycorner + (ysize - oldysize * (*scale)) / 2.0) / (*scale); /* printf("xcorner: %f\n",xcorner); printf("ycorner: %f\n",ycorner); printf("xsize: %f\n",xsize); printf("ysize: %f\n",ysize); printf("oldxsize: %f\n",oldxsize); printf("oldysize: %f\n",oldysize); printf("xo: %f\n",*xo); printf("yo: %f\n",*yo); printf("scale: %f\n",*scale); fflush(stdout); */ xscale = (*scale) * xunitspercm; yscale = (*scale) * yunitspercm; xmargin *= (*scale); ymargin *= (*scale); hpmargin *= (*scale); vpmargin *= (*scale); xpag = (int)((pagex-hpmargin-0.01)/(paperx - hpmargin))+1; ypag = (int)((pagey-vpmargin-0.01)/(papery - vpmargin))+1; /* printf("xscale: %f\n", xscale); printf("yscale: %f\n", yscale); printf("xmargin: %f\n", xmargin); printf("ymargin: %f\n", ymargin); printf("hpmargin: %f\n", hpmargin); printf("vpmargin: %f\n", vpmargin); printf("xpag: %li\n", xpag); printf("ypag: %li\n", ypag); fflush(stdout); */ /* draw the outer borders */ plot(penup, xscale * (*xo), yscale * (*yo)); //printf("line from: (%f, %f)\n", xscale * (*xo), yscale * (*yo)); plot(pendown, xscale * (*xo), yscale * ((*yo) + pagey * ypagecorrection)); //printf(" to: (%f, %f)\n", xscale * (*xo), yscale * ((*yo) + pagey * ypagecorrection)); plot(pendown, xscale * ((*xo) + pagex * xpagecorrection), yscale * ((*yo) + pagey * ypagecorrection)); //printf(" to: (%f, %f)\n", xscale * ((*xo) + pagex * xpagecorrection), yscale * ((*yo) + pagey * ypagecorrection)); plot(pendown, xscale * ((*xo) + pagex * xpagecorrection), yscale * (*yo)); //printf(" to: (%f, %f)\n", xscale * ((*xo) + pagex * xpagecorrection), yscale * (*yo)); plot(pendown, xscale * (*xo), yscale * (*yo)); //printf(" to: (%f, %f)\n", xscale * (*xo) , yscale * (*yo)); //fflush(stdout); /* we've done the extent, now draw the dividing lines: */ for (i=0; i /* Metrowerks for windows defines WIN32 here */ #define swap_m(x,y) temp = y,y=x,x=temp; extern long winheight; extern long winwidth; #ifdef WIN32 #include HDC hdc; /******* Menu Defines *******/ #define IDM_ABOUT 1000 #define IDM_PLOT 1001 #define IDM_CHANGE 1002 #define IDM_QUIT 1003 #define XWINPERCENT 0.66 #define YWINPERCENT 0.66 #endif #ifdef QUICKC extern struct videoconfig myscreen; #endif /* #ifdef OSX_CARBON #include #endif */ #include "draw.h" #include "phylip.h" static long eb[]={ 0 , 1 ,2 ,3 ,55,45,46,47,22,5,37,11,12,13,14,15,16,17,18,19,60,61,50,38, 24, 25,63,39,28,29,30,31,64,90,127,123,91,108,80,125,77,93,92,78,107,96, 75,97,240,241,242,243,244,245,246,247,248,249,122,94,76,126,110,111, 124, 193,194,195,196,197,198,199,200,201,209,210,211, 212,213,214,215,216,217, 226,227,228,229,230,231,232,233,173,224,189, 95,109,121,129,130,131,132, 133,134,135,136,137,145,146,147,148,149,150, 151, 152,153,162,163,164,165, 166,167,168,169,192,79,208,161,7}; double oldxreal, oldyreal; boolean didenter, didexit, curvetrue; extern long vrmlplotcolor; extern double oldx, oldy ; extern node *root; extern long nmoves, oldpictint ; extern long rootmatrix[51][51]; extern long strpbottom,strptop,strpwide,strpdeep; extern boolean dotmatrix, empty; extern double ynow, ysize, xsize, yunitspercm; extern FILE *plotfile; extern plottertype plotter; extern striptype stripe; extern long treecolor, namecolor, vrmlskycolorfar, vrmlskycolornear, vrmlgroundcolorfar, vrmlgroundcolornear; extern colortype colors[7]; extern vrmllighttype vrmllights[3]; /* Added by Dan F. for the new previewing paradigm */ extern double labelline,linewidth,oldxhigh,oldxlow,oldyhigh,oldylow, vrmllinewidth, raylinewidth,treeline,oldxsize,oldysize,oldxunitspercm, oldyunitspercm,oldxcorner,oldycorner,clipx0,clipx1,clipy0,clipy1; /* func. protocol added for vrml - danieyek 981111 */ extern long strpdiv,hpresolution; extern boolean pictbold,pictitalic, pictshadow,pictoutline; extern double expand,xcorner,xnow,xscale,xunitspercm, ycorner,yscale,labelrotation, labelheight,ymargin,pagex,pagey,paperx,papery,hpmargin,vpmargin; extern long filesize; extern growth grows; extern enum {yes,no} penchange,oldpenchange; extern plottertype oldplotter; extern char resopts; extern winactiontype winaction; #ifndef OLDC /* function prototypes */ void plotdot(long, long); void circlepoints(int, int, int, int); void drawpen(long, long, long); void drawfatline(long, long, long, long, long); void idellipse(double, double); void splyne(double,double,double,double,boolean,long,boolean,boolean); static void putshort(FILE *, int); static void putint(FILE *, int); void reverse_bits (byte *, int); void makebox_no_interaction(char *, double *, double *, double *, long); void void_func(void); /* function prototypes */ #endif void plotdot(long ix, long iy) { /* plot one dot at ix, iy */ long ix0, iy0, iy1 = 0, iy2 = 0; iy0 = strptop - iy; if ((unsigned)iy0 > strpdeep || ix <= 0 || ix > strpwide) return; empty = false; ix0 = ix; switch (plotter) { case citoh: iy1 = 1; iy2 = iy0; break; case epson: iy1 = 1; iy2 = 7 - iy0; break; case oki: iy1 = 1; iy2 = 7 - iy0; break; case toshiba: iy1 = iy0 / 6 + 1; iy2 = 5 - iy0 % 6; break; case pcx: iy1 = iy0 + 1; ix0 = (ix - 1) / 8 + 1; iy2 = 7 - ((ix - 1) & 7); break; case pcl: iy1 = iy0 + 1; ix0 = (ix - 1) / 8 + 1; iy2 = 7 - ((ix - 1) & 7); break; case bmp: iy1 = iy0 + 1; ix0 = (ix - 1) / 8 + 1; iy2 = 7 - ((ix - 1) & 7); case xbm: case gif: iy1 = iy0 + 1; ix0 = (ix - 1) / 8 + 1; iy2 = (ix - 1) & 7; break; case lw: case hp: case tek: case mac: case houston: case decregis: case fig: case pict: case ray: case pov: case idraw: case ibm: case other: break; default: /* vrml not handled */ break; /* code for making dot array for a new printer goes here */ } stripe[iy1 - 1][ix0 - 1] |= (unsigned char)1< x){ if (d < 0) { d = d + deltaE; deltaE += 2; deltaSE += 2; x++; } else { d+=deltaSE; deltaE += 2; deltaSE += 4; x++; y--; } circlepoints(x,y,x0,y0); } } /* drawpen */ void drawfatline(long ixabs, long iyabs, long ixnow, long iynow, long penwide) { long temp, xdiff, ydiff, err, x1, y1; didenter = false; didexit = false; if (ixabs < ixnow) { temp = ixnow; ixnow = ixabs; ixabs = temp; temp = iynow; iynow = iyabs; iyabs = temp; } xdiff = ixabs - ixnow; ydiff = iyabs - iynow; if (ydiff >= 0) { if (xdiff >= ydiff) { err = -(xdiff / 2); x1 = ixnow; while (x1 <= ixabs && !(didenter && didexit)) { drawpen(x1, iynow, penwide); err += ydiff; if (err > 0) { iynow++; err -= xdiff; } x1++; } return; } err = -(ydiff / 2); y1 = iynow; while (y1 < iyabs && !(didenter && didexit)) { drawpen(ixnow, y1, penwide); err += xdiff; if (err > 0) { ixnow++; err -= ydiff; } y1++; } return; } if (xdiff < -ydiff) { err = ydiff / 2; y1 = iynow; while (y1 >= iyabs && !(didenter && didexit)) { drawpen(ixnow, y1, penwide); err += xdiff; if (err > 0) { ixnow++; err += ydiff; } y1--; } return; } err = -(xdiff / 2); x1 = ixnow; while (x1 <= ixabs && !(didenter && didexit)) { drawpen(x1, iynow, penwide); err -= ydiff; if (err > 0) { iynow--; err -= xdiff; } x1++; } } /* drawfatline */ void plot(pensttstype pen, double xabs, double yabs) { long xhigh, yhigh, xlow, ylow, ixnow, iynow, ixabs, iyabs, cdx, cdy, temp, i; long pictint; double newx, newy, dx, dy, lscale, dxreal, dyreal; Char picthi, pictlo; /* added to give every line a name in vrml! - danieyek 981110 */ static long lineCount = 0; /* Record the first node as the coordinate for viewpoint! */ static int firstNodeP=1; double distance, angle; double episilon = 1.0e-10; /* For povray, added by Dan F. */ char texture_string[7]; /* remember to respect & translate for clipping region, clip{x,y}{0,1} */ if (!dotmatrix) { switch (plotter) { case tek: if (pen == penup) { putc('\035', plotfile); } ixnow = (long)floor(xabs + 0.5); iynow = (long)floor(yabs + 0.5); xhigh = ixnow / 32; yhigh = iynow / 32; xlow = ixnow & 31; ylow = iynow & 31; if (!ebcdic) { if (yhigh != oldyhigh) { putc(yhigh + 32, plotfile); } if (ylow != oldylow || xhigh != oldxhigh) { putc(ylow + 96, plotfile); } if (xhigh != oldxhigh) { putc(xhigh + 32, plotfile); } putc(xlow + 64, plotfile); } else { /* DLS/JMH -- for systems that use EBCDIC coding */ if (yhigh != oldyhigh) { putc(eb[yhigh + 32], plotfile); } if (ylow != oldylow || xhigh != oldxhigh) { putc(eb[ylow + 96], plotfile); } if (xhigh != oldxhigh) { putc(eb[xhigh + 32], plotfile); } putc(eb[xlow + 64], plotfile); } oldxhigh = xhigh; oldxlow = xlow; oldyhigh = yhigh; oldylow = ylow; break; case hp: if (pen == pendown) fprintf(plotfile, "PD"); else fprintf(plotfile, "PU"); pout((long)floor(xabs + 0.5)); putc(',', plotfile); pout((long)floor(yabs + 0.5)); fprintf(plotfile, ";\n"); break; case pict: newx = floor(xabs + 0.5); newy = floor(ysize * yunitspercm - yabs + 0.5); if (pen == pendown) { if (linewidth > 5) { dxreal = xabs - oldxreal; dyreal = yabs - oldyreal; lscale = sqrt(dxreal * dxreal + dyreal * dyreal) / (fabs(dxreal) + fabs(dyreal)); pictint = (long)(lscale * linewidth + 0.5); if (pictint == 0) pictint = 1; if (pictint != oldpictint) { picthi = (Char)(pictint / 256); pictlo = (Char)(pictint & 255); fprintf(plotfile, "\007%c%c%c%c", picthi, pictlo, picthi, pictlo); } oldpictint = pictint; } fprintf(plotfile, " %c%c%c%c", (Char)((long) oldy / 256), (Char)((long) oldy & 255), (Char)((long) oldx / 256), (Char)((long) oldx & 255)); fprintf(plotfile, "%c%c%c%c", (Char)((long)newy / 256), (Char)((long)newy & 255), (Char)((long)newx / 256), (Char)((long)newx & 255)); } oldxreal = xabs; oldyreal = yabs; oldx = newx; oldy = newy; break; case ray: if (pen == pendown) { if (linewidth != treeline) { if (raylinewidth > labelline) { raylinewidth = labelline; fprintf(plotfile, "end\n\n"); fprintf(plotfile, "name species_names\n"); fprintf(plotfile, "grid 22 22 22\n"); } } if (oldxreal != xabs || oldyreal != yabs) { raylinewidth *= 0.99999; fprintf(plotfile, "cylinder %8.7f %6.3f 0 %6.3f %6.3f 0 %6.3f\n", raylinewidth, oldxreal, oldyreal, xabs, yabs); fprintf(plotfile, "sphere %8.7f %6.3f 0 %6.3f\n", raylinewidth, xabs, yabs); } } oldxreal = xabs; oldyreal = yabs; break; case pov: /* Default to writing out tree texture... */ strcpy (texture_string, TREE_TEXTURE); if (pen == pendown) { if (linewidth != treeline) { /* Change the texture to name texture */ strcpy (texture_string, NAME_TEXTURE); if (raylinewidth > labelline) { raylinewidth = labelline; fprintf(plotfile, "\n// Now, the species names:\n\n"); } } if (oldxreal != xabs || oldyreal != yabs) { raylinewidth *= 0.99999; fprintf(plotfile, "cylinder { <%6.3f, 0, %6.3f,>, <%6.3f, 0, %6.3f>, %8.7f \n", oldxreal, oldyreal, xabs, yabs, raylinewidth); fprintf(plotfile, "\ttexture { %s } }\n", texture_string); fprintf(plotfile, "sphere { <%6.3f, 0, %6.3f>, %8.7f \n", xabs, yabs, raylinewidth); fprintf(plotfile, "\ttexture { %s } }\n", texture_string); } } oldxreal = xabs; oldyreal = yabs; break; case lw: if (pen == pendown){ /* If there's NO possibility that the line interesects the page, * leave it out. Otherwise, let postscript clip it to the page. */ /* printf("xabs: %f\n", xabs); printf("oldx: %f\n", oldx); printf("clipx0: %f\n", clipx0); printf("clipx1: %f\n", clipx1); printf("xunitspercm: %f\n", xunitspercm); printf("yabs: %f\n", yabs); printf("oldy: %f\n", oldy); printf("clipy0: %f\n", clipy0); printf("clipy1: %f\n", clipy1); printf("yunitspercm: %f\n", yunitspercm); */ /* this is broken, but now irrelevant for preview if (!((xabs > clipx1*xunitspercm && oldx > clipx1*xunitspercm) || (xabs < clipx0*xunitspercm && oldx < clipx0*xunitspercm) || (yabs > clipy1*yunitspercm && oldy > clipy1*yunitspercm) || (yabs < clipy0*yunitspercm && oldy < clipy0*yunitspercm))) { printf("line: %8.2f %8.2f %8.2f %8.2f\n", oldx-(clipx0*xunitspercm), oldy-(clipy0*yunitspercm), xabs-(clipx0*xunitspercm), yabs-(clipy0*yunitspercm)); */ fprintf(plotfile, "%8.2f %8.2f %8.2f %8.2f l\n", oldx-(clipx0*xunitspercm), oldy-(clipy0*yunitspercm), xabs-(clipx0*xunitspercm), yabs-(clipy0*yunitspercm)); /* } */ } oldx = xabs, oldy = yabs; break; case idraw: if (pen == pendown) { fprintf(plotfile, "Begin %%I Line\n"); fprintf(plotfile, "%%I b 65535\n"); fprintf(plotfile, "%d 0 0 [] 0 SetB\n", ((linewidth>=1.0) ? (int)linewidth : 1)); fprintf(plotfile, "%%I cfg Black\n"); fprintf(plotfile, "0 0 0 SetCFg\n"); fprintf(plotfile, "%%I cbg White\n"); fprintf(plotfile, "1 1 1 SetCBg\n"); fprintf(plotfile, "%%I p\n"); fprintf(plotfile, "0 SetP\n"); fprintf(plotfile, "%%I t\n"); fprintf(plotfile, "[ 0.01 0 0 0.01 216 285 ] concat\n"); fprintf(plotfile, "%%I\n"); fprintf(plotfile, "%ld %ld %ld %ld Line\n", (long)(100.0 * (oldxreal+0.5)), (long)(100.0 * (oldyreal+0.5)), (long)(100.0 * (xabs+0.5)), (long)(100.0 * (yabs+0.5))); fprintf(plotfile, "End\n\n"); if (linewidth >= 4.0) { fprintf(plotfile, "Begin %%I Elli\n"); fprintf(plotfile, "%%I b 65535\n"); fprintf(plotfile, "1 0 0 [] 0 SetB\n"); fprintf(plotfile, "%%I cfg Black\n"); fprintf(plotfile, "0 0 0 SetCFg\n"); fprintf(plotfile, "%%I cbg White\n"); fprintf(plotfile, "1 1 1 SetCBg\n"); fprintf(plotfile, "%%I p\n"); fprintf(plotfile, "0 SetP\n"); fprintf(plotfile, "%%I t\n"); fprintf(plotfile, "[ 0.01 0 0 0.01 216 285 ] concat\n"); fprintf(plotfile, "%%I\n"); fprintf(plotfile, "%ld %ld %ld %ld Elli\n", (long)(100.0 * (oldxreal+0.5)), (long)(100.0 * (oldyreal+0.5)), (long)(100.0 * (linewidth/2)) - 100, (long)(100.0 * (linewidth/2)) - 100); fprintf(plotfile, "End\n"); fprintf(plotfile, "Begin %%I Elli\n"); fprintf(plotfile, "%%I b 65535\n"); fprintf(plotfile, "1 0 0 [] 0 SetB\n"); fprintf(plotfile, "%%I cfg Black\n"); fprintf(plotfile, "0 0 0 SetCFg\n"); fprintf(plotfile, "%%I cbg White\n"); fprintf(plotfile, "1 1 1 SetCBg\n"); fprintf(plotfile, "%%I p\n"); fprintf(plotfile, "0 SetP\n"); fprintf(plotfile, "%%I t\n"); fprintf(plotfile, "[ 0.01 0 0 0.01 216 285 ] concat\n"); fprintf(plotfile, "%%I\n"); fprintf(plotfile, "%ld %ld %ld %ld Elli\n", (long)(100.0 * (xabs+0.5)), (long)(100.0 * (yabs+0.5)), (long)(100.0 * (linewidth/2)) - 100, (long)(100.0 * (linewidth/2)) - 100); fprintf(plotfile, "End\n"); } } oldxreal = xabs; oldyreal = yabs; break; case ibm: #ifdef TURBOC newx = floor(xabs + 0.5); newy = fabs(floor(yabs) - getmaxy()); if (pen == pendown) line((long)oldx,(long)oldy,(long)newx,(long)newy); oldx = newx; oldy = newy; #endif #ifdef QUICKC newx = floor(xabs + 0.5); newy = fabs(floor(yabs) - myscreen.numypixels); if (pen == pendown) _lineto((long)newx,(long)newy); else _moveto((long)newx,(long)newy); oldx = newx; oldy = newy; #endif break; case mac: #ifdef MAC if (pen == pendown){ LineTo((int)floor((double)xabs + 0.5), winheight - (long)floor((double)yabs + 0.5)+MAC_OFFSET);} else{ MoveTo((int)floor((double)xabs + 0.5), winheight - (long)floor((double)yabs + 0.5)+MAC_OFFSET);} #endif break; case houston: if (pen == pendown) fprintf(plotfile, "D "); else fprintf(plotfile, "U "); pout((long)((long)floor(xabs + 0.5))); putc(',', plotfile); pout((long)((long)floor(yabs + 0.5))); putc('\n', plotfile); break; case decregis: newx = floor(xabs + 0.5); newy = fabs(floor(yabs + 0.5) - 479); if (pen == pendown) { fprintf(plotfile, "P["); pout((long)oldx); putc(',', plotfile); pout((long)oldy); fprintf(plotfile, "]V["); pout((long)newx); putc(',', plotfile); pout((long)newy); putc(']', plotfile); nmoves++; if (nmoves == 3) { nmoves = 0; putc('\n', plotfile); } } oldx = newx; oldy = newy; break; case fig: newx = floor(xabs + 0.5); newy = floor(yabs + 0.5); if (pen == pendown) { fprintf(plotfile, "2 1 0 %5ld 0 0 0 0 0.000 0 0\n", (long)floor(linewidth + 0.5) + 1); fprintf(plotfile, "%5ld%5ld%5ld%5ld 9999 9999\n", (long)oldx, 606 - (long) oldy, (long)newx, 606 - (long)newy); fprintf(plotfile, "1 3 0 1 0 0 0 21 0.00 1 0.0 %5ld%5ld%5ld %5ld %5ld%5ld%5ld 349\n", (long)oldx, 606 - (long) oldy, (long)floor(linewidth / 2 + 0.5), (long)floor(linewidth / 2 + 0.5), (long)oldx, 606 - (long)oldy, 606 - (long)oldy); fprintf(plotfile, "1 3 0 1 0 0 0 21 0.00 1 0.0 %5ld%5ld%5ld %5ld %5ld%5ld%5ld 349\n", (long)newx, 606 - (long)newy, (long)floor(linewidth / 2 + 0.5), (long)floor(linewidth / 2 + 0.5), (long)newx, 606 - (long)newy, 606 - (long)newy); } oldx = newx; oldy = newy; break; case vrml: newx = xabs; newy = yabs; /* if this is the root node, use the coordinates to define the view point */ if (firstNodeP-- == 1) { fprintf(plotfile, "#VRML V2.0 utf8\n"); fprintf(plotfile, " NavigationInfo {\n"); fprintf(plotfile, " headlight FALSE\n"); fprintf(plotfile, " }\n"); fprintf(plotfile, " Viewpoint\n"); fprintf(plotfile, " {\n"); fprintf(plotfile, " position %f %f %f\n", xsize/2, ysize/2, ysize*1.2); fprintf(plotfile, " description \"Entry View\"\n"); fprintf(plotfile, " }\n"); for (i=0; i<3; i++) { fprintf(plotfile, " PointLight {\n"); fprintf(plotfile, " on TRUE\n"); fprintf(plotfile, " intensity %f\n", vrmllights[i].intensity); fprintf(plotfile, " ambientIntensity 0.0\n"); fprintf(plotfile, " color 1.0 1.0 1.0\n"); fprintf(plotfile, " location %f %f %f\n", vrmllights[i].x, vrmllights[i].y, vrmllights[i].z); fprintf(plotfile, " attenuation 0.0 0.0 0.0\n"); fprintf(plotfile, " radius 200.0\n"); fprintf(plotfile, " }\n"); } fprintf(plotfile, " Background\n"); fprintf(plotfile, " {\n"); fprintf(plotfile, " skyAngle [1.75]\n"); fprintf(plotfile, " skyColor [%f %f %f, %f %f %f]\n", colors[vrmlskycolornear-1].red, colors[vrmlskycolornear-1].green, colors[vrmlskycolornear-1].blue, colors[vrmlskycolorfar-1].red, colors[vrmlskycolorfar-1].green, colors[vrmlskycolorfar-1].blue); fprintf(plotfile, " groundAngle[0 1.57 3.14]\n"); fprintf(plotfile, " groundColor [0.9 0.9 0.9, 0.7 0.7 0.7, %f %f %f]\n", colors[vrmlgroundcolorfar-1].red, colors[vrmlgroundcolorfar-1].green, colors[vrmlgroundcolorfar-1].blue); fprintf(plotfile, " }\n"); } if (pen == penup) {/* pen down = beginning of a new path */ } else if (pen == pendown) {/* pen up = continue, line may not end yet. */ if (linewidth != treeline) { if (vrmllinewidth > labelline) { vrmllinewidth = labelline; vrmlplotcolor = namecolor; } } distance = sqrt((newy - oldy)*(newy - oldy) + (newx - oldx)*(newx - oldx)); angle = computeAngle(oldx, oldy, newx, newy); if (distance >= episilon) { fprintf(plotfile, " DEF Line%ld Transform\n", lineCount++); fprintf(plotfile, " {\n"); fprintf(plotfile, " rotation 0 0 1 %f\n", angle); fprintf(plotfile, " translation %f %f 0\n", oldx, oldy); fprintf(plotfile, " children\n"); fprintf(plotfile, " [\n"); fprintf(plotfile, " Shape\n"); fprintf(plotfile, " {\n"); fprintf(plotfile, " appearance Appearance\n"); fprintf(plotfile, " {\n"); fprintf(plotfile, " material Material { diffuseColor %f %f %f}\n", colors[vrmlplotcolor-1].red, colors[vrmlplotcolor-1].green, colors[vrmlplotcolor-1].blue); fprintf(plotfile, " }\n"); fprintf(plotfile, " geometry Sphere\n"); fprintf(plotfile, " {\n"); /* vrmllinewidth *= 0.99999; */ fprintf(plotfile, " radius %f\n", vrmllinewidth); fprintf(plotfile, " }\n"); fprintf(plotfile, " }\n"); fprintf(plotfile, " Transform\n"); fprintf(plotfile, " {\n"); fprintf(plotfile, " rotation 0 0 1 -1.570796327\n" ); fprintf(plotfile, " translation %f 0 0\n", distance/2); fprintf(plotfile, " children\n"); fprintf(plotfile, " [\n"); fprintf(plotfile, " Shape\n"); fprintf(plotfile, " {\n"); fprintf(plotfile, " appearance Appearance\n"); fprintf(plotfile, " {\n"); fprintf(plotfile, " material Material { diffuseColor %f %f %f}\n", colors[vrmlplotcolor-1].red, colors[vrmlplotcolor-1].green, colors[vrmlplotcolor-1].blue ); fprintf(plotfile, " }\n"); fprintf(plotfile, " geometry Cylinder\n"); fprintf(plotfile, " {\n"); /* line radius affects end sphere's size */ /* vrmllinewidth *= 0.99999; */ fprintf(plotfile, " radius %f\n", vrmllinewidth); fprintf(plotfile, " height %f\n", distance); fprintf(plotfile, " }\n"); fprintf(plotfile, " }\n"); fprintf(plotfile, " ]\n"); fprintf(plotfile, " }\n"); fprintf(plotfile, " Transform\n"); fprintf(plotfile, " {\n"); fprintf(plotfile, " translation %f 0 0\n", distance); fprintf(plotfile, " children\n"); fprintf(plotfile, " [\n"); fprintf(plotfile, " Shape\n"); fprintf(plotfile, " {\n"); fprintf(plotfile, " appearance Appearance\n"); fprintf(plotfile, " {\n"); fprintf(plotfile, " material Material { diffuseColor %f %f %f}\n", colors[vrmlplotcolor-1].red, colors[vrmlplotcolor-1].green, colors[vrmlplotcolor-1].blue ); fprintf(plotfile, " }\n"); fprintf(plotfile, " geometry Sphere\n"); fprintf(plotfile, " {\n"); /* radius affects line size */ /* vrmllinewidth *= 0.99999; */ fprintf(plotfile, " radius %f\n", vrmllinewidth); fprintf(plotfile, " }\n"); fprintf(plotfile, " }\n"); fprintf(plotfile, " ]\n"); fprintf(plotfile, " }\n"); fprintf(plotfile, " ]\n"); fprintf(plotfile, " }\n"); } } else { fprintf(stderr, "ERROR: Programming error in plot()."); } oldx = newx; oldy = newy; break; case epson: case oki: case citoh: case toshiba: case pcx: case pcl: case bmp: case xbm: case gif: case other: break; /* code for a pen move on a new plotter goes here */ } return; } if (pen == pendown) { ixabs = (long)floor(xabs + 0.5); iyabs = (long)floor(yabs + 0.5); ixnow = (long)floor(xnow + 0.5); iynow = (long)floor(ynow + 0.5); if (ixnow > ixabs) { temp = ixnow; ixnow = ixabs; ixabs = temp; temp = iynow; iynow = iyabs; iyabs = temp; } dx = ixabs - ixnow; dy = iyabs - iynow; /* if (dx + fabs(dy) <= 0.0) c = 0.0; else c = 0.5 * linewidth / sqrt(dx * dx + dy * dy); */ cdx = (long)floor(linewidth + 0.5); cdy = (long)floor(linewidth + 0.5); if ((iyabs + cdx >= strpbottom || iynow + cdx >= strpbottom) && (iyabs - cdx <= strptop || iynow - cdx <= strptop)) { drawfatline(ixnow,iynow,ixabs,iyabs,(long)floor(linewidth+0.5)); } } xnow = xabs; ynow = yabs; /* Bitmap Code to plot (xnow,ynow) to (xabs,yabs) */ } /* plot */ void idellipse(double x, double y) { fprintf(plotfile, "Begin %%I Elli\n"); fprintf(plotfile, "%%I b 65535\n"); fprintf(plotfile, "1 0 0 [] 0 SetB\n"); fprintf(plotfile, "%%I cfg Black\n"); fprintf(plotfile, "0 0 0 SetCFg\n"); fprintf(plotfile, "%%I cbg White\n"); fprintf(plotfile, "1 1 1 SetCBg\n"); fprintf(plotfile, "%%I p\n"); fprintf(plotfile, "0 SetP\n"); fprintf(plotfile, "%%I t\n"); fprintf(plotfile, "[ 0.01 0 0 0.01 216 285 ] concat\n"); fprintf(plotfile, "%%I\n"); fprintf(plotfile, "%ld %ld %ld %ld Elli\n", (long)(100.0 * (x+0.5)),(long)(100.0 * (y+0.5)), (long)(100.0 * (linewidth/2)) - 100, (long)(100.0 * (linewidth/2)) - 100); fprintf(plotfile, "End\n"); } /* idellipse */ void splyne(double x1, double y1, double x2, double y2, boolean sense, long segs, boolean head, boolean tail) { /* sense is true if line departing from x1,y1 is tangential to x, false if tangential to y */ long i,fromx,fromy,tox,toy; double f, g, h, x3, y3; long ptop, pleft, pbottom, pright, startangle, arcangle; double dtheta; double sintheta,costheta,sindtheta,cosdtheta,newsintheta,newcostheta; double rx,ry; /* axes of ellipse */ double ox,oy; /* center of ellipse */ double prevx,prevy; long pictint; x1 = x1 - (clipx0 * xunitspercm); x2 = x2 - (clipx0 * xunitspercm); y1 = y1 - (clipy0 * yunitspercm); y2 = y2 - (clipy0 * yunitspercm); /* adjust by clipping region */ switch (plotter) { case lw: fprintf(plotfile,"stroke %8.2f %8.2f moveto\n",x1,y1); if (sense) fprintf(plotfile,"%8.2f %8.2f %8.2f %8.2f %8.2f %8.2f curveto\n", (x1+(0.55*(x2-x1))), y1, x2, (y1+(0.45*(y2-y1))), x2, y2); else fprintf(plotfile,"%8.2f %8.2f %8.2f %8.2f %8.2f %8.2f curveto\n", x1, (y1+(0.55*(y2-y1))), (x1+(0.45*(x2-x1))), y2, x2, y2); break; case pict: { double dtop, dleft, dbottom, dright,temp; if (x1 == x2 || y1 == y2) { plot(penup, x1, y1); plot(pendown, x2, y2); } else { if (x2 > x1 && y2 < y1){ swap_m(x2,x1); swap_m(y2,y1); sense = !sense; } y1 = (ysize * yunitspercm) - y1; y2 = (ysize * yunitspercm) - y2; if (sense) { if (x2 > x1) { dtop = y2 - y1 + y2; dleft = x1 - x2 + x1; dbottom = y1; dright = x2; startangle = 90; } else { dtop = y2 - y1 + y2; dleft = x2; dbottom = y1; dright = x1 + (x1 - x2); startangle = 180; } } else { if (x2 > x1) { dtop = y1 + (y1 - y2); dleft = x1; dbottom = y2; dright = x2 + (x2 - x1); startangle = 270; } else { dtop = y2; dleft = x1; dbottom = y1 + y1 - y2;; dright = x2 + (x2 - x1); startangle = 0; } } arcangle = 90; if (dbottom < dtop) {swap_m(dbottom,dtop);} if (dleft> dright) {swap_m(dleft,dright);} ptop = (long)floor((dtop - 0) + 0.5); pleft = (long)floor(dleft + 0.5); pbottom = (long)floor(dbottom + 0.5) + (long)floor(linewidth + 0.5); pright = (long)floor(dright + 0.5) + (long)floor(linewidth + 0.5); if (!sense) pbottom++; else if (x2 < x1) pright++; else pleft--; pictint = 1; fprintf(plotfile,"\140%c%c%c%c%c%c%c%c%c%c%c%c", (Char)(ptop / 256), (Char)(ptop % 256), (Char)(pleft / 256), (Char)(pleft % 256), (Char)(pbottom / 256), (Char)(pbottom % 256), (Char)(pright / 256), (Char)(pright % 256), (Char)(startangle / 256), (Char)(startangle % 256), (Char)(arcangle / 256), (Char)(arcangle % 256)); } } break; case fig: fromx = (long)floor(x1 + 0.5); fromy = (long)floor(y1 + 0.5); tox = (long)floor(x2 + 0.5); toy = (long)floor(y2 + 0.5); fprintf(plotfile, "3 0 0 %5ld 0 0 0 0 0.000 0 0\n", (long)floor(linewidth + 0.5) + 1); if (sense) fprintf(plotfile, "%5ld%5ld%5ld%5ld%5ld%5ld%5ld%5ld 9999 9999\n", fromx, 606 - fromy, (long)floor((x1+(0.55*(x2-x1))) + 0.5), 606 - fromy, tox, 606 - (long)floor((y1+(0.45*(y2-y1))) + 0.5), tox, 606 - toy); else fprintf(plotfile, "%5ld%5ld%5ld%5ld%5ld%5ld%5ld%5ld 9999 9999\n", fromx, 606 - fromy, fromx, 606 - (long)floor((y1+(0.55*(y2-y1))) + 0.5), (long)floor((x1+(0.45*(x2-x1))) + 0.5), 606 - toy, tox, 606 - toy); fprintf(plotfile, "1 3 0 1 0 0 0 21 0.00 1 0.0 "); fprintf(plotfile, "%5ld%5ld%5ld %5ld %5ld%5ld%5ld 349\n", fromx, 606 - fromy, (long)floor(linewidth / 2 + 0.5), (long)floor(linewidth / 2 + 0.5), fromx, 606 - fromy, 606 - fromy); fprintf(plotfile, "1 3 0 1 0 0 0 21 0.00 1 0.0 "); fprintf(plotfile, "%5ld%5ld%5ld %5ld %5ld%5ld%5ld 349\n", tox, 606 - toy, (long)floor(linewidth / 2 + 0.5), (long)floor(linewidth / 2 + 0.5), tox, 606 - toy, 606 - toy); break; case idraw: if (head){ fprintf(plotfile,"Begin %%I Pict\n%%I b u\n%%I cfg u\n%%I cbg u\n"); fprintf(plotfile,"%%I f u\n%%I p u \n%%I t u\n\n"); idellipse(x1,y1); fprintf(plotfile, "Begin %%I BSpl\n"); fprintf(plotfile, "%%I b 65535\n"); fprintf(plotfile, "%ld 0 0 [] 0 SetB\n", ((linewidth>=1.0) ? (long)linewidth : 1)); fprintf(plotfile, "%%I cfg Black\n"); fprintf(plotfile, "0 0 0 SetCFg\n"); fprintf(plotfile, "%%I cbg White\n"); fprintf(plotfile, "1 1 1 SetCBg\n"); fprintf(plotfile, "none SetP %%I p n\n"); fprintf(plotfile, "%%I t\n"); fprintf(plotfile, "[ 0.01 0 0 0.01 216 285 ] concat\n"); if (tail) fprintf(plotfile,"%%I %ld\n",segs+1); else fprintf(plotfile,"%%I %ld\n",(segs*2)+1); fprintf(plotfile, "%ld %ld\n", (long)(100.0 * (x1+0.5)), (long)(100.0 * (y1+0.5))); } rx = (fabs(x2 - x1)); ry = (fabs(y2 - y1)); if (!sense){ if (x2 < x1) sintheta = 0.0, costheta = 1.0, dtheta = 90.0 / ((double)segs), ox = x2, oy = y1; else sintheta = 0.0, costheta = -1.0, dtheta = -90.0 / ((double)segs), ox = x2, oy = y1; } else{ if (x2 < x1) sintheta = -1.0, costheta = 0.0, dtheta = -90.0 / ((double)segs), ox = x1, oy = y2; else sintheta = -1.0, costheta = 0.0, dtheta = 90.0 / ((double)segs), ox = x1, oy = y2; } x3 = x1; y3 = y1; sindtheta = sin(dtheta * (3.1415926535897932384626433 / 180.0)); cosdtheta = cos(dtheta * (3.1415926535897932384626433 / 180.0)); for (i = 1; i <= segs; i++) { prevx = x3; prevy = y3; newsintheta = (sintheta * cosdtheta) + (costheta * sindtheta); newcostheta = (costheta * cosdtheta) - (sintheta * sindtheta); sintheta = newsintheta; costheta = newcostheta; x3 = ox + (costheta * rx); y3 = oy + (sintheta * ry); /* adjust spline for better aesthetics: */ if (i == 1){ if (sense) y3 = (y3 + prevy) / 2.0; else x3 = (x3 + prevx) / 2.0;} else if (i == segs - 1){ if (sense) x3 = (x3 + x2) / 2.0; else y3 = (y2 + y3) / 2.0; } fprintf(plotfile, "%ld %ld\n", (long)(100.0 * (x3+0.5)), (long)(100.0 * (y3+0.5))); } if (head && tail) fprintf(plotfile," BSpl\nEnd\n\n"); /* changed for gcc */ /*fprintf(plotfile,"%ld BSpl\nEnd\n\n"); This is the original */ else if (tail) fprintf(plotfile," BSpl \nEnd\n\n"); /* changed for gcc */ /*fprintf(plotfile,"%ld BSpl\nEnd\n\n"); This is the original */ if (tail) idellipse(x2,y2), fprintf(plotfile,"\nEnd %%I eop\n\n"); break; case hp: plot(penup,x1,y1); if (sense){ if (x2 > x1) fprintf(plotfile,"PD;AA%ld,%ld,90,1;\n",(long)x1,(long)y2); else fprintf(plotfile,"PD;AA%ld,%ld,-90,1;\n",(long)x1,(long)y2); } else { if (x2 > x1) fprintf(plotfile,"PD;AA%ld,%ld,-90,1;\n",(long)x2,(long)y1); else fprintf(plotfile,"PD;AA%ld,%ld,90,1;\n",(long)x2,(long)y1); } plot(penup,x2,y2); fprintf(plotfile,"PD;PU;"); /* else fprintf(plotfile,"PD;AA%ld,%ld,90,1;\n",(long)x2,(int)y1); */ plot(penup,x2,y2); break; default: for (i = 1; i <= 2*segs; i++) { f = (double)i / (2*segs); g = (double)i / (2*segs); h = 1.0 - sqrt(1.0 - g * g); if (sense) { x3 = x1 * (1.0 - f) + x2 * f; y3 = y1 + (y2 - y1) * h; } else { x3 = x1 + (x2 - x1) * h; y3 = y1 * (1.0 - f) + y2 * f; } plot(pendown, x3, y3); } break; } } /* splyne */ void swoopspline(double x1, double y1, double x2, double y2, double x3, double y3, boolean sense, long segs) { splyne(x1,y1,x2,y2,sense,segs/4,true,false); splyne(x2,y2,x3,y3,(boolean)(!sense),segs/4,false,true); } /* swoopspline */ void curvespline(double x1, double y1, double x2, double y2, boolean sense, long segs) { splyne(x1,y1,x2,y2,sense,segs/2,true,true); } /* curvespline */ /*******************************************/ static void putshort(FILE *fp, int i) { int c, c1; c = ((unsigned int ) i) & 0xff; c1 = (((unsigned int) i)>>8) & 0xff; putc(c, fp); putc(c1,fp); } /* putshort */ /*******************************************/ static void putint(FILE *fp, int i) { int c, c1, c2, c3; c = ((unsigned int ) i) & 0xff; c1 = (((unsigned int) i)>>8) & 0xff; c2 = (((unsigned int) i)>>16) & 0xff; c3 = (((unsigned int) i)>>24) & 0xff; putc(c, fp); putc(c1,fp); putc(c2,fp); putc(c3,fp); } /* ptint */ void write_bmp_header (FILE *plotfile,int width,int height) { /* * write a 1-bit image header * */ //putc('T', testBMPfile); byte r1[2],g1[2],b1[2] ; int i, bperlin; r1[0] = (long) 255; /* Black */ g1[0] = (long) 255; b1[0] = (long) 255; r1[1] = 0; g1[1] = 0; b1[1] = 0; bperlin = ((width + 31) / 32) * 4; /* # bytes written per line */ putc('B', plotfile); putc('M', plotfile); /* BMP file magic number */ /* compute filesize and write it */ i = 14 + /* size of bitmap file header */ 40 + /* size of bitmap info header */ 8 + /* size of colormap */ bperlin * height; /* size of image data */ putint(plotfile, i); putshort(plotfile, 0); /* reserved1 */ putshort(plotfile, 0); /* reserved2 */ putint(plotfile, 14 + 40 + 8); /* offset from BOfile to BObitmap */ putint(plotfile, 40); /* biSize: size of bitmap info header */ putint(plotfile, width); /* Width */ putint(plotfile, height); /* Height */ putshort(plotfile, 1); /* Planes: must be '1' */ putshort(plotfile, 1); /* BitCount: 1 */ putint(plotfile, 0); /* Compression: BI_RGB = 0 */ putint(plotfile, bperlin*height);/* SizeImage: size of raw image data */ putint(plotfile, 75 * 39); /* XPelsPerMeter: (75dpi * 39 in. per meter) */ putint(plotfile, 75 * 39); /* YPelsPerMeter: (75dpi * 39 in. per meter) */ putint(plotfile, 2); /* ClrUsed: # of colors used in cmap */ putint(plotfile, 2); /* ClrImportant: same as above */ /* write out the colormap */ for (i = 0 ; i < 2 ; i++) { putc(b1[i],plotfile); putc(g1[i],plotfile); putc(r1[i],plotfile); putc(0, plotfile); } } /* write_bmp_header */ void reverse_bits (byte *full_pic, int location) { /* Reverse all the bits at location */ int i, loop_end ; byte orig, reversed; /* initialize...*/ orig = full_pic[location] ; reversed = (byte) '\0'; loop_end = sizeof (byte) * 8 ; if (orig == (byte) '\0') { /* No need to do anything for 0 bytes, */ return ; } else { for (i = 0 ; i < loop_end ; i++) { reversed = (reversed << 1) | (orig & 1) ; orig >>= 1 ; } full_pic[location] = reversed ; } } /* reverse_bits */ void turn_rows (byte *full_pic, int padded_width, int height) { int i, j; int midpoint = padded_width / 2 ; byte temp ; /* For the swap call */ for (j = 0 ; j < height ; j++) { for (i = 0 ; i < midpoint ; i++) { reverse_bits (full_pic, (j * padded_width) + i); reverse_bits (full_pic, (j * padded_width) + (padded_width - i)); swap_m (full_pic[(j * padded_width) + i], full_pic[(j * padded_width) + (padded_width - i)]) ; } /* Then do the midpoint */ reverse_bits (full_pic, (j * padded_width) + midpoint); } } /* turn_rows */ void translate_stripe_to_bmp(striptype *stripe, byte *full_pic, int increment, int width, int div, int *total_bytes) { int padded_width, i, j, offset, pad_size, total_stripes, last_stripe_offset, truncated_stripe_height ; if (div == 0) /* For some reason this is called once without valid data */ return ; else if (div == DEFAULT_STRIPE_HEIGHT) { /* For a non-last-stripe, figure out if the last stripe is going to be shorter than the others, to know how far from the bottom things should be offset. */ truncated_stripe_height = (int) ysize % DEFAULT_STRIPE_HEIGHT; if (truncated_stripe_height != 0) /* The last stripe isn't default height */ last_stripe_offset = DEFAULT_STRIPE_HEIGHT - ((int) ysize % DEFAULT_STRIPE_HEIGHT) ; else /* Stripes are all default height */ last_stripe_offset = 0 ; } else { /* For the last stripe, */ last_stripe_offset = 0 ; } total_stripes = (int) ceil (ysize / (double) DEFAULT_STRIPE_HEIGHT); /* width, padded to be a multiple of 32 bits, or 4 bytes */ padded_width = ((width + 3)/4) * 4; pad_size = padded_width - width; /* Include pad_size here, as it'll be turned horizontally later */ offset = ((total_stripes - increment) * (padded_width * DEFAULT_STRIPE_HEIGHT)) - (padded_width * last_stripe_offset) + pad_size ; //testBMPfile= fopen("testBMPfile","ab"); for (j = div; j >= 0; j--) { for (i = 0; i < width; i++) { full_pic[offset + (((div-j) * padded_width) + (width-i))] = (byte) (*stripe)[j][i]; //putc((byte) (*stripe)[j][i], testBMPfile); (*total_bytes)++ ; } //fclose(testBMPfile); /* Take into account the padding */ (*total_bytes) += pad_size ; } } /* translate_stripe_to_bmp */ void write_full_pic(byte *full_pic, int total_bytes) { int i ; for (i = 0; i < total_bytes; i++) { putc (full_pic[i], plotfile); } } /* write_full_pic */ void makebox_no_interaction(char *fn, double *xo, double *yo, double *scale, long ntips) /* fn--fontname xo,yo--x and y offsets */ { /* draw the box on screen which represents plotting area. */ long xpag,ypag,i,j; oldpenchange = penchange; oldxsize = xsize; oldysize = ysize; oldxunitspercm = xunitspercm; oldyunitspercm = yunitspercm; oldxcorner = xcorner; oldycorner = ycorner; oldplotter = plotter; plotrparms(ntips); xcorner += 0.05 * xsize; ycorner += 0.05 * ysize; xsize *= 0.9; ysize *= 0.9; (*scale) = ysize / oldysize; if (xsize / oldxsize < (*scale)) (*scale) = xsize / oldxsize; (*xo) = (xcorner + (xsize - oldxsize * (*scale)) / 2.0) / (*scale); (*yo) = (ycorner + (ysize - oldysize * (*scale)) / 2.0) / (*scale); xscale = (*scale) * xunitspercm; yscale = (*scale) * yunitspercm; initplotter(ntips,fn); plot(penup, xscale * (*xo), yscale * (*yo)); plot(pendown, xscale * (*xo), yscale * ((*yo) + oldysize)); plot(pendown, xscale * ((*xo) + oldxsize), yscale * ((*yo) + oldysize)); plot(pendown, xscale * ((*xo) + oldxsize), yscale * (*yo)); plot(pendown, xscale * (*xo), yscale * (*yo)); /* we've done the extent, now draw the dividing lines: */ xpag = (int)((pagex-hpmargin-0.01)/(paperx - hpmargin))+1; ypag = (int)((pagey-vpmargin-0.01)/(papery - vpmargin))+1; for (i=0;i\n"); fprintf(plotfile, "#declare C_White_trans = color rgbt<1, 1, 1, 0.7>\n"); fprintf(plotfile, "#declare C_Red = color rgb<1, 0, 0>\n"); fprintf(plotfile, "#declare C_Yellow = color rgb<1, 1, 0>\n"); fprintf(plotfile, "#declare C_Green = color rgb<0, 1, 0>\n"); fprintf(plotfile, "#declare C_Black = color rgb<0, 0, 0>\n"); fprintf(plotfile, "#declare C_Blue = color rgb<0, 0, 1>\n"); fprintf(plotfile, "\n// Declare the textures\n\n"); fprintf(plotfile, "#declare T_White = texture { pigment { C_White }}\n"); fprintf(plotfile, "#declare T_White_trans = texture { pigment { C_White_trans }}\n"); fprintf(plotfile, "#declare T_Red = texture { pigment { C_Red }\n"); fprintf(plotfile, "\tfinish { phong 1 phong_size 100 }}\n"); fprintf(plotfile, "#declare T_Red_trans = texture { pigment { C_Red filter 0.7 }\n"); fprintf(plotfile, "\tfinish { phong 1 phong_size 100 }}\n"); fprintf(plotfile, "#declare T_Green = texture { pigment { C_Green }\n"); fprintf(plotfile, "\tfinish { phong 1 phong_size 100 }}\n"); fprintf(plotfile, "#declare T_Green_trans = texture { \n"); fprintf(plotfile, "\tpigment { C_Green filter 0.7 }\n"); fprintf(plotfile, "\tfinish { phong 1 phong_size 100 }}\n"); fprintf(plotfile, "#declare T_Blue = texture { pigment { C_Blue }\n"); fprintf(plotfile, "\tfinish { phong 1 phong_size 100 }}\n"); fprintf(plotfile, "#background { color rgb<1, 1, 1> }\n"); } /* void_func */ /* added for vrml - danieyek 981111 */ /* Returned angle in radian */ /* A related function is "double angleBetVectors(Xu, Yu, Xv, Yv)" in drawtree.c */ double computeAngle(double oldx, double oldy, double newx, double newy) { double angle; if ((newx-oldx) == 0 ) { /* pi/2 or -pi/2! */ if (newy > oldy) angle = pi/2; else if (newy < oldy) angle = -pi/2; else { /* added - danieyek 990130 */ /* newx = oldx; newy = oldy; one point on top of the other! If new and old correspond to 2 points, changes are that the 2 coordinates are not identical under double precision value. */ fprintf(stderr, "ERROR: Angle can't be computed, 2 points on top of each other in computeAngle()!\n"); angle = 0; } } else { angle = atan( (newy-oldy)/(newx-oldx) ); if (newy >= oldy && newx >= oldx) { /* First quardrant - no adjustment */ } else if (newx <= oldx) { /* Second (angle = negative) and third (angle = positive) quardrant */ angle = pi + angle; } else if (newy <= oldy && newx >= oldx) { /* Fourth quardrant; "angle" is negative! */ angle = 2*pi + angle; } else { /* Should never get here. */ fprintf(stderr, "ERROR: Programming error in computeAngle()!\n"); } } return angle; } /* computeAngle */ phylip-3.697/src/drawgram.c0000644004732000473200000017627212407046613015326 0ustar joefelsenst_g#include "phylip.h" #include "draw.h" /* Version 3.696. Written by Joseph Felsenstein and Christopher A. Meacham. Additional code written by Hisashi Horino, Sean Lamont, Andrew Keefe, Daniel Fineman, Akiko Fuseki, Doug Buxton, Michal Palczewski, and James McGill. Copyright (c) 1986-2014, Joseph Felsenstein All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #ifdef MAC char* about_message = "Drawgram unrooted tree plotting program\r" "PHYLIP version 3.696\r" "(c) Copyright 1986-2014 by Joseph Felsenstein\r" "Written by Joseph Felsenstein and Christopher A. Meacham.\r" "Additional code written by Hisashi Horino, Sean Lamont, Andrew Keefe,\r" "Daniel Fineman, Akiko Fuseki, Doug Buxton and Michal Palczewski.\r" #endif //#define JAVADEBUG #define gap 0.5 /* distance in character heights between the end of a branch and the start of the name */ FILE *plotfile; char pltfilename[FNMLNGTH]; char trefilename[FNMLNGTH]; char *progname; long nextnode, strpwide, strpdeep, strpdiv, strptop, strpbottom, payge, numlines, hpresolution, iteration; boolean dotmatrix, haslengths, uselengths, empty, rescaled, firstscreens, pictbold, pictitalic, pictshadow, pictoutline, multiplot, finished; double xmargin, ymargin, topoflabels, bottomoflabels, rightoflabels, leftoflabels, tipspacing,maxheight, scale, xscale, yscale, xoffset, yoffset, nodespace, stemlength, treedepth, xnow, ynow, xunitspercm, yunitspercm, xsize, ysize, xcorner, ycorner, labelheight,labelrotation,expand, rootx, rooty, bscale, xx0, yy0, fontheight, maxx, minx, maxy, miny; double pagex, pagey, paperx, papery, hpmargin, vpmargin; double *textlength, *firstlet; striptype stripe; plottertype plotter, oldplotter; growth grows; treestyle style; node *root; pointarray nodep; pointarray treenode; fonttype font; long filesize; Char ch, resopts; double trweight; /* starting here, needed to make sccs version happy */ boolean goteof; node *grbg; long *zeros; /* ... down to here */ boolean canbeplotted; enum {yes, no} penchange, oldpenchange; static enum {weighted, intermediate, centered, inner, vshaped} nodeposition; winactiontype winaction; #ifndef OLDC /* function prototypes */ void initdrawgramnode(node **, node **, node *, long, long, long *, long *, initops, pointarray, pointarray, Char *, Char *, FILE *); void initialparms(void); char showparms(void); void getparms(char); void calctraverse(node *, double, double *); void calculate(void); void rescale(void); void setup_environment(Char *argv[]); void user_loop(void); void drawgram(char* intreename, char* fontfilename, char* plotfilename, char* plotfileopt, char* treegrows, char* stylekind, int usebranchlengths, double labelangle, int scalebranchlength, double branchscale, double breadthdepthratio, double stemltreedratio, double chhttipspratio, double xmarginratio, double ymarginratio, char* ancnodes, int dofinalplot, char* finalplotkind); /* function prototypes */ #endif void initdrawgramnode(node **p, node **grbg, node *q, long len, long nodei, long *ntips, long *parens, initops whichinit, pointarray treenode, pointarray nodep, Char *str, Char *ch, FILE *intree) { /* initializes a node */ long i; boolean minusread; double valyew, divisor; switch (whichinit) { case bottom: gnu(grbg, p); (*p)->index = nodei; (*p)->tip = false; for (i=0;inayme[i] = '\0'; nodep[(*p)->index - 1] = (*p); break; case nonbottom: gnu(grbg, p); (*p)->index = nodei; break; case tip: (*ntips)++; gnu(grbg, p); nodep[(*ntips) - 1] = *p; setupnode(*p, *ntips); (*p)->tip = true; (*p)->naymlength = len ; strncpy ((*p)->nayme, str, MAXNCH); break; case length: processlength(&valyew, &divisor, ch, &minusread, intree, parens); if (!minusread) (*p)->oldlen = valyew / divisor; else (*p)->oldlen = 0.0; break; case hsnolength: haslengths = false; break; default: /* cases hslength,treewt,unittrwt,iter */ break; /* should never occur */ } } /* initdrawgramnode */ void initialparms() { /* initialize parameters */ plotter = DEFPLOTTER; paperx=20.6375; pagex=20.6375; papery=26.9875; pagey=26.9875; strcpy(fontname,"Times-Roman"); plotrparms(spp); /* initial, possibly bogus, parameters */ style = phenogram; grows = horizontal; labelrotation = 90.0; nodespace = 3.0; stemlength = 0.05; treedepth = 0.5 / 0.95; rescaled = true; bscale = 1.0; uselengths = haslengths; if (uselengths) nodeposition = weighted; else nodeposition = centered; xmargin = 0.08 * xsize; ymargin = 0.08 * ysize; hpmargin = 0.02*pagex; vpmargin = 0.02*pagey; } /* initialparms */ char showparms() { char input[32]; Char ch; char cstr[32]; if (!firstscreens) clearit(); printf("\nRooted tree plotting program version %s\n\n", VERSION); printf("Here are the settings: \n"); printf(" 0 Screen type (IBM PC, ANSI): %s\n", ibmpc ? "IBM PC" : ansi ? "ANSI" : "(none)"); printf(" P Final plotting device: "); switch (plotter) { case lw: printf(" Postscript printer\n"); break; case pcl: printf(" HP Laserjet compatible printer (%d DPI)\n", (int) hpresolution); break; case epson: printf(" Epson dot-matrix printer\n"); break; case pcx: printf(" PCX file for PC Paintbrush drawing program (%s)\n", (resopts == 1) ? "EGA 640x350" : (resopts == 2) ? "VGA 800x600" : "VGA 1024x768"); break; case pict: printf(" Macintosh PICT file for drawing program\n"); break; case idraw: printf(" Idraw drawing program\n"); break; case fig: printf(" Xfig drawing program\n"); break; case hp: printf(" HPGL graphics language for HP plotters\n"); break; case xbm: printf(" X Bitmap file format (%d by %d resolution)\n", (int)xsize, (int)ysize); break; case bmp: printf(" MS-Windows Bitmap (%d by %d resolution)\n", (int)xsize, (int)ysize); break; case gif: printf(" Compuserve GIF format (%d by %d)\n",(int)xsize,(int)ysize); break; case ibm: printf(" IBM PC graphics (CGA, EGA, or VGA)\n"); break; case tek: printf(" Tektronix graphics screen\n"); break; case decregis: printf(" DEC ReGIS graphics (VT240 or DECTerm)\n"); break; case houston: printf(" Houston Instruments plotter\n"); break; case toshiba: printf(" Toshiba 24-pin dot matrix printer\n"); break; case citoh: printf(" Imagewriter or C.Itoh/TEC/NEC 9-pin dot matrix printer\n"); break; case oki: printf(" old Okidata 9-pin dot matrix printer\n"); break; case ray: printf(" Rayshade ray-tracing program file format\n"); break; case pov: printf(" POV ray-tracing program file format\n"); break; case vrml: printf(" VRML, Virtual Reality Markup Language\n"); break; case mac: case other: printf(" (Current output device unannounced)\n"); break; default: /*case not handled */ break; } printf(" (Preview no longer available)\n"); printf(" H Tree grows: "); printf((grows == vertical) ? "Vertically\n" : "Horizontally\n"); printf(" S Tree style: %s\n", (style == cladogram) ? "Cladogram" : (style == phenogram) ? "Phenogram" : (style == curvogram) ? "Curvogram" : (style == eurogram) ? "Eurogram" : (style == swoopogram) ? "Swoopogram" : "Circular"); printf(" B Use branch lengths: "); if (haslengths) { if (uselengths) printf("Yes\n"); else printf("No\n"); } else printf("(no branch lengths available)\n"); if (style != circular) { printf(" L Angle of labels:"); if (labelrotation < 10.0) printf("%5.1f\n", labelrotation); else printf("%6.1f\n", labelrotation); } printf(" R Scale of branch length:"); if (rescaled) printf(" Automatically rescaled\n"); else printf(" Fixed:%6.2f cm per unit branch length\n", bscale); printf(" D Depth/Breadth of tree:%6.2f\n", treedepth); printf(" T Stem-length/tree-depth:%6.2f\n", stemlength); printf(" C Character ht / tip space:%8.4f\n", 1.0 / nodespace); printf(" A Ancestral nodes: %s\n", (nodeposition == weighted) ? "Weighted" : (nodeposition == intermediate) ? "Intermediate" : (nodeposition == centered) ? "Centered" : (nodeposition == inner) ? "Inner" : "So tree is V-shaped"); if (plotter == lw || plotter == idraw || (plotter == fig && (labelrotation == 90.0 || labelrotation == 180.0 || labelrotation == 270.0 || labelrotation == 0.0)) || (plotter == pict && ((grows == vertical && labelrotation == 0.0) || (grows == horizontal && labelrotation == 90.0)))) printf(" F Font: %s\n",fontname); if ((plotter == pict && ((grows == vertical && labelrotation == 0.0) || (grows == horizontal && labelrotation == 90.0))) && (strcmp(fontname,"Hershey") != 0)) printf(" Q Pict Font Attributes: %s, %s, %s, %s\n", (pictbold ? "Bold" : "Medium"), (pictitalic ? "Italic" : "Regular"), (pictshadow ? "Shadowed": "Unshadowed"), (pictoutline ? "Outlined" : "Unoutlined")); if (plotter == ray) { printf(" M Horizontal margins:%6.2f pixels\n", xmargin); printf(" M Vertical margins:%6.2f pixels\n", ymargin); } else { printf(" M Horizontal margins:%6.2f cm\n", xmargin); printf(" M Vertical margins:%6.2f cm\n", ymargin); } printf(" # Pages per tree: "); /* Add 0.5 to clear up truncation problems. */ if (((int) ((pagex / paperx) + 0.5) == 1) && ((int) ((pagey / papery) + 0.5) == 1)) /* If we're only using one page per tree, */ printf ("one page per tree\n") ; else printf ("%.0f by %.0f pages per tree\n", (pagey-vpmargin) / (papery-vpmargin), (pagex-hpmargin) / (paperx-hpmargin)) ; for (;;) { printf("\n Y to accept these or type the letter for one to change\n"); #ifdef WIN32 phyFillScreenColor(); #endif getstryng(input); uppercase(&input[0]); ch=input[0]; if (plotter == idraw || plotter == lw) strcpy(cstr,"#Y0PVHSBLMRDTCAF"); else if (((plotter == fig) && (labelrotation == 0.0)) || (labelrotation == 90.0 ) || (labelrotation == 180.0) || (labelrotation == 270.0)) strcpy(cstr,"#Y0PVHSBLMRDTCAFQ"); else if (plotter == pict){ if (((grows == vertical && labelrotation == 0.0) || (grows == horizontal && labelrotation == 90.0))) strcpy(cstr,"#Y0PVHSBLMRDTCAFQ"); else strcpy(cstr,"#Y0PVHSBLMRDTCA"); } else strcpy(cstr,"#Y0PVHSBLMRDTCA"); if (strchr(cstr,ch)) break; printf(" That letter is not one of the menu choices. Type\n"); } return ch; } /* showparms */ void getparms(char numtochange) { /* get from user the relevant parameters for the plotter and diagram */ long loopcount; Char ch; char input[100]; boolean ok; int i, m, n; n = (int)((pagex-hpmargin-0.01)/(paperx-hpmargin)+1.0); m = (int)((pagey-vpmargin-0.01)/(papery-vpmargin)+1.0); switch (numtochange) { case '0': initterminal(&ibmpc, &ansi); break; case 'P': getplotter(); break; case 'H': if (grows == vertical) grows = horizontal; else grows = vertical; break; case 'S': clearit() ; printf("What style tree is this to be (currently set to %s):\n", (style == cladogram) ? "Cladogram" : (style == phenogram) ? "Phenogram" : (style == curvogram) ? "Curvogram" : (style == eurogram) ? "Eurogram" : (style == swoopogram) ? "Swoopogram" : "Circular") ; printf(" C Cladogram -- v-shaped \n") ; printf(" P Phenogram -- branches are square\n") ; printf(" V Curvogram -- branches are 1/4 of an ellipse\n") ; printf(" E Eurogram -- branches angle outward, then up\n"); printf(" S Swoopogram -- branches curve outward then reverse\n") ; printf(" O Circular tree\n"); do { printf("\n Type letter of style to change to (C, P, V, E, S or O):\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); uppercase(&ch); } while (ch != 'C' && ch != 'P' && ch != 'V' && ch != 'E' && ch != 'S' && ch != 'O'); switch (ch) { case 'C': style = cladogram; break; case 'P': style = phenogram; break; case 'E': style = eurogram; break; case 'S': style = swoopogram; break; case 'V': style = curvogram; break; case 'O': style = circular; treedepth = 1.0; break; } break; case 'B': if (haslengths) { uselengths = !uselengths; if (!uselengths) nodeposition = weighted; else nodeposition = intermediate; } else { printf("Cannot use lengths since not all of them exist\n"); uselengths = false; } break; case 'L': clearit(); printf("\n(Considering the tree as if it \"grew\" vertically:)\n"); printf("Are the labels to be plotted vertically (90),\n"); printf(" horizontally (0), or at a 45-degree angle?\n"); loopcount = 0; do { printf(" Choose an angle in degrees from 90 to 0:\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%lf%*[^\n]", &labelrotation); getchar(); uppercase(&ch); countup(&loopcount, 10); } while (labelrotation < 0.0 && labelrotation > 90.0); break; case 'M': clearit(); printf("\nThe tree will be drawn to fit in a rectangle which has \n"); printf(" margins in the horizontal and vertical directions of:\n"); if (plotter == ray) { printf( "%6.2f pixels (horizontal margin) and%6.2f pixels (vertical margin)\n", xmargin, ymargin); } else { printf("%6.2f cm (horizontal margin) and%6.2f cm (vertical margin)\n", xmargin, ymargin); } putchar('\n'); loopcount = 0; do { if (plotter == ray) printf(" New value (in pixels) of horizontal margin?\n"); else printf(" New value (in cm) of horizontal margin?\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%lf%*[^\n]", &xmargin); getchar(); ok = ((unsigned)xmargin < xsize / 2.0); if (!ok) printf(" Impossible value. Please retype it.\n"); countup(&loopcount, 10); } while (!ok); loopcount = 0; do { if (plotter == ray) printf(" New value (in pixels) of vertical margin?\n"); else printf(" New value (in cm) of vertical margin?\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%lf%*[^\n]", &ymargin); getchar(); ok = ((unsigned)ymargin < ysize / 2.0); if (!ok) printf(" Impossible value. Please retype it.\n"); countup(&loopcount, 10); } while (!ok); break; case 'R': rescaled = !rescaled; if (!rescaled) { printf("Centimeters per unit branch length?\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%lf%*[^\n]", &bscale); getchar(); } break; case 'D': printf("New value of depth of tree as fraction of its breadth?\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%lf%*[^\n]", &treedepth); getchar(); break; case 'T': loopcount = 0; do { printf("New value of stem length as fraction of tree depth?\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%lf%*[^\n]", &stemlength); getchar(); countup(&loopcount, 10); } while ((unsigned)stemlength >= 0.9); break; case 'C': printf("New value of character height as fraction of tip spacing?\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%lf%*[^\n]", &nodespace); getchar(); nodespace = 1.0 / nodespace; break; case '#': loopcount = 0; for (;;){ clearit(); printf(" Page Specifications Submenu\n\n"); printf(" L Output size in pages: %.0f down by %.0f across\n", (pagey / papery), (pagex / paperx)); printf(" P Physical paper size: %1.5f by %1.5f cm\n",paperx,papery); printf(" O Overlap Region: %1.5f %1.5f cm\n",hpmargin,vpmargin); printf(" M main menu\n"); getstryng(input); ch = input[0]; uppercase(&ch); switch (ch){ case 'L': printf("Number of pages in height:\n"); #ifdef WIN32 phyFillScreenColor(); #endif getstryng(input); m = atoi(input); printf("Number of pages in width:\n"); getstryng(input); n = atoi(input); break; case 'P': printf("Paper Width (in cm):\n"); #ifdef WIN32 phyFillScreenColor(); #endif getstryng(input); paperx = atof(input); printf("Paper Height (in cm):\n"); getstryng(input); papery = atof(input); break; case 'O': printf("Horizontal Overlap (in cm):"); #ifdef WIN32 phyFillScreenColor(); #endif getstryng(input); hpmargin = atof(input); printf("Vertical Overlap (in cm):"); #ifdef WIN32 phyFillScreenColor(); #endif getstryng(input); vpmargin = atof(input); case 'M': break; default: printf("Please enter L, P, O , or M.\n"); break; } pagex = ((double)n * (paperx-hpmargin)+hpmargin); pagey = ((double)m * (papery-vpmargin)+vpmargin); if (ch == 'M') break; countup(&loopcount, 10); } break; case 'A': clearit(); printf("Should interior node positions:\n"); printf(" be Intermediate between their immediate descendants,\n"); printf(" Weighted average of tip positions\n"); printf(" Centered among their ultimate descendants\n"); printf(" iNnermost of immediate descendants\n"); printf(" or so that tree is V-shaped\n"); loopcount = 0; do { printf(" (type I, W, C, N or V):\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); uppercase(&ch); countup(&loopcount, 10); } while (ch != 'I' && ch != 'W' && ch != 'C' && ch != 'N' && ch != 'V'); switch (ch) { case 'W': nodeposition = weighted; break; case 'I': nodeposition = intermediate; break; case 'C': nodeposition = centered; break; case 'N': nodeposition = inner; break; case 'V': nodeposition = vshaped; break; } break; case 'F': if (plotter == fig){ for (i=0;i<34;++i) printf("%s\n",figfontname(i)); loopcount = 0; for (;;){ printf("Fontname:"); #ifdef WIN32 phyFillScreenColor(); #endif getstryng(fontname); if (isfigfont(fontname)) break; printf("Invalid font name for fig.\n"); printf("Enter one of the following fonts or \"Hershey\" for default" " font\n"); countup(&loopcount, 10); } } else { printf("Enter font name or \"Hershey\" for the default font\n"); #ifdef WIN32 phyFillScreenColor(); #endif getstryng(fontname); } break; case 'Q': clearit(); loopcount = 0; do { printf("Italic? (Y/N)\n"); #ifdef WIN32 phyFillScreenColor(); #endif getstryng(input); input[0] = toupper(input[0]); countup(&loopcount, 10); } while (input[0] != 'Y' && input[0] != 'N'); pictitalic = (input[0] == 'Y'); loopcount = 0; do { printf("Bold? (Y/N)\n"); #ifdef WIN32 phyFillScreenColor(); #endif getstryng(input); input[0] = toupper(input[0]); countup(&loopcount, 10); } while (input[0] != 'Y' && input[0] != 'N'); pictbold = (input[0] == 'Y'); loopcount = 0; do { printf("Shadow? (Y/N)\n"); #ifdef WIN32 phyFillScreenColor(); #endif getstryng(input); input[0] = toupper(input[0]); countup(&loopcount, 10); } while (input[0] != 'Y' && input[0] != 'N'); pictshadow = (input[0] == 'Y'); loopcount = 0; do { printf("Outline? (Y/N)\n"); #ifdef WIN32 phyFillScreenColor(); #endif getstryng(input); input[0] = toupper(input[0]); countup(&loopcount, 10); } while (input[0] != 'Y' && input[0] != 'N'); pictoutline = (input[0] == 'Y');break; } } /* getparms */ void calctraverse(node *p, double lengthsum, double *tipx) { /* traverse to establish initial node coordinates */ double x1, y1, x2, y2, x3, x4, x5, w1, w2, sumwx, sumw, nodeheight; node *pp, *plast, *panc; if (p == root) nodeheight = 0.0; else if (uselengths) nodeheight = lengthsum + fabs(p->oldlen); else nodeheight = 1.0; if (nodeheight > maxheight) maxheight = nodeheight; if (p->tip) { p->xcoord = *tipx; p->tipsabove = 1; if (uselengths) p->ycoord = nodeheight; else p->ycoord = 1.0; *tipx += tipspacing; return; } sumwx = 0.0; sumw = 0.0; p->tipsabove = 0; pp = p->next; x3 = 0.0; do { calctraverse(pp->back, nodeheight, tipx); p->tipsabove += pp->back->tipsabove; sumw += pp->back->tipsabove; sumwx += pp->back->tipsabove * pp->back->xcoord; if (fabs(pp->back->xcoord - 0.5) < fabs(x3 - 0.5)) x3 = pp->back->xcoord; plast = pp; pp = pp->next; } while (pp != p); x1 = p->next->back->xcoord; x2 = plast->back->xcoord; y1 = p->next->back->ycoord; y2 = plast->back->ycoord; switch (nodeposition) { case weighted: w1 = y1 - p->ycoord; w2 = y2 - p->ycoord; if (w1 + w2 <= 0.0) p->xcoord = (x1 + x2) / 2.0; else p->xcoord = (w2 * x1 + w1 * x2) / (w1 + w2); break; case intermediate: p->xcoord = (x1 + x2) / 2.0; break; case centered: p->xcoord = sumwx / sumw; break; case inner: p->xcoord = x3; break; case vshaped: if (iteration > 1) { if (!(p == root)) { panc = nodep[p->back->index-1]; w1 = p->ycoord - panc->ycoord; w2 = y1 - p->ycoord; if (w1+w2 < 0.000001) x4 = (x1+panc->xcoord)/2.0; else x4 = (w1*x1+w2*panc->xcoord)/(w1+w2); w2 = y2 - p->ycoord; if (w1+w2 < 0.000001) x5 = (x2+panc->xcoord)/2.0; else x5 = (w1*x2+w2*panc->xcoord)/(w1+w2); if (panc->xcoord < p->xcoord) p->xcoord = x5; else p->xcoord = x4; } else { if ((y1-2*p->ycoord+y2) < 0.000001) p->xcoord = (x1+x2)/2; else p->xcoord = ((y2-p->ycoord)*x1+(y1-p->ycoord))/(y1-2*p->ycoord+y2); } } break; } if (uselengths) { p->ycoord = nodeheight; return; } if (nodeposition != inner) { p->ycoord = (y1 + y2 - sqrt((y1 + y2) * (y1 + y2) - 4 * (y1 * y2 - (x2 - p->xcoord) * (p->xcoord - x1)))) / 2.0; /* this formula comes from the requirement that the vector from (x,y) to (x1,y1) be at right angles to that from (x,y) to (x2,y2) */ return; } if (fabs(x1 - 0.5) > fabs(x2 - 0.5)) { p->ycoord = y1 + x1 - x2; w1 = y2 - p->ycoord; } else { p->ycoord = y2 + x1 - x2; w1 = y1 - p->ycoord; } if (w1 < epsilon) p->ycoord -= fabs(x1 - x2); } /* calctraverse */ void calculate() { /* compute coordinates for tree */ double tipx; double sum, temp, maxtextlength, maxfirst=0, leftfirst, angle; double lef = 0.0, rig = 0.0, top = 0.0, bot = 0.0; double *firstlet, *textlength; long i; firstlet = (double *)Malloc(nextnode*sizeof(double)); textlength = (double *)Malloc(nextnode*sizeof(double)); for (i = 0; i < nextnode; i++) { nodep[i]->xcoord = 0.0; nodep[i]->ycoord = 0.0; if (nodep[i]->naymlength > 0) firstlet[i] = lengthtext(nodep[i]->nayme, 1L,fontname,font); else firstlet[i] = 0.0; } i = 0; do i++; while (!nodep[i]->tip); leftfirst = firstlet[i]; maxheight = 0.0; maxtextlength = 0.0; //printf("fontname: %s\n",fontname); for (i = 0; i < nextnode; i++) { if (nodep[i]->tip) { textlength[i] = lengthtext(nodep[i]->nayme, nodep[i]->naymlength, fontname, font); //printf("i: %li textlength[i]: %f nodep[i]->nayme: %s nodep[i]->naymlength: %li\n", i, textlength[i], nodep[i]->nayme, nodep[i]->naymlength); if (textlength[i]-0.5*firstlet[i] > maxtextlength) { maxtextlength = textlength[i]-0.5*firstlet[i]; maxfirst = firstlet[i]; } } } //printf("maxtextlength: %f maxfirst: %f\n", maxtextlength, maxfirst); maxtextlength = maxtextlength + 0.5*maxfirst; fontheight = heighttext(font,fontname); if (style == circular) { if (grows == vertical) angle = pi / 2.0; else angle = 2.0*pi; } else angle = pi * labelrotation / 180.0; maxtextlength /= fontheight; maxfirst /= fontheight; leftfirst /= fontheight; for (i = 0; i < nextnode; i++) { if (nodep[i]->tip) { textlength[i] /= fontheight; firstlet[i] /= fontheight; } } if (spp > 1) labelheight = 1.0 / (nodespace * (spp - 1)); else labelheight = 1.0 / nodespace; if (angle < pi / 6.0) tipspacing = (nodespace + cos(angle) * (maxtextlength - 0.5*maxfirst)) * labelheight; else if (spp > 1) { if (style == circular) { tipspacing = 1.0 / spp; } else tipspacing = 1.0 / (spp - 1.0); } else tipspacing = 1.0; finished = false; iteration = 1; do { if (style == circular) tipx = 1.0/(2.0*(double)spp); else tipx = 0.0; sum = 0.0; calctraverse(root, sum, &tipx); iteration++; } while ((nodeposition == vshaped) && (iteration < 4*spp)); rooty = root->ycoord; labelheight *= 1.0 - stemlength; for (i = 0; i < nextnode; i++) { if (rescaled) { if (style != circular) nodep[i]->xcoord *= 1.0 - stemlength; nodep[i]->ycoord = stemlength * treedepth + (1.0 - stemlength) * treedepth * (nodep[i]->ycoord - rooty) / (maxheight - rooty); nodep[i]->oldtheta = angle; } else { nodep[i]->xcoord = nodep[i]->xcoord * (maxheight - rooty) / treedepth; nodep[i]->ycoord = stemlength / (1 - stemlength) * (maxheight - rooty) + nodep[i]->ycoord; nodep[i]->oldtheta = angle; } } topoflabels = 0.0; bottomoflabels = 0.0; leftoflabels = 0.0; rightoflabels = 0.0; if (style == circular) { for (i = 0; i < nextnode; i++) { temp = nodep[i]->xcoord; if (grows == vertical) { nodep[i]->xcoord = (1.0+nodep[i]->ycoord * cos((1.5-2.0*temp)*pi)/treedepth)/2.0; nodep[i]->ycoord = (1.0+nodep[i]->ycoord * sin((1.5-2.0*temp)*pi)/treedepth)/2.0; nodep[i]->oldtheta = (1.5-2.0*temp)*pi; } else { nodep[i]->xcoord = (1.0+nodep[i]->ycoord * cos((1.0-2.0*temp)*pi)/treedepth)/2.0; nodep[i]->ycoord = (1.0+nodep[i]->ycoord * sin((1.0-2.0*temp)*pi)/treedepth)/2.0; nodep[i]->oldtheta = (1.0-2.0*temp)*pi; } } tipspacing *= 2.0*pi; } maxx = nodep[0]->xcoord; maxy = nodep[0]->ycoord; minx = nodep[0]->xcoord; if (style == circular) miny = nodep[0]->ycoord; else miny = 0.0; for (i = 1; i < nextnode; i++) { if (nodep[i]->xcoord > maxx) maxx = nodep[i]->xcoord; if (nodep[i]->ycoord > maxy) maxy = nodep[i]->ycoord; if (nodep[i]->xcoord < minx) minx = nodep[i]->xcoord; if (nodep[i]->ycoord < miny) miny = nodep[i]->ycoord; } if (style == circular) { for (i = 0; i < nextnode; i++) { if (nodep[i]->tip) { angle = nodep[i]->oldtheta; if (cos(angle) < 0.0) angle -= pi; top = (nodep[i]->ycoord - maxy) / labelheight + sin(nodep[i]->oldtheta); rig = (nodep[i]->xcoord - maxx) / labelheight + cos(nodep[i]->oldtheta); bot = (miny - nodep[i]->ycoord) / labelheight - sin(nodep[i]->oldtheta); lef = (minx - nodep[i]->xcoord) / labelheight - cos(nodep[i]->oldtheta); if (cos(nodep[i]->oldtheta) > 0) { if (sin(angle) > 0.0) top += sin(angle) * textlength[i]; top += sin(angle - 1.25 * pi) * gap * firstlet[i]; if (sin(angle) < 0.0) bot -= sin(angle) * textlength[i]; bot -= sin(angle - 0.75 * pi) * gap * firstlet[i]; if (sin(angle) > 0.0) rig += cos(angle - 0.75 * pi) * gap * firstlet[i]; else rig += cos(angle - 1.25 * pi) * gap * firstlet[i]; rig += cos(angle) * textlength[i]; if (sin(angle) > 0.0) lef -= cos(angle - 1.25 * pi) * gap * firstlet[i]; else lef -= cos(angle - 0.75 * pi) * gap * firstlet[i]; } else { if (sin(angle) < 0.0) top -= sin(angle) * textlength[i]; top += sin(angle + 0.25 * pi) * gap * firstlet[i]; if (sin(angle) > 0.0) bot += sin(angle) * textlength[i]; bot -= sin(angle - 0.25 * pi) * gap * firstlet[i]; if (sin(angle) > 0.0) rig += cos(angle - 0.25 * pi) * gap * firstlet[i]; else rig += cos(angle + 0.25 * pi) * gap * firstlet[i]; if (sin(angle) < 0.0) rig += cos(angle) * textlength[i]; if (sin(angle) > 0.0) lef -= cos(angle + 0.25 * pi) * gap * firstlet[i]; else lef -= cos(angle - 0.25 * pi) * gap * firstlet[i]; lef += cos(angle) * textlength[i]; } } if (top > topoflabels) topoflabels = top; if (bot > bottomoflabels) bottomoflabels = bot; if (rig > rightoflabels) rightoflabels = rig; if (lef > leftoflabels) leftoflabels = lef; } topoflabels *= labelheight; bottomoflabels *= labelheight; leftoflabels *= labelheight; rightoflabels *= labelheight; } if (style != circular) { //printf("labelheight: %f angle: %f maxtextlength: %f maxfirst: %f\n", labelheight, angle, maxtextlength, maxfirst); topoflabels = labelheight * (1.0 + sin(angle) * (maxtextlength - 0.5 * maxfirst) + cos(angle) * 0.5 * maxfirst); rightoflabels = labelheight * (cos(angle) * (textlength[nextnode-1] - 0.5 * maxfirst) + sin(angle) * 0.5); leftoflabels = labelheight * (cos(angle) * leftfirst * 0.5 + sin(angle) * 0.5); } rooty = miny; free(firstlet); free(textlength); } /* calculate */ void rescale() { /* compute coordinates of tree for plot device */ long i; double treeheight, treewidth, extrax, extray, temp; //printf("maxy: %f miny: %f maxx: %f minx: %f\n", maxy, miny, maxx, minx); treeheight = maxy - miny; treewidth = maxx - minx; if (style == circular) { treewidth = 1.0; treeheight = 1.0; if (!rescaled) { if (uselengths) { labelheight *= (maxheight - rooty) / treedepth; topoflabels *= (maxheight - rooty) / treedepth; bottomoflabels *= (maxheight - rooty) / treedepth; leftoflabels *= (maxheight - rooty) / treedepth; rightoflabels *= (maxheight - rooty) / treedepth; treewidth *= (maxheight - rooty) / treedepth; } } } treewidth += rightoflabels + leftoflabels; treeheight += topoflabels + bottomoflabels; //printf("rightoflabels: %f leftoflabels: %f topoflabels: %f bottomoflabels: %f treewidth: %f treeheight: %f\n", rightoflabels, leftoflabels, topoflabels, bottomoflabels, treewidth, treeheight); if (grows == vertical) { if (!rescaled) expand = bscale; else { expand = (xsize - 2 * xmargin) / treewidth; if ((ysize - 2 * ymargin) / treeheight < expand) expand = (ysize - 2 * ymargin) / treeheight; } extrax = (xsize - 2 * xmargin - treewidth * expand) / 2.0; extray = (ysize - 2 * ymargin - treeheight * expand) / 2.0; } else { if (!rescaled) expand = bscale; else { expand = (ysize - 2 * ymargin) / treewidth; if ((xsize - 2 * xmargin) / treeheight < expand) expand = (xsize - 2 * xmargin) / treeheight; } extrax = (xsize - 2 * xmargin - treeheight * expand) / 2.0; extray = (ysize - 2 * ymargin - treewidth * expand) / 2.0; } //printf("in rescale rescaled: %i treewidth: %f treeheight: %f expand: %f\n", rescaled, treewidth, treeheight, expand); //printf("xsize: %f xmargin: %f ysize: %f ymargin: %f\n", xsize, xmargin, ysize, ymargin); for (i = 0; i < nextnode; i++) { //printf("before rescale i: %li, nodep[i]: %f, %f\n", i, nodep[i]->xcoord, nodep[i]->ycoord); nodep[i]->xcoord = expand * (nodep[i]->xcoord + leftoflabels); nodep[i]->ycoord = expand * (nodep[i]->ycoord + bottomoflabels); if ((style != circular) && (grows == horizontal)) { temp = nodep[i]->ycoord; nodep[i]->ycoord = expand * treewidth - nodep[i]->xcoord; nodep[i]->xcoord = temp; } nodep[i]->xcoord += xmargin + extrax; nodep[i]->ycoord += ymargin + extray; //printf("after rescale i: %li, nodep[i]: %f, %f\n", i, nodep[i]->xcoord, nodep[i]->ycoord); } if (style == circular) { xx0 = xmargin+extrax+expand*(leftoflabels + 0.5); yy0 = ymargin+extray+expand*(bottomoflabels + 0.5); } else if (grows == vertical) rooty = ymargin + extray; else rootx = xmargin + extrax; } /* rescale */ void plottree(node *p, node *q) { /* plot part or all of tree on the plotting device */ long i; double x00=0, y00=0, x1, y1, x2, y2, x3, y3, x4, y4, cc, ss, f, g, fract=0, minny, miny; node *pp; //printf("p: %p q: %p xscale: %f xoffset: %f yscale: %f yoffset %f\n", p, q, xscale, xoffset, yscale, yoffset); // fflush(stdout); x2 = xscale * (xoffset + p->xcoord); y2 = yscale * (yoffset + p->ycoord); if (style == circular) { x00 = xscale * (xx0 + xoffset); y00 = yscale * (yy0 + yoffset); } if (p != root) { x1 = xscale * (xoffset + q->xcoord); y1 = yscale * (yoffset + q->ycoord); //printf("branch p: %f, %f q: %f, %f\n",p->xcoord, p->ycoord, q->xcoord, q->ycoord); //fflush(stdout); switch (style) { case cladogram: plot(penup, x1, y1); plot(pendown, x2, y2); break; case phenogram: plot(penup, x1, y1); if (grows == vertical) plot(pendown, x2, y1); else plot(pendown, x1, y2); plot(pendown, x2, y2); break; case curvogram: plot(penup, x1, y1) ; curvespline(x1,y1,x2,y2,(boolean)(grows == vertical),20); break; case eurogram: plot(penup, x1, y1); if (grows == vertical) plot(pendown, x2, (2 * y1 + y2) / 3); else plot(pendown, (2 * x1 + x2) / 3, y2); plot(pendown, x2, y2); break; case swoopogram: plot(penup, x1, y1); if ((grows == vertical && fabs(y1 - y2) >= epsilon) || (grows == horizontal && fabs(x1 - x2) >= epsilon)) { if (grows == vertical) miny = p->ycoord; else miny = p->xcoord; pp = q->next; while (pp != q) { if (grows == vertical) minny = pp->back->ycoord; else minny = pp->back->xcoord; if (minny < miny) miny = minny; pp = pp->next; } if (grows == vertical) miny = yscale * (yoffset + miny); else miny = xscale * (xoffset + miny); if (grows == vertical) fract = 0.3333 * (miny - y1) / (y2 - y1); else fract = 0.3333 * (miny - x1) / (x2 - x1); } if ((grows == vertical && fabs(y1 - y2) >= epsilon) || (grows == horizontal && fabs(x1 - x2) >= epsilon)) { if (grows == vertical) miny = p->ycoord; else miny = p->xcoord; pp = q->next; while (pp != q) { if (grows == vertical) minny = pp->back->ycoord; else minny = pp->back->xcoord; if (minny < miny) miny = minny; pp = pp->next; } if (grows == vertical) miny = yscale * (yoffset + miny); else miny = xscale * (xoffset + miny); if (grows == vertical) fract = 0.3333 * (miny - y1) / (y2 - y1); else fract = 0.3333 * (miny - x1) / (x2 - x1); } swoopspline(x1,y1,x1+fract*(x2-x1),y1+fract*(y2-y1), x2,y2,(boolean)(grows != vertical),segments); break; case circular: plot(penup, x1, y1); if (fabs(x1-x00)+fabs(y1-y00) > 0.00001) { g = ((x1-x00)*(x2-x00)+(y1-y00)*(y2-y00)) /sqrt(((x1-x00)*(x1-x00)+(y1-y00)*(y1-y00)) *((x2-x00)*(x2-x00)+(y2-y00)*(y2-y00))); if (g > 1.0) g = 1.0; if (g < -1.0) g = -1.0; f = acos(g); if ((x2-x00)*(y1-y00)>(x1-x00)*(y2-y00)) f = -f; if (fabs(g-1.0) > 0.0001) { cc = cos(f/segments); ss = sin(f/segments); x3 = x1; y3 = y1; for (i = 1; i <= segments; i++) { x4 = x00 + cc*(x3-x00) - ss*(y3-y00); y4 = y00 + ss*(x3-x00) + cc*(y3-y00); x3 = x4; y3 = y4; plot(pendown, x3, y3); } } } plot(pendown, x2, y2); break; } } else { if (style == circular) { x1 = x00; y1 = y00; } else { if (grows == vertical) { x1 = xscale * (xoffset + p->xcoord); y1 = yscale * (yoffset + rooty); } else { x1 = xscale * (xoffset + rootx); y1 = yscale * (yoffset + p->ycoord); } } //printf("root p: %f, %f q: %f, %f\n",x2, y2, x1, y1); //fflush(stdout); plot(penup, x1, y1); plot(pendown, x2, y2); } if (p->tip) return; pp = p->next; while (pp != p) { plottree(pp->back, p); pp = pp->next; } } /* plottree */ void plotlabels(char *fontname) { long i; double compr, dx = 0, dy = 0, labangle, cosl, sinl, cosv, sinv, vec; boolean left, right; node *lp; double *firstlet; firstlet = (double *)Malloc(nextnode*sizeof(double)); textlength = (double *)Malloc(nextnode*sizeof(double)); compr = xunitspercm / yunitspercm; if (penchange == yes) changepen(labelpen); for (i = 0; i < nextnode; i++) { if (nodep[i]->tip) { lp = nodep[i]; firstlet[i] = lengthtext(nodep[i]->nayme,1L,fontname,font) /fontheight; textlength[i] = lengthtext(nodep[i]->nayme, nodep[i]->naymlength, fontname, font)/fontheight; labangle = nodep[i]->oldtheta; if (cos(labangle) < 0.0) labangle += pi; cosl = cos(labangle); sinl = sin(labangle); vec = sqrt(1.0+firstlet[i]*firstlet[i]); cosv = firstlet[i]/vec; sinv = 1.0/vec; if (style == circular) { right = cos(nodep[i]->oldtheta) > 0.0; left = !right; if (right) { dx = labelheight * expand * cos(nodep[i]->oldtheta); dy = labelheight * expand * sin(nodep[i]->oldtheta); dx += labelheight * expand * 0.5 * vec * (-cosl*sinv+sinl*cosv); dy += labelheight * expand * 0.5 * vec * (-sinl*sinv-cosl*cosv); } if (left) { dx = labelheight * expand * cos(nodep[i]->oldtheta); dy = labelheight * expand * sin(nodep[i]->oldtheta); dx -= labelheight * expand * textlength[i] * cosl; dy -= labelheight * expand * textlength[i] * sinl; dx += labelheight * expand * 0.5 * vec * (cosl*cosv+sinl*sinv); dy += labelheight * expand * 0.5 * vec * (-sinl*cosv-cosl*sinv); } } else { dx = labelheight * expand * cos(nodep[i]->oldtheta); dy = labelheight * expand * sin(nodep[i]->oldtheta); dx -= labelheight * expand * 0.5 * firstlet[i] * (cosl-sinl*sinv); dy -= labelheight * expand * 0.5 * firstlet[i] * (sinl+cosl*sinv); } if (style == circular) { plottext(lp->nayme, lp->naymlength, labelheight * expand * xscale / compr, compr, xscale * (lp->xcoord + dx + xoffset), yscale * (lp->ycoord + dy + yoffset), 180 * (-labangle) / pi, font,fontname); } else { if (grows == vertical) plottext(lp->nayme, lp->naymlength, labelheight * expand * xscale / compr, compr, xscale * (lp->xcoord + dx + xoffset), yscale * (lp->ycoord + dy + yoffset), -labelrotation, font,fontname); else plottext(lp->nayme, lp->naymlength, labelheight * expand * yscale, compr, xscale * (lp->xcoord + dy + xoffset), yscale * (lp->ycoord - dx + yoffset), 90.0 - labelrotation, font,fontname); } } } if (penchange == yes) changepen(treepen); free(firstlet); free(textlength); } /* plotlabels */ void setup_environment(Char *argv[]) { boolean firsttree; /* Set up all kinds of fun stuff */ #ifdef TURBOC if ((registerbgidriver(EGAVGA_driver) <0) || (registerbgidriver(Herc_driver) <0) || (registerbgidriver(CGA_driver) <0)){ printf("Graphics error: %s ",grapherrormsg(graphresult())); exit(-1);} #endif /* Open in binary: ftell() is broken for UNIX line-endings under WIN32 */ openfile(&intree,INTREE,"input tree file", "rb",argv[0],trefilename); printf("DRAWGRAM from PHYLIP version %s\n",VERSION); printf("Reading tree ... \n"); firsttree = true; allocate_nodep(&nodep, &intree, &spp); treeread (intree, &root, treenode, &goteof, &firsttree, nodep, &nextnode, &haslengths, &grbg, initdrawgramnode,true,-1); root->oldlen = 0.0; printf("Tree has been read.\n"); printf("Loading the font .... \n"); loadfont(font,FONTFILE,argv[0]); printf("Font loaded.\n"); ansi = ANSICRT; ibmpc = IBMCRT; firstscreens = true; initialparms(); canbeplotted = false; } /* setup_environment */ void user_loop() { char input_char; long stripedepth; while (!canbeplotted) { do { input_char=showparms(); firstscreens = false; if (input_char != 'Y') getparms(input_char); } while (input_char != 'Y'); #ifdef JAVADEBUG // JRM Debug printf("\n***internal parameters***\n"); printf("fontname: %s\n", fontname); printf("plotter: %i\n",plotter); printf("paperx: %f\n",paperx); printf("papery: %f\n",papery); printf("hpmargin: %f\n",hpmargin); printf("vpmargin: %f\n",vpmargin); printf("pagex: %f\n",pagex); printf("pagey: %f\n",pagey); printf("xmargin: %f\n",xmargin); printf("ymargin: %f\n",ymargin); printf("firstscreens: %i\n",firstscreens); printf("canbeplotted: %i\n",canbeplotted); printf("style: %i\n",style); printf("grows: %i\n",grows); printf("labelrotation: %f\n",labelrotation); printf("nodespace: %f\n",nodespace); printf("stemlength: %f\n",stemlength); printf("treedepth: %f\n",treedepth); printf("bscale: %f\n",bscale); printf("uselengths: %i\n",uselengths); printf("nodeposition: %i\n",nodeposition); printf("dotmatrix: %i\n",dotmatrix); printf("xsize: %f\n", xsize); printf("ysize: %f\n", ysize); fflush(stdout); #endif if (dotmatrix) { stripedepth = allocstripe(stripe,(strpwide/8), ((long)(yunitspercm * ysize))); //printf("strpwide: %li, yunitspercm: %f, ysize: %f\n",strpwide, yunitspercm, ysize); //printf("stripedepth: %li\n", stripedepth); //fflush(stdout); strpdeep = stripedepth; strpdiv = stripedepth; } plotrparms(spp); //printf("after plotrparms\n"); //printf("strpdeep: %li, yunitspercm: %f, ysize: %f\n",strpdeep, yunitspercm, ysize); //fflush(stdout); numlines = dotmatrix ? ((long)floor(yunitspercm * ysize + 0.5) / strpdeep) :1; //printf("dotmatrix: %i numlines: %li\n",dotmatrix, numlines); //fflush(stdout); xscale = xunitspercm; yscale = yunitspercm; calculate(); rescale(); canbeplotted = true; } } /* user_loop */ void drawgram( char *intreename, char *fontfilename, char *plotfilename, char *plotfileopt, char *treegrows, char *stylekind, int usebranchlengths, double labelangle, int scalebranchlength, double branchscale, double breadthdepthratio, double stemltreedratio, double chhttipspratio, double xmarginratio, double ymarginratio, char *ancnodes, int dofinalplot, char* finalplotkind) { //printf("hello from DrawGram\n"); // setup before data is translated char *previewfilename = "JavaPreview.ps"; // from init javarun = true; ansi = ANSICRT; ibmpc = IBMCRT; firstscreens = false; canbeplotted = true; dotmatrix = false; boolean wasplotted = false; grbg = NULL; progname = "Drawgram"; initialparms(); // translate Java doubles to Drawgram local variables labelrotation = labelangle; stemlength = stemltreedratio; treedepth = breadthdepthratio; bscale = branchscale; nodespace = 1.0 / chhttipspratio; xmargin = xmarginratio * paperx; ymargin = ymarginratio * papery; // translate Java boolean to Phylip boolean boolean doplot; if (dofinalplot != 0) doplot = true; else doplot = false; if (usebranchlengths != 0) uselengths = true; else uselengths = false; if (scalebranchlength != 0) rescaled = true; else rescaled = false; // plot style style = phenogram; if (!strcmp(stylekind,"cladogram")) style = cladogram; if (!strcmp(stylekind,"phenogram")) style = phenogram; if (!strcmp(stylekind,"curvogram")) style = curvogram; if (!strcmp(stylekind,"eurogram")) style = eurogram; if (!strcmp(stylekind,"swoopogram")) style = swoopogram; if (!strcmp(stylekind,"circular")) style = circular; // tree growth direction grows = horizontal; if (!strcmp(treegrows,"vertical")) grows = vertical; // node positions if (!strcmp(ancnodes,"weighted")) nodeposition = weighted; if (!strcmp(ancnodes,"intermediate")) nodeposition = intermediate; if (!strcmp(ancnodes,"centered")) nodeposition = centered; if (!strcmp(ancnodes,"inner")) nodeposition = inner; if (!strcmp(ancnodes,"vshaped")) nodeposition = vshaped; plotter = lw; strcpy(afmfile,"none"); if(dofinalplot) { if (!strcmp(finalplotkind,"lw")) { plotter = lw; } /* // not currently available if(!strcmp(finalplotkind,"svg")) { plotter = svg; } */ else if(!strcmp(finalplotkind,"hp")) { plotter = hp; } if(!strcmp(finalplotkind,"tek")) { plotter = tek; } if(!strcmp(finalplotkind,"ibm")) { plotter = ibm; } if(!strcmp(finalplotkind,"mac")) { plotter = mac; } if(!strcmp(finalplotkind,"houston")) { plotter = houston; } if(!strcmp(finalplotkind,"decregis")) { plotter = decregis; } if(!strcmp(finalplotkind,"epson")) { plotter = epson; } if(!strcmp(finalplotkind,"oki")) { plotter = oki; } if(!strcmp(finalplotkind,"fig")) { plotter = fig; } if(!strcmp(finalplotkind,"citoh")) { plotter = citoh; } if(!strcmp(finalplotkind,"toshiba")) { plotter = toshiba; } if(!strcmp(finalplotkind,"pcx")) { plotter = pcx; } if(!strcmp(finalplotkind,"pcl")) { plotter = pcl; } if(!strcmp(finalplotkind,"pict")) { plotter = pict; } if(!strcmp(finalplotkind,"ray")) { plotter = ray; } if(!strcmp(finalplotkind,"pov")) { plotter = pov; } if(!strcmp(finalplotkind,"xbm")) { plotter = xbm; } if(!strcmp(finalplotkind,"bmp")) { plotter = bmp; } if(!strcmp(finalplotkind,"gif")) { plotter = gif; } if(!strcmp(finalplotkind,"idraw")) { plotter = idraw; } if(!strcmp(finalplotkind,"vrml")) { plotter = vrml; } if(!strcmp(finalplotkind,"other")) { plotter = other; } } else { // preview plotter = lw; // hardwired to ps } // choose the best Hershey approximation of the font int fontid = figfontid(fontfilename); //printf("fontfilename: %s id: %i ", fontfilename, fontid); char* hersheyfontname = "font1"; switch (fontid) { case 4: //AvantGarde-Book case 8: //Bookman-Light case 16: //Helvetica case 20: //Helvetica-Narrow hersheyfontname = "font1"; // romans sans break; case 6: //AvantGarde-Demi case 10: //Bookman-Demi case 18: //Helvetica-Bold case 22: //Helvetica-Narrow-Bold hersheyfontname = "font2"; // romans sans bold break; case 0: //Times-Roman case 2: //Times-Bold case 12: //Courier case 14: //Courier-Bold case 24: //NewCenturySchlbk-Roman case 26: //NewCenturySchlbk-Bold case 28: //Palatino-Roman case 30: //Palatino-Bold hersheyfontname = "font3"; // romans serif break; case 1: //Times-Italic case 5: //AvantGarde-BookOblique case 9: //Bookman-LightItalic case 13: //Courier-Italic case 17: //Helvetica-Oblique case 21: //Helvetica-Narrow-Oblique case 25: //NewCenturySchlbk-Italic case 29: //Palatino-Italic case 33: //ZapfChancery-MediumItalic hersheyfontname = "font4"; // romans serif italic break; case 3: //Times-BoldItalic case 7: //AvantGarde-DemiOblique case 11: //Bookman-DemiItalic case 15: //Courier-BoldItalic case 19: //Helvetica-BoldOblique case 23: //Helvetica-Narrow-BoldOblique case 27: //NewCenturySchlbk-BoldItalic case 31: //Palatino-BoldItalic hersheyfontname = "font5"; // romans serif italic bold break; default: /* case 32: //Symbol case 34: //ZapfDingbats */ hersheyfontname = "font1"; // romans sans break; } //printf("hersheyfontname: %s\n", hersheyfontname); loadfont(font,hersheyfontname,progname); #ifdef JAVADEBUG // JRM Debug // dump translated parameters from Java to make sure they made it printf("Starting parameters: \n"); printf("intreename: %s\n",intreename); printf("fontfilename: %s\n",fontfilename); printf("plotfilename: %s\n",plotfilename); printf("plotfileopt: %s\n",plotfileopt); printf("previewfilename: %s\n",previewfilename); printf("rescaled: %i\n",rescaled); printf("finalplotkind: %s\n",finalplotkind); printf("doplot: %i\n",doplot); fflush(stdout); #endif // start drawgram code /* printf("hello from DrawGram\n"); printf("DRAWGRAM from PHYLIP version %s\n",VERSION); fflush(stdout); */ // read tree intree = fopen(intreename,"r"); boolean firsttree = true; allocate_nodep(&nodep, &intree, &spp); plotrparms(spp); treeread (intree, &root, treenode, &goteof, &firsttree, nodep, &nextnode, &haslengths, &grbg, initdrawgramnode,true,-1); root->oldlen = 0.0; // if there are no lengths, shut uselengths off if (haslengths==0) uselengths = false; //printf("Tree has been read.\n"); //printf("haslengths: %i\n",haslengths); //fflush(stdout); // printer specific details to simulate interactive setup // mostly from getplotter with some from plotrparms // after treeread as some formats need some of the tree data dotmatrix = false; long stripedepth; switch (plotter) { case lw: strcpy(fontname, fontfilename); break; case hp: strcpy(fontname, "Hershey"); break; case tek: strcpy(fontname, "Hershey"); break; case ibm: strcpy(fontname, "Hershey"); break; case mac: strcpy(fontname, "Hershey"); break; case houston: strcpy(fontname, "Hershey"); break; case decregis: strcpy(fontname, "Hershey"); break; case epson: dotmatrix = true; strcpy(fontname, "Hershey"); strpdiv = 1; stripedepth = allocstripe(stripe,(strpwide/8),((long)(yunitspercm * ysize))); break; case oki: dotmatrix = true; strcpy(fontname, "Hershey"); strpdiv = 1; stripedepth = allocstripe(stripe,(strpwide/8),((long)(yunitspercm * ysize))); break; case fig: strcpy(fontname, fontfilename); break; case citoh: dotmatrix = true; strcpy(fontname, "Hershey"); strpdiv = 1; stripedepth = allocstripe(stripe,(strpwide/8),((long)(yunitspercm * ysize))); break; case toshiba: dotmatrix = true; strcpy(fontname, "Hershey"); strpdiv = 4; stripedepth = allocstripe(stripe,(strpwide/8),((long)(yunitspercm * ysize))); break; case pcx: dotmatrix = true; strcpy(fontname, "Hershey"); // assume VGA 1024 x 768 strpwide = 1024; yunitspercm = 768 / ysize; resopts = 3; strpdiv = 10; stripedepth = allocstripe(stripe,(strpwide/8),((long)(yunitspercm * ysize))); break; case pcl: dotmatrix = true; strcpy(fontname, "Hershey"); // assume 300 dpi hpresolution = 300; xunitspercm = 118.11023622; /* 300 DPI = 118.1 DPC */ yunitspercm = 118.11023622; strpwide = 2550; /* 8.5 * 300 DPI */ strpdeep = DEFAULT_STRIPE_HEIGHT; /* height of the strip */ strpdiv = DEFAULT_STRIPE_HEIGHT; /* in this case == strpdeep */ stripedepth = allocstripe(stripe,(strpwide/8),((long)(yunitspercm * ysize))); break; case pict: strcpy(fontname, fontfilename); break; case ray: strcpy(fontname, "Hershey"); break; case pov: strcpy(fontname, "Hershey"); break; case xbm: dotmatrix = true; strcpy(fontname, "Hershey"); // assume 1000 x 1000 xunitspercm = 1.0; yunitspercm = 1.0; xsize = 1000; ysize = 1000; xmargin = 0.08 * xsize; ymargin = 0.08 * ysize; strpdeep = DEFAULT_STRIPE_HEIGHT; strpdiv = DEFAULT_STRIPE_HEIGHT; strpwide = (long)xsize; stripedepth = allocstripe(stripe,(strpwide/8),((long)(yunitspercm * ysize))); break; case bmp: dotmatrix = true; strcpy(fontname, "Hershey"); // assume 1000 x 1000 xunitspercm = 1.0; yunitspercm = 1.0; xsize = 1000; ysize = 1000; xmargin = 0.08 * xsize; ymargin = 0.08 * ysize; strpdeep = DEFAULT_STRIPE_HEIGHT; strpdiv = DEFAULT_STRIPE_HEIGHT; strpwide = (long)xsize; stripedepth = allocstripe(stripe,(strpwide/8),((long)(yunitspercm * ysize))); /* printf("***bmp setup done\n"); printf("xsize: %f\n", xsize); printf("ysize: %f\n", ysize); printf("xmargin: %f\n",xmargin); printf("ymargin: %f\n",ymargin); printf("strpwide: %li, yunitspercm: %f, ysize: %f\n" ,strpwide, yunitspercm, ysize); fflush(stdout); */ break; case gif: strcpy(fontname, fontfilename); break; case idraw: strcpy(fontname, "Times-Bold"); break; case vrml: strcpy(fontname, "Hershey"); treecolor = 5; namecolor = 4; vrmlskycolornear = 6; vrmlskycolorfar = 6; vrmlgroundcolornear = 3; vrmlgroundcolorfar = 3; break; case other: strcpy(fontname, "Hershey"); break; default: break; } numlines = dotmatrix ? ((long)floor(yunitspercm * ysize + 0.5) / strpdeep) :1; xscale = xunitspercm; yscale = yunitspercm; #ifdef JAVADEBUG // jrmdebug dump final parameter settings printf("\n***internal parameters***\n"); printf("fontname: %s\n", fontname); printf("plotter: %i\n",plotter); printf("paperx: %f\n",paperx); printf("papery: %f\n",papery); printf("hpmargin: %f\n",hpmargin); printf("vpmargin: %f\n",vpmargin); printf("pagex: %f\n",pagex); printf("pagey: %f\n",pagey); printf("xmargin: %f\n",xmargin); printf("ymargin: %f\n",ymargin); printf("firstscreens: %i\n",firstscreens); printf("canbeplotted: %i\n",canbeplotted); printf("style: %i\n",style); printf("grows: %i\n",grows); printf("labelrotation: %f\n",labelrotation); printf("nodespace: %f\n",nodespace); printf("stemlength: %f\n",stemlength); printf("treedepth: %f\n",treedepth); printf("bscale: %f\n",bscale); printf("uselengths: %i\n",uselengths); printf("nodeposition: %i\n",nodeposition); printf("dotmatrix: %i\n",dotmatrix); printf("xsize: %f\n", xsize); printf("ysize: %f\n", ysize); printf("---internal vars---\n"); printf("nodespace: %f\n",nodespace); printf("spp: %li\n",spp); printf("doplot: %i\n",doplot); printf("dotmatrix: %i numlines: %li\n",dotmatrix, numlines); fflush(stdout); #endif calculate(); rescale(); //printf("passed rescale\n"); //fflush(stdout); int lenname; if (doplot) { lenname = strlen(plotfilename); } else { lenname = strlen(previewfilename); } char* usefilename = (char*) malloc(lenname + 1); if (doplot) { strcpy(usefilename, plotfilename); strcpy(pltfilename, plotfilename); } else { strcpy(usefilename, previewfilename); } strcpy(trefilename, intreename); // Open plot files in binary mode. plotfile = fopen(usefilename,plotfileopt); //printf("initializing plotter\n"); //fflush(stdout); initplotter(spp,fontname); //printf("plotter initialized\n"); //printf("numlines: %li\n",numlines); //fflush(stdout); //printf("\nWriting plot file ...\n"); //fflush(stdout); if (!dofinalplot) { /* printf("calling makebox\n"); printf("fontname: %s\n",fontname); printf("xoffset: %f\n",xoffset); printf("yoffset: %f\n",yoffset); printf("scale: %f\n",scale); printf("spp: %li\n",spp); fflush(stdout); */ changepen(labelpen); makebox(fontname,&xoffset,&yoffset,&scale,spp); /* printf("back from makebox\n"); printf("fontname: %s\n",fontname); printf("xoffset: %f\n",xoffset); printf("yoffset: %f\n",yoffset); printf("scale: %f\n",scale); printf("spp: %li\n",spp); fflush(stdout); */ changepen(treepen); } drawit(fontname,&xoffset,&yoffset,numlines,root); //printf("drawit done\n"); //fflush(stdout); finishplotter(); //printf("finishplotter done\n"); //fflush(stdout); fclose(plotfile); //printf("\nPlot written to file \"%s\"\n\n", pltfilename); wasplotted = true; fclose(intree); //printf("Drawgram Done.\n\n"); return; } int main(int argc, Char *argv[]) { boolean wasplotted = false; javarun = false; argv[0] = "Drawgram"; grbg = NULL; progname = argv[0]; init(argc, argv); setup_environment(argv); user_loop(); if (!(winaction == quitnow)) { /* Open plot files in binary mode. */ openfile(&plotfile,PLOTFILE,"plot file", "wb",argv[0],pltfilename); initplotter(spp,fontname); numlines = dotmatrix ? ((long)floor(yunitspercm * ysize + 0.5)/strpdeep) : 1; //printf("numlines: %li\n",numlines); if (plotter != ibm) printf("\nWriting plot file ...\n"); drawit(fontname,&xoffset,&yoffset,numlines,root); finishplotter(); FClose(plotfile); wasplotted = true; printf("\nPlot written to file \"%s\"\n\n", pltfilename); } FClose(intree); printf("Done.\n\n"); #ifdef WIN32 phyRestoreConsoleAttributes(); #endif return 0; } phylip-3.697/src/drawtree.c0000644004732000473200000026743712407046723015345 0ustar joefelsenst_g#include "phylip.h" #include "draw.h" /* Version 3.696. Written by Joseph Felsenstein and Christopher A. Meacham. Additional code written by Sean Lamont, Andrew Keefe, Hisashi Horino, Akiko Fuseki, Doug Buxton Michal Palczewski, and James McGill. Copyright (c) 1986-2014, Joseph Felsenstein All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #ifdef MAC char* about_message = "Drawtree unrooted tree plotting program\r" "PHYLIP version 3.696.\r" "(c) Copyright 1986-2004 by Joseph Felsenstein.\r" "Written by Joseph Felsenstein and Christopher A. Meacham.\r" "Additional code written by Sean Lamont, Andrew Keefe, Hisashi Horino,\r" "Akiko Fuseki, Doug Buxton and Michal Palczewski.\r" #endif //#define JAVADEBUG #define GAP 0.5 #define MAXITERATIONS 100 #define MINIMUMCHANGE 0.0001 /*When 2 Nodes are on top of each other, this is the max. force that's allowed.*/ #ifdef INFINITY #undef INFINITY #endif #define INFINITY (double) 9999999999.0 typedef enum {fixed, radial, along, middle} labelorient; FILE *plotfile; char pltfilename[FNMLNGTH]; long nextnode, strpwide, strpdeep, strptop, strpbottom, payge, numlines,hpresolution; double xmargin, ymargin, topoflabels, rightoflabels, leftoflabels, bottomoflabels, ark, maxx, maxy, minx, miny, scale, xscale, yscale, xoffset, yoffset, charht, xnow, ynow, xunitspercm, yunitspercm, xsize, ysize, xcorner, ycorner,labelheight, labelrotation, treeangle, expand, bscale, maxchange; boolean canbeplotted, dotmatrix, haslengths, uselengths, regular, rotate, empty, rescaled, notfirst, improve, nbody, firstscreens, labelavoid; boolean pictbold,pictitalic,pictshadow,pictoutline; boolean javarun; striptype stripe; plottertype plotter, oldplotter; growth grows; labelorient labeldirec; node *root, *where; pointarray nodep; pointarray treenode; fonttype font; enum { yes, no } penchange, oldpenchange; char ch,resopts; char *progname; long filesize; long strpdiv; double pagex,pagey,paperx,papery,hpmargin,vpmargin; double *textlength, *firstlet; double trweight; /* starting here, needed to make sccs version happy */ boolean goteof; node *grbg; winactiontype winaction; long maxNumOfIter; struct stackElem { /* This is actually equivalent to a reversed link list; pStackElemBack point toward the direction of the bottom of the stack */ struct stackElem *pStackElemBack; node *pNode; }; typedef struct stackElem stackElemType; #ifndef OLDC /* function prototypes */ void initdrawtreenode(node **, node **, node *, long, long, long *, long *, initops, pointarray, pointarray, Char *, Char *, FILE *); void initialparms(void); char showparms(void); void getparms(char); void getwidth(node *); void plrtrans(node *, double, double, double); void coordtrav(node *, double *, double *); double angleof(double , double ); void polartrav(node *, double, double, double, double, double *, double *, double *, double *); void tilttrav(node *, double *, double *, double *, double *); void leftrightangle(node *, double, double); void improvtrav(node *); void force_1to1(node *, node *, double *, double *, double); void totalForceOnNode(node *, node *, double *, double *, double); double dotProduct(double, double, double, double ); double capedAngle(double); double angleBetVectors(double, double, double, double); double signOfMoment(double, double, double, double); double forcePerpendicularOnNode(node *, node *, double); void polarizeABranch(node *, double *, double *); void pushNodeToStack(stackElemType **, node *); void popNodeFromStack(stackElemType **, node **); double medianOfDistance(node *, boolean); void leftRightLimits(node *, double *, double *); void branchLRHelper(node *, node *, double *, double *); void branchLeftRightAngles(node *, double *, double *); void improveNodeAngle(node *, double); void improvtravn(node *); void coordimprov(double *, double *); void calculate(void); void rescale(void); void user_loop(void); void setup_environment(int argc, Char *argv[]); void polarize(node *p, double *xx, double *yy); double vCounterClkwiseU(double Xu, double Yu, double Xv, double Yv); void drawtree(char* intreename, char* plotfilename, char* plotfileopt, char* fontfilename, char* treegrows, int usebranchlengths, char* labeldirecstr, double labelangle, double treerotation, double treearc, char* iterationkind, int iterationcount, int regularizeangles, int avoidlabeloverlap, int branchrescale, double branchscaler, double relcharhgt, double xmarginratio, double ymarginratio, int dofinalplot, char* finalplotkind); /* function prototypes */ #endif void initdrawtreenode(node **p, node **grbg, node *q, long len, long nodei, long *ntips, long *parens, initops whichinit, pointarray treenode, pointarray nodep, Char *str, Char *ch, FILE *intree) { /* initializes a node */ long i; boolean minusread; double valyew, divisor; switch (whichinit) { case bottom: gnu(grbg, p); (*p)->index = nodei; (*p)->tip = false; for (i=0;inayme[i] = '\0'; nodep[(*p)->index - 1] = (*p); break; case nonbottom: gnu(grbg, p); (*p)->index = nodei; break; case tip: (*ntips)++; gnu(grbg, p); nodep[(*ntips) - 1] = *p; setupnode(*p, *ntips); (*p)->tip = true; (*p)->naymlength = len ; strncpy ((*p)->nayme, str, MAXNCH); break; case length: processlength(&valyew, &divisor, ch, &minusread, intree, parens); if (!minusread) (*p)->oldlen = valyew / divisor; else (*p)->oldlen = fabs(valyew/divisor); if ((*p)->oldlen < epsilon) (*p)->oldlen = epsilon; if ((*p)->back != NULL) (*p)->back->oldlen = (*p)->oldlen; break; case hsnolength: haslengths = false; break; default: /* cases hslength,iter,treewt,unitrwt */ break; /* should not occur */ } } /* initdrawtreenode */ void initialparms() { /* initialize parameters */ paperx = 20.6375; pagex = 20.6375; papery = 26.9875; pagey = 26.9875; strcpy(fontname,"Times-Roman"); plotrparms(spp); grows = vertical; treeangle = pi / 2.0; ark = 2 * pi; improve = true; nbody = false; regular = false; rescaled = true; bscale = 1.0; labeldirec = middle; xmargin = 0.08 * xsize; ymargin = 0.08 * ysize; labelrotation = 0.0; charht = 0.3333; plotter = DEFPLOTTER; hpmargin = 0.02*pagex; vpmargin = 0.02*pagey; labelavoid = false; uselengths = haslengths; } /* initialparms */ char showparms() { long loopcount; char numtochange; Char ch,input[64]; double treea; char options[32]; strcpy(options,"#YN0OPVBLRIDSMC"); if (strcmp(fontname,"Hershey") !=0 && (((plotter == pict || plotter == mac) && (((grows == vertical && labelrotation == 0.0) || (grows == horizontal && labelrotation == 90.0)))))) strcat(options,"Q"); if (plotter == lw || plotter == idraw || plotter == pict || plotter == mac) strcat(options,"F"); if (!improve) strcat(options,"GA"); if (!firstscreens) clearit(); printf("\nUnrooted tree plotting program version %s\n", VERSION); putchar('\n'); printf("Here are the settings: \n\n"); printf(" 0 Screen type (IBM PC, ANSI)? %s\n", ibmpc ? "IBM PC" : ansi ? "ANSI" : "(none)"); printf(" P Final plotting device: "); switch (plotter) { case lw: printf(" Postscript printer\n"); break; case pcl: printf(" HP Laserjet compatible printer (%d DPI)\n", (int) hpresolution); break; case epson: printf(" Epson dot-matrix printer\n"); break; case pcx: printf(" PCX file for PC Paintbrush drawing program (%s)\n", (resopts == 1) ? "EGA 640x350" : (resopts == 2) ? "VGA 800x600" : "VGA 1024x768"); break; case pict: printf(" Macintosh PICT file for drawing program\n"); break; case idraw: printf(" Idraw drawing program\n"); break; case fig: printf(" Xfig drawing program\n"); break; case hp: printf(" HPGL graphics language for HP plotters\n"); break; case bmp: printf(" MS-Windows Bitmap (%d by %d resolution)\n", (int)xsize,(int)ysize); break; case xbm: printf(" X Bitmap file format (%d by %d resolution)\n", (int)xsize,(int)ysize); break; case ibm: printf(" IBM PC graphics (CGA, EGA, or VGA)\n"); break; case tek: printf(" Tektronix graphics screen\n"); break; case decregis: printf(" DEC ReGIS graphics (VT240 or DECTerm)\n"); break; case houston: printf(" Houston Instruments plotter\n"); break; case toshiba: printf(" Toshiba 24-pin dot matrix printer\n"); break; case citoh: printf(" Imagewriter or C.Itoh/TEC/NEC 9-pin dot matrix printer\n"); break; case oki: printf(" old Okidata 9-pin dot matrix printer\n"); break; case ray: printf(" Rayshade ray-tracing program file format\n"); break; case pov: printf(" POV ray-tracing program file format\n"); break; case vrml: printf(" VRML, Virtual Reality Markup Language\n"); case mac: case gif: case other: break ; default: /* case not handled */ break; } printf(" (Preview not available)\n"); printf(" B Use branch lengths: "); if (haslengths) printf("%s\n",uselengths ? "Yes" : "No"); else printf("(no branch lengths available)\n"); printf(" L Angle of labels:"); if (labeldirec == fixed) { printf(" Fixed angle of"); if (labelrotation >= 10.0) printf("%6.1f", labelrotation); else if (labelrotation <= -10.0) printf("%7.1f", labelrotation); else if (labelrotation < 0.0) printf("%6.1f", labelrotation); else printf("%5.1f", labelrotation); printf(" degrees\n"); } else if (labeldirec == radial) printf(" Radial\n"); else if (labeldirec == middle) printf(" branch points to Middle of label\n"); else printf(" Along branches\n"); printf(" R Rotation of tree:"); treea = treeangle * 180 / pi; if (treea >= 100.0) printf("%7.1f\n", treea); else if (treea >= 10.0) printf("%6.1f\n", treea); else if (treea <= -100.0) printf("%8.1f\n", treea); else if (treea <= -10.0) printf("%7.1f\n", treea); else if (treea < 0.0) printf("%6.1f\n", treea); else printf("%5.1f\n", treea); if (!improve) { printf(" A Angle of arc for tree:"); treea = 180 * ark / pi; if (treea >= 100.0) printf("%7.1f\n", treea); else if (treea >= 10.0) printf("%6.1f\n", treea); else if (treea <= -100.0) printf("%8.1f\n", treea); else if (treea <= -10.0) printf("%7.1f\n", treea); else if (treea < 0.0) printf("%6.1f\n", treea); else printf("%5.1f\n", treea); } printf(" I Iterate to improve tree: "); if (improve) { if (nbody) printf("n-Body algorithm\n"); else printf("Equal-Daylight algorithm\n"); } else printf("No\n"); if (improve) printf(" D Try to avoid label overlap? %s\n", (labelavoid? "Yes" : "No")); printf(" S Scale of branch length:"); if (rescaled) printf(" Automatically rescaled\n"); else printf(" Fixed:%6.2f cm per unit branch length\n", bscale); if (!improve) { printf(" G Regularize the angles: %s\n", (regular ? "Yes" : "No")); } printf(" C Relative character height:%8.4f\n", charht); if ((((plotter == pict || plotter == mac) && (((grows == vertical && labelrotation == 0.0) || (grows == horizontal && labelrotation == 90.0)))))) printf(" F Font: %s\n Q" " Pict Font Attributes: %s, %s, %s, %s\n", fontname, (pictbold ? "Bold" : "Medium"), (pictitalic ? "Italic" : "Regular"), (pictshadow ? "Shadowed": "Unshadowed"), (pictoutline ? "Outlined" : "Unoutlined")); else if (plotter == lw || plotter == idraw) printf(" F Font: %s\n",fontname); if (plotter == ray) { printf(" M Horizontal margins:%6.2f pixels\n", xmargin); printf(" M Vertical margins:%6.2f pixels\n", ymargin); } else { printf(" M Horizontal margins:%6.2f cm\n", xmargin); printf(" M Vertical margins:%6.2f cm\n", ymargin); } printf(" # Page size submenu: "); /* Add 0.5 to clear up truncation problems. */ if (((int) ((pagex / paperx) + 0.5) == 1) && ((int) ((pagey / papery) + 0.5) == 1)) /* If we're only using one page per tree, */ printf ("one page per tree\n") ; else printf ("%.0f by %.0f pages per tree\n", (pagey-vpmargin) / (papery-vpmargin), (pagex-hpmargin) / (paperx-hpmargin)) ; loopcount = 0; for (;;) { printf("\n Y to accept these or type the letter for one to change\n"); #ifdef WIN32 phyFillScreenColor(); #endif getstryng(input); uppercase(&input[0]); ch=input[0]; if (strchr(options,ch)) { numtochange = ch; break; } printf(" That letter is not one of the menu choices. Type\n"); countup(&loopcount, 100); } return numtochange; } /* showparms */ void getparms(char numtochange) { /* get from user the relevant parameters for the plotter and diagram */ long loopcount2; Char ch; boolean ok; char options[32]; char line[32]; char input[100]; int m, n; n = (int)((pagex-hpmargin-0.01)/(paperx-hpmargin)+1.0); m = (int)((pagey-vpmargin-0.01)/(papery-vpmargin)+1.0); strcpy(options,"YNOPVBLRIDSMC"); if ((((plotter == pict || plotter == mac) && (((grows == vertical && labelrotation == 0.0) || (grows == horizontal && labelrotation == 90.0)))))) strcat(options,"Q"); if (plotter == lw || plotter == idraw) strcat(options,"F"); if (!improve) { strcat(options,"GA"); } if (numtochange == '*') { do { printf(" Type the number of one that you want to change:\n"); #ifdef WIN32 phyFillScreenColor(); #endif getstryng(line); numtochange = line[0]; } while (strchr(options,numtochange)); } switch (numtochange) { case '0': initterminal(&ibmpc, &ansi); break; case 'P': getplotter(); break; case '#': loopcount2 = 0; for (;;){ clearit(); printf(" Page Specifications Submenu\n\n"); printf(" L Output size in pages: %.0f down by %.0f across\n", (pagey / papery), (pagex / paperx)); printf(" P Physical paper size: %1.5f by %1.5f cm\n",paperx,papery); printf(" O Overlap Region: %1.5f %1.5f cm\n",hpmargin,vpmargin); printf(" M main menu\n"); #ifdef WIN32 phyFillScreenColor(); #endif getstryng(input); ch = input[0]; uppercase(&ch); switch (ch){ case 'L': printf("Number of pages in height:\n"); #ifdef WIN32 phyFillScreenColor(); #endif getstryng(input); m = atoi(input); printf("Number of pages in width:\n"); #ifdef WIN32 phyFillScreenColor(); #endif getstryng(input); n = atoi(input); break; case 'P': printf("Paper Width (in cm):\n"); #ifdef WIN32 phyFillScreenColor(); #endif getstryng(input); paperx = atof(input); printf("Paper Height (in cm):\n"); #ifdef WIN32 phyFillScreenColor(); #endif getstryng(input); papery = atof(input); break; case 'O': printf("Horizontal Overlap (in cm):"); #ifdef WIN32 phyFillScreenColor(); #endif getstryng(input); hpmargin = atof(input); printf("Vertical Overlap (in cm):"); #ifdef WIN32 phyFillScreenColor(); #endif getstryng(input); vpmargin = atof(input); case 'M': break; default: printf("Please enter L, P, O , or M.\n"); break; } pagex = ((double)n * (paperx-hpmargin)+hpmargin); pagey = ((double)m * (papery-vpmargin)+vpmargin); if (ch == 'M') break; countup(&loopcount2, 20); } break; case 'B': if (haslengths) uselengths = !uselengths; else { printf("Cannot use lengths since not all of them exist\n"); uselengths = false; } break; case 'L': printf("\nDo you want labels to be Fixed angle, Radial, Along,"); printf(" or Middle?\n"); loopcount2 = 0; do { printf(" Type F, R, A, or M\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch); (void)getchar(); if (ch == '\n') ch = ' '; uppercase(&ch); countup(&loopcount2, 10); } while (ch != 'F' && ch != 'R' && ch != 'A' && ch != 'M'); switch (ch) { case 'A': labeldirec = along; break; case 'F': labeldirec = fixed; break; case 'R': labeldirec = radial; break; case 'M': labeldirec = middle; break; } if (labeldirec == fixed) { printf("Are the labels to be plotted vertically (90),\n"); printf(" horizontally (0), or downwards (-90) ?\n"); loopcount2 = 0; do { printf(" Choose an angle in degrees from 90 to -90: \n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%lf%*[^\n]", &labelrotation); (void)getchar(); countup(&loopcount2, 10); } while ((labelrotation < -90.0 || labelrotation > 90.0) && labelrotation != -99.0); } break; case 'R': printf("\n At what angle is the tree to be plotted?\n"); loopcount2 = 0; do { printf(" Choose an angle in degrees from 360 to -360: \n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%lf%*[^\n]", &treeangle); (void)getchar(); uppercase(&ch); countup(&loopcount2, 10); } while (treeangle < -360.0 && treeangle > 360.0); treeangle = treeangle * pi / 180; break; case 'A': printf(" How many degrees (up to 360) of arc\n"); printf(" should the tree occupy? (Currently it is %5.1f)\n", 180 * ark / pi); loopcount2 = 0; do { printf("Enter a number of degrees from 0 up to 360)\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%lf%*[^\n]", &ark); (void)getchar(); countup(&loopcount2, 10); } while (ark <= 0.0 || ark > 360.0); ark = ark * pi / 180; break; case 'I': if (nbody) { improve = false; nbody = false; } else { if (improve) nbody = true; else improve = true; } break; case 'D': labelavoid = !labelavoid; break; case 'S': rescaled = !rescaled; if (!rescaled) { printf("Centimeters per unit branch length?\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%lf%*[^\n]", &bscale); (void)getchar(); } break; case 'M': clearit(); printf("\nThe tree will be drawn to fit in a rectangle which has \n"); printf(" margins in the horizontal and vertical directions of:\n"); if (plotter == ray) { printf( "%6.2f pixels (horizontal margin) and%6.2f pixels (vertical margin)\n", xmargin, ymargin); } else { printf("%6.2f cm (horizontal margin) and%6.2f cm (vertical margin)\n", xmargin, ymargin); } loopcount2 = 0; do { printf(" New value (in cm) of horizontal margin?\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%lf%*[^\n]", &xmargin); (void)getchar(); ok = ((unsigned)xmargin < xsize / 2.0); if (!ok) printf(" Impossible value. Please retype it.\n"); countup(&loopcount2, 10); } while (!ok); loopcount2 = 0; do { printf(" New value (in cm) of vertical margin?\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%lf%*[^\n]", &ymargin); (void)getchar(); ok = ((unsigned)ymargin < ysize / 2.0); if (!ok) printf(" Impossible value. Please retype it.\n"); countup(&loopcount2, 10); } while (!ok); break; case 'C': printf("New value of character height?\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%lf%*[^\n]", &charht); (void)getchar(); break; case 'F': printf("Enter font name or \"Hershey\" for default font\n"); #ifdef WIN32 phyFillScreenColor(); #endif getstryng(fontname); break; case 'G': regular = !regular; break; case 'Q': clearit(); loopcount2 = 0; do { printf("Italic? (Y/N)\n"); #ifdef WIN32 phyFillScreenColor(); #endif getstryng(input); input[0] = toupper(input[0]); countup(&loopcount2, 10); } while (input[0] != 'Y' && input[0] != 'N'); pictitalic = (input[0] == 'Y'); loopcount2 = 0; do { printf("Bold? (Y/N)\n"); #ifdef WIN32 phyFillScreenColor(); #endif getstryng(input); input[0] = toupper(input[0]); countup(&loopcount2, 10); } while (input[0] != 'Y' && input[0] != 'N'); pictbold = (input[0] == 'Y'); loopcount2 = 0; do { printf("Shadow? (Y/N)\n"); #ifdef WIN32 phyFillScreenColor(); #endif getstryng(input); input[0] = toupper(input[0]); countup(&loopcount2, 10); } while (input[0] != 'Y' && input[0] != 'N'); pictshadow = (input[0] == 'Y'); loopcount2 = 0; do { printf("Outline? (Y/N)\n"); #ifdef WIN32 phyFillScreenColor(); #endif getstryng(input); input[0] = toupper(input[0]); countup(&loopcount2, 10); } while (input[0] != 'Y' && input[0] != 'N'); pictoutline = (input[0] == 'Y'); break; } } /* getparms */ void getwidth(node *p) { /* get width and depth beyond each node */ double nw, nd; node *pp, *qq; nd = 0.0; if (p->tip) nw = 1.0; else { nw = 0.0; qq = p; pp = p->next; do { getwidth(pp->back); nw += pp->back->width; if (pp->back->depth > nd) nd = pp->back->depth; pp = pp->next; } while (((p != root) && (pp != qq)) || ((p == root) && (pp != p->next))); } p->depth = nd + p->length; p->width = nw; } /* getwidth */ void plrtrans(node *p, double theta, double lower, double upper) { /* polar coordinates of a node relative to start */ long num; double nn, pr, ptheta, angle, angle2, subangle, len; node *pp, *qq; nn = p->width; subangle = (upper - lower) / nn; qq = p; pp = p->next; if (p->tip) return; angle = upper; do { angle -= pp->back->width / 2.0 * subangle; pr = p->r; ptheta = p->theta; if (regular) { num = 1; while (num * subangle < 2 * pi) num *= 2; if (angle >= 0.0) angle2 = 2 * pi / num * (long)(num * angle / (2 * pi) + 0.5); else angle2 = 2 * pi / num * (long)(num * angle / (2 * pi) - 0.5); } else angle2 = angle; if (uselengths) len = fabs(pp->back->oldlen); else len = 1.0; pp->back->r = sqrt(len * len + pr * pr + 2 * len * pr * cos(angle2 - ptheta)); if (fabs(pr * cos(ptheta) + len * cos(angle2)) > epsilon) pp->back->theta = atan((pr * sin(ptheta) + len * sin(angle2)) / (pr * cos(ptheta) + len * cos(angle2))); else if (pr * sin(ptheta) + len * sin(angle2) >= 0.0) pp->back->theta = pi / 2; else pp->back->theta = 1.5 * pi; if (pr * cos(ptheta) + len * cos(angle2) < -epsilon) pp->back->theta += pi; if (!pp->back->tip) plrtrans(pp->back, pp->back->theta, angle - pp->back->width * subangle / 2.0, angle + pp->back->width * subangle / 2.0); else pp->back->oldtheta = angle2; angle -= pp->back->width / 2.0 * subangle; pp = pp->next; } while (((p != root) && (pp != qq)) || ((p == root) && (pp != p->next))); } /* plrtrans */ void coordtrav(node *p, double *xx, double *yy) { /* compute x and y coordinates */ node *pp; if (!p->tip) { pp = p->next; while (pp != p) { coordtrav(pp->back, xx, yy); pp = pp->next; if (p == root) coordtrav(p->back, xx, yy); } } (*xx) = p->r * cos(p->theta); (*yy) = p->r * sin(p->theta); if ((*xx) > maxx) maxx = (*xx); if ((*xx) < minx) minx = (*xx); if ((*yy) > maxy) maxy = (*yy); if ((*yy) < miny) miny = (*yy); p->xcoord = (*xx); p->ycoord = (*yy); } /* coordtrav */ double angleof(double x, double y) { /* compute the angle of a vector */ double theta; if (fabs(x) > epsilon) theta = atan(y / x); else if (y >= 0.0) theta = pi / 2; else theta = 1.5 * pi; if (x < -epsilon) theta = pi + theta; while (theta > 2 * pi) theta -= 2 * pi; while (theta < 0.0) theta += 2 * pi; return theta; } /* angleof */ void polartrav(node *p, double xx, double yy, double firstx, double firsty, double *leftx, double *lefty, double *rightx, double *righty) { /* go through subtree getting left and right vectors */ double x, y, xxx, yyy, labangle = 0; boolean lookatit; node *pp; lookatit = true; if (!p->tip) lookatit = (p->next->next != p || p->index != root->index); if (lookatit) { x = nodep[p->index - 1]->xcoord; y = nodep[p->index - 1]->ycoord; if (p->tip) { if (labeldirec == fixed) { labangle = pi * labelrotation / 180.0; if (cos(p->oldtheta) < 0.0) labangle = labangle - pi; } if (labeldirec == radial) labangle = p->theta; else if (labeldirec == along) labangle = p->oldtheta; else if (labeldirec == middle) labangle = 0.0; xxx = x; yyy = y; if (labelavoid) { if (labeldirec == middle) { xxx += GAP * labelheight * cos(p->oldtheta); yyy += GAP * labelheight * sin(p->oldtheta); xxx += labelheight * cos(labangle) * textlength[p->index - 1]; if (textlength[p->index - 1] * sin(p->oldtheta) < 1.0) xxx += labelheight * cos(labangle) * textlength[p->index - 1]; else xxx += 0.5 * labelheight * cos(labangle) * textlength[p->index - 1]; yyy += labelheight * sin(labangle) * textlength[p->index - 1]; } else { xxx += GAP * labelheight * cos(p->oldtheta); yyy += GAP * labelheight * sin(p->oldtheta); xxx -= labelheight * cos(labangle) * 0.5 * firstlet[p->index - 1]; yyy -= labelheight * sin(labangle) * 0.5 * firstlet[p->index - 1]; xxx += labelheight * cos(labangle) * textlength[p->index - 1]; yyy += labelheight * sin(labangle) * textlength[p->index - 1]; } } if ((yyy - yy) * firstx - (xxx - xx) * firsty < 0.0) { if ((yyy - yy) * (*rightx) - (xxx - xx) * (*righty) < 0.0) { (*rightx) = xxx - xx; (*righty) = yyy - yy; } } if ((yyy - yy) * firstx - (xxx - xx) * firsty > 0.0) { if ((yyy - yy) * (*leftx) - (xxx - xx) * (*lefty) > 0.0) { (*leftx) = xxx - xx; (*lefty) = yyy - yy; } } } if ((y - yy) * firstx - (x - xx) * firsty < 0.0) { if ((y - yy) * (*rightx) - (x - xx) * (*righty) < 0.0) { (*rightx) = x - xx; (*righty) = y - yy; } } if ((y - yy) * firstx - (x - xx) * firsty > 0.0) { if ((y - yy) * (*leftx) - (x - xx) * (*lefty) > 0.0) { (*leftx) = x - xx; (*lefty) = y - yy; } } } if (p->tip) return; pp = p->next; while (pp != p) { if (pp != NULL) polartrav(pp->back,xx,yy,firstx,firsty,leftx,lefty,rightx,righty); pp = pp->next; } } /* polartrav */ void tilttrav(node *q, double *xx, double *yy, double *sinphi, double *cosphi) { /* traverse to move successive nodes */ double x, y; node *pp; pp = nodep[q->index - 1]; x = pp->xcoord; y = pp->ycoord; pp->xcoord = (*xx) + (x - (*xx)) * (*cosphi) + ((*yy) - y) * (*sinphi); pp->ycoord = (*yy) + (x - (*xx)) * (*sinphi) + (y - (*yy)) * (*cosphi); if (q->tip) return; pp = q->next; while (pp != q) { /* if (pp != root) */ if (pp->back != NULL) tilttrav(pp->back,xx,yy,sinphi,cosphi); pp = pp->next; } } /* tilttrav */ void polarize(node *p, double *xx, double *yy) { double TEMP, TEMP1; if (fabs(p->xcoord - (*xx)) > epsilon) p->oldtheta = atan((p->ycoord - (*yy)) / (p->xcoord - (*xx))); else if (p->ycoord - (*yy) > epsilon) p->oldtheta = pi / 2; if (p->xcoord - (*xx) < -epsilon) p->oldtheta += pi; if (fabs(p->xcoord - root->xcoord) > epsilon) p->theta = atan((p->ycoord - root->ycoord) / (p->xcoord - root->xcoord)); else if (p->ycoord - root->ycoord > 0.0) p->theta = pi / 2; else p->theta = 1.5 * pi; if (p->xcoord - root->xcoord < -epsilon) p->theta += pi; TEMP = p->xcoord - root->xcoord; TEMP1 = p->ycoord - root->ycoord; p->r = sqrt(TEMP * TEMP + TEMP1 * TEMP1); } /* polarize */ void leftrightangle(node *p, double xx, double yy) { /* get leftmost and rightmost angle of subtree, put them in node p */ double firstx, firsty, leftx, lefty, rightx, righty; double langle, rangle; firstx = nodep[p->back->index-1]->xcoord - xx; firsty = nodep[p->back->index-1]->ycoord - yy; leftx = firstx; lefty = firsty; rightx = firstx; righty = firsty; if (p->back != NULL) polartrav(p->back,xx,yy,firstx,firsty,&leftx,&lefty,&rightx,&righty); if ((fabs(leftx) < epsilon) && (fabs(lefty) < epsilon)) langle = p->back->oldtheta; else langle = angleof(leftx, lefty); if ((fabs(rightx) < epsilon) && (fabs(righty) < epsilon)) rangle = p->back->oldtheta; else rangle = angleof(rightx, righty); while (langle - rangle > 2*pi) langle -= 2 * pi; while (rangle > langle) { if (rangle > 2*pi) rangle -= 2 * pi; else langle += 2 * pi; } while (langle > 2*pi) { rangle -= 2 * pi; langle -= 2 * pi; } p->lefttheta = langle; p->righttheta = rangle; } /* leftrightangle */ void improvtrav(node *p) { /* traverse tree trying different tiltings at each node */ double xx, yy, cosphi, sinphi; double langle, rangle, sumrot, olddiff; node *pp, *qq, *ppp;; if (p->tip) return; xx = p->xcoord; yy = p->ycoord; pp = p->next; do { leftrightangle(pp, xx, yy); pp = pp->next; } while ((pp != p->next)); if (p == root) { pp = p->next; do { qq = pp; pp = pp->next; } while (pp != root); p->righttheta = qq->righttheta; p->lefttheta = p->next->lefttheta; } qq = p; pp = p->next; ppp = p->next->next; do { langle = qq->righttheta - pp->lefttheta; rangle = pp->righttheta - ppp->lefttheta; while (langle > pi) langle -= 2*pi; while (langle < -pi) langle += 2*pi; while (rangle > pi) rangle -= 2*pi; while (rangle < -pi) rangle += 2*pi; olddiff = fabs(langle-rangle); sumrot = (langle - rangle) /2.0; if (sumrot > langle) sumrot = langle; if (sumrot < -rangle) sumrot = -rangle; cosphi = cos(sumrot); sinphi = sin(sumrot); if (p != root) { if (fabs(sumrot) > maxchange) maxchange = fabs(sumrot); pp->back->oldtheta += sumrot; tilttrav(pp->back,&xx,&yy,&sinphi,&cosphi); polarize(pp->back,&xx,&yy); leftrightangle(pp, xx, yy); langle = qq->righttheta - pp->lefttheta; rangle = pp->righttheta - ppp->lefttheta; while (langle > pi) langle -= 2*pi; while (langle < -pi) langle += 2*pi; while (rangle > pi) rangle -= 2*pi; while (rangle < -pi) rangle += 2*pi; while ((fabs(langle-rangle) > olddiff) && (fabs(sumrot) > 0.01)) { sumrot = sumrot /2.0; cosphi = cos(-sumrot); sinphi = sin(-sumrot); pp->back->oldtheta -= sumrot; tilttrav(pp->back,&xx,&yy,&sinphi,&cosphi); polarize(pp->back,&xx,&yy); leftrightangle(pp, xx, yy); langle = qq->righttheta - pp->lefttheta; rangle = pp->righttheta - ppp->lefttheta; if (langle > pi) langle -= 2*pi; if (langle < -pi) langle += 2*pi; if (rangle > pi) rangle -= 2*pi; if (rangle < -pi) rangle += 2*pi; } } qq = pp; pp = pp->next; ppp = ppp->next; } while (((p == root) && (pp != p->next)) || ((p != root) && (pp != p))); pp = p->next; do { improvtrav(pp->back); pp = pp->next; } while (((p == root) && (pp != p->next)) || ((p != root) && (pp != p))); } /* improvtrav */ void force_1to1(node *pFromSubNode, node *pToSubNode, double *pForce, double *pAngle, double medianDistance) { /* calculate force acting between 2 nodes and return the force in pForce. Remember to pass the index subnodes to this function if needed. Force should always be positive for repelling. Angle changes to indicate the direction of the force. The value of INFINITY is the cap to the value of Force. There might have problem (error msg.) if pFromSubNode and pToSubNode are the same node or the coordinates are identical even with double precision. */ double distanceX, distanceY, distance, norminalDistance; distanceX = pFromSubNode->xcoord - pToSubNode->xcoord; distanceY = pFromSubNode->ycoord - pToSubNode->ycoord; distance = sqrt( distanceX*distanceX + distanceY*distanceY ); norminalDistance = distance/medianDistance; if (norminalDistance < epsilon) { *pForce = INFINITY; } else { *pForce = (double)1 / (norminalDistance * norminalDistance); if (*pForce > INFINITY) *pForce = INFINITY; } *pAngle = computeAngle(pFromSubNode->xcoord, pFromSubNode->ycoord, pToSubNode->xcoord, pToSubNode->ycoord); return; } /* force_1to1 */ void totalForceOnNode(node *pPivotSubNode, node *pToSubNode, double *pTotalForce, double *pAngle, double medianDistance) { /* pToSubNode is where all the relevent nodes apply forces to. All branches are visited except the branch contains pToSubNode. pToSubNode must be one of the branch out of the current Node (= out of one of the subnode in the current subnodes set.) Most likely pPivotSubNode is not the index subNode! In any case, only the leafs are consider in repelling force; so, no worry about index subNode. pTotalForce and pAngle must be set to 0 before calling this function for the first time, or the result will be invalid. pPivotSubNode is named for external interface. When calling totalForceOnNode() recursively, pPivotSubNode should be thought of as pFromSubNode. */ node *pSubNode; double force, angle, forceX, forceY, prevForceX, prevForceY; pSubNode = pPivotSubNode; /* visit the rest of the branches of current node; the branch attaches to the current subNode may be visited in the code down below. */ while (pSubNode->next != NULL && pSubNode->next != pPivotSubNode) { pSubNode = pSubNode->next; if ( pSubNode->back != NULL && pSubNode->back != pToSubNode) totalForceOnNode(pSubNode->back, pToSubNode, pTotalForce, pAngle, medianDistance); } /* visit this branch; You need to visit it for the first time - at root only! * * Modified so that all nodes are visited and calculated forces, instead of * just the leafs only. * use pPivotSubNode instead of pSubNode here because pSubNode stop short * just before pPivotSubNode (the entry node) */ if ( pPivotSubNode == root && pPivotSubNode->back != NULL && pPivotSubNode->back != pToSubNode) totalForceOnNode(pPivotSubNode->back, pToSubNode, pTotalForce, pAngle, medianDistance); /* Break down the previous sum of forces to components form */ prevForceX = *pTotalForce * cos(*pAngle); prevForceY = *pTotalForce * sin(*pAngle); force_1to1(nodep[pPivotSubNode->index-1], pToSubNode, &force, &angle, medianDistance); /* force between 2 nodes */ forceX = force * cos(angle); forceY = force * sin(angle); /* Combined force */ forceX = forceX + prevForceX; forceY = forceY + prevForceY; /* Write to output parameters */ *pTotalForce = sqrt( forceX*forceX + forceY*forceY ); *pAngle = computeAngle((double)0, (double)0, forceX, forceY); return; } /* totalForceOnNode */ double dotProduct(double Xu, double Yu, double Xv, double Yv) { return Xu * Xv + Yu * Yv; } /* dotProduct */ double capedAngle(double angle) { /* Return the equivalent value of angle that is within 0 to 2*pi */ while (angle < 0 || angle >= 2*pi) { if(angle < 0) { angle = angle + 2*pi; } else if (angle >= 2*pi) { angle = angle - 2*pi; } } return angle; } /* capedAngle */ double angleBetVectors(double Xu, double Yu, double Xv, double Yv) { /* Calculate angle between 2 vectors; value returned is always between 0 and pi. Use vCounterClkwiseU() to get the relative position of the vectors. */ double dotProd, cosTheta, theta, lengthsProd; dotProd = dotProduct(Xu, Yu, Xv, Yv); lengthsProd = sqrt(Xu*Xu+Yu*Yu) * sqrt(Xv*Xv+Yv*Yv); if (lengthsProd < epsilon) { printf("ERROR: drawtree - division by zero in angleBetVectors()!\n"); printf("Xu %f Yu %f Xv %f Yv %f\n", Xu, Yu, Xv, Yv); exxit(0); } cosTheta = dotProd / lengthsProd; if (cosTheta > 1) /* cosTheta will only be > 1 or < -1 due to rounding errors */ theta = 0; /* cosTheta = 1 */ else if (cosTheta < -1) theta = pi; /* cosTheta = -1 */ else theta = acos(cosTheta); return theta; } /* angleBetVectors */ double signOfMoment(double xReferenceVector, double yReferenceVector, double xForce, double yForce) { /* it return the sign of the moment caused by the force, applied to the tip of the refereceVector; the root of the refereceVector is the pivot. */ double angleReference, angleForce, sign; angleReference = computeAngle((double)0, (double)0, xReferenceVector, yReferenceVector); angleForce = computeAngle((double)0, (double)0, xForce, yForce); angleForce = capedAngle(angleForce); angleReference = capedAngle(angleReference); /* reduce angleReference to 0 */ angleForce = angleForce - angleReference; angleForce = capedAngle(angleForce); if (angleForce > 0 && angleForce < pi) { /* positive sign - force pointing toward the left of the reference line/vector. */ sign = 1; } else { /* negative sign */ sign = -1; } return sign; } /* signOfMoment */ double vCounterClkwiseU(double Xu, double Yu, double Xv, double Yv) { /* Return 1 if vector v is counter clockwise from u */ /* signOfMoment() is doing just that! */ return signOfMoment(Xu, Yu, Xv, Yv); } /* vCounterClkwiseU */ double forcePerpendicularOnNode(node *pPivotSubNode, node *pToSubNode, double medianDistance) { /* Read comment for totalForceOnNode */ /* It supposed to return a positive value to indicate that it has a positive moment; and negative return value to indicate negative moment. force perpendicular at norminal distance 1 is taken to be 1. medianDistance is the median of Distances in this graph. */ /* / Force / | ToNode o > alpha | \ yDelta | \ theta = pi/2 + alpha | beta = vector (or angle) from Pivot to ToNode Pivot o----------- xDelta alpha = theta + beta */ double totalForce, forceAngle, xDelta, yDelta; double alpha, theta, forcePerpendicular, sinForceAngle, cosForceAngle; totalForce = (double)0; forceAngle = (double)0; totalForceOnNode(pPivotSubNode, pToSubNode, &totalForce, &forceAngle, medianDistance); xDelta = nodep[pToSubNode->index-1]->xcoord - nodep[pPivotSubNode->index-1]->xcoord; yDelta = nodep[pToSubNode->index-1]->ycoord - nodep[pPivotSubNode->index-1]->ycoord; /* Try to avoid the case where 2 nodes are on top of each other. */ /* if (xDelta < 0) tempx = -xDelta; else tempx = xDelta; if (yDelta < 0) tempy = -yDelta; else tempy = yDelta; if (tempx < epsilon && tempy < epsilon) { return; } */ sinForceAngle = sin(forceAngle); cosForceAngle = cos(forceAngle); theta = angleBetVectors(xDelta, yDelta, cosForceAngle, sinForceAngle); if (theta > pi/2) { alpha = theta - pi/2; } else { alpha = pi/2 - theta; } forcePerpendicular = totalForce * cos(alpha); if (forcePerpendicular < -epsilon) { printf("ERROR: drawtree - forcePerpendicular applied at an angle should" " not be less than zero (in forcePerpendicularOnNode()). \n"); printf("alpha = %f\n", alpha); exxit(1); } /* correct the sign of the moment */ forcePerpendicular = signOfMoment(xDelta, yDelta, cosForceAngle, sinForceAngle) * forcePerpendicular; return forcePerpendicular; } /* forcePerpendicularOnNode */ void polarizeABranch(node *pStartingSubNode, double *xx, double *yy) { /* added - danieyek 990128 */ /* After calling tilttrav(), if you don't polarize all the nodes on the branch to convert the x-y coordinates to theta and radius, you won't get result on the plot! This function takes a subnode and branch out of all other subnode except the starting subnode (where the parent is), thus converting the x-y to polar coordinates for the branch only. xx and yy are purely "inherited" features of polarize(). They should have been passed as values not addresses. */ node *pSubNode; pSubNode = pStartingSubNode; /* convert the current node (note: not subnode) to polar coordinates. */ polarize( nodep[pStartingSubNode->index - 1], xx, yy); /* visit the rest of the branches of current node */ while (pSubNode->next != NULL && pSubNode->next != pStartingSubNode) { pSubNode = pSubNode->next; if ( pSubNode->tip != true ) polarizeABranch(pSubNode->back, xx, yy); } return; } /* polarizeABranch */ void pushNodeToStack(stackElemType **ppStackTop, node *pNode) { /* added - danieyek 990204 */ /* pStackTop must be the current top element of the stack, where we add another element on top of it. ppStackTop must be the location where we can find pStackTop. This function "returns" the revised top (element) of the stack through the output parameter, ppStackTop. The last element on the stack has the "back" (pStackElemBack) pointer set to NULL. So, when the last element is poped, ppStackTop will be automatically set to NULL. If popNodeFromStack() is called with ppStackTop = NULL, we assume that it is the error caused by over popping the stack. */ stackElemType *pStackElem; if (ppStackTop == NULL) { /* NULL can be stored in the location, but the location itself can't be NULL! */ printf("ERROR: drawtree - error using pushNodeToStack(); " "ppStackTop is NULL.\n"); exxit(1); } pStackElem = (stackElemType*)Malloc( sizeof(stackElemType) ); pStackElem->pStackElemBack = *ppStackTop; /* push an element onto the stack */ pStackElem->pNode = pNode; *ppStackTop = pStackElem; return; } /* pushNodeToStack */ void popNodeFromStack(stackElemType **ppStackTop, node **ppNode) { /* added - danieyek 990205 */ /* pStackTop must be the current top element of the stack, where we pop an element from the top of it. ppStackTop must be the location where we can find pStackTop. This function "returns" the revised top (element) of the stack through the output parameter, ppStackTop. The last element on the stack has the "back" (pStackElemBack) pointer set to NULL. So, when the last element is poped, ppStackTop will be automatically set to NULL. If popNodeFromStack() is called with ppStackTop = NULL, we assume that it is the error caused by over popping the stack. */ stackElemType *pStackT; if (ppStackTop == NULL) { printf("ERROR: drawtree - a call to pop while the stack is empty.\n"); exxit(1); } pStackT = *ppStackTop; *ppStackTop = pStackT->pStackElemBack; *ppNode = pStackT->pNode; free(pStackT); return; } /* popNodeFromStack */ double medianOfDistance(node *pRootSubNode, boolean firstRecursiveCallP) { /* added - danieyek 990208 */ /* Find the median of the distance; used to compute the angle to rotate in proportion to the size of the graph and forces. It is assumed that pRootSubNode is also the pivot (during the first call to this function) - the center, with respect to which node the distances are calculated. If there are only 3 element, element #2 is returned, ie. (2+1)/2. This function now finds the median of distances of all nodes, not only the leafs! */ node *pSubNode; double xDelta, yDelta, distance; long i, j; struct dblLinkNode { double value; /* Implement reverse Linked List */ struct dblLinkNode *pBack; } *pLink, *pBackElem, *pMidElem, *pFrontElem, junkLink; /* must use static to retain values over calls */ static node *pReferenceNode; static long count; static struct dblLinkNode *pFrontOfLinkedList; /* Remember the reference node so that it doesn't have to be passed arround in the function parameter. */ if (firstRecursiveCallP == true) { pReferenceNode = pRootSubNode; pFrontOfLinkedList = NULL; count = 0; } pSubNode = pRootSubNode; /* visit the rest of the branches of current node; the branch attaches to the current subNode may be visited in the code further down below. */ while (pSubNode->next != NULL && pSubNode->next != pRootSubNode) { pSubNode = pSubNode->next; if ( pSubNode->back != NULL) medianOfDistance(pSubNode->back, false); } /* visit this branch; You need to visit it for the first time - at root only! use pRootSubNode instead of pSubNode here because pSubNode stop short just before pRootSubNode (the entry node) */ if ( firstRecursiveCallP == true && pRootSubNode->back != NULL) medianOfDistance(pRootSubNode->back, false); /* Why only leafs count? Modifying it! */ xDelta = nodep[pSubNode->index-1]->xcoord - nodep[pReferenceNode->index-1]->xcoord; yDelta = nodep[pSubNode->index-1]->ycoord - nodep[pReferenceNode->index-1]->ycoord; distance = sqrt( xDelta*xDelta + yDelta*yDelta ); /* Similar to pushing onto the stack */ pLink = (struct dblLinkNode*) Malloc( sizeof(struct dblLinkNode) ); if (pLink == NULL) { printf("Fatal ERROR: drawtree - Insufficient Memory in" " medianOfDistance()!\n"); exxit(1); } pLink->value = distance; pLink->pBack = pFrontOfLinkedList; pFrontOfLinkedList = pLink; count = count + 1; if (firstRecursiveCallP == true) { if (count == 0) { return (double)0; } else if (count == 1) { distance = pFrontOfLinkedList->value; free(pFrontOfLinkedList); return distance; } else if (count == 2) { distance = (pFrontOfLinkedList->value + pFrontOfLinkedList->pBack->value)/(double)2; free(pFrontOfLinkedList->pBack); free(pFrontOfLinkedList); return distance; } else { junkLink.pBack = pFrontOfLinkedList; /* SORT first - use bubble sort; we start with at least 3 elements here. */ /* We are matching backward when sorting the list and comparing MidElem and BackElem along the path; junkLink is there just to make a symmetric operation at the front end. */ for (j = 0; j < count - 1; j++) { pFrontElem = &junkLink; pMidElem = junkLink.pBack; pBackElem = junkLink.pBack->pBack; for (i = j; i < count - 1; i++) { if(pMidElem->value < pBackElem->value) { /* Swap - carry the smaller value to the root of the linked list. */ pMidElem->pBack = pBackElem->pBack; pBackElem->pBack = pMidElem; pFrontElem->pBack = pBackElem; /* Correct the order of pFrontElem, pMidElem, pBackElem and match one step */ pFrontElem = pBackElem; pBackElem = pMidElem->pBack; } else { pFrontElem = pMidElem; pMidElem = pBackElem; pBackElem = pBackElem->pBack; } } pFrontOfLinkedList = junkLink.pBack; } /* Sorted; now get the middle element. */ for (i = 1; i < (count + 1)/(long) 2; i++) { /* Similar to Poping the stack */ pLink = pFrontOfLinkedList; pFrontOfLinkedList = pLink->pBack; free(pLink); } /* Get the return value!! - only the last return value is the valid one. */ distance = pFrontOfLinkedList->value; /* Continue from the same i value left off by the previous for loop! */ for (; i <= count; i++) { /* Similar to Poping the stack */ pLink = pFrontOfLinkedList; pFrontOfLinkedList = pLink->pBack; free(pLink); } } } return distance; } /* medianOfDistance */ void leftRightLimits(node *pToSubNode, double *pLeftLimit, double *pRightLimit) /* As usual, pToSubNode->back is the angle leftLimit is the max angle you can rotate on the left and rightLimit vice versa. *pLeftLimit and *pRightLimit must be initialized to 0; without initialization, it would introduce bitter bugs into the program; they are initialized in this routine. */ { /* pPivotNode is nodep[pToSubNode->back->index-1], not pPivotSubNode which is just pToSubNode->back! */ node *pLeftSubNode, *pRightSubNode, *pPivotNode, *pSubNode; double leftLimit, rightLimit, xToNodeVector, yToNodeVector, xLeftVector, yLeftVector, xRightVector, yRightVector, lengthsProd; *pLeftLimit = 0; *pRightLimit = 0; /* Make an assumption first - guess "pToSubNode->back->next" is the right and the opposite direction is the left! */ /* It shouldn't be pivoted at a left, but just checking. */ if (pToSubNode->back->tip == true) { /* Logically this should not happen. But we actually can return pi as the limit. */ printf("ERROR: In leftRightLimits() - Pivoted at a leaf! Unable to " "calculate left and right limit.\n"); exxit(1); } else if (pToSubNode->back->next->next == pToSubNode->back) { *pLeftLimit = 0; /* don't pivot where there is no branching */ *pRightLimit = 0; return; } /* Else, do this */ pPivotNode = nodep[pToSubNode->back->index-1]; /* 3 or more branches - the regular case. */ /* First, initialize the pRightSubNode - non-repeative portion of the code */ pRightSubNode = pToSubNode->back; pLeftSubNode = pToSubNode->back; xToNodeVector = nodep[pToSubNode->index-1]->xcoord - pPivotNode->xcoord; yToNodeVector = nodep[pToSubNode->index-1]->ycoord - pPivotNode->ycoord; /* If both x and y are 0, then the length must be 0; but this check is not enough yet, we need to check the product of length also. */ if ( fabs(xToNodeVector) < epsilon && fabs(yToNodeVector) < epsilon ) { /* If the branch to rotate is too short, don't rotate it. */ *pLeftLimit = 0; *pRightLimit = 0; return; } while( nodep[pRightSubNode->index-1]->tip != true ) { /* Repeative code */ pRightSubNode = pRightSubNode->next->back; xRightVector = nodep[pRightSubNode->index-1]->xcoord - pPivotNode->xcoord; yRightVector = nodep[pRightSubNode->index-1]->ycoord - pPivotNode->ycoord; lengthsProd = sqrt(xToNodeVector*xToNodeVector+yToNodeVector*yToNodeVector) * sqrt(xRightVector*xRightVector+yRightVector*yRightVector); if ( lengthsProd < epsilon ) { continue; } rightLimit = angleBetVectors(xToNodeVector, yToNodeVector, xRightVector, yRightVector); if ( (*pRightLimit) < rightLimit) *pRightLimit = rightLimit; } while( nodep[pLeftSubNode->index-1]->tip != true ) { /* First, let pSubNode be 1 subnode after rightSubNode. */ pSubNode = pLeftSubNode->next->next; /* Then, loop until the last subNode before getting back to the pivot */ while (pSubNode->next != pLeftSubNode) { pSubNode = pSubNode->next; } pLeftSubNode = pSubNode->back; xLeftVector = nodep[pLeftSubNode->index-1]->xcoord - pPivotNode->xcoord; yLeftVector = nodep[pLeftSubNode->index-1]->ycoord - pPivotNode->ycoord; lengthsProd = sqrt(xToNodeVector*xToNodeVector+yToNodeVector*yToNodeVector) * sqrt(xLeftVector*xLeftVector+yLeftVector*yLeftVector); if ( lengthsProd < epsilon ) { continue; } leftLimit = angleBetVectors(xToNodeVector, yToNodeVector, xLeftVector, yLeftVector); if ( (*pLeftLimit) < leftLimit) *pLeftLimit = leftLimit; } return; } /* leftRightLimits */ void branchLRHelper(node *pPivotSubNode, node *pCurSubNode, double *pBranchL, double *pBranchR) { /* added - danieyek 990226 */ /* Recursive helper function for branchLeftRightAngles(). pPivotSubNode->back is the pToNode, to which node you apply the forces! */ /* Abandoned as it is similar to day-light algorithm; the first part is done implementing but not tested, the second part yet to be implemented if necessary. */ double xCurNodeVector, yCurNodeVector, xPivotVector, yPivotVector; /* Base case : a leaf - return 0 & 0. */ if ( nodep[pCurSubNode->index-1]->tip == true ) { xPivotVector = nodep[pPivotSubNode->back->index-1]->xcoord - nodep[pPivotSubNode->index-1]->xcoord; yPivotVector = nodep[pPivotSubNode->back->index-1]->ycoord - nodep[pPivotSubNode->index-1]->ycoord; xCurNodeVector = nodep[pCurSubNode->index-1]->xcoord - nodep[pPivotSubNode->index-1]->xcoord; yCurNodeVector = nodep[pCurSubNode->index-1]->ycoord - nodep[pPivotSubNode->index-1]->ycoord; if ( vCounterClkwiseU(xPivotVector, yPivotVector, xCurNodeVector, yCurNodeVector) == 1) { /* Relevant to Left Angle */ *pBranchL = angleBetVectors(xPivotVector, yPivotVector, xCurNodeVector, yCurNodeVector); *pBranchR = (double)0; } else { /* Relevant to Right Angle */ *pBranchR = angleBetVectors(xPivotVector, yPivotVector, xCurNodeVector, yCurNodeVector); *pBranchL = (double)0; } return; } else { /* not a leaf */ } } /* branchLRHelper */ void improveNodeAngle(node *pToNode, double medianDistance) { double forcePerpendicular, distance, xDistance, yDistance, angleRotate, sinAngleRotate, cosAngleRotate, norminalDistance, leftLimit, rightLimit, limitFactor; node *pPivot; /* Limit factor determinte how close the rotation can approach the absolute limit before colliding with other branches */ limitFactor = (double)4 / (double)5; pPivot = pToNode->back; xDistance = nodep[pPivot->index-1]->xcoord - nodep[pToNode->index-1]->xcoord; yDistance = nodep[pPivot->index-1]->ycoord - nodep[pToNode->index-1]->ycoord; distance = sqrt( xDistance*xDistance + yDistance*yDistance ); /* convert distance to absolute value and test if it is zero */ if ( fabs(distance) < epsilon) { angleRotate = (double)0; } else { leftRightLimits(pToNode, &leftLimit, &rightLimit); norminalDistance = distance / medianDistance; forcePerpendicular = forcePerpendicularOnNode(pPivot, pToNode, medianDistance); angleRotate = forcePerpendicular / norminalDistance; /* Limiting the angle of rotation */ if ( angleRotate > 0 && angleRotate > limitFactor * leftLimit) { /* Left */ angleRotate = limitFactor * leftLimit; } else if ( -angleRotate > limitFactor * rightLimit ) /* angleRotate < 0 && */ { /* Right */ angleRotate = - limitFactor * rightLimit; } } angleRotate = (double).1 * angleRotate; sinAngleRotate = sin(angleRotate); cosAngleRotate = cos(angleRotate); tilttrav(pToNode, &(nodep[pPivot->index - 1]->xcoord), &(nodep[pPivot->index - 1]->ycoord), &sinAngleRotate, &cosAngleRotate); polarizeABranch(pToNode, &(nodep[pPivot->index - 1]->xcoord), &(nodep[pPivot->index - 1]->ycoord)); } /* improveNodeAngle */ void improvtravn(node *pStartingSubNode) { /* improvtrav for n-body. */ /* POPStack is the stack that is currently being used (popped); PUSHStack is the stack that is for the use of the next round (is pushed now) */ stackElemType *pPUSHStackTop, *pPOPStackTop, *pTempStack; node *pSubNode, *pBackStartNode, *pBackSubNode; double medianDistance; long noOfIteration; /* Stack starts with no element on it */ pPUSHStackTop = NULL; pPOPStackTop = NULL; /* Get the median to relate force to angle proportionally. */ medianDistance = medianOfDistance(root, true); /* Set max. number of iteration */ for ( noOfIteration = (long)0; noOfIteration < maxNumOfIter; noOfIteration++) { /* First, push all subNodes in the root node onto the stack-to-be-used to kick up the process */ pSubNode = pStartingSubNode; pushNodeToStack(&pPUSHStackTop, pSubNode); while(pSubNode->next != pStartingSubNode) { pSubNode = pSubNode->next; pushNodeToStack(&pPUSHStackTop, pSubNode); } while (true) { /* Finishes with the current POPStack; swap the function of the stacks if PUSHStack is not empty */ if (pPUSHStackTop == NULL) { /* Exit infinity loop here if empty. */ break; } else { /* swap */ pTempStack = pPUSHStackTop; pPUSHStackTop = pPOPStackTop; pPOPStackTop = pTempStack; } while (pPOPStackTop != NULL) { /* We always push the pivot subNode onto the stack! That's when we pop that pivot subNode, subNode.back is the node we apply the force to (ToNode). Also, when we pop a pivot subNode, always push all pivot subNodes in the same ToNode onto the stack. */ popNodeFromStack(&pPOPStackTop, &pSubNode); pBackStartNode = pSubNode->back; if (pBackStartNode->tip == true) { /* tip indicates if a node is a leaf */ improveNodeAngle(pSubNode->back, medianDistance); } else { /* Push all subNodes in this pSubNode->back onto the * stack-to-be-used, after poping a pivot subNode. If * pSubNode->back is a leaf, no push on stack. */ pBackSubNode = pBackStartNode; /* Do not push this pBackStartNode onto the stack! Or the * process will never stop. */ while(pBackSubNode->next != pBackStartNode) { pBackSubNode = pBackSubNode->next; pushNodeToStack(&pPOPStackTop, pBackSubNode); } /* improve the node even if it is not a leaf */ improveNodeAngle(pSubNode->back, medianDistance); } } } } } /* improvtravn */ void coordimprov(double *xx, double *yy) { /* use angles calculation to improve node coordinate placement */ long i; if (nbody) /* n-body algorithm */ improvtravn(root); else { /* equal-daylight algorithm */ i = 0; do { maxchange = 0.0; improvtrav(root); i++; } while ((i < MAXITERATIONS) && (maxchange > MINIMUMCHANGE)); } } /* coordimprov */ void calculate() { /* compute coordinates for tree */ double xx, yy; long i; double nttot, fontheight, labangle=0, top, bot, rig, lef; for (i = 0; i < nextnode; i++) nodep[i]->width = 1.0; for (i = 0; i < nextnode; i++) nodep[i]->xcoord = 0.0; for (i = 0; i < nextnode; i++) nodep[i]->ycoord = 0.0; if (!uselengths) { for (i = 0; i < nextnode; i++) nodep[i]->length = 1.0; } else { for (i = 0; i < nextnode; i++) nodep[i]->length = fabs(nodep[i]->oldlen); } getwidth(root); nttot = root->width; for (i = 0; i < nextnode; i++) nodep[i]->width = nodep[i]->width * spp / nttot; if (!improve) plrtrans(root, treeangle, treeangle - ark / 2.0, treeangle + ark / 2.0); else plrtrans(root, treeangle, treeangle - pi, treeangle + pi); maxx = 0.0; minx = 0.0; maxy = 0.0; miny = 0.0; coordtrav(root, &xx,&yy); fontheight = heighttext(font,fontname); if (labeldirec == fixed) labangle = pi * labelrotation / 180.0; textlength = (double*) Malloc(nextnode*sizeof(double)); firstlet = (double*) Malloc(nextnode*sizeof(double)); for (i = 0; i < nextnode; i++) { if (nodep[i]->tip) { textlength[i] = lengthtext(nodep[i]->nayme, nodep[i]->naymlength, fontname,font); textlength[i] /= fontheight; firstlet[i] = lengthtext(nodep[i]->nayme,1L,fontname,font) / fontheight; } } if (spp > 1) labelheight = charht * (maxx - minx) / (spp - 1); else labelheight = charht * (maxx - minx); if (improve) { coordimprov(&xx,&yy); maxx = 0.0; minx = 0.0; maxy = 0.0; miny = 0.0; coordtrav(root, &xx,&yy); } topoflabels = 0.0; bottomoflabels = 0.0; rightoflabels = 0.0; leftoflabels = 0.0; for (i = 0; i < nextnode; i++) { if (nodep[i]->tip) { if (labeldirec == radial) labangle = nodep[i]->theta; else if (labeldirec == along) labangle = nodep[i]->oldtheta; else if (labeldirec == middle) labangle = 0.0; if (cos(labangle) < 0.0 && labeldirec != fixed) labangle -= pi; firstlet[i] = lengthtext(nodep[i]->nayme,1L,fontname,font) / fontheight; top = (nodep[i]->ycoord - maxy) / labelheight + sin(nodep[i]->oldtheta); rig = (nodep[i]->xcoord - maxx) / labelheight + cos(nodep[i]->oldtheta); bot = (miny - nodep[i]->ycoord) / labelheight - sin(nodep[i]->oldtheta); lef = (minx - nodep[i]->xcoord) / labelheight - cos(nodep[i]->oldtheta); if (cos(labangle) * cos(nodep[i]->oldtheta) + sin(labangle) * sin(nodep[i]->oldtheta) > 0.0) { if (sin(labangle) > 0.0) top += sin(labangle) * textlength[i]; top += sin(labangle - 1.25 * pi) * GAP * firstlet[i]; if (sin(labangle) < 0.0) bot -= sin(labangle) * textlength[i]; bot -= sin(labangle - 0.75 * pi) * GAP * firstlet[i]; if (sin(labangle) > 0.0) rig += cos(labangle - 0.75 * pi) * GAP * firstlet[i]; else rig += cos(labangle - 1.25 * pi) * GAP * firstlet[i]; rig += cos(labangle) * textlength[i]; if (sin(labangle) > 0.0) lef -= cos(labangle - 1.25 * pi) * GAP * firstlet[i]; else lef -= cos(labangle - 0.75 * pi) * GAP * firstlet[i]; } else { if (sin(labangle) < 0.0) top -= sin(labangle) * textlength[i]; top += sin(labangle + 0.25 * pi) * GAP * firstlet[i]; if (sin(labangle) > 0.0) bot += sin(labangle) * textlength[i]; bot -= sin(labangle - 0.25 * pi) * GAP * firstlet[i]; if (sin(labangle) > 0.0) rig += cos(labangle - 0.25 * pi) * GAP * firstlet[i]; else rig += cos(labangle + 0.25 * pi) * GAP * firstlet[i]; if (sin(labangle) < 0.0) rig += cos(labangle) * textlength[i]; if (sin(labangle) > 0.0) lef -= cos(labangle + 0.25 * pi) * GAP * firstlet[i]; else lef -= cos(labangle - 0.25 * pi) * GAP * firstlet[i]; lef += cos(labangle) * textlength[i]; } if (top > topoflabels) topoflabels = top; if (bot > bottomoflabels) bottomoflabels = bot; if (rig > rightoflabels) rightoflabels = rig; if (lef > leftoflabels) leftoflabels = lef; } } topoflabels *= labelheight; bottomoflabels *= labelheight; leftoflabels *= labelheight; rightoflabels *= labelheight; } /* calculate */ void rescale() { /* compute coordinates of tree for plot device */ long i; double treeheight, treewidth, extrax, extray, temp; treeheight = maxy - miny + topoflabels + bottomoflabels; treewidth = maxx - minx + rightoflabels + leftoflabels; if (grows == vertical) { if (!rescaled) expand = bscale; else { expand = (xsize - 2 * xmargin) / treewidth; if ((ysize - 2 * ymargin) / treeheight < expand) expand = (ysize - 2 * ymargin) / treeheight; } extrax = (xsize - 2 * xmargin - treewidth * expand) / 2.0; extray = (ysize - 2 * ymargin - treeheight * expand) / 2.0; } else { if (!rescaled) expand = bscale; else { expand = (ysize - 2 * ymargin) / treewidth; if ((xsize - 2 * xmargin) / treeheight < expand) expand = (xsize - 2 * xmargin) / treeheight; } extrax = (xsize - 2 * xmargin - treeheight * expand) / 2.0; extray = (ysize - 2 * ymargin - treewidth * expand) / 2.0; } for (i = 0; i < (nextnode); i++) { nodep[i]->xcoord = expand * (nodep[i]->xcoord - minx + leftoflabels); nodep[i]->ycoord = expand * (nodep[i]->ycoord - miny + bottomoflabels); if (grows == horizontal) { temp = nodep[i]->ycoord; nodep[i]->ycoord = expand * treewidth - nodep[i]->xcoord; nodep[i]->xcoord = temp; } nodep[i]->xcoord += xmargin + extrax; nodep[i]->ycoord += ymargin + extray; } } /* rescale */ void plottree(node *p, node *q) { /* plot part or all of tree on the plotting device */ double x1, y1, x2, y2; node *pp; x2 = xscale * (xoffset + p->xcoord); y2 = yscale * (yoffset + p->ycoord); if (p != root) { x1 = xscale * (xoffset + q->xcoord); y1 = yscale * (yoffset + q->ycoord); plot(penup, x1, y1); plot(pendown, x2, y2); } if (p->tip) return; pp = p->next; do { plottree(pp->back, p); pp = pp->next; } while (((p == root) && (pp != p->next)) || ((p != root) && (pp != p))); } /* plottree */ void plotlabels(char *fontname) { long i; double compr, dx = 0, dy = 0, labangle, sino, coso, cosl, sinl, cosv, sinv, vec; boolean right; node *lp; compr = xunitspercm / yunitspercm; if (penchange == yes) changepen(labelpen); for (i = 0; i < (nextnode); i++) { if (nodep[i]->tip) { lp = nodep[i]; labangle = labelrotation * pi / 180.0; if (labeldirec == radial) labangle = nodep[i]->theta; else if (labeldirec == along) labangle = nodep[i]->oldtheta; else if (labeldirec == middle) labangle = 0.0; if (cos(labangle) < 0.0) labangle -= pi; sino = sin(nodep[i]->oldtheta); coso = cos(nodep[i]->oldtheta); cosl = cos(labangle); sinl = sin(labangle); right = ((coso*cosl+sino*sinl) > 0.0) || (labeldirec == middle); vec = sqrt(1.0+firstlet[i]*firstlet[i]); cosv = firstlet[i]/vec; sinv = 1.0/vec; if (labeldirec == middle) { if ((textlength[i]+1.0)*fabs(tan(nodep[i]->oldtheta)) > 2.0) { dx = -0.5 * textlength[i] * labelheight * expand; if (sino > 0.0) { dy = 0.5 * labelheight * expand; if (fabs(nodep[i]->oldtheta - pi/2.0) > 1000.0) dx += labelheight * expand / (2.0*tan(nodep[i]->oldtheta)); } else { dy = -1.5 * labelheight * expand; if (fabs(nodep[i]->oldtheta - pi/2.0) > 1000.0) dx += labelheight * expand / (2.0*tan(nodep[i]->oldtheta)); } } else { if (coso > 0.0) { dx = 0.5 * labelheight * expand; dy = (-0.5 + (0.5*textlength[i]+0.5)*tan(nodep[i]->oldtheta)) * labelheight * expand; } else { dx = -(textlength[i]+0.5) * labelheight * expand; dy = (-0.5 - (0.5*textlength[i]+0.5)*tan(nodep[i]->oldtheta)) * labelheight * expand; } } } else { if (right) { dx = labelheight * expand * coso; dy = labelheight * expand * sino; dx += labelheight * expand * 0.5 * vec * (-cosl*cosv+sinl*sinv); dy += labelheight * expand * 0.5 * vec * (-sinl*cosv-cosl*sinv); } else { dx = labelheight * expand * coso; dy = labelheight * expand * sino; dx += labelheight * expand * 0.5 * vec * (cosl*cosv+sinl*sinv); dy += labelheight * expand * 0.5 * vec * (sinl*cosv-cosl*sinv); dx -= textlength[i] * labelheight * expand * cosl; dy -= textlength[i] * labelheight * expand * sinl; } } plottext(lp->nayme, lp->naymlength, labelheight * expand * xscale / compr, compr, xscale * (lp->xcoord + dx + xoffset), yscale * (lp->ycoord + dy + yoffset), -180 * labangle / pi, font,fontname); } } if (penchange == yes) changepen(treepen); } /* plotlabels */ void user_loop() { /* loop to decide what to do */ long loopcount; char input_char; while (!canbeplotted) { loopcount = 0; do { input_char=showparms(); firstscreens = false; if ( input_char != 'Y') getparms(input_char); countup(&loopcount, 10); } while (input_char != 'Y'); xscale = xunitspercm; yscale = yunitspercm; plotrparms(spp); numlines = dotmatrix ? ((long)floor(yunitspercm * ysize + 0.5) / strpdeep):1; calculate(); rescale(); canbeplotted = true; } } /* user_loop */ void setup_environment(int argc, Char *argv[]) { /* Set up all kinds of fun stuff */ node *q, *r; char *pChar; double i; boolean firsttree; treenode = NULL; #ifdef TURBOC if ((registerbgidriver(EGAVGA_driver) <0) || (registerbgidriver(Herc_driver) <0) || (registerbgidriver(CGA_driver) <0)){ fprintf(stderr,"Graphics error: %s ",grapherrormsg(graphresult())); exxit(-1);} #endif printf("DRAWTREE from PHYLIP version %s\n", VERSION); /* Open in binary: ftell() is broken for UNIX line-endings under WIN32 */ openfile(&intree,INTREE,"input tree file", "rb",argv[0],NULL); printf("Reading tree ... \n"); firsttree = true; allocate_nodep(&nodep, &intree, &spp); treeread (intree, &root, treenode, &goteof, &firsttree, nodep, &nextnode, &haslengths, &grbg, initdrawtreenode,true,-1); q = root; r = root; while (!(q->next == root)) q = q->next; q->next = root->next; root = q; chuck(&grbg, r); nodep[spp] = q; where = root; rotate = true; printf("Tree has been read.\n"); printf("Loading the font ... \n"); loadfont(font,FONTFILE,argv[0]); printf("Font loaded.\n"); ansi = ANSICRT; ibmpc = IBMCRT; firstscreens = true; initialparms(); canbeplotted = false; if (argc > 1) { pChar = argv[1]; for (i = 0; i < strlen(pChar); i++) { if ( ! isdigit(*pChar) ) { /* set to default if the 2nd. parameter is not a number */ maxNumOfIter = 50; return; } else if ( isspace(*pChar) ) { printf("ERROR: Number of iteration should not contain space!\n"); exxit(1); } } sscanf(argv[1], "%li", &maxNumOfIter); } else { /* 2nd. argument is not entered; use default. */ maxNumOfIter = 50; } return; } /* setup_environment */ void drawtree( char* intreename, char* plotfilename, char* plotfileopt, char* fontfilename, char* treegrows, int usebranchlengths, char* labeldirecstr, double labelangle, double treerotation, double treearc, char* iterationkind, int iterationcount, int regularizeangles, int avoidlabeloverlap, int branchrescale, double branchscaler, double relcharhgt, double xmarginratio, double ymarginratio, int dofinalplot, char* finalplotkind) { //printf("hello from DrawTree\n"); //JRMDebug //fflush(stdout); char *previewfilename = "JavaPreview.ps"; // from init javarun = true; ansi = ANSICRT; ibmpc = IBMCRT; firstscreens = false; canbeplotted = true; dotmatrix = false; boolean wasplotted = false; grbg = NULL; progname = "Drawtree"; initialparms(); // translate Java doubles and ints to Drawtree local variables labelrotation = labelangle; treeangle = treerotation * pi / 180; ark = treearc * pi / 180; bscale = branchscaler; charht = relcharhgt; maxNumOfIter = iterationcount; xmargin = xmarginratio * paperx; ymargin = ymarginratio * papery; // translate Java boolean to Phylip boolean boolean doplot; if (dofinalplot != 0) doplot = true; else doplot = false; if (usebranchlengths != 0) uselengths = true; else uselengths = false; if (regularizeangles != 0) regular = true; else regular = false; if (avoidlabeloverlap != 0) labelavoid = true; else labelavoid = false; if (branchrescale != 0) rescaled = true; else rescaled = false; // label direction labeldirec = middle; if (!strcmp(labeldirecstr,"fixed")) labeldirec = fixed; if (!strcmp(labeldirecstr,"middle")) labeldirec = middle; if (!strcmp(labeldirecstr,"radial")) labeldirec = radial; if (!strcmp(labeldirecstr,"along")) labeldirec = along; // iterate to improve tree improve = false; nbody = false; if (!strcmp(iterationkind,"improve")) { improve = true; } if (!strcmp(iterationkind,"nbody")) { improve = true; nbody = true; } // tree growth direction grows = horizontal; if (!strcmp(treegrows,"vertical")) grows = vertical; // figure out plot type plotter = lw; strcpy(afmfile,"none"); if(dofinalplot) { if (!strcmp(finalplotkind,"lw")) { plotter = lw; } /* // not currently available if(!strcmp(finalplotkind,"svg")) { plotter = svg; } */ else if(!strcmp(finalplotkind,"hp")) { plotter = hp; } if(!strcmp(finalplotkind,"tek")) { plotter = tek; } if(!strcmp(finalplotkind,"ibm")) { plotter = ibm; } if(!strcmp(finalplotkind,"mac")) { plotter = mac; } if(!strcmp(finalplotkind,"houston")) { plotter = houston; } if(!strcmp(finalplotkind,"decregis")) { plotter = decregis; } if(!strcmp(finalplotkind,"epson")) { plotter = epson; } if(!strcmp(finalplotkind,"oki")) { plotter = oki; } if(!strcmp(finalplotkind,"fig")) { plotter = fig; } if(!strcmp(finalplotkind,"citoh")) { plotter = citoh; } if(!strcmp(finalplotkind,"toshiba")) { plotter = toshiba; } if(!strcmp(finalplotkind,"pcx")) { plotter = pcx; } if(!strcmp(finalplotkind,"pcl")) { plotter = pcl; } if(!strcmp(finalplotkind,"pict")) { plotter = pict; } if(!strcmp(finalplotkind,"ray")) { plotter = ray; } if(!strcmp(finalplotkind,"pov")) { plotter = pov; } if(!strcmp(finalplotkind,"xbm")) { plotter = xbm; } if(!strcmp(finalplotkind,"bmp")) { plotter = bmp; } if(!strcmp(finalplotkind,"gif")) { plotter = gif; } if(!strcmp(finalplotkind,"idraw")) { plotter = idraw; } if(!strcmp(finalplotkind,"vrml")) { plotter = vrml; } if(!strcmp(finalplotkind,"other")) { plotter = other; } } else { // preview plotter = lw; // hardwired to ps } // choose the best Hershey approximation of the font int fontid = figfontid(fontfilename); //printf("fontfilename: %s id: %i ", fontfilename, fontid); char* hersheyfontname = "font1"; switch (fontid) { case 4: //AvantGarde-Book case 8: //Bookman-Light case 16: //Helvetica case 20: //Helvetica-Narrow hersheyfontname = "font1"; // romans sans break; case 6: //AvantGarde-Demi case 10: //Bookman-Demi case 18: //Helvetica-Bold case 22: //Helvetica-Narrow-Bold hersheyfontname = "font2"; // romans sans bold break; case 0: //Times-Roman case 2: //Times-Bold case 12: //Courier case 14: //Courier-Bold case 24: //NewCenturySchlbk-Roman case 26: //NewCenturySchlbk-Bold case 28: //Palatino-Roman case 30: //Palatino-Bold hersheyfontname = "font3"; // romans serif break; case 1: //Times-Italic case 5: //AvantGarde-BookOblique case 9: //Bookman-LightItalic case 13: //Courier-Italic case 17: //Helvetica-Oblique case 21: //Helvetica-Narrow-Oblique case 25: //NewCenturySchlbk-Italic case 29: //Palatino-Italic case 33: //ZapfChancery-MediumItalic hersheyfontname = "font4"; // romans serif italic break; case 3: //Times-BoldItalic case 7: //AvantGarde-DemiOblique case 11: //Bookman-DemiItalic case 15: //Courier-BoldItalic case 19: //Helvetica-BoldOblique case 23: //Helvetica-Narrow-BoldOblique case 27: //NewCenturySchlbk-BoldItalic case 31: //Palatino-BoldItalic hersheyfontname = "font5"; // romans serif italic bold break; default: /* case 32: //Symbol case 34: //ZapfDingbats */ hersheyfontname = "font1"; // romans sans break; } //printf("hersheyfontname: %s\n", hersheyfontname); loadfont(font,hersheyfontname,progname); #ifdef JAVADEBUG // dump translated parameters from Java to make sure they made it //JRMDebug printf("***Java input parameters***\n"); printf("intreename: %s\n",intreename); printf("plotfilename: %s\n",plotfilename); printf("plotfileopt: %s\n",plotfileopt); printf("fontfilename: %s\n",fontfilename); printf("usebranchlengths:: %i\n",usebranchlengths); printf("labeldirecstr: %s\n",labeldirecstr); printf("labelangle: %f\n",labelangle); printf("treerotation: %f\n",treerotation); printf("treearc: %f\n",treearc); printf("iterationkind: %s\n",iterationkind); printf("iterationcount: %i\n",iterationcount); printf("regularizeangles: %i\n",regularizeangles); printf("avoidlabeloverlap: %i\n",avoidlabeloverlap); printf("branchrescale: %i\n",branchrescale); printf("branchscaler: %f\n",branchscaler); printf("relcharhgt: %f\n",relcharhgt); printf("dofinalplot: %i\n",dofinalplot); printf("previewfilename: %s\n",previewfilename); printf("finalplotkind: %s\n",finalplotkind); printf("doplot: %i\n",doplot); fflush(stdout); #endif // read tree intree = fopen(intreename,"r"); boolean firsttree = true; allocate_nodep(&nodep, &intree, &spp); plotrparms(spp); treeread (intree, &root, treenode, &goteof, &firsttree, nodep, &nextnode, &haslengths, &grbg, initdrawtreenode,true,-1); root->oldlen = 0.0; //??? // if there are no lengths, shut uselengths off if (haslengths==0) { uselengths = false; } //printf("Tree has been read.\n"); //printf("haslengths: %i\n",haslengths); //fflush(stdout); // printer specific details to simulate interactive setup // mostly from getplotter with some from plotrparms // after treeread as some formats need some of the tree data dotmatrix = false; long stripedepth; switch (plotter) { case lw: strcpy(fontname, fontfilename); break; case hp: strcpy(fontname, "Hershey"); break; case tek: strcpy(fontname, "Hershey"); break; case ibm: strcpy(fontname, "Hershey"); break; case mac: strcpy(fontname, "Hershey"); break; case houston: strcpy(fontname, "Hershey"); break; case decregis: strcpy(fontname, "Hershey"); break; case epson: dotmatrix = true; strcpy(fontname, "Hershey"); strpdiv = 1; stripedepth = allocstripe(stripe,(strpwide/8),((long)(yunitspercm * ysize))); break; case oki: dotmatrix = true; strcpy(fontname, "Hershey"); strpdiv = 1; stripedepth = allocstripe(stripe,(strpwide/8),((long)(yunitspercm * ysize))); break; case fig: strcpy(fontname, fontfilename); break; case citoh: dotmatrix = true; strcpy(fontname, "Hershey"); strpdiv = 1; stripedepth = allocstripe(stripe,(strpwide/8),((long)(yunitspercm * ysize))); break; case toshiba: dotmatrix = true; strcpy(fontname, "Hershey"); strpdiv = 4; stripedepth = allocstripe(stripe,(strpwide/8),((long)(yunitspercm * ysize))); break; case pcx: dotmatrix = true; strcpy(fontname, "Hershey"); // assume VGA 1024 x 768 strpwide = 1024; yunitspercm = 768 / ysize; resopts = 3; strpdiv = 10; stripedepth = allocstripe(stripe,(strpwide/8),((long)(yunitspercm * ysize))); break; case pcl: dotmatrix = true; strcpy(fontname, "Hershey"); // assume 300 dpi hpresolution = 300; xunitspercm = 118.11023622; /* 300 DPI = 118.1 DPC */ yunitspercm = 118.11023622; strpwide = 2550; /* 8.5 * 300 DPI */ strpdeep = DEFAULT_STRIPE_HEIGHT; /* height of the strip */ strpdiv = DEFAULT_STRIPE_HEIGHT; /* in this case == strpdeep */ stripedepth = allocstripe(stripe,(strpwide/8),((long)(yunitspercm * ysize))); break; case pict: strcpy(fontname, fontfilename); break; case ray: strcpy(fontname, "Hershey"); break; case pov: strcpy(fontname, "Hershey"); break; case xbm: dotmatrix = true; strcpy(fontname, "Hershey"); // assume 1000 x 1000 xunitspercm = 1.0; yunitspercm = 1.0; xsize = 1000; ysize = 1000; xmargin = 0.08 * xsize; ymargin = 0.08 * ysize; strpdeep = DEFAULT_STRIPE_HEIGHT; strpdiv = DEFAULT_STRIPE_HEIGHT; strpwide = (long)xsize; stripedepth = allocstripe(stripe,(strpwide/8),((long)(yunitspercm * ysize))); break; case bmp: dotmatrix = true; strcpy(fontname, "Hershey"); // assume 1000 x 1000 xunitspercm = 1.0; yunitspercm = 1.0; xsize = 1000; ysize = 1000; xmargin = 0.08 * xsize; ymargin = 0.08 * ysize; strpdeep = DEFAULT_STRIPE_HEIGHT; strpdiv = DEFAULT_STRIPE_HEIGHT; strpwide = (long)xsize; stripedepth = allocstripe(stripe,(strpwide/8),((long)(yunitspercm * ysize))); /* printf("***bmp setup done\n"); printf("xsize: %f\n", xsize); printf("ysize: %f\n", ysize); printf("xmargin: %f\n",xmargin); printf("ymargin: %f\n",ymargin); printf("strpwide: %li, yunitspercm: %f, ysize: %f\n" ,strpwide, yunitspercm, ysize); fflush(stdout); */ break; case gif: strcpy(fontname, fontfilename); break; case idraw: strcpy(fontname, "Times-Bold"); break; case vrml: strcpy(fontname, "Hershey"); treecolor = 5; namecolor = 4; vrmlskycolornear = 6; vrmlskycolorfar = 6; vrmlgroundcolornear = 3; vrmlgroundcolorfar = 3; break; case other: strcpy(fontname, "Hershey"); break; default: break; } numlines = dotmatrix ? ((long)floor(yunitspercm * ysize + 0.5) / strpdeep) :1; xscale = xunitspercm; yscale = yunitspercm; #ifdef JAVADEBUG printf("\n***internal parameters***\n"); printf("fontname: %s\n", fontname); printf("plotter: %i\n",plotter); printf("paperx: %f\n",paperx); printf("papery: %f\n",papery); printf("hpmargin: %f\n",hpmargin); printf("vpmargin: %f\n",vpmargin); printf("pagex: %f\n",pagex); printf("pagey: %f\n",pagey); printf("xmargin: %f\n",xmargin); printf("ymargin: %f\n",ymargin); printf("rescaled: %i\n",rescaled); printf("firstscreens: %i\n",firstscreens); printf("canbeplotted: %i\n",canbeplotted); printf("labelrotation: %f\n",labelrotation); printf("treeangle: %f\n",treeangle); printf("maxNumOfIter: %li\n",maxNumOfIter); printf("ark: %f\n",ark); printf("bscale: %f\n",bscale); printf("charht: %f\n",charht); printf("uselengths: %i\n",uselengths); printf("regular: %i\n",regular); printf("rescaled: %i\n",rescaled); printf("labeldirec: %i\n",labeldirec); printf("improve: %i\n",improve); printf("nbody: %i\n",nbody); printf("grows: %i\n",grows); printf("dotmatrix: %i numlines: %li\n",dotmatrix, numlines); fflush(stdout); #endif // set up node link structure node *q, *r; q = root; r = root; while (!(q->next == root)) q = q->next; q->next = root->next; root = q; chuck(&grbg, r); nodep[spp] = q; where = root; rotate = true; calculate(); rescale(); int lenname; if (doplot) { lenname = strlen(plotfilename); } else { lenname = strlen(previewfilename); } char* usefilename = (char*) malloc(lenname + 1); if (doplot) { strcpy(usefilename, plotfilename); } else { strcpy(usefilename, previewfilename); } // Open plot files in binary mode. plotfile = fopen(usefilename,plotfileopt); initplotter(spp,fontname); numlines = dotmatrix ? ((long)floor(yunitspercm * ysize + 0.5)/strpdeep) : 1; if (!dofinalplot) { changepen(labelpen); makebox(fontname,&xoffset,&yoffset,&scale,spp); changepen(treepen); } drawit(fontname,&xoffset,&yoffset,numlines,root); finishplotter(); fclose(plotfile); wasplotted = true; fclose(intree); return; } int main(int argc, Char *argv[]) { long stripedepth; boolean wasplotted = false; javarun = false; #ifdef MAC char filename1[FNMLNGTH]; OSErr retcode; FInfo fndrinfo; #ifdef OSX_CARBON FSRef fileRef; FSSpec fileSpec; #endif #ifdef __MWERKS__ SIOUXSetTitle("\pPHYLIP: Drawtree"); #endif argv[0] = "Drawtree"; #endif init(argc,argv); progname = argv[0]; grbg = NULL; setup_environment(argc, argv); user_loop(); #ifdef JAVADEBUG //JRMDebug printf("\n***internal parameters***\n"); printf("fontname: %s\n", fontname); printf("plotter: %i\n",plotter); printf("paperx: %f\n",paperx); printf("papery: %f\n",papery); printf("hpmargin: %f\n",hpmargin); printf("vpmargin: %f\n",vpmargin); printf("pagex: %f\n",pagex); printf("pagey: %f\n",pagey); printf("xmargin: %f\n",xmargin); printf("ymargin: %f\n",ymargin); printf("rescaled: %i\n",rescaled); printf("firstscreens: %i\n",firstscreens); printf("canbeplotted: %i\n",canbeplotted); printf("labelrotation: %f\n",labelrotation); printf("treeangle: %f\n",treeangle); printf("maxNumOfIter: %li\n",maxNumOfIter); printf("ark: %f\n",ark); printf("bscale: %f\n",bscale); printf("charht: %f\n",charht); printf("uselengths: %i\n",uselengths); printf("regular: %i\n",regular); printf("rescaled: %i\n",rescaled); printf("labeldirec: %i\n",labeldirec); printf("improve: %i\n",improve); printf("nbody: %i\n",nbody); printf("grows: %i\n",grows); printf("dotmatrix: %i\n",dotmatrix); fflush(stdout); #endif if (dotmatrix) { stripedepth = allocstripe(stripe,(strpwide/8), ((long)(yunitspercm * ysize))); strpdeep = stripedepth; strpdiv = stripedepth; } if (!(winaction == quitnow)) { // Open plotfiles in binary mode openfile(&plotfile, PLOTFILE, "plot file", "wb", argv[0], pltfilename); initplotter(spp,fontname); numlines = dotmatrix ? ((long)floor(yunitspercm * ysize + 0.5)/strpdeep) : 1; if (plotter != ibm) printf("\nWriting plot file ...\n"); drawit(fontname,&xoffset,&yoffset,numlines,root); finishplotter(); wasplotted = true; FClose(plotfile); printf("\nPlot written to file \"%s\"\n", pltfilename); } FClose(intree); printf("\nDone.\n\n"); #ifdef MAC if (plotter == pict && wasplotted){ #ifdef OSX_CARBON FSPathMakeRef((unsigned char *)pltfilename, &fileRef, NULL); FSGetCatalogInfo(&fileRef, kFSCatInfoNone, NULL, NULL, &fileSpec, NULL); FSpGetFInfo(&fileSpec, &fndrinfo); fndrinfo.fdType='PICT'; fndrinfo.fdCreator='MDRW'; FSpSetFInfo(&fileSpec, &fndrinfo); #else strcpy(filename1, pltfilename); retcode=GetFInfo(CtoPstr(filename1),0,&fndrinfo); fndrinfo.fdType='PICT'; fndrinfo.fdCreator='MDRW'; strcpy(filename1, pltfilename); retcode=SetFInfo(CtoPstr(PLOTFILE),0,&fndrinfo); #endif } if (plotter == lw && wasplotted){ #ifdef OSX_CARBON FSPathMakeRef((unsigned char *)pltfilename, &fileRef, NULL); FSGetCatalogInfo(&fileRef, kFSCatInfoNone, NULL, NULL, &fileSpec, NULL); FSpGetFInfo(&fileSpec, &fndrinfo); fndrinfo.fdType='TEXT'; FSpSetFInfo(&fileSpec, &fndrinfo); #else retcode=GetFInfo(CtoPstr(PLOTFILE),0,&fndrinfo); fndrinfo.fdType='TEXT'; retcode=SetFInfo(CtoPstr(PLOTFILE),0,&fndrinfo); #endif } #endif #ifdef WIN32 phyRestoreConsoleAttributes(); #endif exxit(0); return 1; } phylip-3.697/src/factor.c0000644004732000473200000004112712406201116014754 0ustar joefelsenst_g #include "phylip.h" /* version 3.696. A program to factor multistate character trees. Originally version 29 May 1983 by C. A. Meacham, Botany Department, University of Georgia Additional code by Joe Felsenstein, 1988-1991 C version code by Akiko Fuseki, Sean Lamont, and Andrew Keeffe. Additional code by James McGill. Copyright (c) 1988-2014, Joseph Felsenstein All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #define maxstates 20 /* Maximum number of states in multi chars */ #define maxoutput 80 /* Maximum length of output line */ #define sizearray 5000 /* Size of symbarray; must be >= the sum of */ /* squares of the number of states in each multi*/ /* char to be factored */ #define factchar ':' /* character to indicate state connections */ #define unkchar '?' /* input character to indicate state unknown */ typedef struct statenode { /* Node of multifurcating tree */ struct statenode *ancstr, *sibling, *descendant; Char state; /* Symbol of character state */ long edge; /* Number of subtending edge */ } statenode; #ifndef OLDC /* function prototypes */ void getoptions(void); void nextch(Char *ch); void readtree(void); void attachnodes(statenode *, Char *); void maketree(statenode *, Char *); void construct(void); void numberedges(statenode *, long *); void factortree(void); void dotrees(void); void writech(Char ch, long *, FILE *outauxfile); void writefactors(long *); void writeancestor(long *); void doeu(long *, long); void dodatamatrix(void); /* function prototypes */ #endif Char infilename[FNMLNGTH], outfilename[FNMLNGTH], outfactfilename[FNMLNGTH], outancfilename[FNMLNGTH]; FILE *outfactfile, *outancfile; long neus, nchars, charindex, lastindex; Char ch; boolean ancstrrequest, factorrequest, rooted, progress; Char symbarray[sizearray]; /* Holds multi symbols and their factored equivs */ long *charnum; /* Multis */ long *chstart; /* Position of each */ long *numstates; /* Number of states */ Char *ancsymbol; /* Ancestral state */ /* local variables for dotrees, propagated to global level. */ long npairs, offset, charnumber, nstates; statenode *root; Char pair[maxstates][2]; statenode *nodes[maxstates]; void getoptions() { /* interactively set options */ Char ch; ibmpc = IBMCRT; ansi = ANSICRT; progress = true; factorrequest = false; ancstrrequest = false; putchar('\n'); for (;;){ #ifdef WIN32 if(ibmpc || ansi){ phyClearScreen(); } else { printf("\n"); } #else printf(ansi ? "\033[2J\033[H" : "\n"); #endif printf("\nFactor -- multistate to binary recoding program, version %s\n\n" ,VERSION); printf("Settings for this run:\n"); printf(" A put ancestral states in output file? %s\n", ancstrrequest ? "Yes" : "No"); printf(" F put factors information in output file? %s\n", factorrequest ? "Yes" : "No"); printf(" 0 Terminal type (IBM PC, ANSI, none)? %s\n", ibmpc ? "IBM PC" : ansi ? "ANSI" : "(none)"); printf(" 1 Print indications of progress of run %s\n", (progress ? "Yes" : "No")); printf("\nAre these settings correct? (type Y or the letter for one to change)\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); uppercase(&ch); if (ch == 'Y') break; if (strchr("AF01", ch) != NULL) { switch (ch) { case 'A': ancstrrequest = !ancstrrequest; break; case 'F': factorrequest = !factorrequest; break; case '0': initterminal(&ibmpc, &ansi); break; case '1': progress = !progress; break; } } else printf("Not a possible option!\n"); } } /* getoptions */ void nextch(Char *ch) { *ch = ' '; while (*ch == ' ' && !eoln(infile)) *ch = gettc(infile); } /* nextch */ void readtree() { /* Reads a single character-state tree; puts adjacent symbol pairs into array 'pairs' */ npairs = 0; while (!eoln(infile)) { nextch(&ch); if (eoln(infile)) break; npairs++; pair[npairs - 1][0] = ch; nextch(&ch); if (eoln(infile) || (ch != factchar)) { printf("\n\nERROR: Character %ld: bad character state tree format\n\n", charnumber); exxit(-1);} nextch(&pair[npairs - 1][1]); if (eoln(infile) && pair[npairs - 1][1] == ' '){ printf("\n\nERROR: Character %ld: bad character state tree format\n\n", charnumber); exxit(-1);} } scan_eoln(infile); } /* readtree */ void attachnodes(statenode *poynter, Char *otherone) { /* Makes linked list of all nodes to which passed node is ancestral. First such node is 'descendant'; second such node is 'sibling' of first; third such node is sibling of second; etc. */ statenode *linker, *ptr; long i, j, k; linker = poynter; for (i = 0; i < (npairs); i++) { for (j = 1; j <= 2; j++) { if (poynter->state == pair[i][j - 1]) { if (j == 1) *otherone = pair[i][1]; else *otherone = pair[i][0]; if (*otherone != '.' && *otherone != poynter->ancstr->state) { k = offset + 1; while (*otherone != symbarray[k - 1]) k++; if (nodes[k - offset - 1] != NULL) exxit(-1); ptr = (statenode *)Malloc(sizeof(statenode)); ptr->ancstr = poynter; ptr->descendant = NULL; ptr->sibling = NULL; ptr->state = *otherone; if (linker == poynter) /* If not first */ poynter->descendant = ptr; /* If first */ else linker->sibling = ptr; nodes[k - offset - 1] = ptr; /* Save pntr to node */ linker = ptr; } } } } } /* attachnodes */ void maketree(statenode *poynter, Char *otherone) { /* Recursively attach nodes */ if (poynter == NULL) return; attachnodes(poynter, otherone); maketree(poynter->descendant, otherone); maketree(poynter->sibling, otherone); } /* maketree */ void construct() { /* Puts tree together from array 'pairs' */ Char rootstate; long i, j, k; boolean done; statenode *poynter; char otherone; rooted = false; ancsymbol[charindex - 1] = '?'; rootstate = pair[0][0]; nstates = 0; for (i = 0; i < (npairs); i++) { for (j = 1; j <= 2; j++) { k = 1; done = false; while (!done) { if (k > nstates) { done = true; break; } if (pair[i][j - 1] == symbarray[offset + k - 1]) done = true; else k++; } if (k > nstates) { if (pair[i][j - 1] == '.') { if (rooted) exxit(-1); rooted = true; ancsymbol[charindex - 1] = '0'; if (j == 1) rootstate = pair[i][1]; else rootstate = pair[i][0]; } else { nstates++; symbarray[offset + nstates - 1] = pair[i][j - 1]; } } } } if ((rooted && nstates != npairs) || (!rooted && nstates != npairs + 1)) exxit(-1); root = (statenode *)Malloc(sizeof(statenode)); root->state = ' '; root->descendant = (statenode *)Malloc(sizeof(statenode)); root->descendant->ancstr = root; root = root->descendant; root->descendant = NULL; root->sibling = NULL; root->state = rootstate; for (i = 0; i < (nstates); i++) nodes[i] = NULL; i = 1; while (symbarray[offset + i - 1] != rootstate) i++; nodes[i - 1] = root; maketree(root, &otherone); for (i = 0; i < (nstates); i++) { if (nodes[i] != root) { if (nodes[i] == NULL){ printf( "\n\nERROR: Character %ld: invalid character state tree description\n", charnumber); exxit(-1);} else { poynter = nodes[i]->ancstr; while (poynter != root && poynter != nodes[i]) poynter = poynter->ancstr; if (poynter != root){ printf( "ERROR: Character %ld: invalid character state tree description\n\n", charnumber); exxit(-1);} } } } } /* construct */ void numberedges(statenode *poynter, long *edgenum) { /* Assign to each node a number for the edge below it. The root is zero */ if (poynter == NULL) return; poynter->edge = *edgenum; (*edgenum)++; numberedges(poynter->descendant, edgenum); numberedges(poynter->sibling, edgenum); } /* numberedges */ void factortree() { /* Generate the string of 0's and 1's that will be substituted for each symbol of the multistate char. */ long i, j, place, factoroffset; statenode *poynter; long edgenum=0; numberedges(root, &edgenum); factoroffset = offset + nstates; for (i = 0; i < (nstates); i++) { place = factoroffset + (nstates - 1) * i; for (j = place; j <= (place + nstates - 2); j++) symbarray[j] = '0'; poynter = nodes[i]; while (poynter != root) { symbarray[place + poynter->edge - 1] = '1'; poynter = poynter->ancstr; } } } /* factortree */ void dotrees() { /* Process character-state trees */ long lastchar; charindex = 0; lastchar = 0; offset = 0; charnumber = 0; if (fscanf(infile, "%ld", &charnumber) != 1) { printf("Invalid input file!\n"); exxit(-1); } while (charnumber < 999) { if (charnumber < lastchar) { printf("\n\nERROR: Character state tree"); printf(" for character %ld: out of order\n\n", charnumber); exxit(-1); } charindex++; lastindex = charindex; readtree(); /* Process character-state tree */ if (npairs > 0) { construct(); /* Link tree together */ factortree(); } else { nstates = 0; ancsymbol[charindex - 1] = '?'; } lastchar = charnumber; charnum[charindex - 1] = charnumber; chstart[charindex - 1] = offset; numstates[charindex - 1] = nstates; offset += nstates * nstates; fscanf(infile, "%ld", &charnumber); } scan_eoln(infile); /* each multistate character */ /* symbol */ } /* dotrees */ void writech(Char ch, long *chposition, FILE *outauxfile) { /* Writes a single character to output */ if (*chposition > maxoutput) { putc('\n', outauxfile); *chposition = 1; } putc(ch, outauxfile); (*chposition)++; } /* writech */ void writefactors(long *chposition) { /* Writes 'FACTORS' line */ long i, charindex; Char symbol; *chposition = 11; symbol = '-'; for (charindex = 0; charindex < (lastindex); charindex++) { if (symbol == '-') symbol = '+'; else symbol = '-'; if (numstates[charindex] == 0) writech(symbol, chposition, outfactfile); else { for (i = 1; i < (numstates[charindex]); i++) writech(symbol, chposition, outfactfile); } } putc('\n', outfactfile); } /* writefactors */ void writeancestor(long *chposition) { /* Writes 'ANCESTOR' line */ long i, charindex; charindex = 1; while (ancsymbol[charindex - 1] == '?') charindex++; if (charindex > lastindex) return; *chposition = 11; for (charindex = 0; charindex < (lastindex); charindex++) { if (numstates[charindex] == 0) writech(ancsymbol[charindex], chposition, outancfile); else { for (i = 1; i < (numstates[charindex]); i++) writech(ancsymbol[charindex], chposition, outancfile); } } putc('\n', outancfile); } /* writeancestor */ void doeu(long *chposition, long eu) { /* Writes factored data for a single species */ long i, charindex, place; Char *multichar; for (i = 1; i <= nmlngth; i++) { ch = gettc(infile); putc(ch, outfile); if ((ch == '(') || (ch == ')') || (ch == ':') || (ch == ',') || (ch == ';') || (ch == '[') || (ch == ']')) { printf( "\n\nERROR: Species name may not contain characters ( ) : ; , [ ] \n"); printf(" In name of species number %ld there is character %c\n\n", i+1, ch); exxit(-1); } } multichar = (Char *)Malloc(nchars*sizeof(Char)); *chposition = 11; for (i = 0; i < (nchars); i++) { do { if (eoln(infile)) scan_eoln(infile); ch = gettc(infile); } while (ch == ' ' || ch == '\t'); multichar[i] = ch; } scan_eoln(infile); for (charindex = 0; charindex < (lastindex); charindex++) { if (numstates[charindex] == 0) writech(multichar[charnum[charindex] - 1], chposition, outfile); else { i = 1; while (symbarray[chstart[charindex] + i - 1] != multichar[charnum[charindex] - 1] && i <= numstates[charindex]) i++; if (i > numstates[charindex]) { if( multichar[charnum[charindex] - 1] == unkchar){ for (i = 1; i < (numstates[charindex]); i++) writech('?', chposition, outfile); } else { putc('\n', outfile); printf("\n\nERROR: In species %ld, multistate character %ld: ", eu, charnum[charindex]); printf("'%c' is not a documented state\n\n", multichar[charnum[charindex] - 1]); exxit(-1); } } else { place = chstart[charindex] + numstates[charindex] + (numstates[charindex] - 1) * (i - 1); for (i = 0; i <= (numstates[charindex] - 2); i++) writech(symbarray[place + i], chposition, outfile); } } } putc('\n', outfile); free(multichar); } /* doeu */ void dodatamatrix() { /* Reads species information and write factored data set */ long charindex, totalfactors, eu, chposition; totalfactors = 0; for (charindex = 0; charindex < (lastindex); charindex++) { if (numstates[charindex] == 0) totalfactors++; else totalfactors += numstates[charindex] - 1; } fprintf(outfile, "%5ld %4ld\n", neus, totalfactors); if (factorrequest) writefactors(&chposition); if (ancstrrequest) writeancestor(&chposition); eu = 1; while (eu <= neus) { eu++; doeu(&chposition, eu); } if (progress) printf("\nData matrix written on file \"%s\"\n\n", outfilename); } /* dodatamatrix */ int main(int argc, Char *argv[]) { #ifdef MAC argc = 1; /* macsetup("Factor",""); */ argv[0] = "Factor"; #endif init(argc,argv); openfile(&infile,INFILE,"input file", "r",argv[0],infilename); openfile(&outfile,OUTFILE,"output file", "w",argv[0],outfilename); getoptions(); if(factorrequest) openfile(&outfactfile, "factors", "output factors file", "w", argv[0], outfactfilename); if(ancstrrequest) openfile(&outancfile, "ancestors", "output ancestors file", "w", argv[0], outancfilename); fscanf(infile, "%ld%ld", &neus, &nchars); scan_eoln(infile); charnum = (long *)Malloc(nchars*sizeof(long)); chstart = (long *)Malloc(nchars*sizeof(long)); numstates = (long *)Malloc(nchars*sizeof(long)); ancsymbol = (Char *)Malloc(nchars*sizeof(Char)); dotrees(); /* Read and factor character-state trees */ dodatamatrix(); FClose(infile); FClose(outfile); #ifdef MAC fixmacfile(outfilename); #endif printf("Done.\n\n"); #ifdef WIN32 phyRestoreConsoleAttributes(); #endif return 0; } /* factor */ phylip-3.697/src/fitch.c0000644004732000473200000007501612406201116014577 0ustar joefelsenst_g #include "phylip.h" #include "dist.h" #include "float.h" /* version 3.696. Written by Joseph Felsenstein, Akiko Fuseki, Sean Lamont, and Andrew Keeffe. Copyright (c) 1993-2014, Joseph Felsenstein All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #define zsmoothings 10 /* number of zero-branch correction iterations */ #define epsilonf 0.000001 /* a very small but not too small number */ #define delta 0.0001 /* a not quite so small number */ #define MAXNUMTREES 100000000 /* a number bigger than conceivable numtrees */ #ifndef OLDC /* function prototypes */ void getoptions(void); void allocrest(void); void doinit(void); void inputoptions(void); void fitch_getinput(void); void secondtraverse(node *, double , long *, double *); void firsttraverse(node *, long *, double *); double evaluate(tree *); void nudists(node *, node *); void makedists(node *); void makebigv(node *); void correctv(node *); void alter(node *, node *); void nuview(node *); void update(node *); void smooth(node *); void filltraverse(node *, node *, boolean); void fillin(node *, node *, boolean); void insert_(node *, node *, boolean); void copynode(node *, node *); void copy_(tree *, tree *); void setuptipf(long, tree *); void buildnewtip(long , tree *, long); void buildsimpletree(tree *, long); void addtraverse(node *, node *, boolean, long *, boolean *); void re_move(node **, node **); void rearrange(node *, long *, long *, boolean *); void describe(node *); void summarize(long); void nodeinit(node *); void initrav(node *); void treevaluate(void); void maketree(void); void globrearrange(long* numtrees,boolean* succeeded); /* function prototypes */ #endif Char infilename[FNMLNGTH], outfilename[FNMLNGTH], intreename[FNMLNGTH], outtreename[FNMLNGTH]; long nonodes2, outgrno, nums, col, datasets, ith, njumble, jumb=0; long inseed; vector *x; intvector *reps; boolean minev, global, jumble, lengths, usertree, lower, upper, negallowed, outgropt, replicates, trout, printdata, progress, treeprint, mulsets, firstset; double power; double trweight; /* to make treeread happy */ boolean goteof, haslengths; /* ditto ... */ boolean first; /* ditto ... */ node *addwhere; longer seed; long *enterorder; tree curtree, priortree, bestree, bestree2; Char ch; char *progname; void getoptions() { /* interactively set options */ long inseed0=0, loopcount; Char ch; boolean done=false; putchar('\n'); minev = false; global = false; jumble = false; njumble = 1; lengths = false; lower = false; negallowed = false; outgrno = 1; outgropt = false; power = 2.0; replicates = false; trout = true; upper = false; usertree = false; printdata = false; progress = true; treeprint = true; loopcount = 0; do { cleerhome(); printf("\nFitch-Margoliash method version %s\n\n",VERSION); printf("Settings for this run:\n"); printf(" D Method (F-M, Minimum Evolution)? %s\n", (minev ? "Minimum Evolution" : "Fitch-Margoliash")); printf(" U Search for best tree? %s\n", (usertree ? "No, use user trees in input file" : "Yes")); if (usertree) { printf(" N Use lengths from user trees? %s\n", (lengths ? "Yes" : "No")); } printf(" P Power?%9.5f\n",power); printf(" - Negative branch lengths allowed? %s\n", negallowed ? "Yes" : "No"); printf(" O Outgroup root?"); if (outgropt) printf(" Yes, at species number%3ld\n", outgrno); else printf(" No, use as outgroup species%3ld\n", outgrno); printf(" L Lower-triangular data matrix?"); if (lower) printf(" Yes\n"); else printf(" No\n"); printf(" R Upper-triangular data matrix?"); if (upper) printf(" Yes\n"); else printf(" No\n"); printf(" S Subreplicates?"); if (replicates) printf(" Yes\n"); else printf(" No\n"); if (!usertree) { printf(" G Global rearrangements?"); if (global) printf(" Yes\n"); else printf(" No\n"); printf(" J Randomize input order of species?"); if (jumble) printf(" Yes (seed =%8ld,%3ld times)\n", inseed0, njumble); else printf(" No. Use input order\n"); } printf(" M Analyze multiple data sets?"); if (mulsets) printf(" Yes, %2ld sets\n", datasets); else printf(" No\n"); printf(" 0 Terminal type (IBM PC, ANSI, none)?"); if (ibmpc) printf(" IBM PC\n"); if (ansi) printf(" ANSI\n"); if (!(ibmpc || ansi)) printf(" (none)\n"); printf(" 1 Print out the data at start of run"); if (printdata) printf(" Yes\n"); else printf(" No\n"); printf(" 2 Print indications of progress of run"); if (progress) printf(" Yes\n"); else printf(" No\n"); printf(" 3 Print out tree"); if (treeprint) printf(" Yes\n"); else printf(" No\n"); printf(" 4 Write out trees onto tree file?"); if (trout) printf(" Yes\n"); else printf(" No\n"); printf( "\n Y to accept these or type the letter for one to change\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); uppercase(&ch); done = (ch == 'Y'); if (!done) { if (((!usertree) && (strchr("DJOUNPG-LRSM01234", ch) != NULL)) || (usertree && ((strchr("DOUNPG-LRSM01234", ch) != NULL)))){ switch (ch) { case 'D': minev = !minev; if (minev && (!negallowed)) negallowed = true; break; case '-': negallowed = !negallowed; break; case 'G': global = !global; break; case 'J': jumble = !jumble; if (jumble) initjumble(&inseed, &inseed0, seed, &njumble); else njumble = 1; break; case 'L': lower = !lower; break; case 'N': lengths = !lengths; break; case 'O': outgropt = !outgropt; if (outgropt) initoutgroup(&outgrno, spp); break; case 'P': initpower(&power); break; case 'R': upper = !upper; break; case 'S': replicates = !replicates; break; case 'U': usertree = !usertree; break; case 'M': mulsets = !mulsets; if (mulsets) initdatasets(&datasets); jumble = true; if (jumble) initseed(&inseed, &inseed0, seed); break; case '0': initterminal(&ibmpc, &ansi); break; case '1': printdata = !printdata; break; case '2': progress = !progress; break; case '3': treeprint = !treeprint; break; case '4': trout = !trout; break; } } else printf("Not a possible option!\n"); } countup(&loopcount, 100); } while (!done); if (lower && upper) { printf("ERROR: Data matrix cannot be both uppeR and Lower triangular\n"); exxit(-1); } } /* getoptions */ void allocrest() { long i; x = (vector *)Malloc(spp*sizeof(vector)); reps = (intvector *)Malloc(spp*sizeof(intvector)); for (i=0;i 1) { alloctree(&bestree2.nodep, nonodes2); allocd(nonodes2, bestree2.nodep); allocw(nonodes2, bestree2.nodep); } } allocrest(); } /* doinit */ void inputoptions() { /* print options information */ if (!firstset) samenumsp2(ith); fprintf(outfile, "\nFitch-Margoliash method version %s\n\n",VERSION); if (minev) fprintf(outfile, "Minimum evolution method option\n\n"); fprintf(outfile, " __ __ 2\n"); fprintf(outfile, " \\ \\ (Obs - Exp)\n"); fprintf(outfile, "Sum of squares = /_ /_ ------------\n"); fprintf(outfile, " "); if (power == (long)power) fprintf(outfile, "%2ld\n", (long)power); else fprintf(outfile, "%4.1f\n", power); fprintf(outfile, " i j Obs\n\n"); fprintf(outfile, "Negative branch lengths "); if (!negallowed) fprintf(outfile, "not "); fprintf(outfile, "allowed\n\n"); if (global) fprintf(outfile, "global optimization\n\n"); } /* inputoptions */ void fitch_getinput() { /* reads the input data */ inputoptions(); } /* fitch_getinput */ void secondtraverse(node *q, double y, long *nx, double *sum) { /* from each of those places go back to all others */ /* nx comes from firsttraverse */ /* sum comes from evaluate via firsttraverse */ double z=0.0, TEMP=0.0; z = y + q->v; if (q->tip) { TEMP = q->d[(*nx) - 1] - z; *sum += q->w[(*nx) - 1] * (TEMP * TEMP); } else { secondtraverse(q->next->back, z, nx, sum); secondtraverse(q->next->next->back, z, nx,sum); } } /* secondtraverse */ void firsttraverse(node *p, long *nx, double *sum) { /* go through tree calculating branch lengths */ if (minev && (p != curtree.start)) *sum += p->v; if (p->tip) { if (!minev) { *nx = p->index; secondtraverse(p->back, 0.0, nx, sum); } } else { firsttraverse(p->next->back, nx,sum); firsttraverse(p->next->next->back, nx,sum); } } /* firsttraverse */ double evaluate(tree *t) { double sum=0.0; long nx=0; /* evaluate likelihood of a tree */ firsttraverse(t->start->back ,&nx, &sum); firsttraverse(t->start, &nx, &sum); if ((!minev) && replicates && (lower || upper)) sum /= 2; t->likelihood = -sum; return (-sum); } /* evaluate */ void nudists(node *x, node *y) { /* compute distance between an interior node and tips */ long nq=0, nr=0, nx=0, ny=0; double dil=0, djl=0, wil=0, wjl=0, vi=0, vj=0; node *qprime, *rprime; qprime = x->next; rprime = qprime->next->back; qprime = qprime->back; ny = y->index; dil = qprime->d[ny - 1]; djl = rprime->d[ny - 1]; wil = qprime->w[ny - 1]; wjl = rprime->w[ny - 1]; vi = qprime->v; vj = rprime->v; x->w[ny - 1] = wil + wjl; if (wil + wjl <= 0.0) x->d[ny - 1] = 0.0; else x->d[ny - 1] = ((dil - vi) * wil + (djl - vj) * wjl) / (wil + wjl); nx = x->index; nq = qprime->index; nr = rprime->index; dil = y->d[nq - 1]; djl = y->d[nr - 1]; wil = y->w[nq - 1]; wjl = y->w[nr - 1]; y->w[nx - 1] = wil + wjl; if (wil + wjl <= 0.0) y->d[nx - 1] = 0.0; else y->d[nx - 1] = ((dil - vi) * wil + (djl - vj) * wjl) / (wil + wjl); } /* nudists */ void makedists(node *p) { /* compute distances among three neighbors of a node */ long i=0, nr=0, ns=0; node *q, *r, *s; r = p->back; nr = r->index; for (i = 1; i <= 3; i++) { q = p->next; s = q->back; ns = s->index; if (s->w[nr - 1] + r->w[ns - 1] <= 0.0) p->dist = 0.0; else p->dist = (s->w[nr - 1] * s->d[nr - 1] + r->w[ns - 1] * r->d[ns - 1]) / (s->w[nr - 1] + r->w[ns - 1]); p = q; r = s; nr = ns; } } /* makedists */ void makebigv(node *p) { /* make new branch length */ long i=0; node *temp, *q, *r; q = p->next; r = q->next; for (i = 1; i <= 3; i++) { if (p->iter) { p->v = (p->dist + r->dist - q->dist) / 2.0; p->back->v = p->v; } temp = p; p = q; q = r; r = temp; } } /* makebigv */ void correctv(node *p) { /* iterate branch lengths if some are to be zero */ node *q, *r, *temp; long i=0, j=0, n=0, nq=0, nr=0, ntemp=0; double wq=0.0, wr=0.0; q = p->next; r = q->next; n = p->back->index; nq = q->back->index; nr = r->back->index; for (i = 1; i <= zsmoothings; i++) { for (j = 1; j <= 3; j++) { if (p->iter) { wr = r->back->w[n - 1] + p->back->w[nr - 1]; wq = q->back->w[n - 1] + p->back->w[nq - 1]; if (wr + wq <= 0.0 && !negallowed) p->v = 0.0; else p->v = ((p->dist - q->v) * wq + (r->dist - r->v) * wr) / (wr + wq); if (p->v < 0 && !negallowed) p->v = 0.0; p->back->v = p->v; } temp = p; p = q; q = r; r = temp; ntemp = n; n = nq; nq = nr; nr = ntemp; } } } /* correctv */ void alter(node *x, node *y) { /* traverse updating these views */ nudists(x, y); if (!y->tip) { alter(x, y->next->back); alter(x, y->next->next->back); } } /* alter */ void nuview(node *p) { /* renew information about subtrees */ long i=0; node *q, *r, *pprime, *temp; q = p->next; r = q->next; for (i = 1; i <= 3; i++) { temp = p; pprime = p->back; alter(p, pprime); p = q; q = r; r = temp; } } /* nuview */ void update(node *p) { /* update branch lengths around a node */ if (p->tip) return; makedists(p); if (p->iter || p->next->iter || p->next->next->iter) { makebigv(p); correctv(p); } nuview(p); } /* update */ void smooth(node *p) { /* go through tree getting new branch lengths and views */ if (p->tip) return; update(p); smooth(p->next->back); smooth(p->next->next->back); } /* smooth */ void filltraverse(node *pb, node *qb, boolean contin) { if (qb->tip) return; if (contin) { filltraverse(pb, qb->next->back,contin); filltraverse(pb, qb->next->next->back,contin); nudists(qb, pb); return; } if (!qb->next->back->tip) nudists(qb->next->back, pb); if (!qb->next->next->back->tip) nudists(qb->next->next->back, pb); } /* filltraverse */ void fillin(node *pa, node *qa, boolean contin) { if (!pa->tip) { fillin(pa->next->back, qa, contin); fillin(pa->next->next->back, qa, contin); } filltraverse(pa, qa, contin); } /* fillin */ void insert_(node *p, node *q, boolean contin_) { /* put p and q together and iterate info. on resulting tree */ double x=0.0, oldlike; hookup(p->next->next, q->back); hookup(p->next, q); x = q->v / 2.0; p->v = 0.0; p->back->v = 0.0; p->next->v = x; p->next->back->v = x; p->next->next->back->v = x; p->next->next->v = x; fillin(p->back, p, contin_); evaluate(&curtree); do { oldlike = curtree.likelihood; smooth(p); smooth(p->back); evaluate(&curtree); } while (fabs(curtree.likelihood - oldlike) > delta); } /* insert_ */ void copynode(node *c, node *d) { /* make a copy of a node */ memcpy(d->d, c->d, nonodes2*sizeof(double)); memcpy(d->w, c->w, nonodes2*sizeof(double)); d->v = c->v; d->iter = c->iter; d->dist = c->dist; d->xcoord = c->xcoord; d->ycoord = c->ycoord; d->ymin = c->ymin; d->ymax = c->ymax; } /* copynode */ void copy_(tree *a, tree *b) { /* make copy of a tree a to tree b */ long i, j=0; node *p, *q; for (i = 0; i < spp; i++) { copynode(a->nodep[i], b->nodep[i]); if (a->nodep[i]->back) { if (a->nodep[i]->back == a->nodep[a->nodep[i]->back->index - 1]) b->nodep[i]->back = b->nodep[a->nodep[i]->back->index - 1]; else if (a->nodep[i]->back == a->nodep[a->nodep[i]->back->index - 1]->next) b->nodep[i]->back = b->nodep[a->nodep[i]->back->index - 1]->next; else b->nodep[i]->back = b->nodep[a->nodep[i]->back->index - 1]->next->next; } else b->nodep[i]->back = NULL; } for (i = spp; i < nonodes2; i++) { p = a->nodep[i]; q = b->nodep[i]; for (j = 1; j <= 3; j++) { copynode(p, q); if (p->back) { if (p->back == a->nodep[p->back->index - 1]) q->back = b->nodep[p->back->index - 1]; else if (p->back == a->nodep[p->back->index - 1]->next) q->back = b->nodep[p->back->index - 1]->next; else q->back = b->nodep[p->back->index - 1]->next->next; } else q->back = NULL; p = p->next; q = q->next; } } b->likelihood = a->likelihood; b->start = a->start; } /* copy_ */ void setuptipf(long m, tree *t) { /* initialize branch lengths and views in a tip */ long i=0; intvector n=(long *)Malloc(spp * sizeof(long)); node *WITH; WITH = t->nodep[m - 1]; memcpy(WITH->d, x[m - 1], (nonodes2 * sizeof(double))); memcpy(n, reps[m - 1], (spp * sizeof(long))); for (i = 0; i < spp; i++) { if (i + 1 != m && n[i] > 0) { if (WITH->d[i] < epsilonf) WITH->d[i] = epsilonf; WITH->w[i] = n[i] / exp(power * log(WITH->d[i])); } else { WITH->w[i] = 0.0; WITH->d[i] = 0.0; } } for (i = spp; i < nonodes2; i++) { WITH->w[i] = 1.0; WITH->d[i] = 0.0; } WITH->index = m; if (WITH->iter) WITH->v = 0.0; free(n); } /* setuptipf */ void buildnewtip(long m, tree *t, long nextsp) { /* initialize and hook up a new tip */ node *p; setuptipf(m, t); p = t->nodep[nextsp + spp - 3]; hookup(t->nodep[m - 1], p); } /* buildnewtip */ void buildsimpletree(tree *t, long nextsp) { /* make and initialize a three-species tree */ curtree.start=curtree.nodep[enterorder[0] - 1]; setuptipf(enterorder[0], t); setuptipf(enterorder[1], t); hookup(t->nodep[enterorder[0] - 1], t->nodep[enterorder[1] - 1]); buildnewtip(enterorder[2], t, nextsp); insert_(t->nodep[enterorder[2] - 1]->back, t->nodep[enterorder[0] - 1], false); } /* buildsimpletree */ void addtraverse(node *p, node *q, boolean contin, long *numtrees, boolean *succeeded) { /* traverse through a tree, finding best place to add p */ insert_(p, q, true); (*numtrees)++; if (evaluate(&curtree) > (bestree.likelihood + epsilonf * fabs(bestree.likelihood))){ copy_(&curtree, &bestree); addwhere = q; (*succeeded)=true; } copy_(&priortree, &curtree); if (!q->tip && contin) { addtraverse(p, q->next->back, contin,numtrees,succeeded); addtraverse(p, q->next->next->back, contin,numtrees,succeeded); } } /* addtraverse */ void re_move(node **p, node **q) { /* re_move p and record in q where it was */ *q = (*p)->next->back; hookup(*q, (*p)->next->next->back); (*p)->next->back = NULL; (*p)->next->next->back = NULL; update(*q); update((*q)->back); } /* re_move */ void globrearrange(long* numtrees,boolean* succeeded) { /* does global rearrangements */ tree globtree; tree oldtree; int i,j,k,num_sibs,num_sibs2; node *where,*sib_ptr,*sib_ptr2; double oldbestyet = curtree.likelihood; int success = false; alloctree(&globtree.nodep,nonodes2); alloctree(&oldtree.nodep,nonodes2); setuptree(&globtree,nonodes2); setuptree(&oldtree,nonodes2); allocd(nonodes2, globtree.nodep); allocd(nonodes2, oldtree.nodep); allocw(nonodes2, globtree.nodep); allocw(nonodes2, oldtree.nodep); copy_(&curtree,&globtree); copy_(&curtree,&oldtree); for ( i = spp ; i < nonodes2 ; i++ ) { num_sibs = count_sibs(curtree.nodep[i]); sib_ptr = curtree.nodep[i]; if ( (i - spp) % (( nonodes2 / 72 ) + 1 ) == 0 ) putchar('.'); fflush(stdout); for ( j = 0 ; j <= num_sibs ; j++ ) { re_move(&sib_ptr,&where); copy_(&curtree,&priortree); if (where->tip) { copy_(&oldtree,&curtree); copy_(&oldtree,&bestree); sib_ptr=sib_ptr->next; continue; } else num_sibs2 = count_sibs(where); sib_ptr2 = where; for ( k = 0 ; k < num_sibs2 ; k++ ) { addwhere = NULL; addtraverse(sib_ptr,sib_ptr2->back,true,numtrees,succeeded); if ( addwhere && where != addwhere && where->back != addwhere && bestree.likelihood > globtree.likelihood) { copy_(&bestree,&globtree); success = true; } sib_ptr2 = sib_ptr2->next; } copy_(&oldtree,&curtree); copy_(&oldtree,&bestree); sib_ptr = sib_ptr->next; } } copy_(&globtree,&curtree); copy_(&globtree,&bestree); if (success && globtree.likelihood > oldbestyet) { *succeeded = true; } else { *succeeded = false; } freed(nonodes2, globtree.nodep); freed(nonodes2, oldtree.nodep); freew(nonodes2, globtree.nodep); freew(nonodes2, oldtree.nodep); freetree(&globtree.nodep,nonodes2); freetree(&oldtree.nodep,nonodes2); } void rearrange(node *p, long *numtrees, long *nextsp, boolean *succeeded) { node *q, *r; if (!p->tip && !p->back->tip) { r = p->next->next; re_move(&r, &q); copy_(&curtree, &priortree); addtraverse(r, q->next->back, false, numtrees,succeeded); addtraverse(r, q->next->next->back, false, numtrees,succeeded); copy_(&bestree, &curtree); if (global && ((*nextsp) == spp)) { putchar('.'); fflush(stdout); } } if (!p->tip) { rearrange(p->next->back, numtrees,nextsp,succeeded); rearrange(p->next->next->back, numtrees,nextsp,succeeded); } } /* rearrange */ void describe(node *p) { /* print out information for one branch */ long i=0; node *q; q = p->back; fprintf(outfile, "%4ld ", q->index - spp); if (p->tip) { for (i = 0; i < nmlngth; i++) putc(nayme[p->index - 1][i], outfile); } else fprintf(outfile, "%4ld ", p->index - spp); fprintf(outfile, "%15.5f\n", q->v); if (!p->tip) { describe(p->next->back); describe(p->next->next->back); } } /* describe */ void summarize(long numtrees) { /* print out branch lengths etc. */ long i, j, totalnum; fprintf(outfile, "\nremember:"); if (outgropt) fprintf(outfile, " (although rooted by outgroup)"); fprintf(outfile, " this is an unrooted tree!\n\n"); if (!minev) fprintf(outfile, "Sum of squares = %11.5f\n\n", -curtree.likelihood); else fprintf(outfile, "Sum of branch lengths = %11.5f\n\n", -curtree.likelihood); if ((power == 2.0) && !minev) { totalnum = 0; for (i = 1; i <= nums; i++) { for (j = 1; j <= nums; j++) { if (i != j) totalnum += reps[i - 1][j - 1]; } } fprintf(outfile, "Average percent standard deviation = "); fprintf(outfile, "%11.5f\n\n", 100 * sqrt(-curtree.likelihood / (totalnum - 2))); } fprintf(outfile, "Between And Length\n"); fprintf(outfile, "------- --- ------\n"); describe(curtree.start->next->back); describe(curtree.start->next->next->back); describe(curtree.start->back); fprintf(outfile, "\n\n"); if (trout) { col = 0; treeout(curtree.start, &col, 0.43429445222, true, curtree.start); } } /* summarize */ void nodeinit(node *p) { /* initialize a node */ long i, j; for (i = 1; i <= 3; i++) { for (j = 0; j < nonodes2; j++) { p->w[j] = 1.0; p->d[j] = 0.0; } p = p->next; } if ((!lengths) || p->iter) p->v = 1.0; if ((!lengths) || p->back->iter) p->back->v = 1.0; } /* nodeinit */ void initrav(node *p) { /* traverse to initialize */ if (p->tip) return; nodeinit(p); initrav(p->next->back); initrav(p->next->next->back); } /* initrav */ void treevaluate() { /* evaluate user-defined tree, iterating branch lengths */ long i; double oldlike; for (i = 1; i <= spp; i++) setuptipf(i, &curtree); unroot(&curtree,nonodes2); initrav(curtree.start); if (curtree.start->back != NULL) { initrav(curtree.start->back); evaluate(&curtree); do { oldlike = curtree.likelihood; smooth(curtree.start); evaluate(&curtree); } while (fabs(curtree.likelihood - oldlike) > delta); } evaluate(&curtree); } /* treevaluate */ void maketree() { /* contruct the tree */ long nextsp,numtrees; boolean succeeded=false; long i, j, which; if (usertree) { inputdata(replicates, printdata, lower, upper, x, reps); setuptree(&curtree, nonodes2); for (which = 1; which <= spp; which++) setuptipf(which, &curtree); if (eoln(infile)) scan_eoln(infile); /* Open in binary: ftell() is broken for UNIX line-endings under WIN32 */ openfile(&intree,INTREE,"input tree file","rb",progname,intreename); numtrees = countsemic(&intree); if (numtrees > MAXNUMTREES) { printf("\nERROR: number of input trees is read incorrectly from %s\n", intreename); exxit(-1); } if (treeprint) { fprintf(outfile, "User-defined tree"); if (numtrees > 1) putc('s', outfile); fprintf(outfile, ":\n\n"); } first = true; which = 1; while (which <= numtrees) { treeread2 (intree, &curtree.start, curtree.nodep, lengths, &trweight, &goteof, &haslengths, &spp,false,nonodes2); nums = spp; curtree.start = curtree.nodep[outgrno - 1]->back; treevaluate(); printree(curtree.start, treeprint, false, false); summarize(numtrees); clear_connections(&curtree,nonodes2); which++; } FClose(intree); } else { if (jumb == 1) { inputdata(replicates, printdata, lower, upper, x, reps); setuptree(&curtree, nonodes2); setuptree(&priortree, nonodes2); setuptree(&bestree, nonodes2); if (njumble > 1) setuptree(&bestree2, nonodes2); } for (i = 1; i <= spp; i++) enterorder[i - 1] = i; if (jumble) randumize(seed, enterorder); nextsp = 3; buildsimpletree(&curtree, nextsp); curtree.start = curtree.nodep[enterorder[0] - 1]->back; if (jumb == 1) numtrees = 1; nextsp = 4; if (progress) { printf("Adding species:\n"); writename(0, 3, enterorder); #ifdef WIN32 phyFillScreenColor(); #endif } while (nextsp <= spp) { nums = nextsp; buildnewtip(enterorder[nextsp - 1], &curtree, nextsp); copy_(&curtree, &priortree); bestree.likelihood = -DBL_MAX; curtree.start = curtree.nodep[enterorder[0] - 1]->back; addtraverse(curtree.nodep[enterorder[nextsp - 1] - 1]->back, curtree.start, true, &numtrees,&succeeded); copy_(&bestree, &curtree); if (progress) { writename(nextsp - 1, 1, enterorder); #ifdef WIN32 phyFillScreenColor(); #endif } if (global && nextsp == spp) { if (progress) { printf("Doing global rearrangements\n"); printf(" !"); for (j = spp; j < nonodes2; j++) if ( (j - spp) % (( nonodes2 / 72 ) + 1 ) == 0 ) putchar('-'); printf("!\n"); printf(" "); } } succeeded = true; while (succeeded) { succeeded = false; curtree.start = curtree.nodep[enterorder[0] - 1]->back; if (nextsp == spp && global) globrearrange (&numtrees,&succeeded); else{ rearrange(curtree.start,&numtrees,&nextsp,&succeeded); } if (global && ((nextsp) == spp) && progress) printf("\n "); } if (global && nextsp == spp) { putc('\n', outfile); if (progress) putchar('\n'); } if (njumble > 1) { if (jumb == 1 && nextsp == spp) copy_(&bestree, &bestree2); else if (nextsp == spp) { if (bestree2.likelihood < bestree.likelihood) copy_(&bestree, &bestree2); } } if (nextsp == spp && jumb == njumble) { if (njumble > 1) copy_(&bestree2, &curtree); curtree.start = curtree.nodep[outgrno - 1]->back; printree(curtree.start, treeprint, true, false); summarize(numtrees); } nextsp++; } } if (jumb == njumble && progress) { printf("\nOutput written to file \"%s\"\n", outfilename); if (trout) { printf("\nTree also written onto file \"%s\"\n", outtreename); } } } /* maketree */ int main(int argc, Char *argv[]) { int i; #ifdef MAC argc = 1; /* macsetup("Fitch",""); */ argv[0]="Fitch"; #endif init(argc,argv); progname = argv[0]; openfile(&infile,INFILE,"input file","r",argv[0],infilename); openfile(&outfile,OUTFILE,"output file","w",argv[0],outfilename); ibmpc = IBMCRT; ansi = ANSICRT; mulsets = false; datasets = 1; firstset = true; doinit(); if (trout) openfile(&outtree,OUTTREE,"output tree file","w",argv[0],outtreename); for (i=0;i 1) { fprintf(outfile, "Data set # %ld:\n\n",ith); if (progress) printf("\nData set # %ld:\n\n",ith); } fitch_getinput(); for (jumb = 1; jumb <= njumble; jumb++) maketree(); firstset = false; if (eoln(infile) && (ith < datasets)) scan_eoln(infile); } if (trout) FClose(outtree); FClose(outfile); FClose(infile); #ifdef MAC fixmacfile(outfilename); fixmacfile(outtreename); #endif printf("\nDone.\n\n"); #ifdef WIN32 phyRestoreConsoleAttributes(); #endif return 0; } phylip-3.697/src/font10000644004732000473200000001345612406201116014310 0ustar joefelsenst_gCA 501 21 18 28 -1956 1135 -1956 2735 -1442 2442 -12835 CB 502 21 21 28 -1456 1435 -1456 2356 2655 2754 2852 2850 2748 2647 2346 -1446 2346 2645 2744 2842 2839 2737 2636 2335 1435 -13135 CC 503 21 21 28 -2851 2753 2555 2356 1956 1755 1553 1451 1348 1343 1440 1538 1736 1935 2335 2536 2738 2840 -13135 CD 504 21 21 28 -1456 1435 -1456 2156 2455 2653 2751 2848 2843 2740 2638 2436 2135 1435 -13135 CE 505 21 19 28 -1456 1435 -1456 2756 -1446 2246 -1435 2735 -12935 CF 506 21 18 28 -1456 1435 -1456 2756 -1446 2246 -12835 CG 507 21 21 28 -2851 2753 2555 2356 1956 1755 1553 1451 1348 1343 1440 1538 1736 1935 2335 2536 2738 2840 2843 -2343 2843 -13135 CH 508 21 22 28 -1456 1435 -2856 2835 -1446 2846 -13235 CI 509 21 8 28 -1456 1435 -11835 CJ 510 21 16 28 -2256 2240 2137 2036 1835 1635 1436 1337 1240 1242 -12635 CK 511 21 21 28 -1456 1435 -2856 1442 -1947 2835 -13135 CL 512 21 17 28 -1456 1435 -1435 2635 -12735 CM 513 21 24 28 -1456 1435 -1456 2235 -3056 2235 -3056 3035 -13435 CN 514 21 22 28 -1456 1435 -1456 2835 -2856 2835 -13235 CO 515 21 22 28 -1956 1755 1553 1451 1348 1343 1440 1538 1736 1935 2335 2536 2738 2840 2943 2948 2851 2753 2555 2356 1956 -13235 CP 516 21 21 28 -1456 1435 -1456 2356 2655 2754 2852 2849 2747 2646 2345 1445 -13135 CQ 517 21 22 28 -1956 1755 1553 1451 1348 1343 1440 1538 1736 1935 2335 2536 2738 2840 2943 2948 2851 2753 2555 2356 1956 -2239 2833 -13235 CR 518 21 21 28 -1456 1435 -1456 2356 2655 2754 2852 2850 2748 2647 2346 1446 -2146 2835 -13135 CS 519 21 20 28 -2753 2555 2256 1856 1555 1353 1351 1449 1548 1747 2345 2544 2643 2741 2738 2536 2235 1835 1536 1338 -13035 CT 520 21 16 28 -1856 1835 -1156 2556 -12635 CU 521 21 22 28 -1456 1441 1538 1736 2035 2235 2536 2738 2841 2856 -13235 CV 522 21 18 28 -1156 1935 -2756 1935 -12835 CW 523 21 24 28 -1256 1735 -2256 1735 -2256 2735 -3256 2735 -13435 CX 524 21 20 28 -1356 2735 -2756 1335 -13035 CY 525 21 18 28 -1156 1946 1935 -2756 1946 -12835 CZ 526 21 20 28 -2756 1335 -1356 2756 -1335 2735 -13035 Ca 601 21 19 28 -2549 2535 -2546 2348 2149 1849 1648 1446 1343 1341 1438 1636 1835 2135 2336 2538 -12935 Cb 602 21 19 28 -1456 1435 -1446 1648 1849 2149 2348 2546 2643 2641 2538 2336 2135 1835 1636 1438 -12935 Cc 603 21 18 28 -2546 2348 2149 1849 1648 1446 1343 1341 1438 1636 1835 2135 2336 2538 -12835 Cd 604 21 19 28 -2556 2535 -2546 2348 2149 1849 1648 1446 1343 1341 1438 1636 1835 2135 2336 2538 -12935 Ce 605 21 18 28 -1343 2543 2545 2447 2348 2149 1849 1648 1446 1343 1341 1438 1636 1835 2135 2336 2538 -12835 Cf 606 21 12 28 -2056 1856 1655 1552 1535 -1249 1949 -12235 Cg 607 21 19 28 -2549 2533 2430 2329 2128 1828 1629 -2546 2348 2149 1849 1648 1446 1343 1341 1438 1636 1835 2135 2336 2538 -12935 Ch 608 21 19 28 -1456 1435 -1445 1748 1949 2249 2448 2545 2535 -12935 Ci 609 21 8 28 -1356 1455 1556 1457 1356 -1449 1435 -11835 Cj 610 21 10 28 -1556 1655 1756 1657 1556 -1649 1632 1529 1328 1128 -12035 Ck 611 21 17 28 -1456 1435 -2449 1439 -1843 2535 -12735 Cl 612 21 8 28 -1456 1435 -11835 Cm 613 21 30 28 -1449 1435 -1445 1748 1949 2249 2448 2545 2535 -2545 2848 3049 3349 3548 3645 3635 -14035 Cn 614 21 19 28 -1449 1435 -1445 1748 1949 2249 2448 2545 2535 -12935 Co 615 21 19 28 -1849 1648 1446 1343 1341 1438 1636 1835 2135 2336 2538 2641 2643 2546 2348 2149 1849 -12935 Cp 616 21 19 28 -1449 1428 -1446 1648 1849 2149 2348 2546 2643 2641 2538 2336 2135 1835 1636 1438 -12935 Cq 617 21 19 28 -2549 2528 -2546 2348 2149 1849 1648 1446 1343 1341 1438 1636 1835 2135 2336 2538 -12935 Cr 618 21 13 28 -1449 1435 -1443 1546 1748 1949 2249 -12335 Cs 619 21 17 28 -2446 2348 2049 1749 1448 1346 1444 1643 2142 2341 2439 2438 2336 2035 1735 1436 1338 -12735 Ct 620 21 12 28 -1556 1539 1636 1835 2035 -1249 1949 -12235 Cu 621 21 19 28 -1449 1439 1536 1735 2035 2236 2539 -2549 2535 -12935 Cv 622 21 16 28 -1249 1835 -2449 1835 -12635 Cw 623 21 22 28 -1349 1735 -2149 1735 -2149 2535 -2949 2535 -13235 Cx 624 21 17 28 -1349 2435 -2449 1335 -12735 Cy 625 21 16 28 -1249 1835 -2449 1835 1631 1429 1228 1128 -12635 Cz 626 21 17 28 -2449 1335 -1349 2449 -1335 2435 -12735 C0 700 21 20 28 -1956 1655 1452 1347 1344 1439 1636 1935 2135 2436 2639 2744 2747 2652 2455 2156 1956 -13035 C1 701 21 20 28 -1652 1853 2156 2135 -13035 C2 702 21 20 28 -1451 1452 1554 1655 1856 2256 2455 2554 2652 2650 2548 2345 1335 2735 -13035 C3 703 21 20 28 -1556 2656 2048 2348 2547 2646 2743 2741 2638 2436 2135 1835 1536 1437 1339 -13035 C4 704 21 20 28 -2356 1342 2842 -2356 2335 -13035 C5 705 21 20 28 -2556 1556 1447 1548 1849 2149 2448 2646 2743 2741 2638 2436 2135 1835 1536 1437 1339 -13035 C6 706 21 20 28 -2653 2555 2256 2056 1755 1552 1447 1442 1538 1736 2035 2135 2436 2638 2741 2742 2645 2447 2148 2048 1747 1545 1442 -13035 C7 707 21 20 28 -2756 1735 -1356 2756 -13035 C8 708 21 20 28 -1856 1555 1453 1451 1549 1748 2147 2446 2644 2742 2739 2637 2536 2235 1835 1536 1437 1339 1342 1444 1646 1947 2348 2549 2651 2653 2555 2256 1856 -13035 C9 709 21 20 28 -2649 2546 2344 2043 1943 1644 1446 1349 1350 1453 1655 1956 2056 2355 2553 2649 2644 2539 2336 2035 1835 1536 1438 -13035 C. 710 21 10 28 -1537 1436 1535 1636 1537 -12035 C, 711 21 10 28 -1636 1535 1436 1537 1636 1634 1532 1431 -12035 C: 712 21 10 28 -1549 1448 1547 1648 1549 -1537 1436 1535 1636 1537 -12035 C; 713 21 10 28 -1549 1448 1547 1648 1549 -1636 1535 1436 1537 1636 1634 1532 1431 -12035 C! 714 21 10 28 -1556 1542 -1537 1436 1535 1636 1537 -12035 C? 715 21 18 28 -1351 1352 1454 1555 1756 2156 2355 2454 2552 2550 2448 2347 1945 1942 -1937 1836 1935 2036 1937 -12835 C/ 720 21 22 28 -3060 1228 -13235 C( 721 21 14 28 -2160 1958 1755 1551 1446 1442 1537 1733 1930 2128 -12435 C) 722 21 14 28 -1360 1558 1755 1951 2046 2042 1937 1733 1530 1328 -12435 C- 724 21 26 28 -1444 3244 -13635 C* 728 21 16 28 -1850 1838 -1347 2341 -2347 1341 -12635 C 699 21 16 28 -12635 phylip-3.697/src/font20000644004732000473200000002603012406201116014301 0ustar joefelsenst_gCA 2501 21 20 28 -2056 1235 -2053 1335 1235 -2053 2735 2835 -2056 2835 -1541 2541 -1440 2640 -13035 CB 2502 21 20 28 -1456 1435 -1555 1536 -1456 2256 2555 2654 2752 2749 2647 2546 2245 -1555 2255 2554 2652 2649 2547 2246 -1546 2246 2545 2644 2742 2739 2637 2536 2235 1435 -1545 2245 2544 2642 2639 2537 2236 1536 -13035 CC 2503 21 21 28 -2851 2753 2555 2356 1956 1755 1553 1451 1348 1343 1440 1538 1736 1935 2335 2536 2738 2840 -2851 2751 2653 2554 2355 1955 1754 1551 1448 1443 1540 1737 1936 2336 2537 2638 2740 2840 -13135 CD 2504 21 21 28 -1456 1435 -1555 1536 -1456 2156 2455 2653 2751 2848 2843 2740 2638 2436 2135 1435 -1555 2155 2454 2553 2651 2748 2743 2640 2538 2437 2136 1536 -13135 CE 2505 21 19 28 -1456 1435 -1555 1536 -1456 2656 -1555 2655 2656 -1546 2146 2145 -1545 2145 -1536 2636 2635 -1435 2635 -12935 CF 2506 21 18 28 -1456 1435 -1555 1535 1435 -1456 2656 -1555 2655 2656 -1546 2146 2145 -1545 2145 -12835 CG 2507 21 21 28 -2851 2753 2555 2356 1956 1755 1553 1451 1348 1343 1440 1538 1736 1935 2335 2536 2738 2840 2844 2344 -2851 2751 2653 2554 2355 1955 1754 1653 1551 1448 1443 1540 1638 1737 1936 2336 2537 2638 2740 2743 2343 2344 -13135 CH 2508 21 22 28 -1456 1435 -1456 1556 1535 1435 -2856 2756 2735 2835 -2856 2835 -1546 2746 -1545 2745 -13235 CI 2509 21 9 28 -1456 1435 1535 -1456 1556 1535 -11935 CJ 2510 21 17 28 -2256 2240 2137 1936 1736 1537 1440 1340 -2256 2356 2340 2237 2136 1935 1735 1536 1437 1340 -12735 CK 2511 21 21 28 -1456 1435 1535 -1456 1556 1535 -2856 2756 1544 -2856 1543 -1847 2735 2835 -1947 2835 -13135 CL 2512 21 17 28 -1456 1435 -1456 1556 1536 -1536 2636 2635 -1435 2635 -12735 CM 2513 21 24 28 -1456 1435 -1551 1535 1435 -1551 2235 -1456 2238 -3056 2238 -2951 2235 -2951 2935 3035 -3056 3035 -13435 CN 2514 21 22 28 -1456 1435 -1553 1535 1435 -1553 2835 -1456 2738 -2756 2738 -2756 2856 2835 -13235 CO 2515 21 22 28 -1956 1755 1553 1451 1348 1343 1440 1538 1736 1935 2335 2536 2738 2840 2943 2948 2851 2753 2555 2356 1956 -2055 1754 1551 1448 1443 1540 1737 2036 2236 2537 2740 2843 2848 2751 2554 2255 2055 -13235 CP 2516 21 20 28 -1456 1435 -1555 1535 1435 -1456 2356 2555 2654 2752 2749 2647 2546 2345 1545 -1555 2355 2554 2652 2649 2547 2346 1546 -13035 CQ 2517 21 22 28 -1956 1755 1553 1451 1348 1343 1440 1538 1736 1935 2335 2536 2738 2840 2943 2948 2851 2753 2555 2356 1956 -2055 1754 1551 1448 1443 1540 1737 2036 2236 2537 2740 2843 2848 2751 2554 2255 2055 -2238 2733 2833 -2238 2338 2833 -13235 CR 2518 21 20 28 -1456 1435 -1555 1535 1435 -1456 2256 2555 2654 2752 2749 2647 2546 2245 1545 -1555 2255 2554 2652 2649 2547 2246 1546 -2045 2635 2735 -2145 2735 -13035 CS 2519 21 20 28 -2753 2555 2256 1856 1555 1353 1351 1449 1548 1747 2245 2444 2543 2641 2638 2537 2236 1836 1637 1538 1338 -2753 2553 2454 2255 1855 1554 1453 1451 1549 1748 2246 2445 2643 2741 2738 2536 2235 1835 1536 1338 -13035 CT 2520 21 17 28 -1855 1835 -1955 1935 1835 -1256 2556 2555 -1256 1255 2555 -12735 CU 2521 21 22 28 -1456 1441 1538 1736 2035 2235 2536 2738 2841 2856 -1456 1556 1541 1638 1737 2036 2236 2537 2638 2741 2756 2856 -13235 CV 2522 21 20 28 -1256 2035 -1256 1356 2038 -2856 2756 2038 -2856 2035 -13035 CW 2523 21 26 28 -1256 1835 -1256 1356 1838 -2356 1838 -2353 1835 -2353 2835 -2356 2838 -3456 3356 2838 -3456 2835 -13635 CX 2524 21 20 28 -1356 2635 2735 -1356 1456 2735 -2756 2656 1335 -2756 1435 1335 -13035 CY 2525 21 19 28 -1256 1946 1935 2035 -1256 1356 2046 -2756 2656 1946 -2756 2046 2035 -12935 CZ 2526 21 20 28 -2656 1335 -2756 1435 -1356 2756 -1356 1355 2655 -1436 2736 2735 -1335 2735 -13035 Ca 2601 21 20 28 -2549 2535 2635 -2549 2649 2635 -2546 2348 2149 1849 1648 1446 1343 1341 1438 1636 1835 2135 2336 2538 -2546 2148 1848 1647 1546 1443 1441 1538 1637 1836 2136 2538 -13035 Cb 2602 21 20 28 -1456 1435 1535 -1456 1556 1535 -1546 1748 1949 2249 2448 2646 2743 2741 2638 2436 2235 1935 1736 1538 -1546 1948 2248 2447 2546 2643 2641 2538 2437 2236 1936 1538 -13035 Cc 2603 21 18 28 -2546 2348 2149 1849 1648 1446 1343 1341 1438 1636 1835 2135 2336 2538 -2546 2445 2347 2148 1848 1647 1546 1443 1441 1538 1637 1836 2136 2337 2439 2538 -12835 Cd 2604 21 20 28 -2556 2535 2635 -2556 2656 2635 -2546 2348 2149 1849 1648 1446 1343 1341 1438 1636 1835 2135 2336 2538 -2546 2148 1848 1647 1546 1443 1441 1538 1637 1836 2136 2538 -13035 Ce 2605 21 18 28 -1442 2542 2545 2447 2348 2149 1849 1648 1446 1343 1341 1438 1636 1835 2135 2336 2538 -1443 2443 2445 2347 2148 1848 1647 1546 1443 1441 1538 1637 1836 2136 2337 2439 2538 -12835 Cf 2606 21 14 28 -2156 1956 1755 1652 1635 1735 -2156 2155 1955 1754 -1855 1752 1735 -1349 2049 2048 -1349 1348 2048 -12435 Cg 2607 21 20 28 -2649 2549 2534 2431 2330 2129 1929 1730 1631 1431 -2649 2634 2531 2329 2128 1828 1629 1431 -2546 2348 2149 1849 1648 1446 1343 1341 1438 1636 1835 2135 2336 2538 -2546 2148 1848 1647 1546 1443 1441 1538 1637 1836 2136 2538 -13035 Ch 2608 21 20 28 -1456 1435 1535 -1456 1556 1535 -1545 1848 2049 2349 2548 2645 2635 -1545 1847 2048 2248 2447 2545 2535 2635 -13035 Ci 2609 21 9 28 -1456 1355 1354 1453 1553 1654 1655 1556 1456 -1455 1454 1554 1555 1455 -1449 1435 1535 -1449 1549 1535 -11935 Cj 2610 21 9 28 -1456 1355 1354 1453 1553 1654 1655 1556 1456 -1455 1454 1554 1555 1455 -1449 1428 1528 -1449 1549 1528 -11935 Ck 2611 21 19 28 -1456 1435 1535 -1456 1556 1535 -2649 2549 1539 -2649 1538 -1842 2435 2635 -1943 2635 -12935 Cl 2612 21 9 28 -1456 1435 1535 -1456 1556 1535 -11935 Cm 2613 21 31 28 -1449 1435 1535 -1449 1549 1535 -1545 1848 2049 2349 2548 2645 2635 -1545 1847 2048 2248 2447 2545 2535 2635 -2645 2948 3149 3449 3648 3745 3735 -2645 2947 3148 3348 3547 3645 3635 3735 -14135 Cn 2614 21 20 28 -1449 1435 1535 -1449 1549 1535 -1545 1848 2049 2349 2548 2645 2635 -1545 1847 2048 2248 2447 2545 2535 2635 -13035 Co 2615 21 19 28 -1849 1648 1446 1343 1341 1438 1636 1835 2135 2336 2538 2641 2643 2546 2348 2149 1849 -1848 1647 1546 1443 1441 1538 1637 1836 2136 2337 2438 2541 2543 2446 2347 2148 1848 -12935 Cp 2616 21 20 28 -1449 1428 1528 -1449 1549 1528 -1546 1748 1949 2249 2448 2646 2743 2741 2638 2436 2235 1935 1736 1538 -1546 1948 2248 2447 2546 2643 2641 2538 2437 2236 1936 1538 -13035 Cq 2617 21 20 28 -2549 2528 2628 -2549 2649 2628 -2546 2348 2149 1849 1648 1446 1343 1341 1438 1636 1835 2135 2336 2538 -2546 2148 1848 1647 1546 1443 1441 1538 1637 1836 2136 2538 -13035 Cr 2618 21 14 28 -1449 1435 1535 -1449 1549 1535 -1543 1646 1848 2049 2349 -1543 1645 1847 2048 2348 2349 -12435 Cs 2619 21 17 28 -2446 2348 2049 1749 1448 1346 1444 1643 2141 2340 -2241 2339 2338 2236 -2337 2036 1736 1437 -1536 1438 1338 -2446 2346 2248 -2347 2048 1748 1447 -1548 1446 1544 -1445 1644 2142 2341 2439 2438 2336 2035 1735 1436 1338 -12735 Ct 2620 21 11 28 -1556 1535 1635 -1556 1656 1635 -1249 1949 1948 -1249 1248 1948 -12135 Cu 2621 21 20 28 -1449 1439 1536 1735 2035 2236 2539 -1449 1549 1539 1637 1836 2036 2237 2539 -2549 2535 2635 -2549 2649 2635 -13035 Cv 2622 21 16 28 -1249 1835 -1249 1349 1837 -2449 2349 1837 -2449 1835 -12635 Cw 2623 21 24 28 -1349 1835 -1349 1449 1838 -2249 1838 -2246 1835 -2246 2635 -2249 2638 -3149 3049 2638 -3149 2635 -13435 Cx 2624 21 18 28 -1349 2435 2535 -1349 1449 2535 -2549 2449 1335 -2549 1435 1335 -12835 Cy 2625 21 16 28 -1249 1835 -1249 1349 1837 -2449 2349 1837 1428 -2449 1835 1528 1428 -12635 Cz 2626 21 18 28 -2348 1335 -2549 1536 -1349 2549 -1349 1348 2348 -1536 2536 2535 -1335 2535 -12835 C0 2700 21 20 28 -1956 1655 1452 1347 1344 1439 1636 1935 2135 2436 2639 2744 2747 2652 2455 2156 1956 -1755 1552 1447 1444 1539 1736 -1637 1936 2136 2437 -2336 2539 2644 2647 2552 2355 -2454 2155 1955 1654 -13035 C1 2701 21 20 28 -1652 1853 2156 2135 -1652 1651 1852 2054 2035 2135 -13035 C2 2702 21 20 28 -1451 1452 1554 1655 1856 2256 2455 2554 2652 2650 2548 2345 1435 -1451 1551 1552 1654 1855 2255 2454 2552 2550 2448 2245 1335 -1436 2736 2735 -1335 2735 -13035 C3 2703 21 20 28 -1556 2656 1947 -1556 1555 2555 -2556 1847 -1948 2148 2447 2645 2742 2741 2638 2436 2135 1835 1536 1437 1339 1439 -1847 2147 2446 2643 -2247 2545 2642 2641 2538 2236 -2640 2437 2136 1836 1537 1439 -1736 1438 -13035 C4 2704 21 20 28 -2353 2335 2435 -2456 2435 -2456 1340 2840 -2353 1440 -1441 2841 2840 -13035 C5 2705 21 20 28 -1556 1447 -1655 1548 -1556 2556 2555 -1655 2555 -1548 1849 2149 2448 2646 2743 2741 2638 2436 2135 1835 1536 1437 1339 1439 -1447 1547 1748 2148 2447 2644 -2248 2546 2643 2641 2538 2236 -2640 2437 2136 1836 1537 1439 -1736 1438 -13035 C6 2706 21 20 28 -2455 2553 2653 2555 2256 2056 1755 1552 1447 1442 1538 1736 2035 2135 2436 2638 2741 2742 2645 2447 2148 2048 1747 1545 -2554 2255 2055 1754 -1855 1652 1547 1542 1638 1936 -1540 1737 2036 2136 2437 2640 -2236 2538 2641 2642 2545 2247 -2643 2446 2147 2047 1746 1543 -1947 1645 1542 -13035 C7 2707 21 20 28 -1356 2756 1735 -1356 1355 2655 -2656 1635 1735 -13035 C8 2708 21 20 28 -1856 1555 1453 1451 1549 1648 1847 2246 2445 2544 2642 2639 2537 2236 1836 1537 1439 1442 1544 1645 1846 2247 2448 2549 2651 2653 2555 2256 1856 -1655 1553 1551 1649 1848 2247 2446 2644 2742 2739 2637 2536 2235 1835 1536 1437 1339 1342 1444 1646 1847 2248 2449 2551 2553 2455 -2554 2255 1855 1554 -1438 1736 -2336 2638 -13035 C9 2709 21 20 28 -2546 2344 2043 1943 1644 1446 1349 1350 1453 1655 1956 2056 2355 2553 2649 2644 2539 2336 2035 1835 1536 1438 1538 1636 -2549 2446 2144 -2548 2345 2044 1944 1645 1448 -1844 1546 1449 1450 1553 1855 -1451 1654 1955 2055 2354 2551 -2155 2453 2549 2544 2439 2236 -2337 2036 1836 1537 -13035 C. 2710 21 11 28 -1538 1437 1436 1535 1635 1736 1737 1638 1538 -1537 1536 1636 1637 1537 -12135 C, 2711 21 11 28 -1736 1635 1535 1436 1437 1538 1638 1737 1734 1632 1431 -1537 1536 1636 1637 1537 -1635 1734 -1736 1632 -12135 C: 2712 21 11 28 -1549 1448 1447 1546 1646 1747 1748 1649 1549 -1548 1547 1647 1648 1548 -1538 1437 1436 1535 1635 1736 1737 1638 1538 -1537 1536 1636 1637 1537 -12135 C; 2713 21 11 28 -1549 1448 1447 1546 1646 1747 1748 1649 1549 -1548 1547 1647 1648 1548 -1736 1635 1535 1436 1437 1538 1638 1737 1734 1632 1431 -1537 1536 1636 1637 1537 -1635 1734 -1736 1632 -12135 C! 2714 21 11 28 -1556 1542 1642 -1556 1656 1642 -1538 1437 1436 1535 1635 1736 1737 1638 1538 -1537 1536 1636 1637 1537 -12135 C? 2715 21 19 28 -1351 1352 1454 1555 1856 2156 2455 2554 2652 2650 2548 2447 2246 1945 -1351 1451 1452 1554 1855 2155 2454 2552 2550 2448 2247 1946 -1453 1755 -2255 2553 -2549 2146 -1946 1942 2042 2046 -1938 1837 1836 1935 2035 2136 2137 2038 1938 -1937 1936 2036 2037 1937 -12935 C/ 2720 21 23 28 -3060 1228 1328 -3060 3160 1328 -13335 C( 2721 21 14 28 -2060 1858 1655 1451 1346 1342 1437 1633 1830 2028 2128 -2060 2160 1958 1755 1551 1446 1442 1537 1733 1930 2128 -12435 C) 2722 21 14 28 -1360 1558 1755 1951 2046 2042 1937 1733 1530 1328 1428 -1360 1460 1658 1855 2051 2146 2142 2037 1833 1630 1428 -12435 C* 2723 21 16 28 -1856 1755 1945 1844 -1856 1844 -1856 1955 1745 1844 -1353 1453 2247 2347 -1353 2347 -1353 1352 2348 2347 -2353 2253 1447 1347 -2353 1347 -2353 2352 1348 1347 -12635 C- 2724 21 25 28 -1445 3145 3144 -1445 1444 3144 -13535 C 2699 21 16 28 -12635 phylip-3.697/src/font30000644004732000473200000004113112406201116014301 0ustar joefelsenst_gCA 3001 21 20 28 -2056 1336 -1953 2535 -2053 2635 -2056 2735 -1541 2441 -1135 1735 -2235 2935 -1336 1235 -1336 1535 -2536 2335 -2537 2435 -2637 2835 -13035 CB 3002 21 22 28 -1556 1535 -1655 1636 -1756 1735 -1256 2456 2755 2854 2952 2950 2848 2747 2446 -2754 2852 2850 2748 -2456 2655 2753 2749 2647 2446 -1746 2446 2745 2844 2942 2939 2837 2736 2435 1235 -2744 2842 2839 2737 -2446 2645 2743 2738 2636 2435 -1356 1555 -1456 1554 -1856 1754 -1956 1755 -1536 1335 -1537 1435 -1737 1835 -1736 1935 -13235 CC 3003 21 21 28 -2753 2856 2850 2753 2555 2356 2056 1755 1553 1451 1348 1343 1440 1538 1736 2035 2335 2536 2738 2840 -1653 1551 1448 1443 1540 1638 -2056 1855 1652 1548 1543 1639 1836 2035 -13135 CD 3004 21 22 28 -1556 1535 -1655 1636 -1756 1735 -1256 2256 2555 2753 2851 2948 2943 2840 2738 2536 2235 1235 -2653 2751 2848 2843 2740 2638 -2256 2455 2652 2748 2743 2639 2436 2235 -1356 1555 -1456 1554 -1856 1754 -1956 1755 -1536 1335 -1537 1435 -1737 1835 -1736 1935 -13235 CE 3005 21 21 28 -1556 1535 -1655 1636 -1756 1735 -1256 2856 2850 -1746 2346 -2350 2342 -1235 2835 2841 -1356 1555 -1456 1554 -1856 1754 -1956 1755 -2356 2855 -2556 2854 -2656 2853 -2756 2850 -2350 2246 2342 -2348 2146 2344 -2347 1946 2345 -1536 1335 -1537 1435 -1737 1835 -1736 1935 -2335 2836 -2535 2837 -2635 2838 -2735 2841 -13135 CF 3006 21 20 28 -1556 1535 -1655 1636 -1756 1735 -1256 2856 2850 -1746 2346 -2350 2342 -1235 2035 -1356 1555 -1456 1554 -1856 1754 -1956 1755 -2356 2855 -2556 2854 -2656 2853 -2756 2850 -2350 2246 2342 -2348 2146 2344 -2347 1946 2345 -1536 1335 -1537 1435 -1737 1835 -1736 1935 -13035 CG 3007 21 23 28 -2753 2856 2850 2753 2555 2356 2056 1755 1553 1451 1348 1343 1440 1538 1736 2035 2335 2536 2736 2835 2843 -1653 1551 1448 1443 1540 1638 -2056 1855 1652 1548 1543 1639 1836 2035 -2742 2737 -2643 2637 2536 -2343 3143 -2443 2642 -2543 2641 -2943 2841 -3043 2842 -13335 CH 3008 21 24 28 -1556 1535 -1655 1636 -1756 1735 -2756 2735 -2855 2836 -2956 2935 -1256 2056 -2456 3256 -1746 2746 -1235 2035 -2435 3235 -1356 1555 -1456 1554 -1856 1754 -1956 1755 -2556 2755 -2656 2754 -3056 2954 -3156 2955 -1536 1335 -1537 1435 -1737 1835 -1736 1935 -2736 2535 -2737 2635 -2937 3035 -2936 3135 -13435 CI 3009 21 12 28 -1556 1535 -1655 1636 -1756 1735 -1256 2056 -1235 2035 -1356 1555 -1456 1554 -1856 1754 -1956 1755 -1536 1335 -1537 1435 -1737 1835 -1736 1935 -12235 CJ 3010 21 16 28 -1956 1939 1836 1735 -2055 2039 1936 -2156 2139 2036 1735 1535 1336 1238 1240 1341 1441 1540 1539 1438 1338 -1340 1339 1439 1440 1340 -1656 2456 -1756 1955 -1856 1954 -2256 2154 -2356 2155 -12635 CK 3011 21 22 28 -1556 1535 -1655 1636 -1756 1735 -2855 1744 -2046 2735 -2146 2835 -2148 2935 -1256 2056 -2556 3156 -1235 2035 -2435 3135 -1356 1555 -1456 1554 -1856 1754 -1956 1755 -2756 2855 -3056 2855 -1536 1335 -1537 1435 -1737 1835 -1736 1935 -2737 2535 -2737 3035 -13235 CL 3012 21 18 28 -1556 1535 -1655 1636 -1756 1735 -1256 2056 -1235 2735 2741 -1356 1555 -1456 1554 -1856 1754 -1956 1755 -1536 1335 -1537 1435 -1737 1835 -1736 1935 -2235 2736 -2435 2737 -2535 2738 -2635 2741 -12835 CM 3013 21 26 28 -1556 1536 -1556 2235 -1656 2238 -1756 2338 -2956 2235 -2956 2935 -3055 3036 -3156 3135 -1256 1756 -2956 3456 -1235 1835 -2635 3435 -1356 1555 -3256 3154 -3356 3155 -1536 1335 -1536 1735 -2936 2735 -2937 2835 -3137 3235 -3136 3335 -13635 CN 3014 21 24 28 -1556 1536 -1556 2935 -1656 2838 -1756 2938 -2955 2935 -1256 1756 -2656 3256 -1235 1835 -1356 1555 -2756 2955 -3156 2955 -1536 1335 -1536 1735 -13435 CO 3015 21 22 28 -2056 1755 1553 1451 1347 1344 1440 1538 1736 2035 2235 2536 2738 2840 2944 2947 2851 2753 2555 2256 2056 -1653 1551 1448 1443 1540 1638 -2638 2740 2843 2848 2751 2653 -2056 1855 1652 1548 1543 1639 1836 2035 -2235 2436 2639 2743 2748 2652 2455 2256 -13235 CP 3016 21 22 28 -1556 1535 -1655 1636 -1756 1735 -1256 2456 2755 2854 2952 2949 2847 2746 2445 1745 -2754 2852 2849 2747 -2456 2655 2753 2748 2646 2445 -1235 2035 -1356 1555 -1456 1554 -1856 1754 -1956 1755 -1536 1335 -1537 1435 -1737 1835 -1736 1935 -13235 CQ 3017 21 22 28 -2056 1755 1553 1451 1347 1344 1440 1538 1736 2035 2235 2536 2738 2840 2944 2947 2851 2753 2555 2256 2056 -1653 1551 1448 1443 1540 1638 -2638 2740 2843 2848 2751 2653 -2056 1855 1652 1548 1543 1639 1836 2035 -2235 2436 2639 2743 2748 2652 2455 2256 -1738 1840 2041 2141 2340 2438 2532 2630 2830 2932 2934 -2534 2632 2731 2831 -2438 2633 2732 2832 2933 -13235 CR 3018 21 22 28 -1556 1535 -1655 1636 -1756 1735 -1256 2456 2755 2854 2952 2950 2848 2747 2446 1746 -2754 2852 2850 2748 -2456 2655 2753 2749 2647 2446 -2146 2345 2443 2637 2735 2935 3037 3039 -2639 2737 2836 2936 -2345 2444 2738 2837 2937 3038 -1235 2035 -1356 1555 -1456 1554 -1856 1754 -1956 1755 -1536 1335 -1537 1435 -1737 1835 -1736 1935 -13235 CS 3019 21 20 28 -2653 2756 2750 2653 2455 2156 1856 1555 1353 1350 1448 1746 2344 2543 2641 2638 2536 -1450 1548 1747 2345 2544 2642 -1555 1453 1451 1549 1748 2346 2644 2742 2739 2637 2536 2235 1935 1636 1438 1341 1335 1438 -13035 CT 3020 21 20 28 -1256 1250 -1956 1935 -2055 2036 -2156 2135 -2856 2850 -1256 2856 -1635 2435 -1356 1250 -1456 1253 -1556 1254 -1756 1255 -2356 2855 -2556 2854 -2656 2853 -2756 2850 -1936 1735 -1937 1835 -2137 2235 -2136 2335 -13035 CU 3021 21 24 28 -1556 1541 1638 1836 2135 2335 2636 2838 2941 2955 -1655 1640 1738 -1756 1740 1837 1936 2135 -1256 2056 -2656 3256 -1356 1555 -1456 1554 -1856 1754 -1956 1755 -2756 2955 -3156 2955 -13435 CV 3022 21 20 28 -1356 2035 -1456 2038 2035 -1556 2138 -2755 2035 -1156 1856 -2356 2956 -1256 1454 -1656 1554 -1756 1555 -2556 2755 -2856 2755 -13035 CW 3023 21 24 28 -1456 1835 -1556 1840 1835 -1656 1940 -2256 1940 1835 -2256 2635 -2356 2640 2635 -2456 2740 -3055 2740 2635 -1156 1956 -2256 2456 -2756 3356 -1256 1555 -1356 1554 -1756 1654 -1856 1655 -2856 3055 -3256 3055 -13435 CX 3024 21 20 28 -1356 2535 -1456 2635 -1556 2735 -2655 1436 -1156 1856 -2356 2956 -1135 1735 -2235 2935 -1256 1554 -1656 1554 -1756 1555 -2456 2655 -2856 2655 -1436 1235 -1436 1635 -2536 2335 -2537 2435 -2537 2835 -13035 CY 3025 21 22 28 -1356 2045 2035 -1456 2145 2136 -1556 2245 2235 -2855 2245 -1156 1856 -2556 3156 -1735 2535 -1256 1455 -1756 1555 -2656 2855 -3056 2855 -2036 1835 -2037 1935 -2237 2335 -2236 2435 -13235 CZ 3026 21 20 28 -2756 1356 1350 -2556 1335 -2656 1435 -2756 1535 -1335 2735 2741 -1456 1350 -1556 1353 -1656 1354 -1856 1355 -2235 2736 -2435 2737 -2535 2738 -2635 2741 -13035 Ca 3101 21 20 28 -1546 1547 1647 1645 1445 1447 1548 1749 2149 2348 2447 2545 2538 2636 2735 -2347 2445 2438 2536 -2149 2248 2346 2338 2436 2735 2835 -2344 2243 1742 1441 1339 1338 1436 1735 2035 2236 2338 -1541 1439 1438 1536 -2243 1842 1641 1539 1538 1636 1735 -13035 Cb 3102 21 21 28 -1556 1535 1636 1836 -1655 1637 -1256 1756 1736 -1746 1848 2049 2249 2548 2746 2843 2841 2738 2536 2235 2035 1836 1738 -2646 2744 2740 2638 -2249 2448 2547 2644 2640 2537 2436 2235 -1356 1555 -1456 1554 -13135 Cc 3103 21 19 28 -2545 2546 2446 2444 2644 2646 2448 2249 1949 1648 1446 1343 1341 1438 1636 1935 2135 2436 2638 -1546 1444 1440 1538 -1949 1748 1647 1544 1540 1637 1736 1935 -12935 Cd 3104 21 21 28 -2456 2435 2935 -2555 2536 -2156 2656 2635 -2446 2348 2149 1949 1648 1446 1343 1341 1438 1636 1935 2135 2336 2438 -1546 1444 1440 1538 -1949 1748 1647 1544 1540 1637 1736 1935 -2256 2455 -2356 2454 -2637 2735 -2636 2835 -13135 Ce 3105 21 19 28 -1543 2643 2645 2547 2448 2149 1949 1648 1446 1343 1341 1438 1636 1935 2135 2436 2638 -2544 2545 2447 -1546 1444 1440 1538 -2443 2446 2348 2149 -1949 1748 1647 1544 1540 1637 1736 1935 -12935 Cf 3106 21 14 28 -2254 2255 2155 2153 2353 2355 2256 1956 1755 1654 1551 1535 -1754 1651 1636 -1956 1855 1753 1735 -1249 2149 -1235 2035 -1536 1335 -1537 1435 -1737 1835 -1736 1935 -12435 Cg 3107 21 19 28 -2548 2647 2748 2649 2549 2348 2247 -1849 1648 1547 1445 1443 1541 1640 1839 2039 2240 2341 2443 2445 2347 2248 2049 1849 -1647 1545 1543 1641 -2241 2343 2345 2247 -1849 1748 1646 1642 1740 1839 -2039 2140 2242 2246 2148 2049 -1541 1440 1338 1337 1435 1534 1833 2233 2532 2631 -1535 1834 2234 2533 -1337 1436 1735 2235 2534 2632 2631 2529 2228 1628 1329 1231 1232 1334 1635 -1628 1429 1331 1332 1434 1635 -12935 Ch 3108 21 23 28 -1556 1535 -1655 1636 -1256 1756 1735 -1745 1847 1948 2149 2449 2648 2747 2844 2835 -2647 2744 2736 -2449 2548 2645 2635 -1235 2035 -2335 3135 -1356 1555 -1456 1554 -1536 1335 -1537 1435 -1737 1835 -1736 1935 -2636 2435 -2637 2535 -2837 2935 -2836 3035 -13335 Ci 3109 21 12 28 -1556 1554 1754 1756 1556 -1656 1654 -1555 1755 -1549 1535 -1648 1636 -1249 1749 1735 -1235 2035 -1349 1548 -1449 1547 -1536 1335 -1537 1435 -1737 1835 -1736 1935 -12235 Cj 3110 21 13 28 -1756 1754 1954 1956 1756 -1856 1854 -1755 1955 -1749 1732 1629 1528 -1848 1833 1730 -1449 1949 1933 1830 1729 1528 1228 1129 1131 1331 1329 1229 1230 -1549 1748 -1649 1747 -12335 Ck 3111 21 22 28 -1556 1535 -1655 1636 -1256 1756 1735 -2648 1739 -2143 2835 -2142 2735 -2042 2635 -2349 3049 -1235 2035 -2335 3035 -1356 1555 -1456 1554 -2449 2648 -2949 2648 -1536 1335 -1537 1435 -1737 1835 -1736 1935 -2637 2435 -2537 2935 -13235 Cl 3112 21 12 28 -1556 1535 -1655 1636 -1256 1756 1735 -1235 2035 -1356 1555 -1456 1554 -1536 1335 -1537 1435 -1737 1835 -1736 1935 -12235 Cm 3113 21 34 28 -1549 1535 -1648 1636 -1249 1749 1735 -1745 1847 1948 2149 2449 2648 2747 2844 2835 -2647 2744 2736 -2449 2548 2645 2635 -2845 2947 3048 3249 3549 3748 3847 3944 3935 -3747 3844 3836 -3549 3648 3745 3735 -1235 2035 -2335 3135 -3435 4235 -1349 1548 -1449 1547 -1536 1335 -1537 1435 -1737 1835 -1736 1935 -2636 2435 -2637 2535 -2837 2935 -2836 3035 -3736 3535 -3737 3635 -3937 4035 -3936 4135 -14435 Cn 3114 21 23 28 -1549 1535 -1648 1636 -1249 1749 1735 -1745 1847 1948 2149 2449 2648 2747 2844 2835 -2647 2744 2736 -2449 2548 2645 2635 -1235 2035 -2335 3135 -1349 1548 -1449 1547 -1536 1335 -1537 1435 -1737 1835 -1736 1935 -2636 2435 -2637 2535 -2837 2935 -2836 3035 -13335 Co 3115 21 20 28 -1949 1648 1446 1343 1341 1438 1636 1935 2135 2436 2638 2741 2743 2646 2448 2149 1949 -1546 1444 1440 1538 -2538 2640 2644 2546 -1949 1748 1647 1544 1540 1637 1736 1935 -2135 2336 2437 2540 2544 2447 2348 2149 -13035 Cp 3116 21 21 28 -1549 1528 -1648 1629 -1249 1749 1728 -1746 1848 2049 2249 2548 2746 2843 2841 2738 2536 2235 2035 1836 1738 -2646 2744 2740 2638 -2249 2448 2547 2644 2640 2537 2436 2235 -1228 2028 -1349 1548 -1449 1547 -1529 1328 -1530 1428 -1730 1828 -1729 1928 -13135 Cq 3117 21 20 28 -2448 2428 -2547 2529 -2348 2548 2649 2628 -2446 2348 2149 1949 1648 1446 1343 1341 1438 1636 1935 2135 2336 2438 -1546 1444 1440 1538 -1949 1748 1647 1544 1540 1637 1736 1935 -2128 2928 -2429 2228 -2430 2328 -2630 2728 -2629 2828 -13035 Cr 3118 21 17 28 -1549 1535 -1648 1636 -1249 1749 1735 -2447 2448 2348 2346 2546 2548 2449 2249 2048 1846 1743 -1235 2035 -1349 1548 -1449 1547 -1536 1335 -1537 1435 -1737 1835 -1736 1935 -12735 Cs 3119 21 17 28 -2347 2449 2445 2347 2248 2049 1649 1448 1347 1345 1443 1642 2141 2340 2437 -1448 1345 -1444 1643 2142 2341 -2440 2336 -1347 1445 1644 2143 2342 2440 2437 2336 2135 1735 1536 1437 1339 1335 1437 -12735 Ct 3120 21 15 28 -1554 1540 1637 1736 1935 2135 2336 2438 -1654 1639 1737 -1554 1756 1739 1836 1935 -1249 2149 -12535 Cu 3121 21 23 28 -1549 1540 1637 1736 1935 2235 2436 2537 2639 -1648 1639 1737 -1249 1749 1739 1836 1935 -2649 2635 3135 -2748 2736 -2349 2849 2835 -1349 1548 -1449 1547 -2837 2935 -2836 3035 -13335 Cv 3122 21 18 28 -1349 1935 -1449 1937 -1549 2037 -2548 2037 1935 -1149 1849 -2149 2749 -1249 1547 -1749 1548 -2349 2548 -2649 2548 -12835 Cw 3123 21 24 28 -1449 1835 -1549 1838 -1649 1938 -2249 1938 1835 -2249 2635 -2349 2638 -2249 2449 2738 -3048 2738 2635 -1149 1949 -2749 3349 -1249 1548 -1849 1648 -2849 3048 -3249 3048 -13435 Cx 3124 21 20 28 -1449 2435 -1549 2535 -1649 2635 -2548 1536 -1249 1949 -2249 2849 -1235 1835 -2135 2835 -1349 1548 -1849 1648 -2349 2548 -2749 2548 -1536 1335 -1536 1735 -2436 2235 -2536 2735 -13035 Cy 3125 21 19 28 -1449 2035 -1549 2037 -1649 2137 -2648 2137 1831 1629 1428 1228 1129 1131 1331 1329 1229 1230 -1249 1949 -2249 2849 -1349 1647 -1849 1648 -2449 2648 -2749 2648 -12935 Cz 3126 21 18 28 -2349 1335 -2449 1435 -2549 1535 -2549 1349 1345 -1335 2535 2539 -1449 1345 -1549 1346 -1649 1347 -1849 1348 -2035 2536 -2235 2537 -2335 2538 -2435 2539 -12835 C0 3200 21 20 28 -1956 1655 1452 1347 1344 1439 1636 1935 2135 2436 2639 2744 2747 2652 2455 2156 1956 -1654 1552 1448 1443 1539 1637 -2437 2539 2643 2648 2552 2454 -1956 1755 1653 1548 1543 1638 1736 1935 -2135 2336 2438 2543 2548 2453 2355 2156 -13035 C1 3201 21 20 28 -1954 1935 -2054 2036 -2156 2135 -2156 1853 1652 -1535 2535 -1936 1735 -1937 1835 -2137 2235 -2136 2335 -13035 C2 3202 21 20 28 -1452 1451 1551 1552 1452 -1453 1553 1652 1651 1550 1450 1351 1352 1454 1555 1856 2256 2555 2654 2752 2750 2648 2346 1844 1643 1441 1338 1335 -2554 2652 2650 2548 -2256 2455 2552 2550 2448 2246 1844 -1337 1438 1638 2137 2537 2738 -1638 2136 2536 2637 -1638 2135 2535 2636 2738 2740 -13035 C3 3203 21 20 28 -1452 1451 1551 1552 1452 -1453 1553 1652 1651 1550 1450 1351 1352 1454 1555 1856 2256 2555 2653 2650 2548 2247 -2455 2553 2550 2448 -2156 2355 2453 2450 2348 2147 -1947 2247 2446 2644 2742 2739 2637 2536 2235 1835 1536 1437 1339 1340 1441 1541 1640 1639 1538 1438 -2544 2642 2639 2537 -2147 2346 2445 2542 2539 2436 2235 -1440 1439 1539 1540 1440 -13035 C4 3204 21 20 28 -2153 2135 -2254 2236 -2356 2335 -2356 1241 2841 -1835 2635 -2136 1935 -2137 2035 -2337 2435 -2336 2535 -13035 C5 3205 21 20 28 -1556 1346 1548 1849 2149 2448 2646 2743 2741 2638 2436 2135 1835 1536 1437 1339 1340 1441 1541 1640 1639 1538 1438 -2546 2644 2640 2538 -2149 2348 2447 2544 2540 2437 2336 2135 -1440 1439 1539 1540 1440 -1556 2556 -1555 2355 -1554 1954 2355 2556 -13035 C6 3206 21 20 28 -2453 2452 2552 2553 2453 -2554 2454 2353 2352 2451 2551 2652 2653 2555 2356 2056 1755 1553 1451 1347 1341 1438 1636 1935 2135 2436 2638 2741 2742 2645 2447 2148 1948 1747 1646 1544 -1653 1551 1447 1441 1538 1637 -2538 2640 2643 2545 -2056 1855 1754 1652 1548 1541 1638 1736 1935 -2135 2336 2437 2540 2543 2446 2347 2148 -13035 C7 3207 21 20 28 -1356 1350 -2756 2753 2650 2245 2143 2039 2035 -2144 2042 1939 1935 -2650 2145 1942 1839 1835 2035 -1352 1454 1656 1856 2353 2553 2654 2756 -1554 1655 1855 2054 -1352 1453 1654 1854 2353 -13035 C8 3208 21 20 28 -1856 1555 1453 1450 1548 1847 2247 2548 2650 2653 2555 2256 1856 -1655 1553 1550 1648 -2448 2550 2553 2455 -1856 1755 1653 1650 1748 1847 -2247 2348 2450 2453 2355 2256 -1847 1546 1445 1343 1339 1437 1536 1835 2235 2536 2637 2739 2743 2645 2546 2247 -1545 1443 1439 1537 -2537 2639 2643 2545 -1847 1646 1543 1539 1636 1835 -2235 2436 2539 2543 2446 2247 -13035 C9 3209 21 20 28 -1539 1538 1638 1639 1539 -2547 2445 2344 2143 1943 1644 1446 1349 1350 1453 1655 1956 2156 2455 2653 2750 2744 2640 2538 2336 2035 1735 1536 1438 1439 1540 1640 1739 1738 1637 1537 -1546 1448 1451 1553 -2454 2553 2650 2644 2540 2438 -1943 1744 1645 1548 1551 1654 1755 1956 -2156 2355 2453 2550 2543 2439 2337 2236 2035 -13035 C. 3210 21 11 28 -1538 1437 1436 1535 1635 1736 1737 1638 1538 -1537 1536 1636 1637 1537 -12135 C, 3211 21 11 28 -1736 1635 1535 1436 1437 1538 1638 1737 1734 1632 1431 -1537 1536 1636 1637 1537 -1635 1734 -1736 1632 -12135 C: 3212 21 11 28 -1549 1448 1447 1546 1646 1747 1748 1649 1549 -1548 1547 1647 1648 1548 -1538 1437 1436 1535 1635 1736 1737 1638 1538 -1537 1536 1636 1637 1537 -12135 C; 3213 21 11 28 -1549 1448 1447 1546 1646 1747 1748 1649 1549 -1548 1547 1647 1648 1548 -1736 1635 1535 1436 1437 1538 1638 1737 1734 1632 1431 -1537 1536 1636 1637 1537 -1635 1734 -1736 1632 -12135 C! 3214 21 11 28 -1556 1455 1453 1545 -1556 1542 1642 -1556 1656 1642 -1656 1755 1753 1645 -1538 1437 1436 1535 1635 1736 1737 1638 1538 -1537 1536 1636 1637 1537 -12135 C? 3215 21 19 28 -1451 1452 1552 1550 1350 1352 1454 1555 1756 2156 2455 2554 2652 2650 2548 2447 2045 -2454 2553 2549 2448 -2156 2355 2453 2449 2347 2246 -1945 1942 2042 2045 1945 -1938 1837 1836 1935 2035 2136 2137 2038 1938 -1937 1936 2036 2037 1937 -12935 C/ 3220 21 23 28 -3060 1228 1328 -3060 3160 1328 -13335 C( 3221 21 14 28 -2060 1858 1655 1451 1346 1342 1437 1633 1830 2028 -1654 1551 1447 1441 1537 1634 -1858 1756 1653 1547 1541 1635 1732 1830 -12435 C) 3222 21 14 28 -1460 1658 1855 2051 2146 2142 2037 1833 1630 1428 -1854 1951 2047 2041 1937 1834 -1658 1756 1853 1947 1941 1835 1732 1630 -12435 C* 3223 21 16 28 -1856 1755 1945 1844 -1856 1844 -1856 1955 1745 1844 -1353 1453 2247 2347 -1353 2347 -1353 1352 2348 2347 -2353 2253 1447 1347 -2353 1347 -2353 2352 1348 1347 -12635 C- 3224 21 25 28 -1445 3145 3144 -1445 1444 3144 -13535 C 3199 21 16 28 -12635 phylip-3.697/src/font40000644004732000473200000002604112406201116014305 0ustar joefelsenst_gCA 2051 21 20 28 -2356 1035 -2356 2435 -2254 2335 -1441 2341 -835 1435 -2035 2635 -13035 CB 2052 21 24 28 -1956 1335 -2056 1435 -1656 2756 3055 3153 3151 3048 2947 2646 -2756 2955 3053 3051 2948 2847 2646 -1746 2646 2845 2943 2941 2838 2636 2235 1035 -2646 2745 2843 2841 2738 2536 2235 -13435 CC 2053 21 21 28 -2854 2954 3056 2950 2952 2854 2755 2556 2256 1955 1753 1550 1447 1343 1340 1437 1536 1835 2135 2336 2538 2640 -2256 2055 1853 1650 1547 1443 1440 1537 1636 1835 -13135 CD 2054 21 23 28 -1956 1335 -2056 1435 -1656 2556 2855 2954 3051 3047 2943 2739 2537 2336 1935 1035 -2556 2755 2854 2951 2947 2843 2639 2437 2236 1935 -13335 CE 2055 21 23 28 -1956 1335 -2056 1435 -2450 2242 -1656 3156 3050 3056 -1746 2346 -1035 2535 2740 2435 -13335 CF 2056 21 22 28 -1956 1335 -2056 1435 -2450 2242 -1656 3156 3050 3056 -1746 2346 -1035 1735 -13235 CG 2057 21 22 28 -2854 2954 3056 2950 2952 2854 2755 2556 2256 1955 1753 1550 1447 1343 1340 1437 1536 1835 2035 2336 2538 2742 -2256 2055 1853 1650 1547 1443 1440 1537 1636 1835 -2035 2236 2438 2642 -2342 3042 -13235 CH 2058 21 26 28 -1956 1335 -2056 1435 -3256 2635 -3356 2735 -1656 2356 -2956 3656 -1746 2946 -1035 1735 -2335 3035 -13635 CI 2059 21 13 28 -1956 1335 -2056 1435 -1656 2356 -1035 1735 -12335 CJ 2060 21 18 28 -2556 2039 1937 1836 1635 1435 1236 1138 1140 1241 1340 1239 -2456 1939 1837 1635 -2156 2856 -12835 CK 2061 21 23 28 -1956 1335 -2056 1435 -3356 1643 -2347 2735 -2247 2635 -1656 2356 -2956 3556 -1035 1735 -2335 2935 -13335 CL 2062 21 20 28 -1956 1335 -2056 1435 -1656 2356 -1035 2535 2741 2435 -13035 CM 2063 21 27 28 -1956 1335 -1956 2035 -2056 2137 -3356 2035 -3356 2735 -3456 2835 -1656 2056 -3356 3756 -1035 1635 -2435 3135 -13735 CN 2064 21 25 28 -1956 1335 -1956 2638 -1953 2635 -3256 2635 -1656 1956 -2956 3556 -1035 1635 -13535 CO 2065 21 22 28 -2256 1955 1753 1550 1447 1343 1340 1437 1536 1735 2035 2336 2538 2741 2844 2948 2951 2854 2755 2556 2256 -2256 2055 1853 1650 1547 1443 1440 1537 1735 -2035 2236 2438 2641 2744 2848 2851 2754 2556 -13235 CP 2066 21 23 28 -1956 1335 -2056 1435 -1656 2856 3155 3253 3251 3148 2946 2545 1745 -2856 3055 3153 3151 3048 2846 2545 -1035 1735 -13335 CQ 2067 21 22 28 -2256 1955 1753 1550 1447 1343 1340 1437 1536 1735 2035 2336 2538 2741 2844 2948 2951 2854 2755 2556 2256 -2256 2055 1853 1650 1547 1443 1440 1537 1735 -2035 2236 2438 2641 2744 2848 2851 2754 2556 -1537 1538 1640 1841 1941 2140 2238 2231 2330 2530 2632 2633 -2238 2332 2431 2531 2632 -13235 CR 2068 21 24 28 -1956 1335 -2056 1435 -1656 2756 3055 3153 3151 3048 2947 2646 1746 -2756 2955 3053 3051 2948 2847 2646 -2246 2445 2544 2636 2735 2935 3037 3038 -2544 2737 2836 2936 3037 -1035 1735 -13435 CS 2069 21 23 28 -2954 3054 3156 3050 3052 2954 2855 2556 2156 1855 1653 1651 1749 1848 2544 2742 -1651 1849 2545 2644 2742 2739 2637 2536 2235 1835 1536 1437 1339 1341 1235 1337 1437 -13335 CT 2070 21 21 28 -2356 1735 -2456 1835 -1756 1450 1656 3156 3050 3056 -1435 2135 -13135 CU 2071 21 25 28 -1856 1545 1441 1438 1536 1835 2235 2536 2738 2841 3256 -1956 1645 1541 1538 1636 1835 -1556 2256 -2956 3556 -13535 CV 2072 21 20 28 -1656 1735 -1756 1837 -3056 1735 -1456 2056 -2656 3256 -13035 CW 2073 21 26 28 -1856 1635 -1956 1737 -2656 1635 -2656 2435 -2756 2537 -3456 2435 -1556 2256 -3156 3756 -13635 CX 2074 21 22 28 -1756 2435 -1856 2535 -3156 1135 -1556 2156 -2756 3356 -935 1535 -2135 2735 -13235 CY 2075 21 21 28 -1656 2046 1735 -1756 2146 1835 -3156 2146 -1456 2056 -2756 3356 -1435 2135 -13135 CZ 2076 21 22 28 -3056 1135 -3156 1235 -1856 1550 1756 3156 -1135 2535 2741 2435 -13235 Ca 2151 21 21 28 -2649 2442 2338 2336 2435 2735 2937 3039 -2749 2542 2438 2436 2535 -2442 2445 2348 2149 1949 1648 1445 1342 1339 1437 1536 1735 1935 2136 2339 2442 -1949 1748 1545 1442 1438 1536 -13135 Cb 2152 21 19 28 -1856 1443 1440 1537 1636 -1956 1543 -1543 1646 1848 2049 2249 2448 2547 2645 2642 2539 2336 2035 1835 1636 1539 1543 -2448 2546 2542 2439 2236 2035 -1556 1956 -12935 Cc 2153 21 18 28 -2446 2445 2545 2546 2448 2249 1949 1648 1445 1342 1339 1437 1536 1735 1935 2236 2439 -1949 1748 1545 1442 1438 1536 -12835 Cd 2154 21 21 28 -2856 2442 2338 2336 2435 2735 2937 3039 -2956 2542 2438 2436 2535 -2442 2445 2348 2149 1949 1648 1445 1342 1339 1437 1536 1735 1935 2136 2339 2442 -1949 1748 1545 1442 1438 1536 -2556 2956 -13135 Ce 2155 21 18 28 -1440 1841 2142 2444 2546 2448 2249 1949 1648 1445 1342 1339 1437 1536 1735 1935 2236 2438 -1949 1748 1545 1442 1438 1536 -12835 Cf 2156 21 15 28 -2555 2454 2553 2654 2655 2556 2356 2155 2054 1952 1849 1535 1431 1329 -2356 2154 2052 1948 1739 1635 1532 1430 1329 1128 928 829 830 931 1030 929 -1449 2449 -12535 Cg 2157 21 20 28 -2749 2335 2232 2029 1728 1428 1229 1130 1131 1232 1331 1230 -2649 2235 2132 1929 1728 -2442 2445 2348 2149 1949 1648 1445 1342 1339 1437 1536 1735 1935 2136 2339 2442 -1949 1748 1545 1442 1438 1536 -13035 Ch 2158 21 21 28 -1856 1235 -1956 1335 -1542 1746 1948 2149 2349 2548 2647 2645 2439 2436 2535 -2349 2547 2545 2339 2336 2435 2735 2937 3039 -1556 1956 -13135 Ci 2159 21 13 28 -1956 1855 1954 2055 1956 -1145 1247 1449 1749 1848 1845 1639 1636 1735 -1649 1748 1745 1539 1536 1635 1935 2137 2239 -12335 Cj 2160 21 13 28 -2056 1955 2054 2155 2056 -1245 1347 1549 1849 1948 1945 1635 1532 1430 1329 1128 928 829 830 931 1030 929 -1749 1848 1845 1535 1432 1330 1128 -12335 Ck 2161 21 20 28 -1856 1235 -1956 1335 -2648 2547 2646 2747 2748 2649 2549 2348 1944 1743 1543 -1743 1942 2136 2235 -1743 1842 2036 2135 2335 2536 2739 -1556 1956 -13035 Cl 2162 21 12 28 -1856 1442 1338 1336 1435 1735 1937 2039 -1956 1542 1438 1436 1535 -1556 1956 -12235 Cm 2163 21 33 28 -1145 1247 1449 1749 1848 1846 1742 1535 -1649 1748 1746 1642 1435 -1742 1946 2148 2349 2549 2748 2847 2845 2535 -2549 2747 2745 2435 -2742 2946 3148 3349 3549 3748 3847 3845 3639 3636 3735 -3549 3747 3745 3539 3536 3635 3935 4137 4239 -14335 Cn 2164 21 23 28 -1145 1247 1449 1749 1848 1846 1742 1535 -1649 1748 1746 1642 1435 -1742 1946 2148 2349 2549 2748 2847 2845 2639 2636 2735 -2549 2747 2745 2539 2536 2635 2935 3137 3239 -13335 Co 2165 21 18 28 -1949 1648 1445 1342 1339 1437 1536 1735 1935 2236 2439 2542 2545 2447 2348 2149 1949 -1949 1748 1545 1442 1438 1536 -1935 2136 2339 2442 2446 2348 -12835 Cp 2166 21 21 28 -1145 1247 1449 1749 1848 1846 1742 1328 -1649 1748 1746 1642 1228 -1742 1845 2048 2249 2449 2648 2747 2845 2842 2739 2536 2235 2035 1836 1739 1742 -2648 2746 2742 2639 2436 2235 -928 1628 -13135 Cq 2167 21 20 28 -2649 2028 -2749 2128 -2442 2445 2348 2149 1949 1648 1445 1342 1339 1437 1536 1735 1935 2136 2339 2442 -1949 1748 1545 1442 1438 1536 -1728 2428 -13035 Cr 2168 21 17 28 -1145 1247 1449 1749 1848 1846 1742 1535 -1649 1748 1746 1642 1435 -1742 1946 2148 2349 2549 2648 2647 2546 2447 2548 -12735 Cs 2169 21 17 28 -2447 2446 2546 2547 2448 2149 1849 1548 1447 1445 1544 2240 2339 -1446 1545 2241 2340 2337 2236 1935 1635 1336 1237 1238 1338 1337 -12735 Ct 2170 21 14 28 -1956 1542 1438 1436 1535 1835 2037 2139 -2056 1642 1538 1536 1635 -1349 2249 -12435 Cu 2171 21 23 28 -1145 1247 1449 1749 1848 1845 1639 1637 1835 -1649 1748 1745 1539 1537 1636 1835 2035 2236 2438 2642 -2849 2642 2538 2536 2635 2935 3137 3239 -2949 2742 2638 2636 2735 -13335 Cv 2172 21 20 28 -1145 1247 1449 1749 1848 1845 1639 1637 1835 -1649 1748 1745 1539 1537 1636 1835 1935 2236 2438 2641 2745 2749 2649 2747 -13035 Cw 2173 21 29 28 -1145 1247 1449 1749 1848 1845 1639 1637 1835 -1649 1748 1745 1539 1537 1636 1835 2035 2236 2438 2540 -2749 2540 2537 2636 2835 3035 3236 3438 3540 3644 3649 3549 3647 -2849 2640 2637 2835 -13935 Cx 2174 21 20 28 -1345 1548 1749 2049 2147 2144 -1949 2047 2044 1940 1838 1636 1435 1335 1236 1237 1338 1437 1336 -1940 1937 2035 2335 2536 2739 -2748 2647 2746 2847 2848 2749 2649 2448 2246 2144 2040 2037 2135 -13035 Cy 2175 21 21 28 -1145 1247 1449 1749 1848 1845 1639 1637 1835 -1649 1748 1745 1539 1537 1636 1835 2035 2236 2438 2642 -2949 2535 2432 2229 1928 1628 1429 1330 1331 1432 1531 1430 -2849 2435 2332 2129 1928 -13135 Cz 2176 21 20 28 -2749 2647 2445 1639 1437 1335 -1445 1547 1749 2049 2447 -1547 1748 2048 2447 2647 -1437 1637 2036 2336 2537 -1637 2035 2335 2537 2639 -13035 C0 2750 21 21 28 -2256 1955 1753 1550 1447 1343 1340 1437 1536 1735 1935 2236 2438 2641 2744 2848 2851 2754 2655 2456 2256 -2256 2055 1853 1650 1547 1443 1440 1537 1735 -1935 2136 2338 2541 2644 2748 2751 2654 2456 -13135 C1 2751 21 21 28 -2252 1735 -2456 1835 -2456 2153 1851 1650 -2353 1951 1650 -13135 C2 2752 21 21 28 -1752 1851 1750 1651 1652 1754 1855 2156 2456 2755 2853 2851 2749 2547 2245 1843 1541 1339 1135 -2456 2655 2753 2751 2649 2447 1843 -1237 1338 1538 2036 2336 2537 2639 -1538 2035 2335 2536 2639 -13135 C3 2753 21 21 28 -1752 1851 1750 1651 1652 1754 1855 2156 2456 2755 2853 2851 2749 2447 2146 -2456 2655 2753 2751 2649 2447 -1946 2146 2445 2544 2642 2639 2537 2436 2135 1735 1436 1337 1239 1240 1341 1440 1339 -2146 2345 2444 2542 2539 2437 2336 2135 -13135 C4 2754 21 21 28 -2655 2035 -2756 2135 -2756 1241 2841 -13135 C5 2755 21 21 28 -1956 1446 -1956 2956 -1955 2455 2956 -1446 1547 1848 2148 2447 2546 2644 2641 2538 2336 2035 1735 1436 1337 1239 1240 1341 1440 1339 -2148 2347 2446 2544 2541 2438 2236 2035 -13135 C6 2756 21 21 28 -2753 2652 2751 2852 2853 2755 2556 2256 1955 1753 1550 1447 1343 1339 1437 1536 1735 2035 2336 2538 2640 2643 2545 2446 2247 1947 1746 1544 1442 -2256 2055 1853 1650 1547 1443 1438 1536 -2035 2236 2438 2540 2544 2446 -13135 C7 2757 21 21 28 -1656 1450 -2956 2853 2650 2144 1941 1839 1735 -2650 2044 1841 1739 1635 -1553 1856 2056 2553 -1654 1855 2055 2553 2753 2854 2956 -13135 C8 2758 21 21 28 -2156 1855 1754 1652 1649 1747 1946 2246 2647 2748 2850 2853 2755 2456 2156 -2156 1955 1854 1752 1749 1847 1946 -2246 2547 2648 2750 2753 2655 2456 -1946 1545 1343 1241 1238 1336 1635 2035 2436 2537 2639 2642 2544 2445 2246 -1946 1645 1443 1341 1338 1436 1635 -2035 2336 2437 2539 2543 2445 -13135 C9 2759 21 21 28 -2749 2647 2445 2244 1944 1745 1646 1548 1551 1653 1855 2156 2456 2655 2754 2852 2848 2744 2641 2438 2236 1935 1635 1436 1338 1339 1440 1539 1438 -1745 1647 1651 1753 1955 2156 -2655 2753 2748 2644 2541 2338 2136 1935 -13135 C. 2760 21 11 28 -1337 1236 1335 1436 1337 -12135 C, 2761 21 11 28 -1335 1236 1337 1436 1435 1333 1131 -12135 C: 2762 21 11 28 -1649 1548 1647 1748 1649 -1337 1236 1335 1436 -12135 C; 2763 21 11 28 -1649 1548 1647 1748 1649 -1335 1236 1337 1436 1435 1333 1131 -12135 C! 2764 21 11 28 -1856 1755 1543 -1855 1543 -1856 1955 1543 -1337 1236 1335 1436 1337 -12135 C? 2765 21 21 28 -1752 1851 1750 1651 1652 1754 1855 2156 2556 2855 2953 2951 2849 2748 2146 1945 1943 2042 2242 -2556 2755 2853 2851 2749 2648 2447 -1837 1736 1835 1936 1837 -13135 C/ 2770 21 22 28 -3460 828 -13235 C( 2771 21 15 28 -2560 2157 1854 1651 1447 1342 1338 1433 1530 1628 -2157 1853 1649 1546 1441 1436 1531 1628 -12535 C) 2772 21 15 28 -1960 2058 2155 2250 2246 2141 1937 1734 1431 1028 -1960 2057 2152 2147 2042 1939 1735 1431 -12535 C* 2773 21 17 28 -2056 2044 -1553 2547 -2553 1547 -12735 C- 2774 21 26 28 -1444 3244 -13635 C 2749 21 16 28 -12635 phylip-3.697/src/font50000644004732000473200000004076612406201116014320 0ustar joefelsenst_gCA 3051 21 20 28 -2356 1136 -2152 2235 -2254 2336 -2356 2354 2437 2435 -1441 2241 -835 1435 -1935 2635 -1136 935 -1136 1335 -2236 2035 -2237 2135 -2437 2535 -13035 CB 3052 21 24 28 -1956 1335 -2056 1435 -2156 1535 -1656 2756 3055 3153 3151 3048 2947 2646 -2955 3053 3051 2948 2847 -2756 2855 2953 2951 2848 2646 -1846 2646 2845 2943 2941 2838 2636 2235 1035 -2745 2843 2841 2738 2536 -2646 2744 2741 2638 2436 2235 -1756 2055 -1856 1954 -2256 2054 -2356 2055 -1436 1135 -1437 1235 -1537 1635 -1436 1735 -13435 CC 3053 21 21 28 -2854 2954 3056 2950 2952 2854 2755 2556 2256 1955 1753 1550 1447 1343 1340 1437 1536 1835 2135 2336 2538 2640 -1954 1752 1650 1547 1443 1439 1537 -2256 2055 1852 1750 1647 1543 1538 1636 1835 -13135 CD 3054 21 23 28 -1956 1335 -2056 1435 -2156 1535 -1656 2556 2855 2954 3051 3047 2943 2739 2537 2336 1935 1035 -2755 2854 2951 2947 2843 2639 2437 -2556 2754 2851 2847 2743 2539 2236 1935 -1756 2055 -1856 1954 -2256 2054 -2356 2055 -1436 1135 -1437 1235 -1537 1635 -1436 1735 -13335 CE 3055 21 23 28 -1956 1335 -2056 1435 -2156 1535 -2550 2342 -1656 3156 3050 -1846 2446 -1035 2535 2740 -1756 2055 -1856 1954 -2256 2054 -2356 2055 -2756 3055 -2856 3054 -2956 3053 -3056 3050 -2550 2346 2342 -2448 2246 2344 -2447 2146 2345 -1436 1135 -1437 1235 -1537 1635 -1436 1735 -2035 2536 -2235 2537 -2537 2740 -13335 CF 3056 21 22 28 -1956 1335 -2056 1435 -2156 1535 -2550 2342 -1656 3156 3050 -1846 2446 -1035 1835 -1756 2055 -1856 1954 -2256 2054 -2356 2055 -2756 3055 -2856 3054 -2956 3053 -3056 3050 -2550 2346 2342 -2448 2246 2344 -2447 2146 2345 -1436 1135 -1437 1235 -1537 1635 -1436 1735 -13235 CG 3057 21 22 28 -2854 2954 3056 2950 2952 2854 2755 2556 2256 1955 1753 1550 1447 1343 1340 1437 1536 1835 2035 2336 2538 2742 -1954 1752 1650 1547 1443 1439 1537 -2438 2539 2642 -2256 2055 1852 1750 1647 1543 1538 1636 1835 -2035 2236 2439 2542 -2242 3042 -2342 2541 -2442 2539 -2842 2640 -2942 2641 -13235 CH 3058 21 26 28 -1956 1335 -2056 1435 -2156 1535 -3156 2535 -3256 2635 -3356 2735 -1656 2456 -2856 3656 -1746 2946 -1035 1835 -2235 3035 -1756 2055 -1856 1954 -2256 2054 -2356 2055 -2956 3255 -3056 3154 -3456 3254 -3556 3255 -1436 1135 -1437 1235 -1537 1635 -1436 1735 -2636 2335 -2637 2435 -2737 2835 -2636 2935 -13635 CI 3059 21 14 28 -1956 1335 -2056 1435 -2156 1535 -1656 2456 -1035 1835 -1756 2055 -1856 1954 -2256 2054 -2356 2055 -1436 1135 -1437 1235 -1537 1635 -1436 1735 -12435 CJ 3060 21 19 28 -2456 1939 1837 1635 -2556 2143 2040 1938 -2656 2243 2038 1836 1635 1435 1236 1138 1140 1241 1341 1440 1439 1338 1238 -1240 1239 1339 1340 1240 -2156 2956 -2256 2555 -2356 2454 -2756 2554 -2856 2555 -12935 CK 3061 21 23 28 -1956 1335 -2056 1435 -2156 1535 -3255 1744 -2147 2535 -2247 2635 -2348 2736 -1656 2456 -2956 3556 -1035 1835 -2235 2935 -1756 2055 -1856 1954 -2256 2054 -2356 2055 -3056 3255 -3456 3255 -1436 1135 -1437 1235 -1537 1635 -1436 1735 -2536 2335 -2537 2435 -2637 2835 -13335 CL 3062 21 20 28 -1956 1335 -2056 1435 -2156 1535 -1656 2456 -1035 2535 2741 -1756 2055 -1856 1954 -2256 2054 -2356 2055 -1436 1135 -1437 1235 -1537 1635 -1436 1735 -2035 2536 -2235 2638 -2435 2741 -13035 CM 3063 21 28 28 -1956 1336 -1955 2037 2035 -2056 2137 -2156 2238 -3356 2238 2035 -3356 2735 -3456 2835 -3556 2935 -1656 2156 -3356 3856 -1035 1635 -2435 3235 -1756 1955 -1856 1954 -3656 3454 -3756 3455 -1336 1135 -1336 1535 -2836 2535 -2837 2635 -2937 3035 -2836 3135 -13835 CN 3064 21 25 28 -1956 1336 -1956 2635 -2056 2638 -2156 2738 -3255 2738 2635 -1656 2156 -2956 3556 -1035 1635 -1756 2055 -1856 2054 -3056 3255 -3456 3255 -1336 1135 -1336 1535 -13535 CO 3065 21 22 28 -2256 1955 1753 1550 1447 1343 1340 1437 1536 1735 2035 2336 2538 2741 2844 2948 2951 2854 2755 2556 2256 -1853 1650 1547 1443 1439 1537 -2438 2641 2744 2848 2852 2754 -2256 2055 1852 1750 1647 1543 1538 1636 1735 -2035 2236 2439 2541 2644 2748 2753 2655 2556 -13235 CP 3066 21 23 28 -1956 1335 -2056 1435 -2156 1535 -1656 2856 3155 3253 3251 3148 2946 2545 1745 -3055 3153 3151 3048 2846 -2856 2955 3053 3051 2948 2746 2545 -1035 1835 -1756 2055 -1856 1954 -2256 2054 -2356 2055 -1436 1135 -1437 1235 -1537 1635 -1436 1735 -13335 CQ 3067 21 22 28 -2256 1955 1753 1550 1447 1343 1340 1437 1536 1735 2035 2336 2538 2741 2844 2948 2951 2854 2755 2556 2256 -1853 1650 1547 1443 1439 1537 -2438 2641 2744 2848 2852 2754 -2256 2055 1852 1750 1647 1543 1538 1636 1735 -2035 2236 2439 2541 2644 2748 2753 2655 2556 -1538 1640 1841 1941 2140 2238 2333 2432 2532 2633 -2332 2431 2531 -2238 2231 2330 2530 2633 2634 -13235 CR 3068 21 24 28 -1956 1335 -2056 1435 -2156 1535 -1656 2756 3055 3153 3151 3048 2947 2646 1846 -2955 3053 3051 2948 2847 -2756 2855 2953 2951 2848 2646 -2246 2445 2544 2738 2837 2937 3038 -2737 2836 2936 -2544 2636 2735 2935 3038 3039 -1035 1835 -1756 2055 -1856 1954 -2256 2054 -2356 2055 -1436 1135 -1437 1235 -1537 1635 -1436 1735 -13435 CS 3069 21 23 28 -2954 3054 3156 3050 3052 2954 2855 2556 2156 1855 1653 1650 1748 1946 2543 2641 2638 2536 -1750 1848 2544 2642 -1855 1753 1751 1849 2446 2644 2742 2739 2637 2536 2235 1835 1536 1437 1339 1341 1235 1337 1437 -13335 CT 3070 21 22 28 -2356 1735 -2456 1835 -2556 1935 -1656 1450 -3256 3150 -1656 3256 -1435 2235 -1756 1450 -1956 1553 -2156 1655 -2856 3155 -2956 3154 -3056 3153 -3156 3150 -1836 1535 -1837 1635 -1937 2035 -1836 2135 -13235 CU 3071 21 25 28 -1856 1545 1441 1438 1536 1835 2235 2536 2738 2841 3255 -1956 1645 1541 1537 1636 -2056 1745 1641 1637 1835 -1556 2356 -2956 3556 -1656 1955 -1756 1854 -2156 1954 -2256 1955 -3056 3255 -3456 3255 -13535 CV 3072 21 20 28 -1656 1654 1737 1735 -1755 1838 -1856 1939 -2955 1735 -1456 2156 -2656 3256 -1556 1654 -1956 1854 -2056 1755 -2756 2955 -3156 2955 -13035 CW 3073 21 26 28 -1856 1854 1637 1635 -1955 1738 -2056 1839 -2656 1839 1635 -2656 2654 2437 2435 -2755 2538 -2856 2639 -3455 2639 2435 -1556 2356 -2656 2856 -3156 3756 -1656 1955 -1756 1854 -2156 1953 -2256 1955 -3256 3455 -3656 3455 -13635 CX 3074 21 22 28 -1756 2335 -1856 2435 -1956 2535 -3055 1236 -1556 2256 -2756 3356 -935 1535 -2035 2735 -1656 1854 -2056 1954 -2156 1955 -2856 3055 -3256 3055 -1236 1035 -1236 1435 -2336 2135 -2337 2235 -2437 2635 -13235 CY 3075 21 22 28 -1656 2046 1735 -1756 2146 1835 -1856 2246 1935 -3155 2246 -1456 2156 -2856 3456 -1435 2235 -1556 1755 -1956 1854 -2056 1755 -2956 3155 -3356 3155 -1836 1535 -1837 1635 -1937 2035 -1836 2135 -13235 CZ 3076 21 22 28 -2956 1135 -3056 1235 -3156 1335 -3156 1756 1550 -1135 2535 2741 -1856 1550 -1956 1653 -2156 1755 -2135 2536 -2335 2638 -2435 2741 -13235 Ca 3151 21 22 28 -2649 2442 2438 2536 2635 2835 3037 3139 -2749 2542 2536 -2649 2849 2642 2538 -2442 2445 2348 2149 1949 1648 1445 1342 1340 1437 1536 1735 1935 2136 2237 2339 2442 -1748 1545 1442 1439 1537 -1949 1747 1645 1542 1539 1636 1735 -13235 Cb 3152 21 19 28 -1756 1549 1443 1439 1537 1636 1835 2035 2336 2539 2642 2644 2547 2448 2249 2049 1848 1747 1645 1542 -1856 1649 1545 1539 1636 -2337 2439 2542 2545 2447 -1456 1956 1749 1542 -2035 2237 2339 2442 2445 2348 2249 -1556 1855 -1656 1754 -12935 Cc 3153 21 18 28 -2445 2446 2346 2344 2544 2546 2448 2249 1949 1648 1445 1342 1340 1437 1536 1735 1935 2236 2439 -1647 1545 1442 1439 1537 -1949 1747 1645 1542 1539 1636 1735 -12835 Cd 3154 21 22 28 -2856 2545 2441 2438 2536 2635 2835 3037 3139 -2956 2645 2541 2536 -2556 3056 2642 2538 -2442 2445 2348 2149 1949 1648 1445 1342 1340 1437 1536 1735 1935 2136 2237 2339 2442 -1647 1545 1442 1439 1537 -1949 1747 1645 1542 1539 1636 1735 -2656 2955 -2756 2854 -13235 Ce 3155 21 18 28 -1440 1841 2142 2444 2546 2448 2249 1949 1648 1445 1342 1340 1437 1536 1735 1935 2236 2438 -1647 1545 1442 1439 1537 -1949 1747 1645 1542 1539 1636 1735 -12835 Cf 3156 21 16 28 -2654 2655 2555 2553 2753 2755 2656 2456 2255 2053 1951 1848 1744 1535 1432 1330 1128 -2052 1949 1844 1635 1532 -2456 2254 2152 2049 1944 1736 1633 1531 1329 1128 928 829 831 1031 1029 929 930 -1449 2549 -12635 Cg 3157 21 21 28 -2649 2235 2132 1929 1728 -2749 2335 2131 -2649 2849 2435 2231 2029 1728 1428 1229 1130 1132 1332 1330 1230 1231 -2442 2445 2348 2149 1949 1648 1445 1342 1340 1437 1536 1735 1935 2136 2237 2339 2442 -1647 1545 1442 1439 1537 -1949 1747 1645 1542 1539 1636 1735 -13135 Ch 3158 21 22 28 -1856 1235 1435 -1956 1335 -1556 2056 1435 -1642 1846 2048 2249 2449 2648 2746 2743 2538 -2648 2644 2540 2536 -2646 2441 2438 2536 2635 2835 3037 3139 -1656 1955 -1756 1854 -13235 Ci 3159 21 13 28 -1956 1954 2154 2156 1956 -2056 2054 -1955 2155 -1145 1247 1449 1649 1748 1846 1843 1638 -1748 1744 1640 1636 -1746 1541 1538 1636 1735 1935 2137 2239 -12335 Cj 3160 21 13 28 -2056 2054 2254 2256 2056 -2156 2154 -2055 2255 -1245 1347 1549 1749 1848 1946 1943 1736 1633 1531 1329 1128 928 829 831 1031 1029 929 930 -1848 1843 1636 1533 1431 -1846 1742 1535 1432 1330 1128 -12335 Ck 3161 21 22 28 -1856 1235 1435 -1956 1335 -1556 2056 1435 -2847 2848 2748 2746 2946 2948 2849 2649 2448 2044 1843 -1643 1843 2042 2141 2337 2436 2636 -2041 2237 2336 -1843 1942 2136 2235 2435 2636 2839 -1656 1955 -1756 1854 -13235 Cl 3162 21 12 28 -1856 1545 1441 1438 1536 1635 1835 2037 2139 -1956 1645 1541 1536 -1556 2056 1642 1538 -1656 1955 -1756 1854 -12235 Cm 3163 21 35 28 -1145 1247 1449 1649 1748 1846 1843 1635 -1748 1743 1535 -1746 1642 1435 1635 -1843 2046 2248 2449 2649 2848 2946 2943 2735 -2848 2843 2635 -2846 2742 2535 2735 -2943 3146 3348 3549 3749 3948 4046 4043 3838 -3948 3944 3840 3836 -3946 3741 3738 3836 3935 4135 4337 4439 -14535 Cn 3164 21 24 28 -1145 1247 1449 1649 1748 1846 1843 1635 -1748 1743 1535 -1746 1642 1435 1635 -1843 2046 2248 2449 2649 2848 2946 2943 2738 -2848 2844 2740 2736 -2846 2641 2638 2736 2835 3035 3237 3339 -13435 Co 3165 21 20 28 -1949 1648 1445 1342 1340 1437 1536 1835 2135 2436 2639 2742 2744 2647 2548 2249 1949 -1647 1545 1442 1439 1537 -2437 2539 2642 2645 2547 -1949 1747 1645 1542 1539 1636 1835 -2135 2337 2439 2542 2545 2448 2249 -13035 Cp 3166 21 22 28 -1145 1247 1449 1649 1748 1846 1843 1739 1428 -1748 1743 1639 1328 -1746 1642 1228 -1842 1945 2047 2148 2349 2549 2748 2847 2944 2942 2839 2636 2335 2135 1936 1839 1842 -2747 2845 2842 2739 2637 -2549 2648 2745 2742 2639 2537 2335 -928 1728 -1329 1028 -1330 1128 -1430 1528 -1329 1628 -13235 Cq 3167 21 21 28 -2649 2028 -2749 2128 -2649 2849 2228 -2442 2445 2348 2149 1949 1648 1445 1342 1340 1437 1536 1735 1935 2136 2237 2339 2442 -1647 1545 1442 1439 1537 -1949 1747 1645 1542 1539 1636 1735 -1728 2528 -2129 1828 -2130 1928 -2230 2328 -2129 2428 -13135 Cr 3168 21 18 28 -1145 1247 1449 1649 1748 1846 1842 1635 -1748 1742 1535 -1746 1642 1435 1635 -2647 2648 2548 2546 2746 2748 2649 2449 2248 2046 1842 -12835 Cs 3169 21 17 28 -2446 2447 2347 2345 2545 2547 2448 2149 1849 1548 1447 1445 1543 1742 2041 2240 2338 -1548 1445 -1544 1743 2042 2241 -2340 2236 -1447 1545 1744 2043 2242 2340 2338 2236 1935 1635 1336 1237 1239 1439 1437 1337 1338 -12735 Ct 3170 21 14 28 -1956 1645 1541 1538 1636 1735 1935 2137 2239 -2056 1745 1641 1636 -1956 2156 1742 1638 -1349 2349 -12435 Cu 3171 21 24 28 -1145 1247 1449 1649 1748 1846 1843 1638 -1748 1744 1640 1636 -1746 1541 1538 1636 1835 2035 2236 2438 2641 -2849 2641 2638 2736 2835 3035 3237 3339 -2949 2741 2736 -2849 3049 2842 2738 -13435 Cv 3172 21 20 28 -1145 1247 1449 1649 1748 1846 1843 1638 -1748 1744 1640 1636 -1746 1541 1538 1636 1835 2035 2236 2438 2641 2745 2749 2649 2648 2746 -13035 Cw 3173 21 30 28 -1145 1247 1449 1649 1748 1846 1843 1638 -1748 1744 1640 1636 -1746 1541 1538 1636 1835 2035 2236 2438 2541 -2749 2541 2538 2636 2835 3035 3236 3438 3641 3745 3749 3649 3648 3746 -2849 2641 2636 -2749 2949 2742 2638 -14035 Cx 3174 21 22 28 -1345 1548 1749 1949 2148 2246 2244 -1949 2048 2044 1940 1838 1636 1435 1235 1136 1138 1338 1336 1236 1237 -2147 2144 2040 2037 -2947 2948 2848 2846 3046 3048 2949 2749 2548 2346 2244 2140 2136 2235 -1940 1938 2036 2235 2435 2636 2839 -13235 Cy 3175 21 22 28 -1145 1247 1449 1649 1748 1846 1843 1638 -1748 1744 1640 1636 -1746 1541 1538 1636 1835 2035 2236 2438 2642 -2849 2435 2332 2129 1928 -2949 2535 2331 -2849 3049 2635 2431 2229 1928 1628 1429 1330 1332 1532 1530 1430 1431 -13235 Cz 3176 21 20 28 -2749 2647 2445 1639 1437 1335 -2647 1747 1546 1444 -2447 2048 1748 1647 -2447 2049 1749 1547 1444 -1437 2337 2538 2640 -1637 2036 2336 2437 -1637 2035 2335 2537 2640 -13035 C0 3250 21 21 28 -2256 1955 1753 1550 1447 1343 1340 1437 1536 1735 1935 2236 2438 2641 2744 2848 2851 2754 2655 2456 2256 -1954 1752 1650 1547 1443 1439 1537 -2237 2439 2541 2644 2748 2752 2654 -2256 2055 1852 1750 1647 1543 1538 1636 1735 -1935 2136 2339 2441 2544 2648 2653 2555 2456 -13135 C1 3251 21 21 28 -2252 1735 1935 -2556 2352 1835 -2556 1935 -2556 2253 1951 1750 -2252 2051 1750 -13135 C2 3252 21 21 28 -1751 1752 1852 1850 1650 1652 1754 1855 2156 2456 2755 2853 2851 2749 2547 1541 1339 1135 -2655 2753 2751 2649 2447 2145 -2456 2555 2653 2651 2549 2347 1541 -1237 1338 1538 2037 2537 2638 -1538 2036 2536 -1538 2035 2335 2536 2638 2639 -13135 C3 3253 21 21 28 -1751 1752 1852 1850 1650 1652 1754 1855 2156 2456 2755 2853 2851 2749 2648 2447 2146 -2655 2753 2751 2649 2548 -2456 2555 2653 2651 2549 2347 2146 -1946 2146 2445 2544 2642 2639 2537 2336 2035 1735 1436 1337 1239 1241 1441 1439 1339 1340 -2444 2542 2539 2437 -2146 2345 2443 2439 2337 2236 2035 -13135 C4 3254 21 21 28 -2552 2035 2235 -2856 2652 2135 -2856 2235 -2856 1241 2841 -13135 C5 3255 21 21 28 -1956 1446 -1956 2956 -1955 2755 -1854 2354 2755 2956 -1446 1547 1848 2148 2447 2546 2644 2641 2538 2336 1935 1635 1436 1337 1239 1241 1441 1439 1339 1340 -2446 2544 2541 2438 2236 -2148 2347 2445 2441 2338 2136 1935 -13135 C6 3256 21 21 28 -2752 2753 2653 2651 2851 2853 2755 2556 2256 1955 1753 1550 1447 1343 1340 1437 1536 1735 2035 2336 2538 2640 2643 2545 2446 2247 1947 1746 1645 1543 -1853 1650 1547 1443 1439 1537 -2438 2540 2543 2445 -2256 2055 1852 1750 1647 1543 1538 1636 1735 -2035 2236 2337 2440 2444 2346 2247 -13135 C7 3257 21 21 28 -1656 1450 -2956 2853 2650 2245 2042 1939 1835 -2043 1839 1735 -2650 2044 1841 1739 1635 1835 -1553 1856 2056 2553 -1755 2055 2553 -1553 1754 2054 2553 2753 2854 2956 -13135 C8 3258 21 21 28 -2156 1855 1754 1652 1649 1747 1946 2246 2547 2748 2850 2853 2755 2556 2156 -2356 1855 -1854 1752 1748 1847 -1747 2046 -2146 2547 -2648 2750 2753 2655 -2755 2356 -2156 1954 1852 1848 1946 -2246 2447 2548 2650 2654 2556 -1946 1545 1343 1241 1238 1336 1635 2035 2436 2537 2639 2642 2544 2445 2246 -2046 1545 -1645 1443 1341 1338 1436 -1336 1835 2436 -2437 2539 2542 2444 -2445 2146 -1946 1745 1543 1441 1438 1536 1635 -2035 2236 2337 2439 2443 2345 2246 -13135 C9 3259 21 21 28 -2648 2546 2445 2244 1944 1745 1646 1548 1551 1653 1855 2156 2456 2655 2754 2851 2848 2744 2641 2438 2236 1935 1635 1436 1338 1340 1540 1538 1438 1439 -1746 1648 1651 1753 -2654 2752 2748 2644 2541 2338 -1944 1845 1747 1751 1854 1955 2156 -2456 2555 2653 2648 2544 2441 2339 2136 1935 -13135 C. 3260 21 11 28 -1338 1237 1236 1335 1435 1536 1537 1438 1338 -1337 1336 1436 1437 1337 -12135 C, 3261 21 11 28 -1435 1335 1236 1237 1338 1438 1537 1535 1433 1332 1131 -1337 1336 1436 1437 1337 -1435 1434 1332 -12135 C: 3262 21 11 28 -1649 1548 1547 1646 1746 1847 1848 1749 1649 -1648 1647 1747 1748 1648 -1338 1237 1236 1335 1435 1536 1537 1438 1338 -1337 1336 1436 1437 1337 -12135 C; 3263 21 11 28 -1649 1548 1547 1646 1746 1847 1848 1749 1649 -1648 1647 1747 1748 1648 -1435 1335 1236 1237 1338 1438 1537 1535 1433 1332 1131 -1337 1336 1436 1437 1337 -1435 1434 1332 -12135 C! 3264 21 11 28 -1956 1856 1755 1542 -1955 1855 1542 -1955 1954 1542 -1956 2055 2054 1542 -1338 1237 1236 1335 1435 1536 1537 1438 1338 -1337 1336 1436 1437 1337 -12135 C? 3265 21 21 28 -1751 1752 1852 1850 1650 1652 1754 1855 2156 2556 2855 2953 2951 2849 2748 2547 2146 1945 1943 2142 2242 -2356 2855 -2755 2853 2851 2749 2648 2447 -2556 2655 2753 2751 2649 2548 2146 2045 2043 2142 -1838 1737 1736 1835 1935 2036 2037 1938 1838 -1837 1836 1936 1937 1837 -13135 C/ 3270 21 23 28 -3460 828 928 -3460 3560 928 -13335 C( 3271 21 16 28 -2660 2459 2157 1854 1651 1447 1343 1338 1434 1531 1728 -1954 1751 1547 1442 1434 -2660 2358 2055 1852 1750 1647 1543 1434 -1442 1533 1630 1728 -12635 C) 3272 21 16 28 -1960 2157 2254 2350 2345 2241 2037 1834 1531 1229 1028 -2254 2246 2141 1937 1734 -1960 2058 2155 2246 -2254 2145 2041 1938 1836 1633 1330 1028 -12635 C* 3273 21 17 28 -2056 1955 2145 2044 -2056 2044 -2056 2155 1945 2044 -1553 1653 2447 2547 -1553 2547 -1553 1552 2548 2547 -2553 2453 1647 1547 -2553 1547 -2553 2552 1548 1547 -12735 C- 3274 21 25 28 -1445 3145 3144 -1445 1444 3144 -13535 C 3249 21 16 28 -12635 phylip-3.697/src/font60000644004732000473200000003375212406201116014316 0ustar joefelsenst_gC@ 2801 21 20 28 -2056 1335 -2056 2735 -2053 2635 -1541 2441 -1135 1735 -2335 2935 -13035 CA 2802 21 22 28 -1556 1535 -1656 1635 -1256 2856 2850 2756 -1646 2446 2745 2844 2942 2939 2837 2736 2435 1235 -2446 2645 2744 2842 2839 2737 2636 2435 -13235 CB 2803 21 22 28 -1556 1535 -1656 1635 -1256 2456 2755 2854 2952 2950 2848 2747 2446 -2456 2655 2754 2852 2850 2748 2647 2446 -1646 2446 2745 2844 2942 2939 2837 2736 2435 1235 -2446 2645 2744 2842 2839 2737 2636 2435 -13235 CC 2804 21 18 28 -1556 1535 -1656 1635 -1256 2756 2750 2656 -1235 1935 -12835 CD 2805 21 24 28 -1856 1850 1742 1638 1536 1435 -2856 2835 -2956 2935 -1556 3256 -1135 3235 -1135 1128 -1235 1128 -3135 3228 -3235 3228 -13435 CE 2806 21 21 28 -1556 1535 -1656 1635 -2250 2242 -1256 2856 2850 2756 -1646 2246 -1235 2835 2841 2735 -13135 CF 2807 21 31 28 -2556 2535 -2656 2635 -2256 2956 -1455 1554 1453 1354 1355 1456 1556 1655 1753 1849 1947 2146 3046 3247 3349 3453 3555 3656 3756 3855 3854 3753 3654 3755 -2146 1945 1843 1738 1636 1535 -2146 2045 1943 1838 1736 1635 1435 1336 1238 -3046 3245 3343 3438 3536 3635 -3046 3145 3243 3338 3436 3535 3735 3836 3938 -2235 2935 -14135 CG 2808 21 20 28 -1453 1356 1350 1453 1655 1856 2256 2555 2653 2650 2548 2247 1947 -2256 2455 2553 2550 2448 2247 -2247 2446 2644 2742 2739 2637 2536 2235 1735 1536 1437 1339 1340 1441 1540 1439 -2545 2642 2639 2537 2436 2235 -13035 CH 2809 21 24 28 -1556 1535 -1656 1635 -2856 2835 -2956 2935 -1256 1956 -2556 3256 -2854 1637 -1235 1935 -2535 3235 -13435 CI 2810 21 24 28 -1556 1535 -1656 1635 -2856 2835 -2956 2935 -1256 1956 -2556 3256 -2854 1637 -1235 1935 -2535 3235 -1862 1863 1763 1762 1860 2059 2459 2660 2762 -13435 CJ 2811 21 24 28 -1556 1535 -1656 1635 -1256 1956 -1646 2346 2547 2649 2753 2855 2956 3056 3155 3154 3053 2954 3055 -2346 2545 2643 2738 2836 2935 -2346 2445 2543 2638 2736 2835 3035 3136 3238 -1235 1935 -13435 CK 2812 21 25 28 -1856 1850 1742 1638 1536 1435 1335 1236 1237 1338 1437 1336 -2956 2935 -3056 3035 -1556 3356 -2635 3335 -13535 CL 2813 21 25 28 -1556 1535 -1656 2238 -1556 2235 -2956 2235 -2956 2935 -3056 3035 -1256 1656 -2956 3356 -1235 1835 -2635 3335 -13535 CM 2814 21 24 28 -1556 1535 -1656 1635 -2856 2835 -2956 2935 -1256 1956 -2556 3256 -1646 2846 -1235 1935 -2535 3235 -13435 CN 2815 21 22 28 -2056 1755 1553 1451 1347 1344 1440 1538 1736 2035 2235 2536 2738 2840 2944 2947 2851 2753 2555 2256 2056 -2056 1855 1653 1551 1447 1444 1540 1638 1836 2035 -2235 2436 2638 2740 2844 2847 2751 2653 2455 2256 -13235 CN 2816 21 24 28 -1556 1535 -1656 1635 -2856 2835 -2956 2935 -1256 3256 -1235 1935 -2535 3235 -13435 CO 2817 21 22 28 -1556 1535 -1656 1635 -1256 2456 2755 2854 2952 2949 2847 2746 2445 1645 -2456 2655 2754 2852 2849 2747 2646 2445 -1235 1935 -13235 CP 2818 21 21 28 -2753 2850 2856 2753 2555 2256 2056 1755 1553 1451 1348 1343 1440 1538 1736 2035 2235 2536 2738 2840 -2056 1855 1653 1551 1448 1443 1540 1638 1836 2035 -13135 CQ 2819 21 19 28 -1956 1935 -2056 2035 -1356 1250 1256 2756 2750 2656 -1635 2335 -12935 CR 2820 21 21 28 -1356 2040 -1456 2140 -2856 2140 1937 1836 1635 1535 1436 1437 1538 1637 1536 -1156 1756 -2456 3056 -13135 CS 2821 21 25 28 -2256 2235 -2356 2335 -1956 2656 -2053 1652 1450 1347 1344 1441 1639 2038 2538 2939 3141 3244 3247 3150 2952 2553 2053 -2053 1752 1550 1447 1444 1541 1739 2038 -2538 2839 3041 3144 3147 3050 2852 2553 -1935 2635 -13535 CT 2822 21 20 28 -1356 2635 -1456 2735 -2756 1335 -1156 1756 -2356 2956 -1135 1735 -2335 2935 -13035 CU 2823 21 24 28 -1556 1535 -1656 1635 -2856 2835 -2956 2935 -1256 1956 -2556 3256 -1235 3235 -3135 3228 -3235 3228 -13435 CV 2824 21 23 28 -1556 1545 1643 1942 2242 2543 2745 -1656 1645 1743 1942 -2756 2735 -2856 2835 -1256 1956 -2456 3156 -2435 3135 -13335 CW 2825 21 33 28 -1556 1535 -1656 1635 -2656 2635 -2756 2735 -3756 3735 -3856 3835 -1256 1956 -2356 3056 -3456 4156 -1235 4135 -14335 CX 2826 21 33 28 -1556 1535 -1656 1635 -2656 2635 -2756 2735 -3756 3735 -3856 3835 -1256 1956 -2356 3056 -3456 4156 -1235 4135 -4035 4128 -4135 4128 -14335 CY 2827 21 26 28 -2056 2035 -2156 2135 -1356 1250 1256 2456 -2146 2846 3145 3244 3342 3339 3237 3136 2835 1735 -2846 3045 3144 3242 3239 3137 3036 2835 -13635 CZ 2828 21 30 28 -1556 1535 -1656 1635 -1256 1956 -1646 2346 2645 2744 2842 2839 2737 2636 2335 1235 -2346 2545 2644 2742 2739 2637 2536 2335 -3456 3435 -3556 3535 -3156 3856 -3135 3835 -14035 C[ 2829 21 21 28 -1556 1535 -1656 1635 -1256 1956 -1646 2346 2645 2744 2842 2839 2737 2636 2335 1235 -2346 2545 2644 2742 2739 2637 2536 2335 -13135 C\ 2830 21 21 28 -1453 1356 1350 1453 1655 1956 2156 2455 2653 2751 2848 2843 2740 2638 2436 2135 1835 1536 1437 1339 1340 1441 1540 1439 -2156 2355 2553 2651 2748 2743 2640 2538 2336 2135 -1846 2746 -13135 C] 2831 21 31 28 -1556 1535 -1656 1635 -1256 1956 -1235 1935 -2956 2655 2453 2351 2247 2244 2340 2438 2636 2935 3135 3436 3638 3740 3844 3847 3751 3653 3455 3156 2956 -2956 2755 2553 2451 2347 2344 2440 2538 2736 2935 -3135 3336 3538 3640 3744 3747 3651 3553 3355 3156 -1646 2246 -14135 C_ 2832 21 22 28 -2656 2635 -2756 2735 -3056 1856 1555 1454 1352 1350 1448 1547 1846 2646 -1856 1655 1554 1452 1450 1548 1647 1846 -2146 1945 1844 1537 1436 1336 1237 -1945 1843 1636 1535 1335 1237 1238 -2335 3035 -13235 C^ 2901 21 20 28 -1547 1546 1446 1447 1548 1749 2149 2348 2447 2545 2538 2636 2735 -2447 2438 2536 2735 2835 -2445 2344 1743 1442 1340 1338 1436 1735 2035 2236 2438 -1743 1542 1440 1438 1536 1735 -13035 Ca 2902 21 20 28 -2656 2555 1953 1651 1448 1345 1341 1438 1636 1935 2135 2436 2638 2741 2743 2646 2448 2149 1949 1648 1446 1343 -2656 2554 2353 1952 1650 1448 -1949 1748 1546 1443 1441 1538 1736 1935 -2135 2336 2538 2641 2643 2546 2348 2149 -13035 Cb 2903 21 20 28 -1549 1535 -1649 1635 -1249 2349 2648 2746 2745 2643 2342 -2349 2548 2646 2645 2543 2342 -1642 2342 2641 2739 2738 2636 2335 1235 -2342 2541 2639 2638 2536 2335 -13035 Cc 2904 21 18 28 -1549 1535 -1649 1635 -1249 2649 2644 2549 -1235 1935 -12835 Cd 2905 21 23 28 -1849 1845 1739 1636 1535 -2749 2735 -2849 2835 -1549 3149 -1335 1230 1235 3135 3130 3035 -13335 Ce 2906 21 19 28 -1443 2643 2645 2547 2448 2249 1949 1648 1446 1343 1341 1438 1636 1935 2135 2436 2638 -2543 2546 2448 -1949 1748 1546 1443 1441 1538 1736 1935 -12935 Cf 2907 21 27 28 -2349 2335 -2449 2435 -2049 2749 -1548 1447 1348 1449 1549 1648 1844 1943 2142 2642 2843 2944 3148 3249 3349 3448 3347 3248 -2142 1941 1840 1636 1535 -2142 1940 1736 1635 1435 1336 1238 -2642 2841 2940 3136 3235 -2642 2840 3036 3135 3335 3436 3538 -2035 2735 -13735 Cg 2908 21 18 28 -1447 1349 1345 1447 1548 1749 2149 2448 2546 2545 2443 2142 -2149 2348 2446 2445 2343 2142 -1842 2142 2441 2539 2538 2436 2135 1735 1436 1338 1339 1440 1539 1438 -2142 2341 2439 2438 2336 2135 -12835 Ch 2909 21 22 28 -1549 1535 -1649 1635 -2649 2635 -2749 2735 -1249 1949 -2349 3049 -1235 1935 -2335 3035 -2648 1636 -13235 Ci 2910 21 22 28 -1549 1535 -1649 1635 -2649 2635 -2749 2735 -1249 1949 -2349 3049 -1235 1935 -2335 3035 -2648 1636 -1855 1856 1756 1755 1853 2052 2252 2453 2555 -13235 Cj 2911 21 20 28 -1549 1535 -1649 1635 -1249 1949 -1642 1842 2143 2244 2448 2549 2649 2748 2647 2548 -1842 2141 2240 2436 2535 -1842 2041 2140 2336 2435 2635 2736 2838 -1235 1935 -13035 Ck 2912 21 22 28 -1749 1745 1639 1536 1435 1335 1236 1337 1436 -2649 2635 -2749 2735 -1449 3049 -2335 3035 -13235 Cl 2913 21 23 28 -1549 1535 -1549 2135 -1649 2137 -2749 2135 -2749 2735 -2849 2835 -1249 1649 -2749 3149 -1235 1835 -2435 3135 -13335 Cm 2914 21 22 28 -1549 1535 -1649 1635 -2649 2635 -2749 2735 -1249 1949 -2349 3049 -1642 2642 -1235 1935 -2335 3035 -13235 Cn 2915 21 20 28 -1949 1648 1446 1343 1341 1438 1636 1935 2135 2436 2638 2741 2743 2646 2448 2149 1949 -1949 1748 1546 1443 1441 1538 1736 1935 -2135 2336 2538 2641 2643 2546 2348 2149 -13035 Co 2916 21 22 28 -1549 1535 -1649 1635 -2649 2635 -2749 2735 -1249 3049 -1235 1935 -2335 3035 -13235 Cp 2917 21 21 28 -1549 1528 -1649 1628 -1646 1848 2049 2249 2548 2746 2843 2841 2738 2536 2235 2035 1836 1638 -2249 2448 2646 2743 2741 2638 2436 2235 -1249 1649 -1228 1928 -13135 Cq 2918 21 19 28 -2546 2445 2544 2645 2646 2448 2249 1949 1648 1446 1343 1341 1438 1636 1935 2135 2436 2638 -1949 1748 1546 1443 1441 1538 1736 1935 -12935 Cr 2919 21 19 28 -1949 1935 -2049 2035 -1449 1344 1349 2649 2644 2549 -1635 2335 -12935 Cs 2920 21 18 28 -1349 1935 -1449 1937 -2549 1935 1731 1529 1328 1228 1129 1230 1329 -1149 1749 -2149 2749 -12835 Ct 2921 21 21 28 -2056 2028 -2156 2128 -1756 2156 -2046 1948 1849 1649 1448 1345 1339 1436 1635 1835 1936 2038 -1649 1548 1445 1439 1536 1635 -2549 2648 2745 2739 2636 2535 -2146 2248 2349 2549 2748 2845 2839 2736 2535 2335 2236 2138 -1728 2428 -13135 Cu 2922 21 20 28 -1449 2535 -1549 2635 -2649 1435 -1249 1849 -2249 2849 -1235 1835 -2235 2835 -13035 Cv 2923 21 22 28 -1549 1535 -1649 1635 -2649 2635 -2749 2735 -1249 1949 -2349 3049 -1235 3035 3030 2935 -13235 Cw 2924 21 22 28 -1549 1542 1640 1939 2139 2440 2642 -1649 1642 1740 1939 -2649 2635 -2749 2735 -1249 1949 -2349 3049 -2335 3035 -13235 Cx 2925 21 31 28 -1549 1535 -1649 1635 -2549 2535 -2649 2635 -3549 3535 -3649 3635 -1249 1949 -2249 2949 -3249 3949 -1235 3935 -14135 Cy 2926 21 31 28 -1549 1535 -1649 1635 -2549 2535 -2649 2635 -3549 3535 -3649 3635 -1249 1949 -2249 2949 -3249 3949 -1235 3935 3930 3835 -14135 Cz 2927 21 21 28 -1949 1935 -2049 2035 -1449 1344 1349 2349 -2042 2442 2741 2839 2838 2736 2435 1635 -2442 2641 2739 2738 2636 2435 -13135 C{ 2928 21 26 28 -1549 1535 -1649 1635 -1249 1949 -1642 2042 2341 2439 2438 2336 2035 1235 -2042 2241 2339 2338 2236 2035 -3049 3035 -3149 3135 -2749 3449 -2735 3435 -13635 C| 2929 21 17 28 -1549 1535 -1649 1635 -1249 1949 -1642 2042 2341 2439 2438 2336 2035 1235 -2042 2241 2339 2338 2236 2035 -12735 C} 2930 21 19 28 -1447 1349 1345 1447 1548 1749 2049 2348 2546 2643 2641 2538 2336 2035 1735 1536 1338 1339 1440 1539 1438 -2049 2248 2446 2543 2541 2438 2236 2035 -1942 2542 -12935 C~ 2931 21 29 28 -1549 1535 -1649 1635 -1249 1949 -1235 1935 -2849 2548 2346 2243 2241 2338 2536 2835 3035 3336 3538 3641 3643 3546 3348 3049 2849 -2849 2648 2446 2343 2341 2438 2636 2835 -3035 3236 3438 3541 3543 3446 3248 3049 -1642 2242 -13935 C; 2932 21 21 28 -2549 2535 -2649 2635 -2949 1849 1548 1446 1445 1543 1842 2542 -1849 1648 1546 1545 1643 1842 -2342 2041 1940 1736 1635 -2342 2141 2040 1836 1735 1535 1436 1338 -2235 2935 -13135 C0 2700 21 20 28 -1956 1655 1452 1347 1344 1439 1636 1935 2135 2436 2639 2744 2747 2652 2455 2156 1956 -1755 1552 1447 1444 1539 1736 -1637 1936 2136 2437 -2336 2539 2644 2647 2552 2355 -2454 2155 1955 1654 -13035 C1 2701 21 20 28 -1652 1853 2156 2135 -1652 1651 1852 2054 2035 2135 -13035 C2 2702 21 20 28 -1451 1452 1554 1655 1856 2256 2455 2554 2652 2650 2548 2345 1435 -1451 1551 1552 1654 1855 2255 2454 2552 2550 2448 2245 1335 -1436 2736 2735 -1335 2735 -13035 C3 2703 21 20 28 -1556 2656 1947 -1556 1555 2555 -2556 1847 -1948 2148 2447 2645 2742 2741 2638 2436 2135 1835 1536 1437 1339 1439 -1847 2147 2446 2643 -2247 2545 2642 2641 2538 2236 -2640 2437 2136 1836 1537 1439 -1736 1438 -13035 C4 2704 21 20 28 -2353 2335 2435 -2456 2435 -2456 1340 2840 -2353 1440 -1441 2841 2840 -13035 C5 2705 21 20 28 -1556 1447 -1655 1548 -1556 2556 2555 -1655 2555 -1548 1849 2149 2448 2646 2743 2741 2638 2436 2135 1835 1536 1437 1339 1439 -1447 1547 1748 2148 2447 2644 -2248 2546 2643 2641 2538 2236 -2640 2437 2136 1836 1537 1439 -1736 1438 -13035 C6 2706 21 20 28 -2455 2553 2653 2555 2256 2056 1755 1552 1447 1442 1538 1736 2035 2135 2436 2638 2741 2742 2645 2447 2148 2048 1747 1545 -2554 2255 2055 1754 -1855 1652 1547 1542 1638 1936 -1540 1737 2036 2136 2437 2640 -2236 2538 2641 2642 2545 2247 -2643 2446 2147 2047 1746 1543 -1947 1645 1542 -13035 C7 2707 21 20 28 -1356 2756 1735 -1356 1355 2655 -2656 1635 1735 -13035 C8 2708 21 20 28 -1856 1555 1453 1451 1549 1648 1847 2246 2445 2544 2642 2639 2537 2236 1836 1537 1439 1442 1544 1645 1846 2247 2448 2549 2651 2653 2555 2256 1856 -1655 1553 1551 1649 1848 2247 2446 2644 2742 2739 2637 2536 2235 1835 1536 1437 1339 1342 1444 1646 1847 2248 2449 2551 2553 2455 -2554 2255 1855 1554 -1438 1736 -2336 2638 -13035 C9 2709 21 20 28 -2546 2344 2043 1943 1644 1446 1349 1350 1453 1655 1956 2056 2355 2553 2649 2644 2539 2336 2035 1835 1536 1438 1538 1636 -2549 2446 2144 -2548 2345 2044 1944 1645 1448 -1844 1546 1449 1450 1553 1855 -1451 1654 1955 2055 2354 2551 -2155 2453 2549 2544 2439 2236 -2337 2036 1836 1537 -13035 C. 2710 21 11 28 -1538 1437 1436 1535 1635 1736 1737 1638 1538 -1537 1536 1636 1637 1537 -12135 C, 2711 21 11 28 -1736 1635 1535 1436 1437 1538 1638 1737 1734 1632 1431 -1537 1536 1636 1637 1537 -1635 1734 -1736 1632 -12135 C: 2712 21 11 28 -1549 1448 1447 1546 1646 1747 1748 1649 1549 -1548 1547 1647 1648 1548 -1538 1437 1436 1535 1635 1736 1737 1638 1538 -1537 1536 1636 1637 1537 -12135 C; 2713 21 11 28 -1549 1448 1447 1546 1646 1747 1748 1649 1549 -1548 1547 1647 1648 1548 -1736 1635 1535 1436 1437 1538 1638 1737 1734 1632 1431 -1537 1536 1636 1637 1537 -1635 1734 -1736 1632 -12135 C! 2714 21 11 28 -1556 1542 1642 -1556 1656 1642 -1538 1437 1436 1535 1635 1736 1737 1638 1538 -1537 1536 1636 1637 1537 -12135 C' 2715 21 19 28 -1351 1352 1454 1555 1856 2156 2455 2554 2652 2650 2548 2447 2246 1945 -1351 1451 1452 1554 1855 2155 2454 2552 2550 2448 2247 1946 -1453 1755 -2255 2553 -2549 2146 -1946 1942 2042 2046 -1938 1837 1836 1935 2035 2136 2137 2038 1938 -1937 1936 2036 2037 1937 -12935 C/ 2720 21 23 28 -3060 1228 1328 -3060 3160 1328 -13335 C( 2721 21 14 28 -2060 1858 1655 1451 1346 1342 1437 1633 1830 2028 2128 -2060 2160 1958 1755 1551 1446 1442 1537 1733 1930 2128 -12435 C) 2722 21 14 28 -1360 1558 1755 1951 2046 2042 1937 1733 1530 1328 1428 -1360 1460 1658 1855 2051 2146 2142 2037 1833 1630 1428 -12435 C* 2723 21 16 28 -1856 1755 1945 1844 -1856 1844 -1856 1955 1745 1844 -1353 1453 2247 2347 -1353 2347 -1353 1352 2348 2347 -2353 2253 1447 1347 -2353 1347 -2353 2352 1348 1347 -12635 C- 2724 21 25 28 -1445 3145 3144 -1445 1444 3144 -13535 C 2699 21 16 28 -12635 phylip-3.697/src/gendist.c0000644004732000473200000002533712406201116015140 0ustar joefelsenst_g#include "phylip.h" /* version 3.696. Written by Joseph Felsenstein, Akiko Fuseki, Sean Lamont, and Andrew Keeffe. Copyright (c) 1993-2014, Joseph Felsenstein All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #define epsilong 0.02 /* a small number */ #ifndef OLDC /* function prototypes */ void getoptions(void); void allocrest(void); void doinit(void); void getalleles(void); void inputdata(void); void getinput(void); void makedists(void); void writedists(void); void freerest(void); void freex(void); /* function prototypes */ #endif Char infilename[FNMLNGTH], outfilename[FNMLNGTH]; long loci, totalleles, df, datasets, ith; long nonodes; long *alleles; phenotype3 *x; double **d; boolean all, cavalli, lower, nei, reynolds, mulsets, firstset, progress; void getoptions() { /* interactively set options */ long loopcount; Char ch; all = false; cavalli = false; lower = false; nei = true; reynolds = false; lower = false; progress = true; loopcount = 0; for (;;) { cleerhome(); printf("\nGenetic Distance Matrix program, version %s\n\n",VERSION); printf("Settings for this run:\n"); printf(" A Input file contains all alleles at each locus? %s\n", all ? "Yes" : "One omitted at each locus"); printf(" N Use Nei genetic distance? %s\n", nei ? "Yes" : "No"); printf(" C Use Cavalli-Sforza chord measure? %s\n", cavalli ? "Yes" : "No"); printf(" R Use Reynolds genetic distance? %s\n", reynolds ? "Yes" : "No"); printf(" L Form of distance matrix? %s\n", lower ? "Lower-triangular" : "Square"); printf(" M Analyze multiple data sets?"); if (mulsets) printf(" Yes, %2ld sets\n", datasets); else printf(" No\n"); printf(" 0 Terminal type (IBM PC, ANSI, none)? %s\n", ibmpc ? "IBM PC" : ansi ? "ANSI" : "(none)"); printf(" 1 Print indications of progress of run? %s\n", progress ? "Yes" : "No"); printf("\n Y to accept these or type the letter for one to change\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); uppercase(&ch); if (ch == 'Y') break; if (strchr("ACNMRL01", ch) != NULL) { switch (ch) { case 'A': all = !all; break; case 'C': cavalli = true; nei = false; reynolds = false; break; case 'N': cavalli = false; nei = true; reynolds = false; break; case 'R': reynolds = true; cavalli = false; nei = false; break; case 'L': lower = !lower; break; case 'M': mulsets = !mulsets; if (mulsets) initdatasets(&datasets); break; case '0': initterminal(&ibmpc, &ansi); break; case '1': progress = !progress; break; } } else printf("Not a possible option!\n"); countup(&loopcount, 100); } putchar('\n'); } /* getoptions */ void allocrest() { long i; x = (phenotype3 *)Malloc(spp*sizeof(phenotype3)); d = (double **)Malloc(spp*sizeof(double *)); for (i = 0; i < (spp); i++) d[i] = (double *)Malloc(spp*sizeof(double)); alleles = (long *)Malloc(loci*sizeof(long)); nayme = (naym *)Malloc(spp*sizeof(naym)); } /* allocrest */ void freerest() { long i; for (i = 0; i < (spp); i++) free(d[i]); free(d); free(alleles); free(nayme); } /* freerest */ void freex() { long i; for (i = 0; i < (spp); i++) free(x[i]); free(x); } void doinit() { /* initializes variables */ inputnumbers(&spp, &loci, &nonodes, 1); getoptions(); } /* doinit */ void getalleles() { long i; if (!firstset) { samenumsp(&loci, ith); free(alleles); alleles = (long *)Malloc(loci*sizeof(long)); } totalleles = 0; scan_eoln(infile); for (i = 0; i < loci; i++) { if (eoln(infile)) scan_eoln(infile); fscanf(infile, "%ld", &alleles[i]); totalleles += alleles[i]; } df = totalleles - loci; } /* getalleles */ void inputdata() { /* read allele frequencies */ long i, j, k, m, m1, n; double sum; for (i = 0; i < spp; i++) x[i] = (phenotype3)Malloc(totalleles*sizeof(double)); for (i = 1; i <= (spp); i++) { scan_eoln(infile); initname(i-1); m = 1; for (j = 1; j <= loci; j++) { sum = 0.0; if (all) n = alleles[j - 1]; else n = alleles[j - 1] - 1; for (k = 1; k <= n; k++) { if (eoln(infile)) scan_eoln(infile); fscanf(infile, "%lf", &x[i - 1][m - 1]); sum += x[i - 1][m - 1]; if (x[i - 1][m - 1] < 0.0) { printf("\n\nERROR: Locus %ld in species %ld: an allele", j, i); printf(" frequency is negative\n\n"); exxit(-1); } m++; } if (all && fabs(sum - 1.0) > epsilong) { printf( "\n\nERROR: Locus %ld in species %ld: frequencies do not add up to 1\n\n", j, i); for (m1 = 1; m1 <= n; m1 += 1) { if (m1 == 1) printf("%f", x[i-1][m-n+m1-2]); else { if ((m1 % 8) == 1) printf("\n"); printf("+%f", x[i-1][m-n+m1-2]); } } printf(" = %f\n\n", sum); exxit(-1); } if (!all) { x[i - 1][m - 1] = 1.0 - sum; if (x[i-1][m-1] < -epsilong) { printf("\n\nERROR: Locus %ld in species %ld: ",j,i); printf("frequencies add up to more than 1\n\n"); for (m1 = 1; m1 <= n; m1 += 1) { if (m1 == 1) printf("%f", x[i-1][m-n+m1-2]); else { if ((m1 % 8) == 1) printf("\n"); printf("+%f", x[i-1][m-n+m1-2]); } } printf(" = %f\n\n", sum); exxit(-1); } m++; } } } } /* inputdata */ void getinput() { /* read the input data */ getalleles(); inputdata(); } /* getinput */ void makedists() { long i, j, k; double s, s1, s2, s3, f; double TEMP; if (progress) printf("Distances calculated for species\n"); for (i = 0; i < spp; i++) d[i][i] = 0.0; for (i = 1; i <= spp; i++) { if (progress) { #ifdef WIN32 phyFillScreenColor(); #endif printf(" "); for (j = 0; j < nmlngth; j++) putchar(nayme[i - 1][j]); printf(" "); } for (j = 0; j <= i - 2; j++) { /* ignore the diagonal */ if (cavalli) { s = 0.0; for (k = 0; k < (totalleles); k++) { f = x[i - 1][k] * x[j][k]; if (f > 0.0) s += sqrt(f); } d[i - 1][j] = 4 * (loci - s) / df; } if (nei) { s1 = 0.0; s2 = 0.0; s3 = 0.0; for (k = 0; k < (totalleles); k++) { s1 += x[i - 1][k] * x[j][k]; TEMP = x[i - 1][k]; s2 += TEMP * TEMP; TEMP = x[j][k]; s3 += TEMP * TEMP; } if (s1 <= 1.0e-20) { d[i - 1][j] = -1.0; printf("\nWARNING: INFINITE DISTANCE BETWEEN SPECIES "); printf("%ld AND %ld; -1.0 WAS WRITTEN\n", i, j); } else d[i - 1][j] = fabs(-log(s1 / sqrt(s2 * s3))); } if (reynolds) { s1 = 0.0; s2 = 0.0; for (k = 0; k < totalleles; k++) { TEMP = x[i - 1][k] - x[j][k]; s1 += TEMP * TEMP; s2 += x[i - 1][k] * x[j][k]; } d[i - 1][j] = s1 / (loci * 2 - 2 * s2); } if (progress) { putchar('.'); fflush(stdout); } d[j][i - 1] = d[i - 1][j]; } if (progress) { putchar('\n'); fflush(stdout); } } if (progress) { putchar('\n'); fflush(stdout); } } /* makedists */ void writedists() { long i, j, k; fprintf(outfile, "%5ld\n", spp); for (i = 0; i < (spp); i++) { for (j = 0; j < nmlngth; j++) putc(nayme[i][j], outfile); if (lower) k = i; else k = spp; for (j = 1; j <= k; j++) { if (d[i][j-1] < 100.0) fprintf(outfile, "%10.6f", d[i][j-1]); else if (d[i][j-1] < 1000.0) fprintf(outfile, " %10.6f", d[i][j-1]); else fprintf(outfile, " %11.6f", d[i][j-1]); if ((j + 1) % 7 == 0 && j < k) putc('\n', outfile); } putc('\n', outfile); } if (progress) printf("Distances written to file \"%s\"\n\n", outfilename); } /* writedists */ int main(int argc, Char *argv[]) { /* main program */ #ifdef MAC argc = 1; /* macsetup("Gendist",""); */ argv[0] = "Gendist"; #endif init(argc, argv); openfile(&infile,INFILE,"input file", "r",argv[0],infilename); openfile(&outfile,OUTFILE,"output file", "w",argv[0],outfilename); ibmpc = IBMCRT; ansi = ANSICRT; mulsets = false; firstset = true; datasets = 1; doinit(); for (ith = 1; ith <= (datasets); ith++) { allocrest(); getinput(); firstset = false; if ((datasets > 1) && progress) printf("\nData set # %ld:\n\n",ith); makedists(); writedists(); freerest(); freex(); } FClose(infile); FClose(outfile); #ifdef MAC fixmacfile(outfilename); #endif printf("Done.\n\n"); #ifdef WIN32 phyRestoreConsoleAttributes(); #endif return 0; } phylip-3.697/src/infile20000644004732000473200000001010012406201116014570 0ustar joefelsenst_g 14 232 Mouse ACCAAAAAAA CATCCAAACA CCAACCCCAG CCCTTACGCA ATAGCCATAC AAAGAATATT Bovine ACCAAACCTG TCCCCACCAT CTAACACCAA CCCACATATA CAAGCTAAAC CAAAAATACC Lemur ACCAAACTAA CATCTAACAA CTACCTCCAA CTCTAAAAAA GCACTCTTAC CAAACCCATC Tarsier ATCTACCTTA TCTCCCCCAA TCAATACCAA CCTAAAAACT CTACAATTAA AAACCCCACC Squir MonkACCCCAGCAA CTCGTTGTGA CCAACATCAA TCCAAAATTA GCAAACGTAC CAACAATCTC Jpn Macaq ACTCCACCTG CTCACCTCAT CCACTACTAC TCCTCAAGCA ATACATAAAC TAAAAACTTC Rhesus MacACTTCACCCG TTCACCTCAT CCACTACTAC TCCTCAAGCG ATACATAAAT CAAAAACTTC Crab-E.MacACCCCACCTA CCCGCCTCGT CCGCTACTGC TTCTCAAACA ATATATAGAC CAACAACTTC BarbMacaq ACCCTATCTA TCTACCTCAC CCGCCACCAC CCCCCAAACA ACACACAAAC CAACAACTTT Gibbon ACTATACCCA CCCAACTCGA CCTACACCAA TCCCCACATA GCACACAGAC CAACAACCTC Orang ACCCCACCCG TCTACACCAG CCAACACCAA CCCCCACCTA CTATACCAAC CAATAACCTC Gorilla ACCCCATTTA TCCATAAAAA CCAACACCAA CCCCCATCTA ACACACAAAC TAATGACCCC Chimp ACCCCATCCA CCCATACAAA CCAACATTAC CCTCCATCCA ATATACAAAC TAACAACCTC Human ACCCCACTCA CCCATACAAA CCAACACCAC TCTCCACCTA ATATACAAAT TAATAACCTC ATACTACTAA AAACTCAAAT TAACTCTTTA ATCTTTATAC AACATTCCAC CAACCTATCC ATACAACCAT AAATAAGACT AATCTATTAA AATAACCCAT TACGATACAA AATCCCTTTC ACAACTCTAT CAACCTAACC AAACTATCAA CATGCCCTCT CCTAATTAAA AACATTGCCA GCTCAATTAC TAGCAAAAAT AGACATTCAA CTCCTCCCAT CATAACATAA AACATTCCTC CCAAATTTAA AAACACATCC TACCTTTACA ATTAATAACC ATTGTCTAGA TATACCCCTA TCACCTCTAA TACTACACAC CACTCCTGAA ATCAATGCCC TCCACTAAAA AACATCACCA TCACCTCCAA TACTACGCAC CGCTCCTAAA ATCAATGCCC CCCACCAAAA AACATCACCA TCACCTTTAA CACTACATAT CACTCCTGAG CTTAACACCC TCCGCTAAAA AACACCACTA TTATCTTTAG CACCACACAT CACCCCCAAA AGCAATACCC TTCACCAAAA AGCACCATCA CCACCTTCCA TACCAAGCCC CGACTTTACC GCCAACGCAC CTCATCAAAA CATACCTACA TCAACCCCTA AACCAAACAC TATCCCCAAA ACCAACACAC TCTACCAAAA TACACCCCCA CCACCCTCAA AGCCAAACAC CAACCCTATA ATCAATACGC CTTATCAAAA CACACCCCCA CCACTCTTCA GACCGAACAC CAATCTCACA ACCAACACGC CCCGTCAAAA CACCCCTTCA CCACCTTCAG AACTGAACGC CAATCTCATA ACCAACACAC CCCATCAAAG CACCCCTCCA ACACAAAAAA ACTCATATTT ATCTAAATAC GAACTTCACA CAACCTTAAC ACATAAACAT GTCTAGATAC AAACCACAAC ACACAATTAA TACACACCAC AATTACAATA CTAAACTCCC CACTAAACCT ACACACCTCA TCACCATTAA CGCATAACTC CTCAGTCATA TCTACTACAC GCTCCAATAA ACACATCACA ATCCCAATAA CGCATATACC TAAATACATC ATTTAATAAT AAATAAATGA ATATAAACCC TCGCCGATAA CATA-ACCCC TAAAATCAAG ACATCCTCTC GCCCAAACAA ACACCTATCT ACCCCCCCGG TCCACGCCCC TAACTCCATC ATTCCCCCTC ACCCAAACAA ACACCTACCC ATCCCCCCGG TTCACGCCTC AAACTCCATC ATTCCCCCTC ACCCAAACAA ACACCTATCT ATCCCCCCGG TCCACGCCCC AAACCCCGCT ATTCCCCCCT AATCAAACAA ACACCTATTT ATTCCCCTAA TTCACGTCCC AAATCCCATT ATCTCTCCCC ACACAAACAA ATGCCCCCCC ACCCTCCTTC TTCAAGCCCA CTAGACCATC CTACCTTCCT ATTCACATCC GCACACCCCC ACCCCCCCTG CCCACGTCCA TCCCATCACC CTCTCCTCCC ACATAAACCC ACGCACCCCC ACCCCTTCCG CCCATGCTCA CCACATCATC TCTCCCCTTC GCACAAATTC ATACACCCCT ACCTTTCCTA CCCACGTTCA CCACATCATC CCCCCCTCTC ACACAAACCC GCACACCTCC ACCCCCCTCG TCTACGCTTA CCACGTCATC CCTCCCTCTC ACCCCAGCCC AACACCCTTC CACAAATCCT TAATATACGC ACCATAAATA AC ATCCCACCAA ATCACCCTCC ATCAAATCCA CAAATTACAC AACCATTAAC CC ACCCTAACAA TTTATCCCTC CCATAATCCA AAAACTCCAT AAACACAAAT TC AATACTCCAA CTCCCATAAC ACAGCATACA TAAACTCCAT AAGTTTGAAC AC ACAACGCCAA ACCCCCCTCT CATAACTCTA CAAAATACAC AATCACCAAC AC AATACATCAA ACAATTCCCC CCAATACCCA CAAACTGCAT AAGCAAACAG AC AATACATCAA ACAATTCCCC CCAATACCCA CAAACTACAT AAACAAACAA AC AATACACCAA ACAATTTTCT CCAACACCCA CAAACTGTAT AAACAAACAA AC AACATACCAA ACAATTCTCC CTAATATACA CAAACCACGC AAACAAACAA AC AGCACGCCAA GCTCTCTACC ATCAAACGCA CAACTTACAC ATACAGAACC AC AACACCCTAA GCCACCTTCC TCAAAATCCA AAACCCACAC AACCGAAACA AC AACACCTCAA TCCACCTCCC CCCAAATACA CAATTCACAC AAACAATACC AC AACATCTTGA CTCGCCTCTC TCCAAACACA CAATTCACGC AAACAACGCC AC AACACCTTAA CTCACCTTCT CCCAAACGCA CAATTCGCAC ACACAACGCC AC phylip-3.697/src/io.h0000644004732000473200000000014212406201116014102 0ustar joefelsenst_g#define MAC_OFFSET 60 #include #include #include #define DRAW phylip-3.697/src/javasrc/0000755004732000473200000000000013212363632014766 5ustar joefelsenst_gphylip-3.697/src/javasrc/drawgram/0000755004732000473200000000000013212363632016572 5ustar joefelsenst_gphylip-3.697/src/javasrc/drawgram/DrawgramInterface.java0000644004732000473200000000605412406603060023023 0ustar joefelsenst_gpackage drawgram; import javax.swing.JOptionPane; import util.TestFileNames; import drawgram.DrawgramUserInterface.DrawgramData; import com.sun.jna.Library; import com.sun.jna.Native; public class DrawgramInterface { public interface Drawgram extends Library { public void drawgram( String intree, String usefont, String plotfile, String plotfileopt, String treegrows, String treestyle, boolean usebranchlengths, double labelangle, boolean scalebranchlength, double branchlength, double breadthdepthratio, double stemltreedratio, double chhttipspratio, String ancnodes, boolean doplot, String finalplotkind); } public boolean DrawgramRun(DrawgramData inVals){ TestFileNames test = new TestFileNames(); if (!test.FileAvailable(inVals.intree, "Intree")) { return false; } if (inVals.doplot) // only check if final plot { String opt = test.FileAlreadyExists(inVals.plotfile, "Plotfile"); if (opt == "q") { return false; } else { if (opt == "a") { inVals.plotfileopt = "ab"; } else { inVals.plotfileopt = "wb"; } } } // at this point we hook into the C code String wherestr = "System.load"; try { wherestr = "Native.loadLibrary"; Drawgram Drawgram = (Drawgram) Native.loadLibrary("drawgram", Drawgram.class); Drawgram.drawgram( inVals.intree, inVals.usefont, inVals.plotfile, inVals.plotfileopt, inVals.treegrows, inVals.treestyle, inVals.usebranchlengths, inVals.labelangle, inVals.scalebranchlength, inVals.branchlength, inVals.breadthdepthratio, inVals.stemltreedratio, inVals.chhttipspratio, inVals.ancnodes, inVals.doplot, inVals.finalplottype); return true; } catch (UnsatisfiedLinkError e) { String mapedLibName = System.mapLibraryName("drawgram"); String libpath = inVals.librarypath; if (mapedLibName.contains("jnilib")) { // mac libpath += "/libdrawgram.dylib"; } else if (mapedLibName.contains("dll")) { // windows libpath += "\\drawgram.dll"; } else { // unix libpath += "/libdrawgram.so"; } String msg = "Drawgram library not found in : "; msg += libpath; msg += " by "; msg += wherestr; msg += ". Error msg: "; msg += e; System.out.println(msg); JOptionPane.showMessageDialog(null, msg, "Error", JOptionPane.ERROR_MESSAGE); String path = System.getProperty("java.library.path"); JOptionPane.showMessageDialog(null, path, "after error", JOptionPane.INFORMATION_MESSAGE); JOptionPane.showMessageDialog(null, mapedLibName, "after error", JOptionPane.INFORMATION_MESSAGE); } return false; } } phylip-3.697/src/javasrc/drawgram/DrawgramUserInterface.java0000644004732000473200000005512412406603060023664 0ustar joefelsenst_gpackage drawgram; import java.awt.EventQueue; import java.io.File; import javax.swing.DefaultListCellRenderer; import javax.swing.JFrame; import javax.swing.JLabel; import javax.swing.JRadioButton; import javax.swing.JButton; import javax.swing.JTextField; import javax.swing.SwingConstants; import javax.swing.JComboBox; import javax.swing.DefaultComboBoxModel; import javax.swing.UIManager; import java.awt.event.ActionListener; import java.awt.event.ActionEvent; import javax.swing.JFileChooser; import javax.swing.JSeparator; import util.DrawPreview; import java.awt.Font; import java.awt.Color; import java.awt.Graphics; public class DrawgramUserInterface { public class DrawgramData{ String intree; String usefont; String plotfile; String plotfileopt; String treegrows; String treestyle; boolean usebranchlengths; Double labelangle; boolean scalebranchlength; Double branchlength; Double breadthdepthratio; Double stemltreedratio; Double chhttipspratio; String ancnodes; String librarypath; boolean doplot; // false = do preview String finalplottype; } public enum LastPage{COUNT, SIZE, OVERLAP} private String filedir; private Color phylipBG; private String ancNodesCBdefault = new String("Weighted"); private JFrame frmDrawgramControls; private JTextField labelAngleTxt; private JLabel lblAngleLabels; private JTextField branchLenTxt; private JTextField depthBreadthTxt; private JTextField stemLenTreeDpthTxt; private JTextField charHgtTipSpTxt; private JRadioButton treeHRB; private JRadioButton treeVRB; private JRadioButton useLenY; private JRadioButton useLenN; private JRadioButton branchScaleAutoRB; private JLabel branchScaleTxt; private JLabel lblCm; private JComboBox cmbxTreeStyle; private JComboBox cmbxAncNodes; private JButton InputTreeBtn; private JTextField IntreeTxt; private JTextField PlotTxt; private JButton plotBtn; private JComboBox cmbxPlotFont; private JLabel lblFinalPlotType; private JComboBox cmbxFinalPlotType; private JButton btnPreview; private JButton btnQuit; private JButton btnPlotFile; /** * Launch the application. */ public static void main(String[] args) { EventQueue.invokeLater(new Runnable() { public void run() { try { DrawgramUserInterface window = new DrawgramUserInterface(); window.frmDrawgramControls.setVisible(true); } catch (Exception e) { e.printStackTrace(); } } }); } /** * Create the application. */ public DrawgramUserInterface() { initialize(); } // event handlers protected void TreeGrowToggle(boolean ishoriz) { if (ishoriz){ treeHRB.setSelected(true); treeVRB.setSelected(false); } else{ treeHRB.setSelected(false); treeVRB.setSelected(true); } } protected void BranchLengthToggle(boolean uselength) { if (uselength){ useLenY.setSelected(true); useLenN.setSelected(false); cmbxAncNodes.setEnabled(true); for (int i=0; i 360) { treeArcTxt.setText(Double.toString(Double.parseDouble(treeArcTxt.getText())%360)); } if (Double.parseDouble(treeArcTxt.getText()) == 0.0) { treeArcTxt.setText(Double.toString(360)); } } protected void RotationLimit() { // doing a mod 360 in case the user gets clever if (Double.parseDouble(treeRotationTxt.getText()) > 360) { treeRotationTxt.setText(Double.toString(Double.parseDouble(treeRotationTxt.getText())%360)); } } protected void TreeGrowToggle(boolean ishoriz) { if (ishoriz){ treeHRB.setSelected(true); treeVRB.setSelected(false); } else{ treeHRB.setSelected(false); treeVRB.setSelected(true); } } protected boolean LaunchDrawtreeInterface(DrawtreeData inputdata){ inputdata.intree = (String)IntreeTxt.getText(); inputdata.plotfile = (String)PlotTxt.getText(); inputdata.plotfileopt = "wb"; inputdata.usefont = cmbxPlotFont.getSelectedItem().toString(); inputdata.usebranchlengths = useLenY.isSelected(); // Angle of Labels inputdata.labeldirec = "middle"; for (int i=0; i 0) { // output previous section if it exists curplot.m_treePart.add(cursec); } cursec = new SectionData(); while (scanline.hasNext()) { if(scanline.hasNextDouble()) { cursec.strokewidth = scanline.nextDouble(); } else { scanline.next(); // throw away misc text } } } else if (curline.contains("Page:")) { pagefound = true; } else if (curline.contains("findfont")) { // text output is on 4 lines Font useFont; Point.Double translation = new Point.Double(0,0); Double rotation = 0.0; String displayText = new String(""); // first line has the font name and the scaling String fontstring = scanline.next().replace('/', ' '); fontstring = fontstring.trim(); String []fontparts = fontstring.split("-"); int fontsize = 0; while(scanline.hasNext()) { if (scanline.hasNextDouble()) { fontsize = (int)scanline.nextDouble(); } else { scanline.next(); // throw away misc text } } String fontkind; if (fontparts.length == 1) { // Courier & Helvetica fontkind = "book"; } else if (fontparts.length == 3) { // Helvetica-Narrow fontkind = fontparts[2].toLowerCase(); } else { fontkind = fontparts[1].toLowerCase(); } if (fontkind.contains("roman") || fontkind.contains("book") || fontkind.contains("narrow") || fontkind.contains("light")) { useFont = new Font(fontparts[0], Font.PLAIN, fontsize); } else if(fontkind.contains("bolditalic") || fontkind.contains("boldoblique") || fontkind.contains("demioblique") || fontkind.contains("demiitalic")) { useFont = new Font(fontparts[0], Font.ITALIC+Font.BOLD, fontsize); } else if(fontkind.contains("bold") || fontkind.contains("demi") ) { useFont = new Font(fontparts[0], Font.BOLD, fontsize); } else if(fontkind.contains("italic") || fontkind.contains("oblique") || fontkind.contains("bookoblique") || fontkind.contains("lightitalic") || fontkind.contains("mediumitalic")) { useFont = new Font(fontparts[0], Font.ITALIC, fontsize); } else { useFont = new Font("SanSerif", Font.PLAIN, fontsize); } //useFont = new Font("AvantGarde-BookOblique", Font.PLAIN, fontsize); // second line has the start position and rotation data curline = scanfile.nextLine(); scanline = new Scanner(curline); boolean xfound = false; boolean yfound = false; while(scanline.hasNext()) { if (scanline.hasNextDouble()) { if (!xfound) { translation.x = scanline.nextDouble(); xfound = true; } else if (!yfound) { translation.y = scanline.nextDouble(); translation.y = curplot.m_plotheight - translation.y; // inverted y correction yfound = true; } else { rotation = -scanline.nextDouble(); // inverted y correction } } else { scanline.next(); // throw away misc text } } // third line is a (0,0) moveto curline = scanfile.nextLine(); scanline = new Scanner(curline); // last line is (name) show curline = scanfile.nextLine(); if(curline.contains("(")) { displayText = curline.substring((curline.indexOf('(') + 1), curline.lastIndexOf(')')); } // make text block LabelData newlabel = new LabelData(useFont, translation, rotation, displayText); cursec.texts.add(newlabel); } } } // output final section if it exists if (cursec.strokewidth > 0) { curplot.m_treePart.add(cursec); } return curplot; } } @SuppressWarnings("serial") class GCanvas extends Canvas // create a canvas for your graphics { private PlotData locplot; public GCanvas(PlotData curplot) { locplot = curplot; } public void paint(Graphics g) // display shapes on canvas { Graphics2D g2D=(Graphics2D) g; // cast to 2D g2D.setRenderingHint(RenderingHints.KEY_ANTIALIASING, RenderingHints.VALUE_ANTIALIAS_ON); //System.out.println("in paint"); for (int i=0; i m_treePart; public PlotData() { m_treePart = new ArrayList (); } } phylip-3.697/src/javasrc/util/SectionData.java0000644004732000473200000000072612406603060021005 0ustar joefelsenst_gpackage util; import java.awt.geom.CubicCurve2D; import java.awt.geom.Line2D; import java.util.ArrayList; public class SectionData { public Double strokewidth; public ArrayList lines; public ArrayList curves; public ArrayList texts; public SectionData() { strokewidth = -1.0; texts = new ArrayList(); lines = new ArrayList(); curves = new ArrayList(); } } phylip-3.697/src/javasrc/util/TestFileNames.java0000644004732000473200000000377512406603060021321 0ustar joefelsenst_gpackage util; import java.io.File; import java.io.IOException; import javax.swing.JOptionPane; public class TestFileNames { public boolean DuplicateFileNames(String file1, String file1name, String file2, String file2name) { File testfile1 = new File(file1); File testfile2 = new File(file2); if (testfile1.exists() && testfile2.exists()){ // check if file1 and file2 are the same String file1path = ""; try { file1path = testfile1.getCanonicalPath(); } catch (IOException e) { // should never happen e.printStackTrace(); } String file2path = ""; try { file2path = testfile2.getCanonicalPath(); } catch (IOException e) { // should never happen e.printStackTrace(); } if (file1path.equals(file2path)) { String msg = file1name; msg += " and "; msg += file2name; msg += " files are both named \""; msg += file1; msg += "\" which will not work."; JOptionPane.showMessageDialog(null, msg, "Error", JOptionPane.ERROR_MESSAGE); return false; } } return true; } public boolean FileAvailable(String file, String filename) { File infile = new File(file); if (!infile.exists()){ String msg = filename; msg += " File: "; msg += file; msg += " does not exist."; JOptionPane.showMessageDialog(null, msg, "Error", JOptionPane.ERROR_MESSAGE); return false; } return true; } public String FileAlreadyExists(String file, String filename) { Object[] options = {"Quit", "Append", "Replace"}; File outfile = new File(file); if (outfile.exists()){ String msg = filename; msg += " File: "; msg += file; msg += " exists. Overwrite?"; int retval = JOptionPane.showOptionDialog(null, msg, "Warning", JOptionPane.YES_NO_CANCEL_OPTION,JOptionPane.WARNING_MESSAGE, null,options,options[0]); if (retval == JOptionPane.CANCEL_OPTION){ return "w"; } else{ if (retval == JOptionPane.NO_OPTION){ return "a"; } else{ return "q"; } } } return "w"; } } phylip-3.697/src/kitsch.c0000644004732000473200000007003012406201116014756 0ustar joefelsenst_g/* version 3.696. Written by Joseph Felsenstein, Akiko Fuseki, Sean Lamont, and Andrew Keeffe. Copyright (c) 1993-2014, Joseph Felsenstein All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #include #include "phylip.h" #include "dist.h" #define epsilonk 0.000001 /* a very small but not too small number */ #ifndef OLDC /* function prototypes */ void getoptions(void); void doinit(void); void inputoptions(void); void getinput(void); void input_data(void); void add(node *, node *, node *); void re_move(node **, node **); void scrunchtraverse(node *, node **, double *); void combine(node *, node *); void scrunch(node *); void secondtraverse(node *, node *, node *, node *, long, long, long , double *); void firstraverse(node *, node *, double *); void sumtraverse(node *, double *); void evaluate(node *); void tryadd(node *, node **, node **); void addpreorder(node *, node *, node *); void tryrearr(node *, node **, boolean *); void repreorder(node *, node **, boolean *); void rearrange(node **); void dtraverse(node *); void describe(void); void copynode(node *, node *); void copy_(tree *, tree *); void maketree(void); /* function prototypes */ #endif Char infilename[FNMLNGTH], outfilename[FNMLNGTH], intreename[FNMLNGTH], outtreename[FNMLNGTH]; long nonodes, numtrees, col, datasets, ith, njumble, jumb; /* numtrees is used by usertree option part of maketree */ long inseed; tree curtree, bestree; /* pointers to all nodes in tree */ boolean minev, jumble, usertree, lower, upper, negallowed, replicates, trout, printdata, progress, treeprint, mulsets, firstset; longer seed; double power; long *enterorder; /* Local variables for maketree, propagated globally for C version: */ long examined; double like, bestyet; node *there; boolean *names; Char ch; char *progname; double trweight; /* to make treeread happy */ boolean goteof, haslengths, lengths; /* ditto ... */ void getoptions() { /* interactively set options */ long inseed0, loopcount; Char ch; minev = false; jumble = false; njumble = 1; lower = false; negallowed = false; power = 2.0; replicates = false; upper = false; usertree = false; trout = true; printdata = false; progress = true; treeprint = true; loopcount = 0; for(;;) { cleerhome(); printf("\nFitch-Margoliash method "); printf("with contemporary tips, version %s\n\n",VERSION); printf("Settings for this run:\n"); printf(" D Method (F-M, Minimum Evolution)? %s\n", (minev ? "Minimum Evolution" : "Fitch-Margoliash")); printf(" U Search for best tree? %s\n", usertree ? "No, use user trees in input file" : "Yes"); printf(" P Power?%9.5f\n",power); printf(" - Negative branch lengths allowed? %s\n", (negallowed ? "Yes" : "No")); printf(" L Lower-triangular data matrix? %s\n", (lower ? "Yes" : "No")); printf(" R Upper-triangular data matrix? %s\n", (upper ? "Yes" : "No")); printf(" S Subreplicates? %s\n", (replicates ? "Yes" : "No")); if (!usertree) { printf(" J Randomize input order of species?"); if (jumble) printf(" Yes (seed =%8ld,%3ld times)\n", inseed0, njumble); else printf(" No. Use input order\n"); } printf(" M Analyze multiple data sets?"); if (mulsets) printf(" Yes, %2ld sets\n", datasets); else printf(" No\n"); printf(" 0 Terminal type (IBM PC, ANSI, none)? %s\n", (ibmpc ? "IBM PC" : ansi ? "ANSI" : "(none)")); printf(" 1 Print out the data at start of run %s\n", (printdata ? "Yes" : "No")); printf(" 2 Print indications of progress of run %s\n", (progress ? "Yes" : "No")); printf(" 3 Print out tree %s\n", (treeprint ? "Yes" : "No")); printf(" 4 Write out trees onto tree file? %s\n", (trout ? "Yes" : "No")); printf("\n Y to accept these or type the letter for one to change\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); if (ch == '\n') ch = ' '; uppercase(&ch); if (ch == 'Y') break; if (((!usertree) && (strchr("DJUP-LRSM12340", ch) != NULL)) || (usertree && ((strchr("DUP-LRSM12340", ch) != NULL)))){ switch (ch) { case 'D': minev = !minev; if (!negallowed) negallowed = true; break; case '-': negallowed = !negallowed; break; case 'J': jumble = !jumble; if (jumble) initjumble(&inseed, &inseed0, seed, &njumble); else njumble = 1; break; case 'L': lower = !lower; break; case 'P': initpower(&power); break; case 'R': upper = !upper; break; case 'S': replicates = !replicates; break; case 'U': usertree = !usertree; break; case 'M': mulsets = !mulsets; if (mulsets) initdatasets(&datasets); jumble = true; if (jumble) initseed(&inseed, &inseed0, seed); break; case '0': initterminal(&ibmpc, &ansi); break; case '1': printdata = !printdata; break; case '2': progress = !progress; break; case '3': treeprint = !treeprint; break; case '4': trout = !trout; break; } } else printf("Not a possible option!\n"); countup(&loopcount, 100); } if (upper && lower) { printf("ERROR: Data matrix cannot be both uppeR and Lower triangular\n"); exxit(-1); } } /* getoptions */ void doinit() { /* initializes variables */ inputnumbers2(&spp, &nonodes, 1); getoptions(); alloctree(&curtree.nodep, nonodes); allocd(nonodes, curtree.nodep); allocw(nonodes, curtree.nodep); if (!usertree && njumble > 1) { alloctree(&bestree.nodep, nonodes); allocd(nonodes, bestree.nodep); allocw(nonodes, bestree.nodep); } nayme = (naym *)Malloc(spp*sizeof(naym)); enterorder = (long *)Malloc(spp*sizeof(long)); } /* doinit */ void inputoptions() { /* print options information */ if (!firstset) samenumsp2(ith); fprintf(outfile, "\nFitch-Margoliash method "); fprintf(outfile, "with contemporary tips, version %s\n\n",VERSION); if (minev) fprintf(outfile, "Minimum evolution method option\n\n"); fprintf(outfile, " __ __ 2\n"); fprintf(outfile, " \\ \\ (Obs - Exp)\n"); fprintf(outfile, "Sum of squares = /_ /_ ------------\n"); fprintf(outfile, " "); if (power == (long)power) fprintf(outfile, "%2ld\n", (long)power); else fprintf(outfile, "%4.1f\n", power); fprintf(outfile, " i j Obs\n\n"); fprintf(outfile, "negative branch lengths"); if (!negallowed) fprintf(outfile, " not"); fprintf(outfile, " allowed\n\n"); } /* inputoptions */ void getinput() { /* reads the input data */ inputoptions(); } /* getinput */ void input_data() { /* read in distance matrix */ long i, j, k, columns, n; boolean skipit, skipother; double x; columns = replicates ? 4 : 6; if (printdata) { fprintf(outfile, "\nName Distances"); if (replicates) fprintf(outfile, " (replicates)"); fprintf(outfile, "\n---- ---------"); if (replicates) fprintf(outfile, "-------------"); fprintf(outfile, "\n\n"); } setuptree(&curtree, nonodes); if (!usertree && njumble > 1) setuptree(&bestree, nonodes); for (i = 0; i < (spp); i++) { curtree.nodep[i]->d[i] = 0.0; curtree.nodep[i]->w[i] = 0.0; curtree.nodep[i]->weight = 0.0; scan_eoln(infile); initname(i); for (j = 1; j <= (spp); j++) { skipit = ((lower && j >= i + 1) || (upper && j <= i + 1)); skipother = ((lower && i + 1 >= j) || (upper && i + 1 <= j)); if (!skipit) { if (eoln(infile)) scan_eoln(infile); fscanf(infile, "%lf", &x); curtree.nodep[i]->d[j - 1] = x; if (replicates) { if (eoln(infile)) scan_eoln(infile); fscanf(infile, "%ld", &n); } else n = 1; if (n > 0 && x < 0) { printf("NEGATIVE DISTANCE BETWEEN SPECIES%5ld AND %5ld\n", i + 1, j); exxit(-1); } curtree.nodep[i]->w[j - 1] = n; if (skipother) { curtree.nodep[j - 1]->d[i] = curtree.nodep[i]->d[j - 1]; curtree.nodep[j - 1]->w[i] = curtree.nodep[i]->w[j - 1]; } if ((i == j) && (fabs(curtree.nodep[i-1]->d[j-1]) > 0.000000001)) { printf("\nERROR: diagonal element of row %ld of distance matrix ", i+2); printf("is not zero.\n"); printf(" Is it a distance matrix?\n\n"); exxit(-1); } if ((j < i) && (fabs(curtree.nodep[i]->d[j-1]-curtree.nodep[j-1]->d[i]) > 0.000000001)) { printf("ERROR: distance matrix is not symmetric:\n"); printf(" (%ld,%ld) element and (%ld,%ld) element are unequal.\n", i+1, j+1, j+1, i+1); printf(" They are %10.6f and %10.6f, respectively.\n", curtree.nodep[i]->d[j-1], curtree.nodep[j]->d[i-1]); printf(" Is it a distance matrix?\n\n"); exxit(-1); } } } } scan_eoln(infile); if (printdata) { for (i = 0; i < (spp); i++) { for (j = 0; j < nmlngth; j++) putc(nayme[i][j], outfile); putc(' ', outfile); for (j = 1; j <= (spp); j++) { fprintf(outfile, "%10.5f", curtree.nodep[i]->d[j - 1]); if (replicates) fprintf(outfile, " (%3ld)", (long)curtree.nodep[i]->w[j - 1]); if (j % columns == 0 && j < spp) { putc('\n', outfile); for (k = 1; k <= nmlngth + 1; k++) putc(' ', outfile); } } putc('\n', outfile); } putc('\n', outfile); } for (i = 0; i < (spp); i++) { for (j = 0; j < (spp); j++) { if (i + 1 != j + 1) { if (curtree.nodep[i]->d[j] < epsilonk) curtree.nodep[i]->d[j] = epsilonk; curtree.nodep[i]->w[j] /= exp(power * log(curtree.nodep[i]->d[j])); } } } } /* inputdata */ void add(node *below, node *newtip, node *newfork) { /* inserts the nodes newfork and its left descendant, newtip, to the tree. below becomes newfork's right descendant */ if (below != curtree.nodep[below->index - 1]) below = curtree.nodep[below->index - 1]; if (below->back != NULL) below->back->back = newfork; newfork->back = below->back; below->back = newfork->next->next; newfork->next->next->back = below; newfork->next->back = newtip; newtip->back = newfork->next; if (curtree.root == below) curtree.root = newfork; curtree.root->back = NULL; } /* add */ void re_move(node **item, node **fork) { /* removes nodes item and its ancestor, fork, from the tree. the new descendant of fork's ancestor is made to be fork's second descendant (other than item). Also returns pointers to the deleted nodes, item and fork */ node *p, *q; if ((*item)->back == NULL) { *fork = NULL; return; } *fork = curtree.nodep[(*item)->back->index - 1]; if (curtree.root == *fork) { if (*item == (*fork)->next->back) curtree.root = (*fork)->next->next->back; else curtree.root = (*fork)->next->back; } p = (*item)->back->next->back; q = (*item)->back->next->next->back; if (p != NULL) p->back = q; if (q != NULL) q->back = p; (*fork)->back = NULL; p = (*fork)->next; while (p != *fork) { p->back = NULL; p = p->next; } (*item)->back = NULL; } /* remove */ void scrunchtraverse(node *u, node **closest, double *tmax) { /* traverse to find closest node to the current one */ if (!u->sametime) { if (u->t > *tmax) { *closest = u; *tmax = u->t; } return; } u->t = curtree.nodep[u->back->index - 1]->t; if (!u->tip) { scrunchtraverse(u->next->back, closest,tmax); scrunchtraverse(u->next->next->back, closest,tmax); } } /* scrunchtraverse */ void combine(node *a, node *b) { /* put node b into the set having the same time as a */ if (a->weight + b->weight <= 0.0) a->t = 0.0; else a->t = (a->t * a->weight + b->t * b->weight) / (a->weight + b->weight); a->weight += b->weight; b->sametime = true; } /* combine */ void scrunch(node *s) { /* see if nodes can be combined to prevent negative lengths */ double tmax; node *closest; boolean found; closest = NULL; tmax = -1.0; do { if (!s->tip) { scrunchtraverse(s->next->back, &closest,&tmax); scrunchtraverse(s->next->next->back, &closest,&tmax); } found = (tmax > s->t); if (found) combine(s, closest); tmax = -1.0; } while (found); } /* scrunch */ void secondtraverse(node *a, node *q, node *u, node *v, long i, long j, long k, double *sum) { /* recalculate distances, add to sum */ long l; double wil, wjl, wkl, wli, wlj, wlk, TEMP; if (!(a->processed || a->tip)) { secondtraverse(a->next->back, q,u,v,i,j,k,sum); secondtraverse(a->next->next->back, q,u,v,i,j,k,sum); return; } if (!(a != q && a->processed)) return; l = a->index; wil = u->w[l - 1]; wjl = v->w[l - 1]; wkl = wil + wjl; wli = a->w[i - 1]; wlj = a->w[j - 1]; wlk = wli + wlj; q->w[l - 1] = wkl; a->w[k - 1] = wlk; if (wkl <= 0.0) q->d[l - 1] = 0.0; else q->d[l - 1] = (wil * u->d[l - 1] + wjl * v->d[l - 1]) / wkl; if (wlk <= 0.0) a->d[k - 1] = 0.0; else a->d[k - 1] = (wli * a->d[i - 1] + wlj * a->d[j - 1]) / wlk; if (minev) return; if (wkl > 0.0) { TEMP = u->d[l - 1] - v->d[l - 1]; (*sum) += wil * wjl / wkl * (TEMP * TEMP); } if (wlk > 0.0) { TEMP = a->d[i - 1] - a->d[j - 1]; (*sum) += wli * wlj / wlk * (TEMP * TEMP); } } /* secondtraverse */ void firstraverse(node *q_, node *r, double *sum) { /* firsttraverse */ /* go through tree calculating branch lengths */ node *q; long i, j, k; node *u, *v; q = q_; if (q == NULL) return; q->sametime = false; if (!q->tip) { firstraverse(q->next->back, r,sum); firstraverse(q->next->next->back, r,sum); } q->processed = true; if (q->tip) return; u = q->next->back; v = q->next->next->back; i = u->index; j = v->index; k = q->index; if (u->w[j - 1] + v->w[i - 1] <= 0.0) q->t = 0.0; else q->t = (u->w[j - 1] * u->d[j - 1] + v->w[i - 1] * v->d[i - 1]) / (2.0 * (u->w[j - 1] + v->w[i - 1])); q->weight = u->weight + v->weight + u->w[j - 1] + v->w[i - 1]; if (!negallowed) scrunch(q); u->v = q->t - u->t; v->v = q->t - v->t; u->back->v = u->v; v->back->v = v->v; secondtraverse(r,q,u,v,i,j,k,sum); } /* firstraverse */ void sumtraverse(node *q, double *sum) { /* traverse to finish computation of sum of squares */ long i, j; node *u, *v; double TEMP, TEMP1; if (minev && (q != curtree.root)) *sum += q->v; if (q->tip) return; sumtraverse(q->next->back, sum); sumtraverse(q->next->next->back, sum); if (!minev) { u = q->next->back; v = q->next->next->back; i = u->index; j = v->index; TEMP = u->d[j - 1] - 2.0 * q->t; TEMP1 = v->d[i - 1] - 2.0 * q->t; (*sum) += u->w[j - 1] * (TEMP * TEMP) + v->w[i - 1] * (TEMP1 * TEMP1); } } /* sumtraverse */ void evaluate(node *r) { /* fill in times and evaluate sum of squares for tree */ double sum; long i; sum = 0.0; for (i = 0; i < (nonodes); i++) curtree.nodep[i]->processed = curtree.nodep[i]->tip; firstraverse(r, r,&sum); sumtraverse(r, &sum); examined++; if (replicates && (lower || upper)) sum /= 2; like = -sum; } /* evaluate */ void tryadd(node *p, node **item, node **nufork) { /* temporarily adds one fork and one tip to the tree. if the location where they are added yields greater "likelihood" than other locations tested up to that time, then keeps that location as there */ add(p, *item, *nufork); evaluate(curtree.root); if (like > bestyet) { bestyet = like; there = p; } re_move(item, nufork); } /* tryadd */ void addpreorder(node *p, node *item, node *nufork) { /* traverses a binary tree, calling PROCEDURE tryadd at a node before calling tryadd at its descendants */ if (p == NULL) return; tryadd(p, &item,&nufork); if (!p->tip) { addpreorder(p->next->back, item, nufork); addpreorder(p->next->next->back, item, nufork); } } /* addpreorder */ void tryrearr(node *p, node **r, boolean *success) { /* evaluates one rearrangement of the tree. if the new tree has greater "likelihood" than the old one sets success := TRUE and keeps the new tree. otherwise, restores the old tree */ node *frombelow, *whereto, *forknode; double oldlike; if (p->back == NULL) return; forknode = curtree.nodep[p->back->index - 1]; if (forknode->back == NULL) return; oldlike = like; if (p->back->next->next == forknode) frombelow = forknode->next->next->back; else frombelow = forknode->next->back; whereto = forknode->back; re_move(&p, &forknode); add(whereto, p, forknode); if ((*r)->back != NULL) *r = curtree.nodep[(*r)->back->index - 1]; evaluate(*r); if (like - oldlike > LIKE_EPSILON) { bestyet = like; *success = true; return; } re_move(&p, &forknode); add(frombelow, p, forknode); if ((*r)->back != NULL) *r = curtree.nodep[(*r)->back->index - 1]; like = oldlike; } /* tryrearr */ void repreorder(node *p, node **r, boolean *success) { /* traverses a binary tree, calling PROCEDURE tryrearr at a node before calling tryrearr at its descendants */ if (p == NULL) return; tryrearr(p,r,success); if (!p->tip) { repreorder(p->next->back,r,success); repreorder(p->next->next->back,r,success); } } /* repreorder */ void rearrange(node **r_) { /* traverses the tree (preorder), finding any local rearrangement which decreases the number of steps. if traversal succeeds in increasing the tree's "likelihood", PROCEDURE rearrange runs traversal again */ node **r; boolean success; r = r_; success = true; while (success) { success = false; repreorder(*r,r,&success); } } /* rearrange */ void dtraverse(node *q) { /* print table of lengths etc. */ long i; if (!q->tip) dtraverse(q->next->back); if (q->back != NULL) { fprintf(outfile, "%4ld ", q->back->index - spp); if (q->index <= spp) { for (i = 0; i < nmlngth; i++) putc(nayme[q->index - 1][i], outfile); } else fprintf(outfile, "%4ld ", q->index - spp); fprintf(outfile, "%13.5f", curtree.nodep[q->back->index - 1]->t - q->t); q->v = curtree.nodep[q->back->index - 1]->t - q->t; q->back->v = q->v; fprintf(outfile, "%16.5f\n", curtree.root->t - q->t); } if (!q->tip) dtraverse(q->next->next->back); } /* dtraverse */ void describe() { /* prints table of lengths, times, sum of squares, etc. */ long i, j; double totalnum; double TEMP; if (!minev) fprintf(outfile, "\nSum of squares = %10.3f\n\n", -like); else fprintf(outfile, "Sum of branch lengths = %10.3f\n\n", -like); if ((fabs(power - 2) < 0.01) && !minev) { totalnum = 0.0; for (i = 0; i < (spp); i++) { for (j = 0; j < (spp); j++) { if (i + 1 != j + 1 && curtree.nodep[i]->d[j] > 0.0) { TEMP = curtree.nodep[i]->d[j]; totalnum += curtree.nodep[i]->w[j] * (TEMP * TEMP); } } } totalnum -= 2; if (replicates && (lower || upper)) totalnum /= 2; fprintf(outfile, "Average percent standard deviation ="); fprintf(outfile, "%10.5f\n\n", 100 * sqrt(-(like / totalnum))); } fprintf(outfile, "From To Length Height\n"); fprintf(outfile, "---- -- ------ ------\n\n"); dtraverse(curtree.root); putc('\n', outfile); if (trout) { col = 0; treeoutr(curtree.root,&col,&curtree); } } /* describe */ void copynode(node *c, node *d) { /* make a copy of a node */ memcpy(d->d, c->d, nonodes*sizeof(double)); memcpy(d->w, c->w, nonodes*sizeof(double)); d->t = c->t; d->sametime = c->sametime; d->weight = c->weight; d->processed = c->processed; d->xcoord = c->xcoord; d->ycoord = c->ycoord; d->ymin = c->ymin; d->ymax = c->ymax; } /* copynode */ void copy_(tree *a, tree *b) { /* make a copy of a tree */ long i, j=0; node *p, *q; for (i = 0; i < spp; i++) { copynode(a->nodep[i], b->nodep[i]); if (a->nodep[i]->back) { if (a->nodep[i]->back == a->nodep[a->nodep[i]->back->index - 1]) b->nodep[i]->back = b->nodep[a->nodep[i]->back->index - 1]; else if (a->nodep[i]->back == a->nodep[a->nodep[i]->back->index - 1]->next) b->nodep[i]->back = b->nodep[a->nodep[i]->back->index - 1]->next; else b->nodep[i]->back = b->nodep[a->nodep[i]->back->index - 1]->next->next; } else b->nodep[i]->back = NULL; } for (i = spp; i < nonodes; i++) { p = a->nodep[i]; q = b->nodep[i]; for (j = 1; j <= 3; j++) { copynode(p, q); if (p->back) { if (p->back == a->nodep[p->back->index - 1]) q->back = b->nodep[p->back->index - 1]; else if (p->back == a->nodep[p->back->index - 1]->next) q->back = b->nodep[p->back->index - 1]->next; else q->back = b->nodep[p->back->index - 1]->next->next; } else q->back = NULL; p = p->next; q = q->next; } } b->root = a->root; } /* copy */ void maketree() { /* constructs a binary tree from the pointers in curtree.nodep. adds each node at location which yields highest "likelihood" then rearranges the tree for greatest "likelihood" */ long i, j, which; double bestlike, bstlike2=0, gotlike; boolean lastrearr; node *item, *nufork; if (!usertree) { if (jumb == 1) { input_data(); examined = 0; } for (i = 1; i <= (spp); i++) enterorder[i - 1] = i; if (jumble) randumize(seed, enterorder); curtree.root = curtree.nodep[enterorder[0] - 1]; add(curtree.nodep[enterorder[0] - 1], curtree.nodep[enterorder[1] - 1], curtree.nodep[spp]); if (progress) { printf("Adding species:\n"); writename(0, 2, enterorder); #ifdef WIN32 phyFillScreenColor(); #endif } for (i = 3; i <= (spp); i++) { bestyet = -DBL_MAX; item = curtree.nodep[enterorder[i - 1] - 1]; nufork = curtree.nodep[spp + i - 2]; addpreorder(curtree.root, item, nufork); add(there, item, nufork); like = bestyet; rearrange(&curtree.root); evaluate(curtree.root); examined--; if (progress) { writename(i - 1, 1, enterorder); #ifdef WIN32 phyFillScreenColor(); #endif } lastrearr = (i == spp); if (lastrearr) { if (progress) { printf("\nDoing global rearrangements\n"); printf(" !"); for (j = 1; j <= (nonodes); j++) if ( j % (( nonodes / 72 ) + 1 ) == 0 ) putchar('-'); printf("!\n"); #ifdef WIN32 phyFillScreenColor(); #endif } bestlike = bestyet; do { gotlike = bestlike; if (progress) printf(" "); for (j = 0; j < (nonodes); j++) { there = curtree.root; bestyet = -DBL_MAX; item = curtree.nodep[j]; if (item != curtree.root) { re_move(&item, &nufork); there = curtree.root; addpreorder(curtree.root, item, nufork); add(there, item, nufork); } if (progress) { if ( j % (( nonodes / 72 ) + 1 ) == 0 ) putchar('.'); fflush(stdout); } } if (progress) { putchar('\n'); #ifdef WIN32 phyFillScreenColor(); #endif } } while (bestlike > gotlike); if (njumble > 1) { if (jumb == 1 || (jumb > 1 && bestlike > bstlike2)) { copy_(&curtree, &bestree); bstlike2 = bestlike; } } } } if (njumble == jumb) { if (njumble > 1) copy_(&bestree, &curtree); evaluate(curtree.root); printree(curtree.root, treeprint, false, true); describe(); } } else { input_data(); /* Open in binary: ftell() is broken for UNIX line-endings under WIN32 */ openfile(&intree,INTREE,"input tree file","rb",progname,intreename); numtrees = countsemic(&intree); if (treeprint) fprintf(outfile, "\n\nUser-defined trees:\n\n"); names = (boolean *)Malloc(spp*sizeof(boolean)); which = 1; while (which <= numtrees ) { treeread2 (intree, &curtree.root, curtree.nodep, lengths, &trweight, &goteof, &haslengths, &spp,false,nonodes); if (curtree.root->back) { printf("Error: Kitsch cannot read unrooted user trees\n"); exxit(-1); } evaluate(curtree.root); printree(curtree.root, treeprint, false, true); describe(); which++; } FClose(intree); free(names); } if (jumb == njumble && progress) { printf("\nOutput written to file \"%s\"\n", outfilename); if (trout) printf("\nTree also written onto file \"%s\"\n", outtreename); } } /* maketree */ int main(int argc, Char *argv[]) { /* Fitch-Margoliash criterion with contemporary tips */ #ifdef MAC argc = 1; /* macsetup("Kitsch",""); */ argv[0] = "Kitsch"; #endif init(argc,argv); /* reads in spp, options, and the data, then calls maketree to construct the tree */ progname = argv[0]; openfile(&infile,INFILE,"input file","r",argv[0],infilename); openfile(&outfile,OUTFILE,"output file","w",argv[0],outfilename); ibmpc = IBMCRT; ansi = ANSICRT; mulsets = false; firstset = true; datasets = 1; doinit(); openfile(&outtree,OUTTREE,"output tree file","w",argv[0],outtreename); for (ith = 1; ith <= datasets; ith++) { if (datasets > 1) { fprintf(outfile, "\nData set # %ld:\n",ith); if (progress) printf("\nData set # %ld:\n",ith); } getinput(); for (jumb = 1; jumb <= njumble; jumb++) maketree(); firstset = false; if (eoln(infile) && (ith < datasets)) scan_eoln(infile); } FClose(infile); FClose(outfile); FClose(outtree); #ifdef MAC fixmacfile(outfilename); fixmacfile(outtreename); #endif #ifdef WIN32 phyRestoreConsoleAttributes(); #endif printf("\nDone.\n\n"); return 0; } /* Fitch-Margoliash criterion with contemporary tips */ phylip-3.697/src/linkmac0000644004732000473200000000320612406201116014667 0ustar joefelsenst_gln -s clique.app/Contents/MacOS/clique clique ln -s consense.app/Contents/MacOS/consense consense ln -s contml.app/Contents/MacOS/contml contml ln -s contrast.app/Contents/MacOS/contrast contrast ln -s dnacomp.app/Contents/MacOS/dnacomp dnacomp ln -s dnadist.app/Contents/MacOS/dnadist dnadist ln -s dnainvar.app/Contents/MacOS/dnainvar dnainvar ln -s dnaml.app/Contents/MacOS/dnaml dnaml ln -s dnamlk.app/Contents/MacOS/dnamlk dnamlk ln -s dnamove.app/Contents/MacOS/dnamove dnamove ln -s dnapars.app/Contents/MacOS/dnapars dnapars ln -s dnapenny.app/Contents/MacOS/dnapenny dnapenny ln -s dollop.app/Contents/MacOS/dollop dollop ln -s dolmove.app/Contents/MacOS/dolmove dolmove ln -s dolpenny.app/Contents/MacOS/dolpenny dolpenny ln -s drawgram.app/Contents/MacOS/drawgram drawgram ln -s drawtree.app/Contents/MacOS/drawtree drawtree ln -s factor.app/Contents/MacOS/factor factor ln -s fitch.app/Contents/MacOS/fitch fitch ln -s gendist.app/Contents/MacOS/gendist gendist ln -s kitsch.app/Contents/MacOS/kitsch kitsch ln -s mix.app/Contents/MacOS/mix mix ln -s move.app/Contents/MacOS/move move ln -s neighbor.app/Contents/MacOS/neighbor neighbor ln -s pars.app/Contents/MacOS/pars pars ln -s penny.app/Contents/MacOS/penny penny ln -s proml.app/Contents/MacOS/proml proml ln -s promlk.app/Contents/MacOS/promlk promlk ln -s protdist.app/Contents/MacOS/protdist protdist ln -s protpars.app/Contents/MacOS/protpars protpars ln -s restdist.app/Contents/MacOS/restdist restdist ln -s restml.app/Contents/MacOS/restml restml ln -s retree.app/Contents/MacOS/retree retree ln -s seqboot.app/Contents/MacOS/seqboot seqboot ln -s treedist.app/Contents/MacOS/treedist treedist phylip-3.697/src/Makefile.unx0000644004732000473200000005014412407036730015613 0ustar joefelsenst_g# Makefile # # Unix Makefile for PHYLIP 3.696 PACKAGE=phylip VERSION=3.696 # We use GNU's version of the make utility. It may be called "gmake" on # your system. # # If you're using a RedHat Linux system with default locations for # gcc libraries, you probably don't need to change anything. You might # might change the first noncomment statement below to redefine $(EXEDIR) # if you'd like your executables installed in a different location than # our default. # # Users with systems that differ substantially from ours may need to set # the following variables: $(CC) $(CFLAGS) $(DFLAGS) $(LIBS) $(DLIBS) # # When uncompressed and extracted, the tar archive phylip-3.6x.tar.gz # produces the following directory structure: # # phylip-3.6x/src -- the source code, including this Makefile # phylip-3.6x/exe -- executables, changed by changing $(EXEDIR) value # phylip-3.6x/doc -- html documentation # # To use the PHYLIP v3.6 Makefile, type from the phylip-3.6x/src directory: # # make install to compile the whole package and install # the executables in $(EXEDIR), and then # remove the object files to save space # # make all to compile the whole package but not install it # or remove the object files. # # make put to move the executables into $(EXEDIR) # # make clean to remove all object files and executables from the # current directory # # make dnaml to compile and link one program, (in this example, # DnaML) and leave the executable and object files # in the current directory (where the source code is). # You will have to move the executable into the # executables directory (e.g. "mv dnaml ../exe") # Note that the program name should be lower case. # # ---------------------------------------------------------------------------- # (Starting here is the section where you may want to change things) # ---------------------------------------------------------------------------- # # the following specifies the directory where the executables will be placed EXEDIR = ../exe # # ---------------------------------------------------------------------------- # # The following statements set these variables: # # CC -- the name (and, optionally, location) of your C compiler # CFLAGS -- compiler directives needed to compile most programs # DFLAGS -- compiler directives needed to compile draw programs # LIBS -- non-default system libraries needed to compile most programs # DLIBS -- non-default system libraries needed to compile draw programs # # We've provided a set of possible values for each variable. # # The value used is the one without a "#" at the beginning of the line. # # To try an alternate value, make sure the one you want has no "#" # as its first character and that all other possibilities have "#" for # their first character. # # Advanced users may need to further edit one of the alternatives in # order to correctly compile on their system. # # ---------------------------------------------------------------------------- # # The next two assignments are the invocations of the compiler # # This one specifies the "cc" C compiler #CC = cc # # To use GCC instead: CC = gcc # # ---------------------------------------------------------------------------- # # This is the CFLAGS statement. It specifies compiler behavior. # # Here are some possible CFLAGS statements: # # #A minimal one CFLAGS = # # A basic one for debugging #CFLAGS = -g # # An optimized one for gcc #CFLAGS = -O3 -DUNX -fomit-frame-pointer # # For some serious debugging using Gnu gcc # #CFLAGS=-g -DUNX -Wall -Wmain -Wmissing-prototypes -Wreturn-type -Wstrict-prototypes -Wunused -Werror -Wredundant-decls -Waggregate-return -Wcast-align -Wcomment # # For doing code coverage with gcov # #CFLAGS = -ggdb -DUNX -fprofile-arcs -ftest-coverage #CFLAGS = -pg -DUNX # # For Digital Alpha systems with Compaq Tru64 Unix # (however, be aware that this may cause floating-point problems in programs # like Dnaml owing to not using IEEE floating point standards). #CFLAGS = -fast -DUNX # # ---------------------------------------------------------------------------- # # This is the DFLAGS statement. It specifies compiler behavior for the # programs drawgram and drawtree. It adds additional information to # the $(CFLAGS) value if needed. # DFLAGS = $(CFLAGS) # # ---------------------------------------------------------------------------- # # Most of the programs need only the math libraries, specified like this; # LIBS = -lm # # The drawing programs may also need access to the graphics libraries. This is # specified with the DLIBS variable. DLIBS = $(LIBS) # # ---------------------------------------------------------------------------- # (After this point there should not be any reason to change anything) # ---------------------------------------------------------------------------- # # # the list of programs # draw programs are listed last since they are the most likely to cause # compilation or linking problems PROGS = clique \ consense \ contml \ contrast \ dnacomp \ dnadist \ dnainvar \ dnaml \ dnamlk \ dnamove \ dnapars \ dnapenny \ dolmove \ dollop \ dolpenny \ factor \ fitch \ gendist \ kitsch \ mix \ move \ neighbor \ pars \ penny \ proml \ promlk \ protdist \ protpars \ restdist \ restml \ retree \ seqboot \ treedist \ drawgram \ drawtree DYLIBS = libdrawgram.so \ libdrawtree.so JARS = javajars/DrawGram.jar \ javajars/DrawTree.jar \ javajars/DrawGramJava.unx\ javajars/DrawTreeJava.unx # # general commands # # The first target it executed if you just type "make". It tells you how to # use the Makefile. # help: @echo "" @echo " To use the PHYLIP v3.6 Makefile, type" @echo " make install to compile the whole package and install" @echo " the executables in $(EXEDIR), and then" @echo " remove the object files to save space" @echo " make all to compile the whole package but not install it" @echo " or remove the object files" @echo " make put to move the executables into $(EXEDIR)" @echo " make clean to remove all object files and executables from the" @echo " current directory" @echo " make dnaml to compile and link one program, (in this example," @echo " Dnaml) and leave the executable and object files" @echo " in the current directory (where the source code is)." @echo " You will have to move the executable into the" @echo " executables directory (e.g. \"mv dnaml $(EXEDIR)\")" @echo " Note that the program name should be lower case." @echo " " introduce: @echo "Building PHYLIP version $(VERSION)" all: introduce $(PROGS) $(DYLIBS) @echo "Finished compiling." @echo "" install: all put clean @echo "Done." @echo "" put: @echo "Installing PHYLIP v3.6 binaries in $(EXEDIR)" @mkdir -p $(EXEDIR) @cp $(PROGS) $(EXEDIR) @echo "Installing dynamic libraries in $(EXEDIR)" @cp $(DYLIBS) $(EXEDIR) @echo "Installing jar files in $(EXEDIR)" @cp $(JARS) $(EXEDIR) @echo "Installing font files in $(EXEDIR)" @cp font* $(EXEDIR) @echo "Finished installation." @echo "" clean: @echo "Removing object files to save space" @rm -f *.o @echo "Finished removing object files. Now will remove" @echo "executable files from the current directory, but not from the" @echo "executables directory. (If some are not here, the makefile" @echo "will terminate with an error message but this is not a problem)" @echo "" @echo "Removing executables from this directory" @rm -f $(PROGS) @echo "Finished cleanup." @echo "" # # compile object files shared between programs # (make's implicit rule for %.o will take care of these) # phylip.o: phylip.h seq.o: phylip.h seq.h disc.o: phylip.h disc.h discrete.o: phylip.h discrete.h dollo.o: phylip.h dollo.h wagner.o: phylip.h wagner.h dist.o: phylip.h dist.h cont.o: phylip.h cont.h mlclock.o: phylip.h mlclock.h moves.o: phylip.h moves.h printree.o: phylip.h printree.h # # compile the individual programs # clique.o: clique.c disc.h phylip.h clique: clique.o disc.o phylip.o $(CC) $(CFLAGS) clique.o disc.o phylip.o $(LIBS) -o clique cons.o: cons.c cons.h phylip.h consense.o: consense.c cons.h phylip.h consense: consense.o phylip.o cons.o $(CC) $(CFLAGS) consense.o phylip.o cons.o $(LIBS) -o consense contml.o: contml.c cont.h phylip.h contml: contml.o cont.o phylip.o $(CC) $(CFLAGS) contml.o cont.o phylip.o $(LIBS) -o contml contrast.o: contrast.c cont.h phylip.h contrast: contrast.o cont.o phylip.o $(CC) $(CFLAGS) contrast.o cont.o phylip.o $(LIBS) -o contrast dnacomp.o: dnacomp.c seq.h phylip.h dnacomp: dnacomp.o seq.o phylip.o $(CC) $(CFLAGS) dnacomp.o seq.o phylip.o $(LIBS) -o dnacomp dnadist.o: dnadist.c seq.h phylip.h dnadist: dnadist.o seq.o phylip.o $(CC) $(CFLAGS) dnadist.o seq.o phylip.o $(LIBS) -o dnadist dnainvar.o: dnainvar.c seq.h phylip.h dnainvar: dnainvar.o seq.o phylip.o $(CC) $(CFLAGS) dnainvar.o seq.o phylip.o $(LIBS) -o dnainvar dnaml.o: dnaml.c seq.h phylip.h dnaml: dnaml.o seq.o phylip.o $(CC) $(CFLAGS) dnaml.o seq.o phylip.o $(LIBS) -o dnaml dnamlk.o: dnamlk.c seq.h phylip.h mlclock.h printree.h dnamlk: dnamlk.o seq.o phylip.o mlclock.o printree.o $(CC) $(CFLAGS) dnamlk.o seq.o phylip.o mlclock.o printree.o $(LIBS) -o dnamlk dnamove.o: dnamove.c seq.h moves.h phylip.h dnamove: dnamove.o seq.o moves.o phylip.o $(CC) $(CFLAGS) dnamove.o seq.o moves.o phylip.o $(LIBS) -o dnamove dnapenny.o: dnapenny.c seq.h phylip.h dnapenny: dnapenny.o seq.o phylip.o $(CC) $(CFLAGS) dnapenny.o seq.o phylip.o $(LIBS) -o dnapenny dnapars.o: dnapars.c seq.h phylip.h dnapars: dnapars.o seq.o phylip.o $(CC) $(CFLAGS) dnapars.o seq.o phylip.o $(LIBS) -o dnapars dolmove.o: dolmove.c disc.h moves.h dollo.h phylip.h dolmove: dolmove.o disc.o moves.o dollo.o phylip.o $(CC) $(CFLAGS) dolmove.o disc.o moves.o dollo.o phylip.o $(LIBS) -o dolmove dollop.o: dollop.c disc.h dollo.h phylip.h dollop: dollop.o disc.o dollo.o phylip.o $(CC) $(CFLAGS) dollop.o disc.o dollo.o phylip.o $(LIBS) -o dollop dolpenny.o: dolpenny.c disc.h dollo.h phylip.h dolpenny: dolpenny.o disc.o dollo.o phylip.o $(CC) $(CFLAGS) dolpenny.o disc.o dollo.o phylip.o $(LIBS) -o dolpenny draw.o: draw.c draw.h phylip.h $(CC) $(DFLAGS) -c draw.c draw2.o: draw2.c draw.h phylip.h $(CC) $(DFLAGS) -c draw2.c drawgram.o: drawgram.c draw.h phylip.h $(CC) $(DFLAGS) -c drawgram.c drawgram: drawgram.o draw.o draw2.o phylip.o $(CC) $(DFLAGS) draw.o draw2.o drawgram.o phylip.o $(DLIBS) -o drawgram # needed by java libdrawgram.so: drawgram.o draw.o draw2.o phylip.o $(CC) $(CFLAGS) -o libdrawgram.so -shared -fPIC drawgram.c draw.c draw2.c phylip.c $(CLIBS) drawtree.o: drawtree.c draw.h phylip.h $(CC) $(DFLAGS) -shared -fPIC -c drawtree.c drawtree: drawtree.o draw.o draw2.o phylip.o $(CC) $(DFLAGS) draw.o draw2.o drawtree.o phylip.o $(DLIBS) -o drawtree # needed by java libdrawtree.so: drawtree.o draw.o draw2.o phylip.o $(CC) $(CFLAGS) -o libdrawtree.so -shared -fPIC drawtree.c draw.c draw2.c phylip.c $(CLIBS) factor.o: factor.c phylip.h factor: factor.o phylip.o $(CC) $(CFLAGS) factor.o phylip.o $(LIBS) -o factor fitch.o: fitch.c dist.h phylip.h fitch: fitch.o dist.o phylip.o $(CC) $(CFLAGS) fitch.o dist.o phylip.o $(LIBS) -o fitch gendist.o: gendist.c phylip.h gendist: gendist.o phylip.o $(CC) $(CFLAGS) gendist.o phylip.o $(LIBS) -o gendist kitsch.o: kitsch.c dist.h phylip.h kitsch: kitsch.o dist.o phylip.o $(CC) $(CFLAGS) kitsch.o dist.o phylip.o $(LIBS) -o kitsch mix.o: mix.c disc.h wagner.h phylip.h mix: mix.o disc.o wagner.o phylip.o $(CC) $(CFLAGS) mix.o disc.o wagner.o phylip.o $(LIBS) -o mix move.o: move.c disc.h moves.h wagner.h phylip.h move: move.o disc.o moves.o wagner.o phylip.o $(CC) $(CFLAGS) move.o disc.o moves.o wagner.o phylip.o $(LIBS) -o move neighbor.o: neighbor.c dist.h phylip.h neighbor: neighbor.o dist.o phylip.o $(CC) $(CFLAGS) neighbor.o dist.o phylip.o $(LIBS) -o neighbor pars.o: pars.c discrete.h phylip.h pars: pars.o discrete.o phylip.o $(CC) $(CFLAGS) pars.o discrete.o phylip.o $(LIBS) -o pars penny.o: penny.c disc.h wagner.h phylip.h penny: penny.o disc.o wagner.o phylip.o $(CC) $(CFLAGS) penny.o disc.o wagner.o phylip.o $(LIBS) -o penny proml.o: proml.c seq.h phylip.h proml: proml.o seq.o phylip.o $(CC) $(CFLAGS) proml.o seq.o phylip.o $(LIBS) -o proml promlk.o: promlk.c seq.h phylip.h mlclock.h printree.h promlk: promlk.o seq.o phylip.o mlclock.o printree.o $(CC) $(CFLAGS) promlk.o seq.o phylip.o mlclock.o printree.o $(LIBS) -o promlk protdist.o: protdist.c seq.h phylip.h protdist: protdist.o seq.o phylip.o $(CC) $(CFLAGS) protdist.o seq.o phylip.o $(LIBS) -o protdist protpars.o: protpars.c seq.h phylip.h protpars: protpars.o seq.o phylip.o $(CC) $(CFLAGS) protpars.o seq.o phylip.o $(LIBS) -o protpars restdist.o: restdist.c seq.h phylip.h restdist: restdist.o seq.o phylip.o $(CC) $(CFLAGS) restdist.o seq.o phylip.o $(LIBS) -o restdist restml.o: restml.c seq.h phylip.h restml: restml.o seq.o phylip.o $(CC) $(CFLAGS) restml.o seq.o phylip.o $(LIBS) -o restml retree.o: retree.c moves.h phylip.h retree: retree.o moves.o phylip.o $(CC) $(CFLAGS) retree.o moves.o phylip.o $(LIBS) -o retree seqboot.o: seqboot.c phylip.h seqboot: seqboot.o seq.o phylip.o $(CC) $(CFLAGS) seqboot.o seq.o phylip.o $(LIBS) -o seqboot treedist.o: treedist.c cons.h phylip.h treedist: treedist.o phylip.o cons.o $(CC) $(CFLAGS) treedist.o cons.o phylip.o $(LIBS) -o treedist # ---------------------------------------------------------------------------- # The following section is used to build a PHYLIP distribution. All sources # and other files except the documentation files must be placed in the # current directory. The HTML documentation files must be in folder "doc" # within this, the Mac icons in folder "mac", and the Windows icons and # resource files must be in folder "icons" # # Usage: # make distdir - Build the distribution dir phylip-/ # make dist - Make a tarred and gzipped phylip-.tar.gz # ---------------------------------------------------------------------------- DIST_COMMON = phylip.html DOCS= doc/clique.html doc/consense.html doc/contchar.html doc/contml.html \ doc/contrast.html doc/discrete.html doc/distance.html doc/dnacomp.html \ doc/dnadist.html doc/dnainvar.html doc/dnaml.html doc/dnamlk.html \ doc/dnamove.html doc/dnapars.html doc/dnapenny.html doc/dollop.html \ doc/dolmove.html doc/dolpenny.html doc/drawgram.html doc/draw.html \ doc/drawtree.html doc/factor.html doc/fitch.html doc/gendist.html \ doc/kitsch.html doc/main.html doc/mix.html doc/move.html \ doc/neighbor.html doc/pars.html doc/penny.html doc/proml.html \ doc/promlk.html doc/protdist.html doc/protpars.html doc/restdist.html \ doc/restml.html doc/retree.html doc/seqboot.html doc/sequence.html \ doc/treedist.html doc/phylip.gif SOURCES= COPYRIGHT Makefile Makefile.cyg Makefile.osx Makefile.unx linkmac \ clique.c cons.c consense.c cons.h cont.c \ cont.h contml.c contrast.c disc.c disc.h discrete.c discrete.h dist.c \ dist.h dnacomp.c dnadist.c dnainvar.c dnaml.c dnamlk.c dnamove.c \ dnapars.c dnapenny.c dollo.c dollo.h dollop.c dolmove.c dolpenny.c \ draw2.c draw.c drawgram.c draw.h drawtree.c \ factor.c fitch.c gendist.c \ kitsch.c mix.c move.c \ moves.c moves.h neighbor.c pars.c penny.c \ phylip.c phylip.h proml.c promlk.c protdist.c protpars.c restdist.c \ restml.c retree.c seqboot.c seq.c seq.h treedist.c wagner.c wagner.h \ mlclock.c mlclock.h printree.c printree.h MAC= \ Info.plist.in boot.icns clique.icns command.in consense.icns \ contml.icns contrast.icns disc.icns dist.icns dna.icns dnacomp.icns \ dnadist.icns dnainvar.icns dnaml.icns dnamlk.icns dnamove.icns \ dnapars.icns dnapenny.icns dollo.icns dollop.icns dolmove.icns \ dolpenny.icns drawgram.icns drawtree.icns factor.icns fitch.icns \ gendist.icns kitsch.icns mac.sit mix.icns move.icns neighbor.icns \ pars.icns penny.icns proml.icns promlk.icns protdist.icns protein.icns \ protpars.icns restdist.icns restml.icns restrict.icns retree.icns \ seqboot.icns treedist.icns ICONS= boot.ico clique.ico clique.rc clique.rcb consense.ico \ consense.rc consense.rcb contml.ico contml.rc contml.rcb \ contrast.ico contrast.rc contrast.rcb disc.ico dist.ico dna.ico \ dnacomp.rc dnacomp.rcb dnadist.rc dnadist.rcb dnainvar.rc \ dnainvar.rcb dnaml.rc dnaml.rcb dnamlk.rc dnamlk.rcb dnamove.rc \ dnamove.rcb dnapars.rc dnapars.rcb dnapenny.rc dnapenny.rcb \ dollo.ico dollop.rc dollop.rcb dolmove.rc dolmove.rcb \ dolpenny.rc dolpenny.rcb drawgram.ico drawgram.rc drawgram.rcb \ drawtree.ico drawtree.rc drawtree.rcb factor.rc factor.rcb \ fitch.rc fitch.rcb gendist.ico gendist.rc gendist.rcb kitsch.rc \ kitsch.rcb mix.rc mix.rcb move.rc move.rcb neighbor.rc \ neighbor.rcb pars.rc pars.rcb penny.rc penny.rcb proml.rc \ proml.rcb promlk.rc promlk.rcb protdist.rc protdist.rcb \ protein.ico protpars.rc protpars.rcb restdist.rc restdist.rcb \ restml.rc restml.rcb restrict.ico retree.ico retree.rc \ retree.rcb seqboot.rc seqboot.rcb treedist.ico treedist.rc \ treedist.rcb FONTS= font1 font2 font3 font4 font5 font6 TESTDIR= clique consense contml contrast dnacomp \ dnadist dnainvar dnaml dnamlk dnamove dnapars dnapenny dollop \ dolmove dolpenny drawgram drawtree factor fitch gendist \ kitsch mix move neighbor pars penny proml promlk \ protdist protpars restdist restml retree seqboot treedist JARAJAR= javajars/DrawGram.jar javajars/DrawTree.jar \ javajars/DrawGramJava.bat javajars/DrawTreeJava.bat \ javajars/DrawGramJava.exe javajars/DrawTreeJava.exe \ javajars/DrawGramJava.unx javajars/DrawTreeJava.unx DISTDIR=$(PACKAGE)-$(VERSION) dist_SRCDIR=$(DISTDIR)/src dist_DOCDIR=$(DISTDIR)/doc dist_EXEDIR=$(DISTDIR)/exe dist_JAVADIR=$(DISTDIR)/src/javajars MACICONDIR=src/mac SHELL=bash # We use this target to create a tarred and gzipped distribution of PHYLIP dist: distdir -chmod -R a+r $(DISTDIR) tar chozf $(DISTDIR).tar.gz $(DISTDIR) -rm -rf $(DISTDIR) # This target creates the distribution directory distdir: $(DIST_COMMON) $(DOCS) $(SOURCES) -rm -rf $(DISTDIR) mkdir $(DISTDIR) && \ mkdir $(dist_EXEDIR) && \ mkdir $(dist_DOCDIR) && \ mkdir $(dist_SRCDIR) && \ mkdir $(dist_JAVADIR) mkdir $(dist_SRCDIR)/mac mkdir $(dist_SRCDIR)/icons mkdir $(dist_EXEDIR)/testdata for i in $(TESTDIR); do \ mkdir $(dist_EXEDIR)/testdata/$$i; \ cp TestData/$$i/*.txt $(dist_EXEDIR)/testdata/$$i; \ done for i in $(DIST_COMMON) ; do \ cp -r $$i $(DISTDIR) ; \ done for i in $(DOCS) ; do \ cp -r $$i $(dist_DOCDIR) ; \ done for i in $(SOURCES) ; do \ cp -r $$i $(dist_SRCDIR) ; \ done for i in $(MAC) ; do \ cp -r mac/$$i $(dist_SRCDIR)/mac ; \ done for i in $(ICONS) ; do \ cp -r icons/$$i $(dist_SRCDIR)/icons ; \ done for i in $(FONTS) ; do \ cp -r $$i $(dist_SRCDIR) ; \ done for i in $(JARAJAR) ; do \ cp $$i $(dist_JAVADIR) ; \ done # This target untars the dist and checks that it can be compiled and remade distcheck: dist -rm -rf $(DISTDIR) tar xzf $(DISTDIR).tar.gz cd $(DISTDIR)/$(SRCDIR) \ && make all -rm -rf $(DISTDIR) @echo "$(DISTDIR).tar.gz is ready for distribution" # Makefile phylip-3.697/src/mix.c0000644004732000473200000007665212406201116014306 0ustar joefelsenst_g #include "phylip.h" #include "disc.h" #include "wagner.h" /* version 3.696. Written by Joseph Felsenstein, Akiko Fuseki, Sean Lamont, and Andrew Keeffe. Copyright (c) 1993-2014, Joseph Felsenstein All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #define maxtrees 100 /* maximum number of tied trees stored */ typedef long *placeptr; #ifndef OLDC /* function prototypes */ void getoptions(void); void allocrest(void); void doinit(void); void inputoptions(void); void doinput(void); void evaluate(node2 *); void reroot(node2 *); void savetraverse(node2 *); void savetree(void); void mix_addtree(long *pos); void mix_findtree(boolean *, long *, long, long *, long **); void tryadd(node2 *, node2 **, node2 **); void addpreorder(node2 *, node2 *, node2 *); void tryrearr(node2 *, node2 **, boolean *); void repreorder(node2 *, node2 **, boolean *); void rearrange(node2 **r); void mix_addelement(node2 **, long *, long *, boolean *); void mix_treeread(void); void describe(void); void maketree(void); void reallocchars(void); void clearallnodes(node2* p); /* function prototypes */ #endif Char infilename[FNMLNGTH], outfilename[FNMLNGTH], intreename[FNMLNGTH], outtreename[FNMLNGTH], weightfilename[FNMLNGTH], ancfilename[FNMLNGTH], mixfilename[FNMLNGTH]; node2 *root; long outgrno, msets, ith, njumble, jumb; /* outgrno indicates outgroup */ long inseed, inseed0; boolean jumble, usertree, weights, thresh, ancvar, questions, allsokal, allwagner, mixture, trout, noroot, outgropt, didreroot, progress, treeprint, stepbox, ancseq, mulsets, firstset, justwts; boolean *ancone, *anczero, *ancone0, *anczero0; pointptr2 treenode; /* pointers to all nodes in tree */ double threshold; double *threshwt; bitptr wagner, wagner0; longer seed; long *enterorder; double **fsteps; char *guess; long **bestrees; steptr numsteps, numsone, numszero; gbit *garbage; char ch; char *progname; /* Local variables for maketree: */ long minwhich; double like, bestyet, bestlike, bstlike2, minsteps; boolean lastrearr,full; double nsteps[maxuser]; node2 *there; long fullset; bitptr steps, zeroanc, oneanc, fulzeroanc, empzeroanc; long *place, col; void getoptions() { /* interactively set options */ long loopcount, loopcount2; Char ch, ch2; fprintf(outfile, "\nMixed parsimony algorithm, version %s\n\n",VERSION); putchar('\n'); jumble = false; njumble = 1; outgrno = 1; outgropt = false; thresh = false; threshold = spp; trout = true; usertree = false; weights = false; justwts = false; ancvar = false; allsokal = false; allwagner = true; mixture = false; printdata = false; progress = true; treeprint = true; stepbox = false; ancseq = false; loopcount = 0; for (;;) { cleerhome(); printf("\nMixed parsimony algorithm, version %s\n\n",VERSION); printf("Settings for this run:\n"); printf(" U Search for best tree? %s\n", (usertree ? "No, use user trees in input file" : "Yes")); printf(" X Use Mixed method? %s\n", (mixture ? "Yes" : "No")); printf(" P Parsimony method?"); if (!mixture) { printf(" %s\n",(allwagner ? "Wagner" : "Camin-Sokal")); } else printf(" (methods in mixture)\n"); if (!usertree) { printf(" J Randomize input order of species?"); if (jumble) printf(" Yes (seed =%8ld,%3ld times)\n", inseed0, njumble); else printf(" No. Use input order\n"); } printf(" O Outgroup root?"); if (outgropt) printf(" Yes, at species number%3ld\n", outgrno); else printf(" No, use as outgroup species%3ld\n", outgrno); printf(" T Use Threshold parsimony?"); if (thresh) printf(" Yes, count steps up to%4.1f per char.\n", threshold); else printf(" No, use ordinary parsimony\n"); printf(" A Use ancestral states in input file? %s\n", (ancvar ? "Yes" : "No")); printf(" W Sites weighted? %s\n", (weights ? "Yes" : "No")); printf(" M Analyze multiple data sets?"); if (mulsets) printf(" Yes, %2ld %s\n", msets, (justwts ? "sets of weights" : "data sets")); else printf(" No\n"); printf(" 0 Terminal type (IBM PC, ANSI, none)? %s\n", (ibmpc ? "IBM PC" : ansi ? "ANSI" : "(none)")); printf(" 1 Print out the data at start of run %s\n", (printdata ? "Yes" : "No")); printf(" 2 Print indications of progress of run %s\n", (progress ? "Yes" : "No")); printf(" 3 Print out tree %s\n", (treeprint ? "Yes" : "No")); printf(" 4 Print out steps in each character %s\n", (stepbox ? "Yes" : "No")); printf(" 5 Print states at all nodes of tree %s\n", (ancseq ? "Yes" : "No")); printf(" 6 Write out trees onto tree file? %s\n", (trout ? "Yes" : "No")); if(weights && justwts){ printf( "WARNING: W option and Multiple Weights options are both on. "); printf( "The W menu option is unnecessary and has no additional effect. \n"); } printf("\nAre these settings correct? "); printf("(type Y or the letter for one to change)\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); if (ch == '\n') ch = ' '; uppercase(&ch); if (ch == 'Y') break; if (((!usertree) && (strchr("WJOTUMPAX1234560", ch) != NULL)) || (usertree && ((strchr("WOTUMPAX1234560", ch) != NULL)))){ switch (ch) { case 'W': weights = !weights; break; case 'U': usertree = !usertree; break; case 'X': mixture = !mixture; break; case 'P': allwagner = !allwagner; break; case 'A': ancvar = !ancvar; break; case 'J': jumble = !jumble; if (jumble) initjumble(&inseed, &inseed0, seed, &njumble); else njumble = 1; break; case 'O': outgropt = !outgropt; if (outgropt) initoutgroup(&outgrno, spp); break; case 'T': thresh = !thresh; if (thresh) initthreshold(&threshold); break; case 'M': mulsets = !mulsets; if (mulsets){ printf("Multiple data sets or multiple weights?"); loopcount2 = 0; do { printf(" (type D or W)\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch2); getchar(); if (ch2 == '\n') ch2 = ' '; uppercase(&ch2); countup(&loopcount2, 10); } while ((ch2 != 'W') && (ch2 != 'D')); justwts = (ch2 == 'W'); if (justwts) justweights(&msets); else initdatasets(&msets); if (!jumble) { jumble = true; initjumble(&inseed, &inseed0, seed, &njumble); } } break; case '0': initterminal(&ibmpc, &ansi); break; case '1': printdata = !printdata; break; case '2': progress = !progress; break; case '3': treeprint = !treeprint; break; case '4': stepbox = !stepbox; break; case '5': ancseq = !ancseq; break; case '6': trout = !trout; break; } } else printf("Not a possible option!\n"); countup(&loopcount, 100); } allsokal = (!allwagner && !mixture); } /* getoptions */ void reallocchars() { long i; if (usertree) { for (i = 0; i < maxuser; i++) { free (fsteps[i]); fsteps[i] = (double *)Malloc(chars*sizeof(double)); } } free(extras); free(weight); free(threshwt); free(numsteps); free(numszero); free(numsone); free(guess); free(ancone); free(anczero); free(ancone0); free(anczero0); extras = (steptr)Malloc(chars*sizeof(long)); weight = (steptr)Malloc(chars*sizeof(long)); threshwt = (double *)Malloc(chars*sizeof(double)); numsteps = (steptr)Malloc(chars*sizeof(long)); numszero = (steptr)Malloc(chars*sizeof(long)); numsone = (steptr)Malloc(chars*sizeof(long)); guess = (Char *)Malloc(chars*sizeof(Char)); ancone = (boolean *)Malloc(chars*sizeof(boolean)); anczero = (boolean *)Malloc(chars*sizeof(boolean)); ancone0 = (boolean *)Malloc(chars*sizeof(boolean)); anczero0 = (boolean *)Malloc(chars*sizeof(boolean)); } void allocrest() { long i; if (usertree) { fsteps = (double **)Malloc(maxuser*sizeof(double *)); for (i = 0; i < maxuser; i++) fsteps[i] = (double *)Malloc(chars*sizeof(double)); } bestrees = (long **)Malloc(maxtrees*sizeof(long *)); for (i = 1; i <= maxtrees; i++) bestrees[i - 1] = (long *)Malloc(spp*sizeof(long)); extras = (steptr)Malloc(chars*sizeof(long)); weight = (steptr)Malloc(chars*sizeof(long)); threshwt = (double *)Malloc(chars*sizeof(double)); numsteps = (steptr)Malloc(chars*sizeof(long)); numszero = (steptr)Malloc(chars*sizeof(long)); numsone = (steptr)Malloc(chars*sizeof(long)); guess = (Char *)Malloc(chars*sizeof(Char)); nayme = (naym *)Malloc(spp*sizeof(naym)); enterorder = (long *)Malloc(spp*sizeof(long)); ancone = (boolean *)Malloc(chars*sizeof(boolean)); anczero = (boolean *)Malloc(chars*sizeof(boolean)); ancone0 = (boolean *)Malloc(chars*sizeof(boolean)); anczero0 = (boolean *)Malloc(chars*sizeof(boolean)); wagner = (bitptr)Malloc(words*sizeof(long)); wagner0 = (bitptr)Malloc(words*sizeof(long)); place = (long *)Malloc(nonodes*sizeof(long)); steps = (bitptr)Malloc(words*sizeof(long)); zeroanc = (bitptr)Malloc(words*sizeof(long)); oneanc = (bitptr)Malloc(words*sizeof(long)); fulzeroanc = (bitptr)Malloc(words*sizeof(long)); empzeroanc = (bitptr)Malloc(words*sizeof(long)); } /* allocrest */ void doinit() { /* initializes variables */ inputnumbers(&spp, &chars, &nonodes, 1); words = chars / bits + 1; getoptions(); if (printdata) fprintf(outfile, "%ld species, %ld characters\n\n", spp, chars); alloctree2(&treenode); setuptree2(treenode); allocrest(); } /* doinit */ void inputoptions() { /* input the information on the options */ long i; if(justwts){ if(firstset){ scan_eoln(infile); if (ancvar) inputancestors(anczero0, ancone0); if (mixture) inputmixture(wagner0); } for (i = 0; i < (chars); i++) weight[i] = 1; inputweights(chars, weight, &weights); for (i = 0; i < (words); i++) { if (mixture) wagner[i] = wagner0[i]; else if (allsokal) wagner[i] = 0; else wagner[i] = (1L << (bits + 1)) - (1L << 1); } } else { if (!firstset) { samenumsp(&chars, ith); reallocchars(); } scan_eoln(infile); for (i = 0; i < (chars); i++) weight[i] = 1; if (ancvar) inputancestors(anczero0, ancone0); if (mixture) inputmixture(wagner0); if (weights) inputweights(chars, weight, &weights); for (i = 0; i < (words); i++) { if (mixture) wagner[i] = wagner0[i]; else if (allsokal) wagner[i] = 0; else wagner[i] = (1L << (bits + 1)) - (1L << 1); } } for (i = 0; i < (chars); i++) { if (!ancvar) { anczero[i] = true; ancone[i] = (((1L << (i % bits + 1)) & wagner[i / bits]) != 0); } else { anczero[i] = anczero0[i]; ancone[i] = ancone0[i]; } } noroot = true; questions = false; for (i = 0; i < (chars); i++) { if (weight[i] > 0) { noroot = (noroot && ancone[i] && anczero[i] && ((((1L << (i % bits + 1)) & wagner[i / bits]) != 0) || threshold <= 2.0)); } questions = (questions || (ancone[i] && anczero[i])); threshwt[i] = threshold * weight[i]; } } /* inputoptions */ void doinput() { /* reads the input data */ inputoptions(); if(!justwts || firstset) inputdata2(treenode); } /* doinput */ void evaluate(node2 *r) { /* Determines the number of steps needed for a tree. This is the minimum number needed to evolve chars on this tree */ long i, stepnum, smaller; double sum, term; sum = 0.0; for (i = 0; i < (chars); i++) { numszero[i] = 0; numsone[i] = 0; } full = true; for (i = 0; i < (words); i++) zeroanc[i] = fullset; postorder(r, fullset, full, wagner, zeroanc); cpostorder(r, full, zeroanc, numszero, numsone); count(r->fulstte1, zeroanc, numszero, numsone); for (i = 0; i < (words); i++) zeroanc[i] = 0; full = false; postorder(r, fullset, full, wagner, zeroanc); cpostorder(r, full, zeroanc, numszero, numsone); count(r->empstte0, zeroanc, numszero, numsone); for (i = 0; i < (chars); i++) { smaller = spp * weight[i]; numsteps[i] = smaller; if (anczero[i]) { numsteps[i] = numszero[i]; smaller = numszero[i]; } if (ancone[i] && numsone[i] < smaller) numsteps[i] = numsone[i]; stepnum = numsteps[i] + extras[i]; if (stepnum <= threshwt[i]) term = stepnum; else term = threshwt[i]; sum += term; if (usertree && which <= maxuser) fsteps[which - 1][i] = term; guess[i] = '?'; if (!ancone[i] || (anczero[i] && numszero[i] < numsone[i])) guess[i] = '0'; else if (!anczero[i] || (ancone[i] && numsone[i] < numszero[i])) guess[i] = '1'; } if (usertree && which <= maxuser) { nsteps[which - 1] = sum; if (which == 1) { minwhich = 1; minsteps = sum; } else if (sum < minsteps) { minwhich = which; minsteps = sum; } } like = -sum; } /* evaluate */ void reroot(node2 *outgroup) { /* reorients tree, putting outgroup in desired position. */ node2 *p, *q; if (outgroup->back->index == root->index) return; p = root->next; q = root->next->next; p->back->back = q->back; q->back->back = p->back; p->back = outgroup; q->back = outgroup->back; outgroup->back->back = q; outgroup->back = p; } /* reroot */ void savetraverse(node2 *p) { /* sets BOOLEANs that indicate which way is down */ p->bottom = true; if (p->tip) return; p->next->bottom = false; savetraverse(p->next->back); p->next->next->bottom = false; savetraverse(p->next->next->back); } /* savetraverse */ void savetree() { /* record in place where each species has to be added to reconstruct this tree */ long i, j; node2 *p; boolean done; if (noroot) reroot(treenode[outgrno - 1]); savetraverse(root); for (i = 0; i < (nonodes); i++) place[i] = 0; place[root->index - 1] = 1; for (i = 1; i <= (spp); i++) { p = treenode[i - 1]; while (place[p->index - 1] == 0) { place[p->index - 1] = i; while (!p->bottom) p = p->next; p = p->back; } if (i > 1) { place[i - 1] = place[p->index - 1]; j = place[p->index - 1]; done = false; while (!done) { place[p->index - 1] = spp + i - 1; while (!p->bottom) p = p->next; p = p->back; done = (p == NULL); if (!done) done = (place[p->index - 1] != j); } } } } /* savetree */ void mix_addtree(long *pos) { /* puts tree from ARRAY place in its proper position in ARRAY bestrees */ long i; for (i =nextree - 1; i >= (*pos); i--) memcpy(bestrees[i], bestrees[i - 1], spp*sizeof(long)); for (i = 0; i < (spp); i++) bestrees[(*pos) - 1][i] = place[i]; nextree++; } /* mix_addtree */ void mix_findtree(boolean *found, long *pos, long nextree, long *place, long **bestrees) { /* finds tree given by ARRAY place in ARRAY bestrees by binary search */ /* used by dnacomp, dnapars, dollop, mix, & protpars */ long i, lower, upper; boolean below, done; below = false; lower = 1; upper = nextree - 1; (*found) = false; while (!(*found) && lower <= upper) { (*pos) = (lower + upper) / 2; i = 3; done = false; while (!done) { done = (i > spp); if (!done) done = (place[i - 1] != bestrees[(*pos) - 1][i - 1]); if (!done) i++; } (*found) = (i > spp); if (*found) break; below = (place[i - 1] < bestrees[(*pos )- 1][i - 1]); if (below) upper = (*pos) - 1; else lower = (*pos) + 1; } if (!(*found) && !below) (*pos)++; } /* mix_findtree */ void tryadd(node2 *p, node2 **item, node2 **nufork) { /* temporarily adds one fork and one tip to the tree. if the location where they are added yields greater "likelihood" than other locations tested up to that time, then keeps that location as there */ long pos; boolean found; node2 *rute; add3(p, *item, *nufork, &root, treenode); evaluate(root); if (lastrearr) { if (like >= bstlike2) { rute = root->next->back; savetree(); reroot(rute); if (like > bstlike2) { bestlike = bstlike2 = like; pos = 1; nextree = 1; mix_addtree(&pos); } else { pos = 0; mix_findtree(&found, &pos, nextree, place, bestrees); if (!found) { if (nextree <= maxtrees) mix_addtree(&pos); } } } } if (like > bestyet) { bestyet = like; there = p; } re_move3(item, nufork, &root, treenode); } /* tryadd */ void addpreorder(node2 *p, node2 *item, node2 *nufork) { /* traverses a binary tree, calling PROCEDURE tryadd at a node before calling tryadd at its descendants */ if (p == NULL) return; tryadd(p, &item, &nufork); if (!p->tip) { addpreorder(p->next->back, item, nufork); addpreorder(p->next->next->back, item, nufork); } } /* addpreorder */ void tryrearr(node2 *p, node2 **r, boolean *success) { /* evaluates one rearrangement of the tree. if the new tree has greater "likelihood" than the old one sets success := TRUE and keeps the new tree. otherwise, restores the old tree */ node2 *frombelow, *whereto, *forknode; double oldlike; if (p->back == NULL) return; forknode = treenode[p->back->index - 1]; if (forknode->back == NULL) return; oldlike = bestyet; if (p->back->next->next == forknode) frombelow = forknode->next->next->back; else frombelow = forknode->next->back; whereto = treenode[forknode->back->index - 1]; re_move3(&p, &forknode, &root, treenode); add3(whereto, p, forknode, &root, treenode); evaluate(*r); if ( like - oldlike > LIKE_EPSILON ) { *success = true; bestyet = like; } else { re_move3(&p, &forknode, &root, treenode); add3(frombelow, p, forknode, &root, treenode); } } /* tryrearr */ void repreorder(node2 *p, node2 **r, boolean *success) { /* traverses a binary tree, calling PROCEDURE tryrearr at a node before calling tryrearr at its descendants */ if (p == NULL) return; tryrearr(p, r, success); if (!p->tip) { repreorder(p->next->back, r,success); repreorder(p->next->next->back, r,success); } } /* repreorder */ void rearrange(node2 **r) { /* traverses the tree (preorder), finding any local rearrangement which decreases the number of steps. if traversal succeeds in increasing the tree's "likelihood", PROCEDURE rearrange runs traversal again */ boolean success=true; while (success) { success = false; repreorder(*r,r,&success); } } /* rearrange */ void mix_addelement(node2 **p, long *nextnode, long *lparens, boolean *names) { /* recursive procedure adds nodes to user-defined tree */ node2 *q; long i, n; boolean found; Char str[nmlngth]; getch(&ch, lparens, intree); if (ch == '(' ) { if ((*lparens) >= spp) { printf("\n\nERROR IN USER TREE: Too many left parentheses\n\n"); exxit(-1); } (*nextnode)++; q = treenode[(*nextnode) - 1]; mix_addelement(&q->next->back, nextnode, lparens, names); q->next->back->back = q->next; findch(',', &ch, which); mix_addelement(&q->next->next->back, nextnode, lparens, names); q->next->next->back->back = q->next->next; findch(')', &ch, which); *p = q; return; } for (i = 0; i < nmlngth; i++) str[i] = ' '; n = 1; do { if (ch == '_') ch = ' '; str[n - 1] =ch; if (eoln(intree)) scan_eoln(intree); ch = gettc(intree); if (ch == '\n') ch = ' '; n++; } while (ch != ',' && ch != ')' && ch != ':' && n <= nmlngth); n = 1; do { found = true; for (i = 0; i < nmlngth; i++) found = (found && ((str[i] == nayme[n - 1][i]) || ((nayme[n - 1][i] == '_') && (str[i] == ' ')))); if (found) { if (names[n - 1] == false) { *p = treenode[n - 1]; names[n - 1] = true; } else { printf("\n\nERROR IN USER TREE: Duplicate name found: "); for (i = 0; i < nmlngth; i++) putchar(nayme[n - 1][i]); printf("\n\n"); exxit(-1); } } else n++; } while (!(n > spp || found )); if (n <= spp) return; printf("CANNOT FIND SPECIES: "); for (i = 0; i < nmlngth; i++) putchar(str[i]); putchar('\n'); } /* mix_addelement */ void mix_treeread() { /* read in user-defined tree and set it up */ long nextnode, lparens, i, dummy; boolean *names; root = treenode[spp]; nextnode = spp; root->back = NULL; names = (boolean *)Malloc(spp*sizeof(boolean)); for (i = 0; i < (spp); i++) names[i] = false; lparens = 0; /* Eat everything in the file (i.e. digits, tabs) until you encounter an open-paren */ getch(&ch, &dummy, intree); while (ch != '(') { getch(&ch, &dummy, intree); } ungetc(ch, intree); mix_addelement(&root, &nextnode, &lparens, names); if (ch == '[') { do ch = gettc(intree); while (ch != ']'); ch = gettc(intree); } findch(';', &ch, which); if (progress) printf("."); scan_eoln(intree); free(names); } /* mix_treeread */ void describe() { /* prints ancestors, steps and table of numbers of steps in each character */ if (treeprint) fprintf(outfile, "\nrequires a total of %10.3f\n", -like); putc('\n', outfile); if (stepbox) writesteps(weights, numsteps); if (questions && (!noroot || didreroot)) guesstates(guess); if (ancseq) { hypstates(fullset, full, noroot, didreroot, root, wagner, zeroanc, oneanc, treenode, guess, garbage); putc('\n', outfile); } putc('\n', outfile); if (trout) { col = 0; treeout2(root, &col, root); } } /* describe */ void clearallnodes(node2* p) { if (p->tip) { p->visited = false; } else { clearallnodes(p->next->back); clearallnodes(p->next->next->back); p->next->visited = false; p->next->next->visited = false; p->visited = false; } } void maketree() { /* constructs a binary tree from the pointers in treenode. adds each node at location which yields highest "likelihood" then rearranges the tree for greatest "likelihood" */ long i, j, numtrees; double gotlike; node2 *item, *nufork, *dummy; fullset = (1L << (bits + 1)) - (1L << 1); for (i=0 ; i gotlike); } } if (progress) putchar('\n'); for (i = spp - 1; i >= 1; i--) re_move3(&treenode[i], &dummy, &root, treenode); if (jumb == njumble) { if (treeprint) { putc('\n', outfile); if (nextree == 2) fprintf(outfile, "One most parsimonious tree found:\n"); else fprintf(outfile, "%6ld trees in all found\n", nextree - 1); } if (nextree > maxtrees + 1) { if (treeprint) fprintf(outfile, "here are the first%4ld of them\n",(long)maxtrees); nextree = maxtrees + 1; } if (treeprint) putc('\n', outfile); for (i = 0; i <= (nextree - 2); i++) { root = treenode[0]; add3(treenode[0], treenode[1], treenode[spp], &root, treenode); for (j = 3; j <= (spp); j++) add3(treenode[bestrees[i][j - 1] - 1], treenode[j - 1], treenode[spp + j - 2], &root, treenode); if (noroot) reroot(treenode[outgrno - 1]); didreroot = (outgropt && noroot); evaluate(root); printree(treeprint, noroot, didreroot, root); describe(); for (j = 1; j < (spp); j++) re_move3(&treenode[j], &dummy, &root, treenode); } } } else { /* Open in binary: ftell() is broken for UNIX line-endings under WIN32 */ openfile(&intree,INTREE,"input tree file", "rb",progname,intreename); numtrees = countsemic(&intree); if (numtrees > 2) initseed(&inseed, &inseed0, seed); if (treeprint) { fprintf(outfile, "User-defined tree"); if (numtrees > 1) putc('s', outfile); fprintf(outfile, ":\n\n"); } which = 1; if (progress) printf(" "); while (which <= numtrees ) { mix_treeread(); didreroot = (outgropt && noroot); if (noroot) reroot(treenode[outgrno - 1]); evaluate(root); printree(treeprint, noroot, didreroot, root); describe(); clearallnodes(root); which++; } if (progress) printf("\n"); FClose(intree); fprintf(outfile, "\n\n"); if (numtrees > 2 && chars > 1 ) { if (progress) printf(" sampling for SH test\n"); standev(numtrees, minwhich, minsteps, nsteps, fsteps, seed); } } if (jumb == njumble) { if (progress) { printf("\nOutput written to file \"%s\"\n", outfilename); if (trout) printf("\nTrees also written onto file \"%s\"\n", outtreename); putchar('\n'); } } if (ancseq) freegarbage(&garbage); } /* maketree */ int main(int argc, Char *argv[]) { /* Mixed parsimony by uphill search */ #ifdef MAC argc = 1; /* macsetup("Mix",""); */ argv[0] = "Mix"; #endif init(argc, argv); progname = argv[0]; openfile(&infile,INFILE,"input file", "r",argv[0],infilename); openfile(&outfile,OUTFILE,"output file", "w",argv[0],outfilename); ibmpc = IBMCRT; ansi = ANSICRT; mulsets = false; msets = 1; firstset = true; garbage = NULL; bits = 8*sizeof(long) - 1; doinit(); if (weights || justwts) openfile(&weightfile,WEIGHTFILE,"weights file","r",argv[0],weightfilename); if (trout) openfile(&outtree,OUTTREE,"output tree file", "w",argv[0],outtreename); if(ancvar) openfile(&ancfile,ANCFILE,"ancestors file", "r",argv[0],ancfilename); if(mixture) openfile(&mixfile,MIXFILE,"mixture file", "r",argv[0],mixfilename); for (ith = 1; ith <= msets; ith++) { if(firstset){ if (allsokal && !mixture) fprintf(outfile, "Camin-Sokal parsimony method\n\n"); if (allwagner && !mixture) fprintf(outfile, "Wagner parsimony method\n\n"); if (mixture) fprintf(outfile, "Mixture of Wagner and Camin-Sokal parsimony methods\n\n"); } doinput(); if (msets > 1 && !justwts) { fprintf(outfile, "Data set # %ld:\n\n",ith); if (progress) printf("\nData set # %ld:\n",ith); } if (justwts){ if(firstset && mixture && (printdata || stepbox || ancseq)) printmixture(outfile, wagner); fprintf(outfile, "Weights set # %ld:\n\n", ith); if (progress) printf("\nWeights set # %ld:\n\n", ith); } else if (mixture && (printdata || stepbox || ancseq)) printmixture(outfile, wagner); if (printdata){ if (weights || justwts) printweights(outfile, 0, chars, weight, "Characters"); if (ancvar) printancestors(outfile, anczero, ancone); } if (ith == 1) firstset = false; for (jumb = 1; jumb <= njumble; jumb++) maketree(); } free(place); free(steps); free(zeroanc); free(oneanc); free(fulzeroanc); free(empzeroanc); FClose(outfile); FClose(infile); FClose(outtree); #ifdef MAC fixmacfile(outtreename); fixmacfile(outfilename); #endif #ifdef WIN32 phyRestoreConsoleAttributes(); #endif return 0; } /* Mixed parsimony by uphill search */ phylip-3.697/src/mlclock.c0000644004732000473200000003602112407047233015130 0ustar joefelsenst_g/* version 3.696. Written by Joseph Felsenstein and Michal Palzewski. Copyright (c) 1993-2014, Joseph Felsenstein All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #include "phylip.h" #include "seq.h" #include "mlclock.h" /* Define the minimum branch length to be enforced for clocklike trees */ const double MIN_BRANCH_LENGTH = 1e-6; /* MIN_ROOT_TYME is added to the current root tyme and used as the lower * bound when optimizing at the root. */ const double MIN_ROOT_TYME = -10; static evaluator_t evaluate = NULL; static tree *curtree = NULL; /* current tree in use */ static node *current_node = NULL; /* current node being optimized */ static double cur_node_eval(double x); static double evaluate_tyme(tree *t, node *p, double tyme); void mlclock_init(tree *t, evaluator_t f) { curtree = t; evaluate = f; } boolean all_tymes_valid(node *p, double minlength, boolean fix) { /* Ensures that all node tymes at node p and descending from it are * valid, with all branches being not less than minlength. If any * inconsistencies are found, returns true. If 'fix' is given, * adjustments are made to make the subtree consistent. Otherwise if * assertions are enabled, all inconsistencies are fatal. No effort is * made to check that the parent node tyme p->back->tyme is less than * p->tyme. */ node *q; double max_tyme; boolean ret = true; /* All tips should have tyme == 0.0 */ if ( p->tip ) { if ( p->tyme == 0.0 ) return true; else { /* this would be very bad. */ if ( fix ) p->tyme = 0.0; else assert( p->tyme == 0 ); return false; } } for ( q = p->next; q != p; q = q->next ) { /* All nodes in ring should have same tyme */ if ( q && q->tyme != p->tyme ) { if ( fix ) q->tyme = p->tyme; else assert( q->tyme == p->tyme ); ret = false; } /* All subtrees should be OK too */ if (!q->back) continue; if ( all_tymes_valid(q->back, minlength, fix) == false ) ret = false; } /* Tymes cannot be greater than the minimum child time, less * branch length */ max_tyme = min_child_tyme(p) - minlength; if ( p->tyme > max_tyme ) { if ( fix ) setnodetymes(p, max_tyme); else assert( p->tyme < max_tyme ); return false; } return ret; } void setnodetymes(node* p, double newtyme) { /* Set node tyme for an entire fork. Also clears initialized flags on this * fork, but not recursively. inittrav() must be called before evaluating * elsewhere. */ node * q; curtree->likelihood = UNDEFINED; p->tyme = newtyme; p->initialized = false; if ( p->tip ) return; for ( q = p->next; q != p; q = q->next ) { assert(q); q->tyme = newtyme; q->initialized = false; } } /* setnodetymes */ double min_child_tyme(node *p) { /* Return the minimum tyme of all children. p must be a parent nodelet */ double min; node *q; min = 1.0; /* Tymes are always nonpositive */ for ( q = p->next; q != p; q = q->next ) { if ( q->back == NULL ) continue; if ( q->back->tyme < min ) min = q->back->tyme; } return min; } /* min_child_tyme */ double parent_tyme(node *p) { /* Return the tyme of the parent of node p. p must be a parent nodelet. */ if ( p->back ) { return p->back->tyme; } else { return p->tyme + MIN_ROOT_TYME; } } /* parent_tyme */ boolean valid_tyme(node *p, double tyme) { /* Return true if tyme is a valid tyme to assign to node p. tyme must be * finite, not greater than any of p's children, and not less than p's * parent. Also, tip nodes can only be assigned 0. Otherwise false is * returned. */ /* p must be the parent nodelet of its node group. */ assert( p->tip != true || tyme == 0.0 ); assert( tyme <= min_child_tyme(p) ); assert( tyme >= parent_tyme(p) ); return true; } /* valid_tyme */ static long node_max_depth(tree *t, node *p) { /* Return the largest number of branches between node p and any tip node. */ long max_depth = 0; long cdep; node *q; assert(p = pnode(t, p)); if (p->tip) return 0; for (q = p->next; q != p; q = q->next) { cdep = node_max_depth(t, q->back) + 1; if (cdep > max_depth) max_depth = cdep; } return max_depth; } static double node_max_tyme(tree *t, node *p) { /* Return the absolute maximum tyme a node can be pushed to. */ return -node_max_depth(t, p) * MIN_BRANCH_LENGTH; } void save_tymes(tree* save_tree, double tymes[]) { /* Save the current node tymes of save_tree in tymes[]. tymes must point to * an array of (nonodes - spp) elements. Tyme for node i gets saved in * tymes[i-spp]. */ int i; assert( all_tymes_valid(curtree->root, 0.0, false) ); for ( i = spp ; i < nonodes ; i++) { tymes[i - spp] = save_tree->nodep[i]->tyme; } } void restore_tymes(tree *load_tree, double tymes[]) { /* Restore the tymes saved in tymes[] to tree load_tree. See save_tymes() * */ int i; for ( i = spp ; i < nonodes ; i++) { if (load_tree->nodep[i]->tyme != tymes[i-spp]) setnodetymes(load_tree->nodep[i], tymes[i-spp]); } /* Check for invalid tymes */ assert( all_tymes_valid(curtree->root, 0.0, false) ); } static void push_tymes_to_root(tree *t, node *p, double tyme) { /* Set tyme for node p to tyme. Ancestors of p are moved down if necessary to prevent * negative branch lengths. */ node *q, *r; assert(p = pnode(t, p)); setnodetymes(p, tyme); r = p; while (r->back != NULL) { q = pnode(t, r->back); /* q = parent(r); */ if (q->tyme > r->tyme - MIN_BRANCH_LENGTH) setnodetymes(q, r->tyme - MIN_BRANCH_LENGTH); else break; r = q; } } static void push_tymes_to_tips(tree *t, node *p, double tyme) { /* Set tyme for node p to tyme. Descendants of p are moved up if necessary to * prevent negative branch lengths. */ node *q; assert( p == pnode(t, p) ); setnodetymes(p, tyme); for (q = p->next; q != p; q = q->next) { if (q->back->tyme < p->tyme + MIN_BRANCH_LENGTH) { if (q->back->tip && q->back->tyme < p->tyme) { fprintf(stderr, "Error: Attempt to move node past tips.\n" "%s line %d\n", __FILE__, __LINE__); exxit(-1); } else { if(!( q->back->tip ) ) { push_tymes_to_tips(t, q->back, p->tyme + MIN_BRANCH_LENGTH); } } } } } static void set_tyme(tree *t, node *p, double tyme) { /* Set the tyme for node p, pushing others out of the way */ /* Use rootward node in fork */ p = pnode(t, p); /* Set node tyme and push other nodes out of the way */ if (tyme < p->tyme) push_tymes_to_root(t, p, tyme); else push_tymes_to_tips(t, p, tyme); } static double evaluate_tyme(tree *t, node *p, double tyme) { /* Evaluate curtree if node p is at tyme. Return the score. Leaves original * tymes intact. */ static double *savetymes = NULL; static long savetymes_sz = 0; long nforks = nonodes - spp; double score = 1.0; if (savetymes_sz < nforks + 1) { if (savetymes != NULL) free(savetymes); savetymes_sz = nforks; savetymes = (double *)Malloc(savetymes_sz * sizeof(double)); } /* Save the current tymes */ save_tymes(t, savetymes); set_tyme(t, p, tyme); /* Evaluate the tree */ score = evaluate(p); /* Restore original tymes */ restore_tymes(t, savetymes); assert( all_tymes_valid(curtree->root, 0.0, false) ); return score; } static double cur_node_eval(double x) { return evaluate_tyme(curtree, current_node, x); } double maximize(double min_tyme, double cur, double max_tyme, double(*f)(double), double eps, boolean *success) { /* Find the maximum of function f by parabolic interpolation and golden section search. * (based on Brent method in NR) */ /* [min_tyme, max_tyme] is the domain, cur is the best guess and must be * within the domain, eps is the fractional accuracy of the result, i.e. the * returned value x will be accurate to +/- x*eps. */ boolean bracket = false; static long max_iterations = 100; /* maximum iterations */ long it; /* iteration counter */ double x[3], lnl[3]; /* tyme (x) and log likelihood (lnl) points below, at, and above the current tyme */ double xn, yn; /* New point */ double d; /* delta x to new point */ double mid; /* Midpoint of (x[0], x[2]) */ double xmax, lnlmax; double tdelta; /* uphill step for bracket finding */ double last_d = 0.0; double prec; /* epsilon * tyme */ double t1, t2, t3, t4; /* temps for parabolic fit */ /* Bracket our maximum; We will assume that we are already close and move * uphill by exponentially increasing steps until we find a smaller value. * The initial step should be small to allow us to finish quickly if we're * still on the maximum from previous smoothings */ x[1] = cur; tdelta = fabs(10.0 * cur * eps); x[0] = cur - tdelta; if (x[0] < min_tyme) x[0] = min_tyme; lnl[1] = (*f)(x[1]); lnl[0] = (*f)(x[0]); if (lnl[0] < lnl[1]) { do { x[2] = x[1] + tdelta; if (x[2] > max_tyme) x[2] = max_tyme; lnl[2] = (*f)(x[2]); if (lnl[2] < lnl[1]) break; x[0] = x[1]; lnl[0] = lnl[1]; x[1] = x[2]; lnl[1] = lnl[2]; tdelta *= 2; } while (x[2] < max_tyme); } else { /* lnl[0] > lnl[1] */ /* shift points (0, 1) -> (1, 2) */ x[2] = x[1]; x[1] = x[0]; lnl[2] = lnl[1]; lnl[1] = lnl[0]; do { x[0] = x[1] - tdelta; if (x[0] < min_tyme) x[0] = min_tyme; lnl[0] = (*f)(x[0]); if (lnl[0] < lnl[1]) break; x[2] = x[1]; lnl[2] = lnl[1]; x[1] = x[0]; lnl[1] = lnl[0]; tdelta *= 2; } while (x[0] > min_tyme); } /* FIXME: this should not be necessary. Somewhere we fail to enforce * MIN_BRANCH_LENGTH */ if ( x[1] < x[0] || x[2] < x[1] ) { x[1] = (x[2] + x[0]) / 2.0; lnl[1] = (*f)(x[1]); } assert(x[0] <= x[1] && x[1] <= x[2]); xmax = x[1]; lnlmax = lnl[1]; if (lnl[0] > lnlmax) { xmax = x[0]; lnlmax = lnl[0]; } if (lnl[2] > lnlmax) { xmax = x[2]; lnlmax = lnl[2]; } bracket = false; for (it = 0; it < max_iterations; it++) { assert(x[0] <= x[1] && x[1] <= x[2]); prec = fabs(x[1] * eps) + 1e-7; if (x[2] - x[0] < 4.0*prec) break; d = 0.0; mid = (x[2] + x[0]) / 2.0; if (lnl[0] < lnl[1] && lnl[1] > lnl[0]) { /* We have a bracket */ bracket = true; /* Try parabolic interpolation */ t1 = (x[1] - x[0]) * (lnl[1] - lnl[2]); t2 = (x[1] - x[2]) * (lnl[1] - lnl[0]); t3 = t1*(x[1] - x[0]) - t2*(x[1] - x[2]); t4 = 2.0*(t1 - t2); if (t4 > 0.0) t3 = -t3; t4 = fabs(t4); if ( fabs(t3) < fabs(0.5*t4*last_d) && t3 > t4 * (x[0] - x[1]) && t3 < t4 * (x[2] - x[1]) ) { d = t3 / t4; xn = x[1] + d; /* Keep the new point from getting too close to the end points */ if (xn - x[0] < 2.0*prec || x[2] - xn < 2.0*prec) d = xn - mid > 0 ? -prec : prec; } } else { /* We should never lose our bracket once we've found it. */ assert( !bracket ); } if (d == 0.0) { /* Bisect larger interval using golden ratio */ d = x[1] > mid ? 0.38 * (x[0] - x[1]) : 0.38 * (x[2] - x[1]); } /* Keep the new point from getting too close to the middle one */ if (fabs(d) < prec) d = d > 0 ? prec : -prec; xn = x[1] + d; last_d = d; yn = (*f)(xn); if (yn > lnlmax) { *success = true; xmax = xn; lnlmax = yn; } if (yn > lnl[1]) { /* (xn, yn) is the new middle point */ if (xn > x[1]) x[0] = x[1]; else x[2] = x[1]; x[1] = xn; lnl[1] = yn; } else { /* xn is the new bound */ if (xn > x[1]) x[2] = xn; else x[0] = xn; } } return xmax; } boolean makenewv(node *p) { /* Try to improve tree by moving node p. Returns true if a better likelihood * was found */ double min_tyme, max_tyme; /* absolute tyme limits */ double new_tyme; /* result from maximize() */ boolean success = false; /* return value */ node *s = curtree->nodep[p->index - 1]; assert( valid_tyme(s, s->tyme) ); /* Tyme cannot be less than parent */ if (s == curtree->root) min_tyme = s->tyme + MIN_ROOT_TYME; else min_tyme = parent_tyme(s) + MIN_BRANCH_LENGTH; /* Tyme cannot be greater than any children */ max_tyme = min_child_tyme(s) - MIN_BRANCH_LENGTH; /* * EXPERIMENTAL: * Allow nodes to move pretty much anywhere by pushing others outta the way. */ /* First, find the absolute maximum and minimum tymes. */ /* Minimum tyme is somewhere past the root */ min_tyme = curtree->root->tyme + MIN_ROOT_TYME; /* Max tyme is the minimum branch length times the maximal number of branches * to any tip node. */ max_tyme = node_max_tyme(curtree, s); /* Nothing to do if we can't move */ if ( max_tyme < min_tyme + 2.0 * MIN_BRANCH_LENGTH ) { return false; } /* Fix a failure to enforce minimum branch lengths which occurs somewhere in * dnamlk_add() */ if (s->tyme > max_tyme) set_tyme(curtree, s, max_tyme); current_node = s; new_tyme = maximize(min_tyme, s->tyme, max_tyme, &cur_node_eval, epsilon, &success); set_tyme(curtree, s, new_tyme); return success; } /* makenewv */ phylip-3.697/src/mlclock.h0000644004732000473200000000247712406201116015134 0ustar joefelsenst_g#include "phylip.h" /* Uncomment this line to dump details to dnamlk.log */ /* #define DEBUG */ #ifdef DEBUG extern double dump_likelihood_graph(FILE *fp, node *p, double min, double max, double step); extern double dump_likelihood_graph_twonode(FILE *fp, node *p1, double npoints); double dump_likelihood_graph_2d(FILE *fp, node *p1, double npoints); extern void get_limits(node **nodea, double *min, double *max); extern double set_tyme_evaluate(node *, double); #endif /* DEBUG */ typedef double (*evaluator_t)(node *); extern const double MIN_BRANCH_LENGTH; extern const double MIN_ROOT_TYME; /* module initialization */ extern void mlclock_init(tree *t, evaluator_t f); /* check or fix node tymes and branch lengths */ extern boolean all_tymes_valid(node *, double, boolean); /* change node tymes */ extern void setnodetymes(node* p, double newtyme); /* limits of node movement */ extern double min_child_tyme(node *p); extern double parent_tyme(node *p); extern boolean valid_tyme(node *p, double tyme); /* save/restore tymes */ extern void save_tymes(tree* save_tree, double tymes[]); extern void restore_tymes(tree *load_tree, double tymes[]); /* optimize a node tyme */ double maximize(double min_tyme, double cur, double max_tyme, double(*f)(double), double eps, boolean *success); extern boolean makenewv(node *p); phylip-3.697/src/move.c0000644004732000473200000012260112406201116014441 0ustar joefelsenst_g #include "phylip.h" #include "disc.h" #include "moves.h" #include "wagner.h" /* version 3.696. Written by Joseph Felsenstein, Akiko Fuseki, Sean Lamont, and Andrew Keeffe. Copyright (c) 1993-2014, Joseph Felsenstein All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #define overr 4 #define which 1 typedef enum { horiz, vert, up, overt, upcorner, downcorner, onne, zerro, question } chartype; typedef enum { arb, use, spec } howtree; typedef enum { rearr, flipp, reroott, none } rearrtype; #ifndef OLDC /*function prototypes */ void getoptions(void); void inputoptions(void); void allocrest(void); void doinput(void); void configure(void); void prefix(chartype); void postfix(chartype); void makechar(chartype); void move_fillin(node *); void move_postorder(node *); void evaluate(node *); void reroot(node *); void move_filltrav(node *); void move_hyptrav(node *); void move_hypstates(void); void grwrite(chartype, long, long *); void move_drawline(long, long); void move_printree(void); void arbitree(void); void yourtree(void); void initmovenode(node **, node **, node *, long, long, long *, long *, initops, pointarray, pointarray, Char *, Char *, FILE *); void buildtree(void); void rearrange(void); void tryadd(node *, node *, node *, double *); void addpreorder(node *, node *, node *, double *); void try(void); void undo(void); void treewrite(boolean); void clade(void); void flip(void); void changeoutgroup(void); void redisplay(void); void treeconstruct(void); void get_usertree(void); void initboolnames(node *p, boolean *names); /*function prototypes */ #endif char infilename[FNMLNGTH],intreename[FNMLNGTH],outtreename[FNMLNGTH], weightfilename[FNMLNGTH], ancfilename[FNMLNGTH], mixfilename[FNMLNGTH], factfilename[FNMLNGTH]; node *root; long outgrno, screenlines, col, treelines, leftedge, topedge, vmargin, hscroll, vscroll, scrollinc, screenwidth, farthest; /* outgrno indicates outgroup */ boolean weights, thresh, outgropt, ancvar, questions, allsokal, allwagner, mixture, factors, noroot, waswritten; boolean *ancone, *anczero, *ancone0, *anczero0; Char *factor; pointptr treenode; /* pointers to all nodes in tree */ double threshold; double *threshwt; bitptr wagner, wagner0; unsigned char che[9]; boolean reversed[9]; boolean graphic[9]; howtree how; gbit *garbage; char* progname; Char ch; /* Variables for treeread */ boolean usertree, goteof, firsttree, haslengths; pointarray nodep; node *grbg; long *zeros; /* Local variables for treeconstruct, propagated globally for C vesion: */ long dispchar, dispword, dispbit, atwhat, what, fromwhere, towhere, oldoutgrno, compatible; double like, bestyet, gotlike; Char *guess; boolean display, newtree, changed, subtree, written, oldwritten, restoring, wasleft, oldleft, earlytree; boolean *in_tree; steptr numsteps, numsone, numszero; long fullset; bitptr steps, zeroanc, oneanc; node *nuroot; rearrtype lastop; boolean *names; void getoptions() { /* interactively set options */ long loopcount; Char ch; boolean done, gotopt; how = arb; usertree = false; goteof = false; outgrno = 1; outgropt = false; thresh = false; threshold = spp; weights = false; ancvar = false; allsokal = false; allwagner = true; mixture = false; factors = false; loopcount = 0; do { #ifdef WIN32 if(ansi || ibmpc){ phyClearScreen(); } else { printf("\n"); } #else printf((ansi || ibmpc) ? "\033[2J\033[H" : "\n"); #endif printf("\n\nInteractive mixed parsimony algorithm, version %s\n\n", VERSION); printf("Settings for this run:\n"); printf(" X Use Mixed method? %s\n", mixture ? "Yes" : "No"); printf(" P Parsimony method? %s\n", (allwagner && !mixture) ? "Wagner" : (!(allwagner || mixture)) ? "Camin-Sokal" : "(methods in mixture)"); printf(" A Use ancestral states? %s\n", ancvar ? "Yes" : "No"); printf(" F Use factors information? %s\n", factors ? "Yes" : "No"); printf(" O Outgroup root? %s %3ld\n", outgropt ? "Yes, at species number" : "No, use as outgroup species", outgrno); printf(" W Sites weighted? %s\n", (weights ? "Yes" : "No")); printf(" T Use Threshold parsimony?"); if (thresh) printf(" Yes, count steps up to%4.1f\n", threshold); else printf(" No, use ordinary parsimony\n"); printf(" U Initial tree (arbitrary, user, specify)? %s\n", (how == arb) ? "Arbitrary" : (how == use) ? "User tree from tree file" : "Tree you specify"); printf(" 0 Graphics type (IBM PC, ANSI, none)? %s\n", ibmpc ? "IBM PC" : ansi ? "ANSI" : "(none)"); printf(" S Width of terminal screen?"); printf("%4ld\n", screenwidth); printf(" L Number of lines on screen?%4ld",screenlines); printf("\n\nAre these settings correct?"); printf(" (type Y or the letter for one to change)\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); if (ch == '\n') ch = ' '; uppercase(&ch); done = (ch == 'Y'); gotopt = (strchr("SFOTXPAU0WL",ch) != NULL) ? true : false; if (gotopt) { switch (ch) { case 'F': factors = !factors; break; case 'X': mixture = !mixture; break; case 'W': weights = !weights; break; case 'P': allwagner = !allwagner; break; case 'A': ancvar = !ancvar; break; case 'O': outgropt = !outgropt; if (outgropt) initoutgroup(&outgrno, spp); break; case 'T': thresh = !thresh; if (thresh) initthreshold(&threshold); break; case 'U': if (how == arb) how = use; else if (how == use) how = spec; else how = arb; break; case '0': initterminal(&ibmpc, &ansi); break; case 'S': screenwidth= readlong("Width of terminal screen (in characters)?\n"); break; case 'L': initnumlines(&screenlines); break; } } if (!(gotopt || done)) printf("Not a possible option!\n"); countup(&loopcount, 100); } while (!done); allsokal = (!allwagner && !mixture); if (scrollinc < screenwidth / 2.0) hscroll = scrollinc; else hscroll = screenwidth / 2; if (scrollinc < screenlines / 2.0) vscroll = scrollinc; else vscroll = screenlines / 2; } /* getoptions */ void inputoptions() { /* input the information on the options */ long i; scan_eoln(infile); for (i = 0; i < (chars); i++) weight[i] = 1; if (ancvar) inputancestors(anczero0, ancone0); if (factors) { factor = (Char *)Malloc(chars*sizeof(Char)); inputfactors(chars, factor, &factors); } if (mixture) inputmixture(wagner0); if (weights) inputweights(chars, weight, &weights); putchar('\n'); if (weights) printweights(stdout, 0, chars, weight, "Characters"); for (i = 0; i < (words); i++) { if (mixture) wagner[i] = wagner0[i]; else if (allsokal) wagner[i] = 0; else wagner[i] = (1L << (bits + 1)) - (1L << 1); } if (allsokal && !mixture) printf("Camin-Sokal parsimony method\n\n"); if (allwagner && !mixture) printf("Wagner parsimony method\n\n"); if (mixture) printmixture(stdout, wagner); for (i = 0; i < (chars); i++) { if (!ancvar) { anczero[i] = true; ancone[i] = (((1L << (i % bits + 1)) & wagner[i / bits]) != 0); } else { anczero[i] = anczero0[i]; ancone[i] = ancone0[i]; } } if (factors) printfactors(stdout, chars, factor, ""); if (ancvar) printancestors(stdout, anczero, ancone); noroot = true; questions = false; for (i = 0; i < (chars); i++) { if (weight[i] > 0) { noroot = (noroot && ancone[i] && anczero[i] && ((((1L << (i % bits + 1)) & wagner[i / bits]) != 0) || threshold <= 2.0)); } questions = (questions || (ancone[i] && anczero[i])); threshwt[i] = threshold * weight[i]; } } /* inputoptions */ void allocrest() { nayme = (naym *)Malloc(spp*sizeof(naym)); in_tree = (boolean *)Malloc(nonodes*sizeof(boolean)); extras = (steptr)Malloc(chars*sizeof(long)); weight = (steptr)Malloc(chars*sizeof(long)); numsteps = (steptr)Malloc(chars*sizeof(long)); numsone = (steptr)Malloc(chars*sizeof(long)); numszero = (steptr)Malloc(chars*sizeof(long)); threshwt = (double *)Malloc(chars*sizeof(double)); guess = (Char *)Malloc(chars*sizeof(Char)); ancone = (boolean *)Malloc(chars*sizeof(boolean)); anczero = (boolean *)Malloc(chars*sizeof(boolean)); ancone0 = (boolean *)Malloc(chars*sizeof(boolean)); anczero0 = (boolean *)Malloc(chars*sizeof(boolean)); wagner = (bitptr)Malloc(words*sizeof(long)); wagner0 = (bitptr)Malloc(words*sizeof(long)); steps = (bitptr)Malloc(words*sizeof(long)); zeroanc = (bitptr)Malloc(words*sizeof(long)); oneanc = (bitptr)Malloc(words*sizeof(long)); } /* allocrest */ void doinput() { /* reads the input data */ inputnumbers(&spp, &chars, &nonodes, 1); words = chars / bits + 1; printf("%2ld species, %3ld characters\n", spp, chars); printf("\nReading input file ...\n\n"); getoptions(); if (weights) openfile(&weightfile,WEIGHTFILE,"weights file","r",progname,weightfilename); if(ancvar) openfile(&ancfile,ANCFILE,"ancestors file", "r",progname,ancfilename); if(mixture) openfile(&mixfile,MIXFILE,"mixture file", "r",progname,mixfilename); if(factors) openfile(&factfile,FACTFILE,"factors file", "r",progname,factfilename); alloctree(&treenode); setuptree(treenode); allocrest(); inputoptions(); inputdata(treenode, true, false, stdout); } /* doinput */ void configure() { /* configure to machine -- set up special characters */ chartype a; for (a = horiz; (long)a <= (long)question; a = (chartype)((long)a + 1)) reversed[(long)a] = false; for (a = horiz; (long)a <= (long)question; a = (chartype)((long)a + 1)) graphic[(long)a] = false; if (ibmpc) { che[(long)horiz] = 205; graphic[(long)horiz] = true; che[(long)vert] = 186; graphic[(long)vert] = true; che[(long)up] = 186; graphic[(long)up] = true; che[(long)overt] = 205; graphic[(long)overt] = true; che[(long)onne] = 219; reversed[(long)onne] = true; che[(long)zerro] = 176; graphic[(long)zerro] = true; che[(long)question] = 178; /* or try CHR(177) */ graphic[(long)question] = true; che[(long)upcorner] = 200; graphic[(long)upcorner] = true; che[(long)downcorner] = 201; graphic[(long)downcorner] = true; return; } if (ansi) { che[(long)onne] = ' '; reversed[(long)onne] = true; che[(long)horiz] = che[(long)onne]; reversed[(long)horiz] = true; che[(long)vert] = che[(long)onne]; reversed[(long)vert] = true; che[(long)up] = 'x'; graphic[(long)up] = true; che[(long)overt] = 'q'; graphic[(long)overt] = true; che[(long)zerro] = 'a'; graphic[(long)zerro] = true; reversed[(long)zerro] = true; che[(long)question] = '?'; reversed[(long)question] = true; che[(long)upcorner] = 'm'; graphic[(long)upcorner] = true; che[(long)downcorner] = 'l'; graphic[(long)downcorner] = true; return; } che[(long)horiz] = '='; che[(long)vert] = ' '; che[(long)up] = '!'; che[(long)overt] = '-'; che[(long)onne] = '*'; che[(long)zerro] = '='; che[(long)question] = '.'; che[(long)upcorner] = '`'; che[(long)downcorner] = ','; } /* configure */ void prefix(chartype a) { /* give prefix appropriate for this character */ if (reversed[(long)a]) prereverse(ansi); if (graphic[(long)a]) pregraph(ansi); } /* prefix */ void postfix(chartype a) { /* give postfix appropriate for this character */ if (reversed[(long)a]) postreverse(ansi); if (graphic[(long)a]) postgraph(ansi); } /* postfix */ void makechar(chartype a) { /* print out a character with appropriate prefix or postfix */ prefix(a); putchar(che[(long)a]); postfix(a); } /* makechar */ void move_fillin(node *p) { /* Sets up for each node in the tree two statesets. stateone and statezero are the sets of character states that must be 1 or must be 0, respectively, in a most parsimonious reconstruction, based on the information at or above this node. Note that this state assignment may change based on information further down the tree. If a character is in both sets it is in state "P". If in neither, it is "?". */ long i; long l0, l1, r0, r1, st, wa, za, oa; for (i = 0; i < (words); i++) { l0 = p->next->back->statezero[i]; l1 = p->next->back->stateone[i]; r0 = p->next->next->back->statezero[i]; r1 = p->next->next->back->stateone[i]; wa = wagner[i]; za = zeroanc[i]; oa = oneanc[i]; st = (l1 & r0) | (l0 & r1); steps[i] = st; p->stateone[i] = (l1 | r1) & (~(st & (wa | za))); p->statezero[i] = (l0 | r0) & (~(st & (wa | oa))); } } /* move_fillin */ void move_postorder(node *p) { /* traverses a binary tree, calling function fillin at a node's descendants before calling fillin at the node */ if (p->tip) return; move_postorder(p->next->back); move_postorder(p->next->next->back); move_fillin(p); count(steps, zeroanc, numszero, numsone); } /* move_postorder */ void evaluate(node *r) { /* Determines the number of steps needed for a tree. This is the minimum number needed to evolve chars on this tree */ long i, stepnum, smaller; double sum; boolean nextcompat, thiscompat, done; sum = 0.0; for (i = 0; i < (chars); i++) { numszero[i] = 0; numsone[i] = 0; } for (i = 0; i < (words); i++) { zeroanc[i] = fullset; oneanc[i] = 0; } compatible = 0; nextcompat = true; move_postorder(r); count(r->stateone, zeroanc, numszero, numsone); for (i = 0; i < (words); i++) { zeroanc[i] = 0; oneanc[i] = fullset; } move_postorder(r); count(r->statezero, zeroanc, numszero, numsone); for (i = 0; i < (chars); i++) { smaller = spp * weight[i]; numsteps[i] = smaller; if (anczero[i]) { numsteps[i] = numszero[i]; smaller = numszero[i]; } if (ancone[i] && numsone[i] < smaller) numsteps[i] = numsone[i]; stepnum = numsteps[i] + extras[i]; if (stepnum <= threshwt[i]) sum += stepnum; else sum += threshwt[i]; thiscompat = (stepnum <= weight[i]); if (factors) { done = (i + 1 == chars); if (!done) done = (factor[i + 1] != factor[i]); nextcompat = (nextcompat && thiscompat); if (done) { if (nextcompat) compatible += weight[i]; nextcompat = true; } } else if (thiscompat) compatible += weight[i]; guess[i] = '?'; if (!ancone[i] || (anczero[i] && numszero[i] < numsone[i])) guess[i] = '0'; else if (!anczero[i] || (ancone[i] && numsone[i] < numszero[i])) guess[i] = '1'; } like = -sum; } /* evaluate */ void reroot(node *outgroup) { /* reorients tree, putting outgroup in desired position. */ node *p, *q, *newbottom, *oldbottom; boolean onleft; if (outgroup->back->index == root->index) return; newbottom = outgroup->back; p = treenode[newbottom->index - 1]->back; while (p->index != root->index) { oldbottom = treenode[p->index - 1]; treenode[p->index - 1] = p; p = oldbottom->back; } onleft = (p == root->next); if (restoring) if (!onleft && wasleft){ p = root->next->next; q = root->next; } else { p = root->next; q = root->next->next; } else { if (onleft) oldoutgrno = root->next->next->back->index; else oldoutgrno = root->next->back->index; wasleft = onleft; p = root->next; q = root->next->next; } p->back->back = q->back; q->back->back = p->back; p->back = outgroup; q->back = outgroup->back; if (restoring) { if (!onleft && wasleft) { outgroup->back->back = root->next; outgroup->back = root->next->next; } else { outgroup->back->back = root->next->next; outgroup->back = root->next; } } else { outgroup->back->back = root->next->next; outgroup->back = root->next; } treenode[newbottom->index - 1] = newbottom; } /* reroot */ void move_filltrav(node *r) { /* traverse to fill in interior node states */ if (r->tip) return; move_filltrav(r->next->back); move_filltrav(r->next->next->back); move_fillin(r); } /* move_filltrav */ void move_hyptrav(node *r) { /* compute states at one interior node */ long i; boolean bottom; long l0, l1, r0, r1, s0, s1, a0, a1, temp, wa; gbit *zerobelow = NULL, *onebelow = NULL; disc_gnu(&zerobelow, &garbage); disc_gnu(&onebelow, &garbage); bottom = (r->back == NULL); if (bottom) { memcpy(zerobelow->bits_, zeroanc, words*sizeof(long)); memcpy(onebelow->bits_, oneanc, words*sizeof(long)); } else { memcpy(zerobelow->bits_, treenode[r->back->index - 1]->statezero, words*sizeof(long)); memcpy(onebelow->bits_, treenode[r->back->index - 1]->stateone, words*sizeof(long)); } for (i = 0; i < (words); i++) { s0 = r->statezero[i]; s1 = r->stateone[i]; a0 = zerobelow->bits_[i]; a1 = onebelow->bits_[i]; if (!r->tip) { wa = wagner[i]; l0 = r->next->back->statezero[i]; l1 = r->next->back->stateone[i]; r0 = r->next->next->back->statezero[i]; r1 = r->next->next->back->stateone[i]; s0 = (wa & ((a0 & l0) | (a0 & r0) | (l0 & r0))) | (fullset & (~wa) & s0); s1 = (wa & ((a1 & l1) | (a1 & r1) | (l1 & r1))) | (fullset & (~wa) & s1); temp = fullset & (~(s0 | s1 | l1 | l0 | r1 | r0)); s0 |= temp & a0; s1 |= temp & a1; r->statezero[i] = s0; r->stateone[i] = s1; } } if (((1L << dispbit) & r->stateone[dispword - 1]) != 0) { if (((1L << dispbit) & r->statezero[dispword - 1]) != 0) r->state = '?'; else r->state = '1'; } else { if (((1L << dispbit) & r->statezero[dispword - 1]) != 0) r->state = '0'; else r->state = '?'; } if (!r->tip) { move_hyptrav(r->next->back); move_hyptrav(r->next->next->back); } disc_chuck(zerobelow, &garbage); disc_chuck(onebelow, &garbage); } /* move_hyptrav */ void move_hypstates() { /* fill in and describe states at interior nodes */ long i, j, k; for (i = 0; i < (words); i++) { zeroanc[i] = 0; oneanc[i] = 0; } for (i = 0; i < (chars); i++) { j = i / bits + 1; k = i % bits + 1; if (guess[i] == '0') zeroanc[j - 1] = ((long)zeroanc[j - 1]) | (1L << k); if (guess[i] == '1') oneanc[j - 1] = ((long)oneanc[j - 1]) | (1L << k); } move_filltrav(root); move_hyptrav(root); } /* move_hypstates */ void grwrite(chartype c, long num, long *pos) { long i; prefix(c); for (i = 1; i <= num; i++) { if ((*pos) >= leftedge && (*pos) - leftedge + 1 < screenwidth) putchar(che[(long)c]); (*pos)++; } postfix(c); } /* grwrite */ void move_drawline(long i, long lastline) { /* draws one row of the tree diagram by moving up tree */ node *p, *q, *r, *first =NULL, *last =NULL; long n, j, pos; boolean extra, done; Char st; chartype c, d; pos = 1; p = nuroot; q = nuroot; extra = false; if (i == (long)p->ycoord && (p == root || subtree)) { extra = true; c = overt; if (display) { switch (p->state) { case '1': c = onne; break; case '0': c = zerro; break; case '?': c = question; break; } } if ((subtree)) stwrite("Subtree:", 8, &pos, leftedge, screenwidth); if (p->index >= 100) nnwrite(p->index, 3, &pos, leftedge, screenwidth); else if (p->index >= 10) { grwrite(c, 1, &pos); nnwrite(p->index, 2, &pos, leftedge, screenwidth); } else { grwrite(c, 2, &pos); nnwrite(p->index, 1, &pos, leftedge, screenwidth); } } else { if (subtree) stwrite(" ", 10, &pos, leftedge, screenwidth); else stwrite(" ", 2, &pos, leftedge, screenwidth); } do { if (!p->tip) { r = p->next; done = false; do { if (i >= r->back->ymin && i <= r->back->ymax) { q = r->back; done = true; } r = r->next; } while (!(done || r == p)); first = p->next->back; r = p->next; while (r->next != p) r = r->next; last = r->back; } done = (p == q); n = (long)p->xcoord - (long)q->xcoord; if (n < 3 && !q->tip) n = 3; if (extra) { n--; extra = false; } if ((long)q->ycoord == i && !done) { if ((long)q->ycoord > (long)p->ycoord) d = upcorner; else d = downcorner; c = overt; if (display) { switch (q->state) { case '1': c = onne; break; case '0': c = zerro; break; case '?': c = question; break; } d = c; } if (n > 1) { grwrite(d, 1, &pos); grwrite(c, n - 3, &pos); } if (q->index >= 100) nnwrite(q->index, 3, &pos, leftedge, screenwidth); else if (q->index >= 10) { grwrite(c, 1, &pos); nnwrite(q->index, 2, &pos, leftedge, screenwidth); } else { grwrite(c, 2, &pos); nnwrite(q->index, 1, &pos, leftedge, screenwidth); } extra = true; } else if (!q->tip) { if ((long)last->ycoord > i && (long)first->ycoord < i && i != (long)p->ycoord) { c = up; if (i < (long)p->ycoord) st = p->next->back->state; else st = p->next->next->back->state; if (display) { switch (st) { case '1': c = onne; break; case '0': c = zerro; break; case '?': c = question; break; } } grwrite(c, 1, &pos); chwrite(' ', n - 1, &pos, leftedge, screenwidth); } else chwrite(' ', n, &pos, leftedge, screenwidth); } else chwrite(' ', n, &pos, leftedge, screenwidth); if (p != q) p = q; } while (!done); if ((long)p->ycoord == i && p->tip) { n = 0; for (j = 1; j <= nmlngth; j++) { if (nayme[p->index - 1][j - 1] != '\0') n = j; } chwrite(':', 1, &pos, leftedge, screenwidth); for (j = 0; j < n; j++) chwrite(nayme[p->index - 1][j], 1, &pos, leftedge, screenwidth); } putchar('\n'); } /* move_drawline */ void move_printree() { /* prints out diagram of the tree */ long tipy, i, dow; if (!subtree) nuroot = root; if (changed || newtree) evaluate(root); if (display) move_hypstates(); if (ansi || ibmpc) printf("\033[2J\033[H"); else putchar('\n'); tipy = 1; dow = down; if (spp * dow > screenlines && !subtree) { dow--; } if (noroot) printf("(unrooted)"); if (display) { printf(" "); makechar(onne); printf(":1 "); makechar(question); printf(":? "); makechar(zerro); printf(":0 "); } else printf(" "); if (!earlytree) { printf("%10.1f Steps", -like); } if (display) printf(" CHAR%3ld", dispchar); else printf(" "); if (!earlytree) { printf(" %3ld chars compatible\n", compatible); } printf(" "); if (changed && !earlytree) { if (-like < bestyet) { printf(" BEST YET!"); bestyet = -like; } else if (fabs(-like - bestyet) < 0.000001) printf(" (as good as best)"); else { if (-like < gotlike) printf(" better"); else if (-like > gotlike) printf(" worse!"); } } printf("\n"); farthest = 0; coordinates(nuroot, &tipy, 1.5, &farthest); vmargin = 4; treelines = tipy - dow; if (topedge != 1) { printf("** %ld lines above screen **\n", topedge - 1); vmargin++; } if ((treelines - topedge + 1) > (screenlines - vmargin)) vmargin++; for (i = 1; i <= treelines; i++) { if (i >= topedge && i < topedge + screenlines - vmargin) move_drawline(i, treelines); } if ((treelines - topedge + 1) > (screenlines - vmargin)) { printf("** %ld", treelines - (topedge - 1 + screenlines - vmargin)); printf(" lines below screen **\n"); } if (treelines - topedge + vmargin + 1 < screenlines) putchar('\n'); gotlike = -like; changed = false; } /* move_printree */ void arbitree() { long i; root = treenode[0]; add2(treenode[0], treenode[1], treenode[spp], &root, restoring, wasleft, treenode); for (i = 3; i <= (spp); i++) add2(treenode[spp+ i - 3], treenode[i - 1], treenode[spp + i - 2], &root, restoring, wasleft, treenode); for (i = 0; i < (nonodes); i++) in_tree[i] = true; } /* arbitree */ void yourtree() { long i, j; boolean ok; root = treenode[0]; add2(treenode[0], treenode[1], treenode[spp], &root, restoring, wasleft, treenode); i = 2; do { i++; move_printree(); printf("\nAdd species%3ld: \n", i); printf(" \n"); for (j = 0; j < nmlngth; j++) putchar(nayme[i - 1][j]); do { printf("\nbefore node (type number): "); inpnum(&j, &ok); ok = (ok && ((j >= 1 && j < i) || (j > spp && j < spp + i - 1))); if (!ok) printf("Impossible number. Please try again:\n"); } while (!ok); add2(treenode[j - 1], treenode[i - 1], treenode[spp + i - 2], &root, restoring, wasleft, treenode); } while (i != spp); for (i = 0; i < (nonodes); i++) in_tree[i] = true; } /* yourtree */ void initmovenode(node **p, node **grbg, node *q, long len, long nodei, long *ntips, long *parens, initops whichinit, pointarray treenode, pointarray nodep, Char *str, Char *ch, FILE *intree) { /* initializes a node */ /* LM 7/27 I added this function and the commented lines around */ /* treeread() to get the program running, but all 4 move programs*/ /* are improperly integrated into the v4.0 support files. As is */ /* this is a patchwork function */ boolean minusread; double valyew, divisor; switch (whichinit) { case bottom: gnutreenode(grbg, p, nodei, chars, zeros); treenode[nodei - 1] = *p; break; case nonbottom: gnutreenode(grbg, p, nodei, chars, zeros); break; case tip: match_names_to_data (str, treenode, p, spp); break; case length: processlength(&valyew, &divisor, ch, &minusread, intree, parens); /* process lengths and discard */ default: /*cases hslength,hsnolength,treewt,unittrwt,iter,*/ break; } } /* initmovenode */ void initboolnames(node *p, boolean *names) { /* sets BOOLEANs that indicate tips */ node *q; if (p->tip) { names[p->index - 1] = true; return; } q = p->next; while (q != p) { initboolnames(q->back, names); q = q->next; } } /* initboolnames */ void get_usertree() { long i, j, nextnode; node *p; /* Open in binary: ftell() is broken for UNIX line-endings under WIN32 */ openfile(&intree,INTREE,"input tree file", "rb",progname,intreename); names = (boolean *)Malloc(spp*sizeof(boolean)); firsttree = true; nodep = NULL; nextnode = 0; haslengths = 0; zeros = (long *)Malloc(chars*sizeof(long)); for (i = 0; i < chars; i++) zeros[i] = 0; treeread(intree, &root, treenode, &goteof, &firsttree, nodep, &nextnode, &haslengths, &grbg, initmovenode,false,nonodes); for (i = spp; i < (nonodes); i++) { p = treenode[i]; for (j = 1; j <= 3; j++) { p->stateone = (bitptr)Malloc(words*sizeof(long)); p->statezero = (bitptr)Malloc(words*sizeof(long)); p = p->next; } } for (i = 0; i < spp; i++) names[i] = false; initboolnames(root, names); for (i = 0; i < (spp); i++) in_tree[i] = names[i]; free(names); FClose(intree); } void buildtree() { changed = false; newtree = false; switch (how) { case arb: arbitree(); break; case use: get_usertree(); break; case spec: yourtree(); break; } if (!outgropt) outgrno = root->next->back->index; if (outgropt && in_tree[outgrno - 1]) reroot(treenode[outgrno - 1]); } /* buildtree */ void rearrange() { long i, j; boolean ok1, ok2; node *p, *q; printf("Remove everything to the right of which node? "); inpnum(&i, &ok1); ok1 = (ok1 && i >= 1 && i < spp * 2 && i != root->index); if (ok1) { printf("Add before which node? "); inpnum(&j, &ok2); ok2 = (ok2 && j >= 1 && j < spp * 2); if (ok2) { ok2 = (treenode[j - 1] != treenode[treenode[i - 1]->back->index - 1]); p = treenode[j - 1]; while (p != root) { ok2 = (ok2 && p != treenode[i - 1]); p = treenode[p->back->index - 1]; } if (ok1 && ok2) { what = i; q = treenode[treenode[i - 1]->back->index - 1]; if (q->next->back->index == i) fromwhere = q->next->next->back->index; else fromwhere = q->next->back->index; towhere = j; re_move2(&treenode[i - 1], &q, &root, &wasleft, treenode); add2(treenode[j - 1], treenode[i - 1], q, &root, restoring, wasleft, treenode); } lastop = rearr; } } changed = (ok1 && ok2); move_printree(); if (!(ok1 && ok2)) printf("Not a possible rearrangement. Try again: "); else { oldwritten =written; written = false; } } /* rearrange */ void tryadd(node *p, node *item, node *nufork, double *place) { /* temporarily adds one fork and one tip to the tree. Records scores in array place */ add2(p, item, nufork, &root, restoring, wasleft, treenode); evaluate(root); place[p->index - 1] = -like; re_move2(&item, &nufork, &root, &wasleft, treenode); } /* tryadd */ void addpreorder(node *p, node *item, node *nufork, double *place) { /* traverses a binary tree, calling function tryadd at a node before calling tryadd at its descendants */ if (p == NULL) return; tryadd(p,item,nufork,place); if (!p->tip) { addpreorder(p->next->back, item, nufork, place); addpreorder(p->next->next->back, item, nufork, place); } } /* addpreorder */ void try() { /* Remove node, try it in all possible places */ double *place; long i, j, oldcompat; double current; node *q, *dummy, *rute; boolean tied, better, ok; printf("Try other positions for which node? "); inpnum(&i, &ok); if (!(ok && i >= 1 && i <= nonodes && i != root->index)) { printf("Not a possible choice! "); return; } printf("WAIT ...\n"); place = (double *)Malloc(nonodes*sizeof(double)); for (j = 0; j < (nonodes); j++) place[j] = -1.0; evaluate(root); current = -like; oldcompat = compatible; what = i; q = treenode[treenode[i - 1]->back->index - 1]; if (q->next->back->index == i) fromwhere = q->next->next->back->index; else fromwhere = q->next->back->index; rute = root; if (root->index == treenode[i - 1]->back->index) { if (treenode[treenode[i - 1]->back->index - 1]->next->back == treenode[i - 1]) rute = treenode[treenode[i - 1]->back->index - 1]->next->next->back; else rute = treenode[treenode[i - 1]->back->index - 1]->next->back; } re_move2(&treenode[i - 1], &dummy, &root, &wasleft, treenode); oldleft = wasleft; root = rute; addpreorder(root, treenode[i - 1], dummy, place); wasleft =oldleft; restoring = true; add2(treenode[fromwhere - 1], treenode[what - 1],dummy, &root, restoring, wasleft, treenode); like = -current; compatible = oldcompat; restoring = false; better = false; printf(" BETTER: "); for (j = 1; j <= (nonodes); j++) { if (place[j - 1] < current && place[j - 1] >= 0.0) { printf("%3ld:%6.2f", j, place[j - 1]); better = true; } } if (!better) printf(" NONE"); printf("\n TIED: "); tied = false; for (j = 1; j <= (nonodes); j++) { if (fabs(place[j - 1] - current) < 1.0e-6 && j != fromwhere) { if (j < 10) printf("%2ld", j); else printf("%3ld", j); tied = true; } } if (tied) printf(":%6.2f\n", current); else printf("NONE\n"); changed = true; free(place); } /* try */ void undo() { /* restore to tree before last rearrangement */ long temp; boolean btemp; node *q; switch (lastop) { case rearr: restoring = true; oldleft = wasleft; re_move2(&treenode[what - 1], &q, &root, &wasleft, treenode); btemp = wasleft; wasleft = oldleft; add2(treenode[fromwhere - 1], treenode[what - 1],q, &root, restoring, wasleft, treenode); wasleft = btemp; restoring = false; temp = fromwhere; fromwhere = towhere; towhere = temp; changed = true; break; case flipp: q = treenode[atwhat - 1]->next->back; treenode[atwhat - 1]->next->back = treenode[atwhat - 1]->next->next->back; treenode[atwhat - 1]->next->next->back = q; treenode[atwhat - 1]->next->back->back = treenode[atwhat - 1]->next; treenode[atwhat - 1]->next->next->back->back = treenode[atwhat - 1]->next->next; break; case reroott: restoring = true; temp = oldoutgrno; oldoutgrno = outgrno; outgrno = temp; reroot(treenode[outgrno - 1]); restoring = false; break; case none: /* blank case */ break; } move_printree(); if (lastop == none) { printf("No operation to undo! \n"); return; } btemp = oldwritten; oldwritten = written; written = btemp; } /* undo */ void treewrite(boolean done) { /* write out tree to a file */ Char ch; treeoptions(waswritten, &ch, &outtree, outtreename, progname); if (!done) move_printree(); if (waswritten && ch == 'N') return; col = 0; treeout(root, 1, &col, root); printf("\nTree written to file \"%s\"\n\n", outtreename); waswritten = true; written = true; FClose(outtree); #ifdef MAC fixmacfile(outtreename); #endif } /* treewrite */ void clade() { /* pick a subtree and show only that on screen */ long i; boolean ok; printf("Select subtree rooted at which node (0 for whole tree)? "); inpnum(&i, &ok); ok = (ok && (unsigned)(i <= nonodes)); if (ok) { subtree = (i > 0); if (subtree) nuroot = treenode[i - 1]; else nuroot = root; } move_printree(); if (!ok) printf("Not possible to use this node. "); } /* clade */ void flip() { /* flip at a node left-right */ long i; boolean ok; node *p; printf("Flip branches at which node? "); inpnum(&i, &ok); ok = (ok && i > spp && i <= nonodes); if (ok) { p = treenode[i - 1]->next->back; treenode[i - 1]->next->back = treenode[i - 1]->next->next->back; treenode[i - 1]->next->next->back = p; treenode[i - 1]->next->back->back = treenode[i - 1]->next; treenode[i - 1]->next->next->back->back = treenode[i - 1]->next->next; atwhat = i; lastop = flipp; } move_printree(); if (ok) { oldwritten = written; written = false; return; } if (i >= 1 && i <= spp) printf("Can't flip there. "); else printf("No such node. "); } /* flip */ void changeoutgroup() { long i; boolean ok; oldoutgrno = outgrno; do { printf("Which node should be the new outgroup? "); inpnum(&i, &ok); ok = (ok && in_tree[i - 1] && i >= 1 && i <= nonodes && i != root->index); if (ok) outgrno = i; } while (!ok); if (in_tree[outgrno - 1]) reroot(treenode[outgrno - 1]); changed = true; lastop = reroott; move_printree(); oldwritten = written; written = false; } /* changeoutgroup */ void redisplay() { boolean done=false; waswritten = false; do { printf("\nNEXT? (Options: R # + - S . T U W O F H J K L C ? X Q) "); printf("(? for Help) "); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); if (ch == '\n') ch = ' '; uppercase(&ch); if (strchr("R#+-S.TUWOFHJKLC?XQ",ch) != NULL){ switch (ch) { case 'R': rearrange(); break; case '#': nextinc(&dispchar, &dispword, &dispbit, chars, bits, &display, numsteps, weight); move_printree(); break; case '+': nextchar(&dispchar, &dispword, &dispbit, chars, bits, &display); move_printree(); break; case '-': prevchar(&dispchar, &dispword, &dispbit, chars, bits, &display); move_printree(); break; case 'S': show(&dispchar, &dispword, &dispbit, chars, bits, &display); move_printree(); break; case '.': move_printree(); break; case 'T': try(); break; case 'U': undo(); break; case 'W': treewrite(done); break; case 'O': changeoutgroup(); break; case 'F': flip(); break; case 'H': window(left, &leftedge, &topedge, hscroll, vscroll, treelines, screenlines, screenwidth, farthest, subtree); move_printree(); break; case 'J': window(downn, &leftedge, &topedge, hscroll, vscroll, treelines, screenlines, screenwidth, farthest, subtree); move_printree(); break; case 'K': window(upp, &leftedge, &topedge, hscroll, vscroll, treelines, screenlines, screenwidth, farthest, subtree); move_printree(); break; case 'L': window(right, &leftedge, &topedge, hscroll, vscroll, treelines, screenlines, screenwidth, farthest, subtree); move_printree(); break; case 'C': clade(); break; case '?': help("character"); move_printree(); break; case 'X': done = true; break; case 'Q': done = true; break; } } } while (!done); if (!written) { do { printf("Do you want to write out the tree to a file? (Y or N) "); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); if (ch == '\n') ch = ' '; } while (ch != 'Y' && ch != 'y' && ch != 'N' && ch != 'n'); } if (ch == 'Y' || ch == 'y') treewrite(done); } /* redisplay */ void treeconstruct() { /* constructs a binary tree from the pointers in treenode. */ restoring = false; subtree = false; display = false; dispchar = 0; fullset = (1L << (bits + 1)) - (1L << 1); earlytree = true; buildtree(); waswritten = false; printf("\nComputing steps needed for compatibility in characters...\n\n"); newtree = true; earlytree = false; move_printree(); bestyet = -like; gotlike = -like; lastop = none; newtree = false; written = false; redisplay(); } /* treeconstruct */ int main(int argc, Char *argv[]) { /* Interactive mixed parsimony */ /* reads in spp, chars, and the data. Then calls treeconstruct to */ /* construct the tree and query the user */ #ifdef MAC argc = 1; /* macsetup("Move",""); */ argv[0] = "Move"; #endif init(argc, argv); progname = argv[0]; strcpy(infilename,INFILE); strcpy(intreename,INTREE); strcpy(outtreename,OUTTREE); openfile(&infile,infilename,"input file", "r",argv[0],infilename); screenlines = 24; scrollinc = 20; screenwidth = 80; topedge = 1; leftedge = 1; ibmpc = IBMCRT; ansi = ANSICRT; root = NULL; bits = 8*sizeof(long) - 1; doinput(); configure(); treeconstruct(); FClose(outtree); #ifdef MAC fixmacfile(outtreename); #endif #ifdef WIN32 phyRestoreConsoleAttributes(); #endif return 0; } /* Interactive mixed parsimony */ phylip-3.697/src/moves.c0000644004732000473200000001640112406201116014624 0ustar joefelsenst_g #include "phylip.h" #include "moves.h" void inpnum(long *n, boolean *success) { /* used by dnamove, dolmove, move, & retree */ int fields; char line[100]; #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); getstryng(line); *n = atof(line); fields = sscanf(line,"%ld",n); *success = (fields == 1); } /* inpnum */ void prereverse(boolean ansi) { /* turn on reverse video */ printf(ansi ? "\033[7m": ""); } /* prereverse */ void postreverse(boolean ansi) { /* turn off reverse video */ printf(ansi ? "\033[0m" : ""); } /* postreverse */ void chwrite(Char ch, long num, long *pos, long leftedge, long screenwidth) { long i; for (i = 1; i <= num; i++) { if ((*pos) >= leftedge && (*pos) - leftedge + 1 < screenwidth) putchar(ch); (*pos)++; } } /* chwrite */ void nnwrite(long nodenum,long num,long *pos,long leftedge,long screenwidth) { long i, leftx; leftx = leftedge - (*pos); if ((*pos) >= leftedge && (*pos) - leftedge + num < screenwidth) printf("%*ld", (int)num, nodenum); else if (leftx > 0 && leftx < 3) for(i=0;i= leftedge && (*pos) - leftedge + 1 < screenwidth) printf("%*s", (int)length, s); (*pos) += length; } /* stwrite */ void help(const char *letters) { /* display help information */ char input[100]; printf("\n\nR Rearrange a tree by moving a node or group\n"); printf("# Show the states of the next %s that doesn't fit tree\n", letters); printf("+ Show the states of the next %s\n", letters); printf("- ... of the previous %s\n", letters); printf("S Show the states of a given %s\n", letters); printf(". redisplay the same tree again\n"); printf("T Try all possible positions of a node or group\n"); printf("U Undo the most recent rearrangement\n"); printf("W Write tree to a file\n"); printf("O select an Outgroup for the tree\n"); printf("F Flip (rotate) branches at a node\n"); printf("H Move viewing window to the left\n"); printf("J Move viewing window downward\n"); printf("K Move viewing window upward\n"); printf("L Move viewing window to the right\n"); printf("C show only one Clade (subtree) (useful if tree is too big)\n"); printf("? Help (this screen)\n"); printf("Q (Quit) Exit from program\n"); printf("X Exit from program\n\n\n"); printf("TO CONTINUE, PRESS ON THE Return OR Enter KEY"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); getstryng(input); } /* help */ void treeoptions(boolean waswritten, Char *ch, FILE **outtree, Char *outtreename, Char *progname) { /* interactively get options for writing a tree */ char input[100]; if (waswritten) { printf("\nTree file already was open.\n"); printf(" A Add to this tree to tree file\n"); printf(" R Replace tree file contents by this tree\n"); printf(" F Write out tree to a different tree file\n"); printf(" N Do Not write out this tree\n"); do { printf("Which should we do? "); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); getstryng(input); *ch = input[0]; uppercase(ch); } while (*ch != 'A' && *ch != 'R' && *ch != 'N' && *ch != 'F'); } if (*ch == 'F'){ outtreename[0] = '\0'; while (outtreename[0] =='\0'){ printf("Please enter a tree file name>"); #ifdef MAC fixmacfile(outtreename); #endif #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); getstryng(outtreename); } FClose(*outtree); } if (*ch == 'R' || *ch == 'A' || *ch == 'F' || !waswritten){ openfile(outtree,outtreename,"output tree file", (*ch == 'A' && waswritten) ? "a" : "w", progname,outtreename); } } /* treeoptions */ void window(adjwindow action, long *leftedge, long *topedge, long hscroll, long vscroll, long treelines, long screenlines, long screenwidth, long farthest, boolean subtree) { /* move viewing window of tree */ switch (action) { case left: if (*leftedge != 1) *leftedge -= hscroll; break; case downn: /* The 'topedge + 6' is needed to allow downward scrolling when part of the tree is above the screen and only 1 or 2 lines are below it. */ if (treelines - *topedge + 6 >= screenlines) *topedge += vscroll; break; case upp: if (*topedge != 1) *topedge -= vscroll; break; case right: if ((farthest + 6 + nmlngth + ((subtree) ? 8 : 0)) > (*leftedge + screenwidth)) *leftedge += hscroll; break; } } /* window */ void pregraph(boolean ansi) { /* turn on graphic characters */ /* used in move & dolmove */ printf(ansi ? "\033(0" : ""); } /* pregraph */ void pregraph2(boolean ansi) { /* turn on graphic characters */ /* used in dnamove & retree */ if (ansi) { printf("\033(0"); printf("\033[10m"); } } /* pregraph2 */ void postgraph(boolean ansi) { /* turn off graphic characters */ /* used in move & dolmove */ printf(ansi ? "\033(B" : ""); } /* postgraph */ void postgraph2(boolean ansi) { /* turn off graphic characters */ /* used in dnamove & retree */ if (ansi) { printf("\033[11m"); printf("\033(B"); } } /* postgraph2 */ void nextinc(long *dispchar, long *dispword, long *dispbit, long chars, long bits, boolean *display, steptr numsteps, steptr weight) { /* show next incompatible character */ /* used in move & dolmove */ long disp0; boolean done; *display = true; disp0 = *dispchar; done = false; do { (*dispchar)++; if (*dispchar > chars) { *dispchar = 1; done = (disp0 == 0); } } while (!(numsteps[*dispchar - 1] > weight[*dispchar - 1] || *dispchar == disp0 || done)); *dispword = (*dispchar - 1) / bits + 1; *dispbit = (*dispchar - 1) % bits + 1; } /* nextinc */ void nextchar(long *dispchar, long *dispword, long *dispbit, long chars, long bits, boolean *display) { /* show next character */ /* used in move & dolmove */ *display = true; (*dispchar)++; if (*dispchar > chars) *dispchar = 1; *dispword = (*dispchar - 1) / bits + 1; *dispbit = (*dispchar - 1) % bits + 1; } /* nextchar */ void prevchar(long *dispchar, long *dispword, long *dispbit, long chars, long bits, boolean *display) { /* show previous character */ /* used in move & dolmove */ *display = true; (*dispchar)--; if (*dispchar < 1) *dispchar = chars; *dispword = (*dispchar - 1) / bits + 1; *dispbit = (*dispchar - 1) % bits + 1; } /* prevchar */ void show(long *dispchar, long *dispword, long *dispbit, long chars, long bits, boolean *display) { /* used in move & dolmove */ long i; boolean ok; do { printf("SHOW: (Character number or 0 to see none)? "); inpnum(&i, &ok); ok = (ok && (i == 0 || (i >= 1 && i <= chars))); if (ok && i != 0) { *display = true; *dispchar = i; *dispword = (i - 1) / bits + 1; *dispbit = (i - 1) % bits + 1; } if (ok && i == 0) *display = false; } while (!ok); } /* show */ phylip-3.697/src/moves.h0000644004732000473200000000176712406201116014642 0ustar joefelsenst_g /* moves.h: included in dnamove, move, dolmove, & retree */ typedef enum { left, downn, upp, right } adjwindow; #ifndef OLDC /* function prototypes */ void inpnum(long *, boolean *); void prereverse(boolean); void postreverse(boolean); void chwrite(Char, long, long *, long, long); void nnwrite(long, long, long *, long, long); void stwrite(const char *,long,long *,long,long); void help(const char *); void treeoptions(boolean, Char *, FILE **, Char *, Char *); void window(adjwindow, long *, long *, long, long, long, long, long, long, boolean); void pregraph(boolean); void pregraph2(boolean); void postgraph(boolean); void postgraph2(boolean); void nextinc(long *, long *, long *, long, long, boolean *, steptr, steptr); void nextchar(long *, long *, long *, long, long, boolean *); void prevchar(long *, long *, long *, long, long, boolean *); void show(long *, long *, long *, long, long, boolean *); /* function prototypes */ #endif phylip-3.697/src/neighbor.c0000644004732000473200000004054012406201116015271 0ustar joefelsenst_g/* version 3.696. Written by Mary Kuhner, Jon Yamato, Joseph Felsenstein, Akiko Fuseki, Sean Lamont, and Andrew Keeffe. Copyright (c) 1993-2014, Joseph Felsenstein All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #include #include "phylip.h" #include "dist.h" #ifndef OLDC /* function prototypes */ void getoptions(void); void allocrest(void); void doinit(void); void inputoptions(void); void getinput(void); void describe(node *, double); void summarize(void); void nodelabel(boolean); void jointree(void); void maketree(void); void freerest(void); /* function prototypes */ #endif Char infilename[FNMLNGTH], outfilename[FNMLNGTH], outtreename[FNMLNGTH]; long nonodes2, outgrno, col, datasets, ith; long inseed; vector *x; intvector *reps; boolean jumble, lower, upper, outgropt, replicates, trout, printdata, progress, treeprint, mulsets, njoin; tree curtree; longer seed; long *enterorder; Char progname[20]; /* variables for maketree, propagated globally for C version: */ node **cluster; void getoptions() { /* interactively set options */ long inseed0 = 0, loopcount; Char ch; fprintf(outfile, "\nNeighbor-Joining/UPGMA method version %s\n\n",VERSION); putchar('\n'); jumble = false; lower = false; outgrno = 1; outgropt = false; replicates = false; trout = true; upper = false; printdata = false; progress = true; treeprint = true; njoin = true; loopcount = 0; for(;;) { cleerhome(); printf("\nNeighbor-Joining/UPGMA method version %s\n\n",VERSION); printf("Settings for this run:\n"); printf(" N Neighbor-joining or UPGMA tree? %s\n", (njoin ? "Neighbor-joining" : "UPGMA")); if (njoin) { printf(" O Outgroup root?"); if (outgropt) printf(" Yes, at species number%3ld\n", outgrno); else printf(" No, use as outgroup species%3ld\n", outgrno); } printf(" L Lower-triangular data matrix? %s\n", (lower ? "Yes" : "No")); printf(" R Upper-triangular data matrix? %s\n", (upper ? "Yes" : "No")); printf(" S Subreplicates? %s\n", (replicates ? "Yes" : "No")); printf(" J Randomize input order of species?"); if (jumble) printf(" Yes (random number seed =%8ld)\n", inseed0); else printf(" No. Use input order\n"); printf(" M Analyze multiple data sets?"); if (mulsets) printf(" Yes, %2ld sets\n", datasets); else printf(" No\n"); printf(" 0 Terminal type (IBM PC, ANSI, none)? %s\n", (ibmpc ? "IBM PC" : ansi ? "ANSI" : "(none)")); printf(" 1 Print out the data at start of run %s\n", (printdata ? "Yes" : "No")); printf(" 2 Print indications of progress of run %s\n", (progress ? "Yes" : "No")); printf(" 3 Print out tree %s\n", (treeprint ? "Yes" : "No")); printf(" 4 Write out trees onto tree file? %s\n", (trout ? "Yes" : "No")); printf("\n\n Y to accept these or type the letter for one to change\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); if (ch == '\n') ch = ' '; uppercase(&ch); if (ch == 'Y') break; if (strchr("NJOULRSM01234",ch) != NULL){ switch (ch) { case 'J': jumble = !jumble; if (jumble) initseed(&inseed, &inseed0, seed); break; case 'L': lower = !lower; break; case 'O': outgropt = !outgropt; if (outgropt) initoutgroup(&outgrno, spp); else outgrno = 1; break; case 'R': upper = !upper; break; case 'S': replicates = !replicates; break; case 'N': njoin = !njoin; break; case 'M': mulsets = !mulsets; if (mulsets) initdatasets(&datasets); jumble = true; if (jumble) initseed(&inseed, &inseed0, seed); break; case '0': initterminal(&ibmpc, &ansi); break; case '1': printdata = !printdata; break; case '2': progress = !progress; break; case '3': treeprint = !treeprint; break; case '4': trout = !trout; break; } } else printf("Not a possible option!\n"); countup(&loopcount, 100); } } /* getoptions */ void allocrest() { long i; x = (vector *)Malloc(spp*sizeof(vector)); for (i = 0; i < spp; i++) x[i] = (vector)Malloc(spp*sizeof(double)); reps = (intvector *)Malloc(spp*sizeof(intvector)); for (i = 0; i < spp; i++) reps[i] = (intvector)Malloc(spp*sizeof(long)); nayme = (naym *)Malloc(spp*sizeof(naym)); enterorder = (long *)Malloc(spp*sizeof(long)); cluster = (node **)Malloc(spp*sizeof(node *)); } /* allocrest */ void freerest() { long i; for (i = 0; i < spp; i++) free(x[i]); free(x); for (i = 0; i < spp; i++) free(reps[i]); free(reps); free(nayme); free(enterorder); free(cluster); } /* freerest */ void doinit() { /* initializes variables */ node *p; inputnumbers2(&spp, &nonodes2, 2); nonodes2 += (njoin ? 0 : 1); getoptions(); alloctree(&curtree.nodep, nonodes2+1); p = curtree.nodep[nonodes2]->next; curtree.nodep[nonodes2]->next = curtree.nodep[nonodes2]; free(p->next); free(p); allocrest(); } /* doinit */ void inputoptions() { /* read options information */ if (ith != 1) samenumsp2(ith); putc('\n', outfile); if (njoin) fprintf(outfile, " Neighbor-joining method\n"); else fprintf(outfile, " UPGMA method\n"); fprintf(outfile, "\n Negative branch lengths allowed\n\n"); } /* inputoptions */ void describe(node *p, double height) { /* print out information for one branch */ long i; node *q; q = p->back; if (njoin) fprintf(outfile, "%4ld ", q->index - spp); else fprintf(outfile, "%4ld ", q->index - spp); if (p->tip) { for (i = 0; i < nmlngth; i++) putc(nayme[p->index - 1][i], outfile); putc(' ', outfile); } else { if (njoin) fprintf(outfile, "%4ld ", p->index - spp); else { fprintf(outfile, "%4ld ", p->index - spp); } } if (njoin) fprintf(outfile, "%12.5f\n", q->v); else fprintf(outfile, "%10.5f %10.5f\n", q->v, q->v+height); if (!p->tip) { describe(p->next->back, height+q->v); describe(p->next->next->back, height+q->v); } } /* describe */ void summarize() { /* print out branch lengths etc. */ putc('\n', outfile); if (njoin) { fprintf(outfile, "remember:"); if (outgropt) fprintf(outfile, " (although rooted by outgroup)"); fprintf(outfile, " this is an unrooted tree!\n"); } if (njoin) { fprintf(outfile, "\nBetween And Length\n"); fprintf(outfile, "------- --- ------\n"); } else { fprintf(outfile, "From To Length Height\n"); fprintf(outfile, "---- -- ------ ------\n"); } describe(curtree.start->next->back, 0.0); describe(curtree.start->next->next->back, 0.0); if (njoin) describe(curtree.start->back, 0.0); fprintf(outfile, "\n\n"); } /* summarize */ void nodelabel(boolean isnode) { if (isnode) printf("node"); else printf("species"); } /* nodelabel */ void jointree() { /* calculate the tree */ long nc, nextnode, mini=0, minj=0, i, j, ia, ja, ii, jj, nude, iter; double fotu2, total, tmin, dio, djo, bi, bj, bk, dmin=0, da; long el[3]; vector av; intvector oc; double *R; /* added in revisions by Y. Ina */ R = (double *)Malloc(spp * sizeof(double)); for (i = 0; i <= spp - 2; i++) { for (j = i + 1; j < spp; j++) { da = (x[i][j] + x[j][i]) / 2.0; x[i][j] = da; x[j][i] = da; } } /* First initialization */ fotu2 = spp - 2.0; nextnode = spp + 1; av = (vector)Malloc(spp*sizeof(double)); oc = (intvector)Malloc(spp*sizeof(long)); for (i = 0; i < spp; i++) { av[i] = 0.0; oc[i] = 1; } /* Enter the main cycle */ if (njoin) iter = spp - 3; else iter = spp - 1; for (nc = 1; nc <= iter; nc++) { for (j = 2; j <= spp; j++) { for (i = 0; i <= j - 2; i++) x[j - 1][i] = x[i][j - 1]; } tmin = DBL_MAX; /* Compute sij and minimize */ if (njoin) { /* many revisions by Y. Ina from here ... */ for (i = 0; i < spp; i++) R[i] = 0.0; for (ja = 2; ja <= spp; ja++) { jj = enterorder[ja - 1]; if (cluster[jj - 1] != NULL) { for (ia = 0; ia <= ja - 2; ia++) { ii = enterorder[ia]; if (cluster[ii - 1] != NULL) { R[ii - 1] += x[ii - 1][jj - 1]; R[jj - 1] += x[ii - 1][jj - 1]; } } } } } /* ... to here */ for (ja = 2; ja <= spp; ja++) { jj = enterorder[ja - 1]; if (cluster[jj - 1] != NULL) { for (ia = 0; ia <= ja - 2; ia++) { ii = enterorder[ia]; if (cluster[ii - 1] != NULL) { if (njoin) { total = fotu2 * x[ii - 1][jj - 1] - R[ii - 1] - R[jj - 1]; /* this statement part of revisions by Y. Ina */ } else total = x[ii - 1][jj - 1]; if (total < tmin) { tmin = total; mini = ii; minj = jj; } } } } } /* compute lengths and print */ if (njoin) { dio = 0.0; djo = 0.0; for (i = 0; i < spp; i++) { dio += x[i][mini - 1]; djo += x[i][minj - 1]; } dmin = x[mini - 1][minj - 1]; dio = (dio - dmin) / fotu2; djo = (djo - dmin) / fotu2; bi = (dmin + dio - djo) * 0.5; bj = dmin - bi; bi -= av[mini - 1]; bj -= av[minj - 1]; } else { bi = x[mini - 1][minj - 1] / 2.0 - av[mini - 1]; bj = x[mini - 1][minj - 1] / 2.0 - av[minj - 1]; av[mini - 1] += bi; } if (progress) { printf("Cycle %3ld: ", iter - nc + 1); if (njoin) nodelabel((boolean)(av[mini - 1] > 0.0)); else nodelabel((boolean)(oc[mini - 1] > 1.0)); printf(" %ld (%10.5f) joins ", mini, bi); if (njoin) nodelabel((boolean)(av[minj - 1] > 0.0)); else nodelabel((boolean)(oc[minj - 1] > 1.0)); printf(" %ld (%10.5f)\n", minj, bj); #ifdef WIN32 phyFillScreenColor(); #endif } hookup(curtree.nodep[nextnode - 1]->next, cluster[mini - 1]); hookup(curtree.nodep[nextnode - 1]->next->next, cluster[minj - 1]); cluster[mini - 1]->v = bi; cluster[minj - 1]->v = bj; cluster[mini - 1]->back->v = bi; cluster[minj - 1]->back->v = bj; cluster[mini - 1] = curtree.nodep[nextnode - 1]; cluster[minj - 1] = NULL; nextnode++; if (njoin) av[mini - 1] = dmin * 0.5; /* re-initialization */ fotu2 -= 1.0; for (j = 0; j < spp; j++) { if (cluster[j] != NULL) { if (njoin) { da = (x[mini - 1][j] + x[minj - 1][j]) * 0.5; if (mini - j - 1 < 0) x[mini - 1][j] = da; if (mini - j - 1 > 0) x[j][mini - 1] = da; } else { da = x[mini - 1][j] * oc[mini - 1] + x[minj - 1][j] * oc[minj - 1]; da /= oc[mini - 1] + oc[minj - 1]; x[mini - 1][j] = da; x[j][mini - 1] = da; } } } for (j = 0; j < spp; j++) { x[minj - 1][j] = 0.0; x[j][minj - 1] = 0.0; } oc[mini - 1] += oc[minj - 1]; } /* the last cycle */ nude = 1; for (i = 1; i <= spp; i++) { if (cluster[i - 1] != NULL) { el[nude - 1] = i; nude++; } } if (!njoin) { curtree.start = cluster[el[0] - 1]; curtree.start->back = NULL; free(av); free(oc); return; } bi = (x[el[0] - 1][el[1] - 1] + x[el[0] - 1][el[2] - 1] - x[el[1] - 1] [el[2] - 1]) * 0.5; bj = x[el[0] - 1][el[1] - 1] - bi; bk = x[el[0] - 1][el[2] - 1] - bi; bi -= av[el[0] - 1]; bj -= av[el[1] - 1]; bk -= av[el[2] - 1]; if (progress) { printf("last cycle:\n"); putchar(' '); nodelabel((boolean)(av[el[0] - 1] > 0.0)); printf(" %ld (%10.5f) joins ", el[0], bi); nodelabel((boolean)(av[el[1] - 1] > 0.0)); printf(" %ld (%10.5f) joins ", el[1], bj); nodelabel((boolean)(av[el[2] - 1] > 0.0)); printf(" %ld (%10.5f)\n", el[2], bk); #ifdef WIN32 phyFillScreenColor(); #endif } hookup(curtree.nodep[nextnode - 1], cluster[el[0] - 1]); hookup(curtree.nodep[nextnode - 1]->next, cluster[el[1] - 1]); hookup(curtree.nodep[nextnode - 1]->next->next, cluster[el[2] - 1]); cluster[el[0] - 1]->v = bi; cluster[el[1] - 1]->v = bj; cluster[el[2] - 1]->v = bk; cluster[el[0] - 1]->back->v = bi; cluster[el[1] - 1]->back->v = bj; cluster[el[2] - 1]->back->v = bk; curtree.start = cluster[el[0] - 1]->back; free(av); free(oc); free(R); } /* jointree */ void maketree() { /* construct the tree */ long i ; inputdata(replicates, printdata, lower, upper, x, reps); if (njoin && (spp < 3)) { printf("\nERROR: Neighbor-Joining runs must have at least 3 species\n\n"); exxit(-1); } if (progress) putchar('\n'); if (ith == 1) setuptree(&curtree, nonodes2 + 1); for (i = 1; i <= spp; i++) enterorder[i - 1] = i; if (jumble) randumize(seed, enterorder); for (i = 0; i < spp; i++) cluster[i] = curtree.nodep[i]; jointree(); if (njoin) curtree.start = curtree.nodep[outgrno - 1]->back; printree(curtree.start, treeprint, njoin, (boolean)(!njoin)); if (treeprint) summarize(); if (trout) { col = 0; if (njoin) treeout(curtree.start, &col, 0.43429448222, njoin, curtree.start); else curtree.root = curtree.start, treeoutr(curtree.start,&col,&curtree); } if (progress) { printf("\nOutput written on file \"%s\"\n\n", outfilename); if (trout) printf("Tree written on file \"%s\"\n\n", outtreename); } } /* maketree */ int main(int argc, Char *argv[]) { /* main program */ #ifdef MAC argc = 1; /* macsetup("Neighbor",""); */ argv[0] = "Neighbor"; #endif init(argc, argv); openfile(&infile,INFILE,"input file", "r",argv[0],infilename); openfile(&outfile,OUTFILE,"output file", "w",argv[0],outfilename); ibmpc = IBMCRT; ansi = ANSICRT; mulsets = false; datasets = 1; doinit(); if (trout) openfile(&outtree,OUTTREE,"output tree file", "w",argv[0],outtreename); ith = 1; while (ith <= datasets) { if (datasets > 1) { fprintf(outfile, "Data set # %ld:\n",ith); if (progress) printf("Data set # %ld:\n",ith); } inputoptions(); maketree(); if (eoln(infile) && (ith < datasets)) scan_eoln(infile); ith++; } FClose(infile); FClose(outfile); FClose(outtree); freerest(); freetree(&curtree.nodep, nonodes2+1); #ifdef MAC fixmacfile(outfilename); fixmacfile(outtreename); #endif printf("Done.\n\n"); #ifdef WIN32 phyRestoreConsoleAttributes(); #endif return 0; } phylip-3.697/src/pars.c0000644004732000473200000014256012406201116014446 0ustar joefelsenst_g #include "phylip.h" #include "discrete.h" /* version 3.696. Written by Joseph Felsenstein, Akiko Fuseki, Sean Lamont, and Andrew Keeffe. Copyright (c) 1993-2014, Joseph Felsenstein All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #define MAXNUMTREES 1000000 /* bigger than number of user trees can be */ #ifndef OLDC /* function prototypes */ void getoptions(void); void allocrest(void); void doinit(void); void makeweights(void); void doinput(void); void initparsnode(node **, node **, node *, long, long, long *, long *, initops, pointarray, pointarray, Char *, Char *, FILE *); void evaluate(node *); void tryadd(node *, node *, node *); void addpreorder(node *, node *, node *); void trydescendants(node *, node *, node *, node *, boolean); void trylocal(node *, node *); void trylocal2(node *, node *, node *); void tryrearr(node *p, boolean *); void repreorder(node *p, boolean *); void rearrange(node **); void describe(void); void pars_coordinates(node *, double, long *, double *); void pars_printree(void); void globrearrange(void); void grandrearr(void); void maketree(void); void freerest(void); void load_tree(long treei); void reallocchars(void); /* function prototypes */ #endif Char infilename[FNMLNGTH], outfilename[FNMLNGTH], intreename[FNMLNGTH], outtreename[FNMLNGTH], weightfilename[FNMLNGTH]; node *root; long chars, col, msets, ith, njumble, jumb, maxtrees; /* chars = number of sites in actual sequences */ long inseed, inseed0; double threshold; boolean jumble, usertree, thresh, weights, thorough, rearrfirst, trout, progress, stepbox, ancseq, mulsets, justwts, firstset, mulf, multf; steptr oldweight; longer seed; pointarray treenode; /* pointers to all nodes in tree */ long *enterorder; char *progname; long *zeros; unsigned char *zeros2; /* local variables for Pascal maketree, propagated globally for C version: */ long minwhich; double like, minsteps, bestyet, bestlike, bstlike2; boolean lastrearr, recompute; double nsteps[maxuser]; long **fsteps; node *there, *oldnufork; long *place; bestelm *bestrees; long *threshwt; discbaseptr nothing; gbases *garbage; node *temp, *temp1, *temp2, *tempsum, *temprm, *tempadd, *tempf, *tmp, *tmp1, *tmp2, *tmp3, *tmprm, *tmpadd; boolean *names; node *grbg; void getoptions() { /* interactively set options */ long inseed0, loopcount, loopcount2; Char ch, ch2; fprintf(outfile, "\nDiscrete character parsimony algorithm, version %s\n\n", VERSION); jumble = false; njumble = 1; outgrno = 1; outgropt = false; thresh = false; thorough = true; rearrfirst = false; maxtrees = 100; trout = true; usertree = false; weights = false; mulsets = false; printdata = false; progress = true; treeprint = true; stepbox = false; ancseq = false; dotdiff = true; interleaved = true; loopcount = 0; for (;;) { cleerhome(); printf("\nDiscrete character parsimony algorithm, version %s\n\n",VERSION); printf("Setting for this run:\n"); printf(" U Search for best tree? %s\n", (usertree ? "No, use user trees in input file" : "Yes")); if (!usertree) { printf(" S Search option? "); if (thorough) printf("More thorough search\n"); else if (rearrfirst) printf("Rearrange on one best tree\n"); else printf("Less thorough\n"); printf(" V Number of trees to save? %ld\n", maxtrees); printf(" J Randomize input order of species?"); if (jumble) printf(" Yes (seed =%8ld,%3ld times)\n", inseed0, njumble); else printf(" No. Use input order\n"); } printf(" O Outgroup root?"); if (outgropt) printf(" Yes, at species number %ld\n", outgrno); else printf(" No, use as outgroup species %ld\n", outgrno); printf(" T Use Threshold parsimony?"); if (thresh) printf(" Yes, count steps up to%4.1f per site\n", threshold); else printf(" No, use ordinary parsimony\n"); printf(" W Sites weighted? %s\n", (weights ? "Yes" : "No")); printf(" M Analyze multiple data sets?"); if (mulsets) printf(" Yes, %2ld %s\n", msets, (justwts ? "sets of weights" : "data sets")); else printf(" No\n"); printf(" I Input species interleaved? %s\n", (interleaved ? "Yes" : "No, sequential")); printf(" 0 Terminal type (IBM PC, ANSI, none)? %s\n", ibmpc ? "IBM PC" : ansi ? "ANSI" : "(none)"); printf(" 1 Print out the data at start of run %s\n", (printdata ? "Yes" : "No")); printf(" 2 Print indications of progress of run %s\n", progress ? "Yes" : "No"); printf(" 3 Print out tree %s\n", treeprint ? "Yes" : "No"); printf(" 4 Print out steps in each site %s\n", stepbox ? "Yes" : "No"); printf(" 5 Print character at all nodes of tree %s\n", ancseq ? "Yes" : "No"); if (ancseq || printdata) printf(" . Use dot-differencing to display them %s\n", dotdiff ? "Yes" : "No"); printf(" 6 Write out trees onto tree file? %s\n", trout ? "Yes" : "No"); printf("\n Y to accept these or type the letter for one to change\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); if (ch == '\n') ch = ' '; uppercase(&ch); if (ch == 'Y') break; if (((!usertree) && (strchr("WSVJOTUMI12345.60", ch) != NULL)) || (usertree && (strchr("WSVOTUMI12345.60", ch) != NULL))){ switch (ch) { case 'J': jumble = !jumble; if (jumble) initjumble(&inseed, &inseed0, seed, &njumble); else njumble = 1; break; case 'O': outgropt = !outgropt; if (outgropt) initoutgroup(&outgrno, spp); break; case 'T': thresh = !thresh; if (thresh) initthreshold(&threshold); break; case 'W': weights = !weights; break; case 'M': mulsets = !mulsets; if (mulsets) { printf("Multiple data sets or multiple weights?"); loopcount2 = 0; do { printf(" (type D or W)\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch2); getchar(); if (ch2 == '\n') ch2 = ' '; uppercase(&ch2); countup(&loopcount2, 10); } while ((ch2 != 'W') && (ch2 != 'D')); justwts = (ch2 == 'W'); if (justwts) justweights(&msets); else initdatasets(&msets); if (!jumble) { jumble = true; initjumble(&inseed, &inseed0, seed, &njumble); } } break; case 'U': usertree = !usertree; break; case 'S': thorough = !thorough; if (!thorough) { printf("Rearrange on just one best tree?"); loopcount2 = 0; do { printf(" (type Y or N)\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch2); getchar(); if (ch2 == '\n') ch2 = ' '; uppercase(&ch2); countup(&loopcount2, 10); } while ((ch2 != 'Y') && (ch2 != 'N')); rearrfirst = (ch2 == 'Y'); } break; case 'V': loopcount2 = 0; do { printf("type the number of trees to save\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%ld%*[^\n]", &maxtrees); getchar(); if (maxtrees > MAXNUMTREES) maxtrees = MAXNUMTREES; countup(&loopcount2, 10); } while (maxtrees < 1); break; case 'I': interleaved = !interleaved; break; case '0': initterminal(&ibmpc, &ansi); break; case '1': printdata = !printdata; break; case '2': progress = !progress; break; case '3': treeprint = !treeprint; break; case '4': stepbox = !stepbox; break; case '5': ancseq = !ancseq; break; case '.': dotdiff = !dotdiff; break; case '6': trout = !trout; break; } } else printf("Not a possible option!\n"); countup(&loopcount, 100); } } /* getoptions */ void reallocchars() { long i; for (i = 0; i < spp; i++) { free(y[i]); y[i] = (Char *)Malloc(chars*sizeof(Char)); } for (i = 0; i < spp; i++){ free(convtab[i]); convtab[i] = (Char *)Malloc(chars*sizeof(Char)); } free(weight); free(oldweight); free(alias); free(ally); free(location); weight = (long *)Malloc(chars*sizeof(long)); oldweight = (long *)Malloc(chars*sizeof(long)); alias = (long *)Malloc(chars*sizeof(long)); ally = (long *)Malloc(chars*sizeof(long)); location = (long *)Malloc(chars*sizeof(long)); } void allocrest() { long i; y = (Char **)Malloc(spp*sizeof(Char *)); for (i = 0; i < spp; i++) y[i] = (Char *)Malloc(chars*sizeof(Char)); convtab = (Char **)Malloc(spp*sizeof(Char *)); for (i = 0; i < spp; i++) convtab[i] = (Char *)Malloc(chars*sizeof(Char)); bestrees = (bestelm *)Malloc(maxtrees*sizeof(bestelm)); for (i = 1; i <= maxtrees; i++) bestrees[i - 1].btree = (long *)Malloc(nonodes*sizeof(long)); nayme = (naym *)Malloc(spp*sizeof(naym)); enterorder = (long *)Malloc(spp*sizeof(long)); place = (long *)Malloc(nonodes*sizeof(long)); weight = (long *)Malloc(chars*sizeof(long)); oldweight = (long *)Malloc(chars*sizeof(long)); alias = (long *)Malloc(chars*sizeof(long)); ally = (long *)Malloc(chars*sizeof(long)); location = (long *)Malloc(chars*sizeof(long)); } /* alocrest */ void doinit() { /* initializes variables */ inputnumbers(&spp, &chars, &nonodes, 1); getoptions(); if (printdata) fprintf(outfile, "%2ld species, %3ld sites\n\n", spp, chars); alloctree(&treenode, nonodes, usertree); allocrest(); } /* doinit */ void makeweights() { /* make up weights vector to avoid duplicate computations */ long i; for (i = 1; i <= chars; i++) { alias[i - 1] = i; oldweight[i - 1] = weight[i - 1]; ally[i - 1] = i; } sitesort(chars, weight); sitecombine(chars); sitescrunch(chars); endsite = 0; for (i = 1; i <= chars; i++) { if (ally[i - 1] == i) endsite++; } for (i = 1; i <= endsite; i++) location[alias[i - 1] - 1] = i; if (!thresh) threshold = spp; threshwt = (long *)Malloc(endsite*sizeof(long)); for (i = 0; i < endsite; i++) { weight[i] *= 10; threshwt[i] = (long)(threshold * weight[i] + 0.5); } zeros = (long *)Malloc(endsite*sizeof(long)); for (i = 0; i < endsite; i++) zeros[i] = 0; zeros2 = (unsigned char *)Malloc(endsite*sizeof(unsigned char)); for (i = 0; i < endsite; i++) zeros2[i] = 0; } /* makeweights */ void doinput() { /* reads the input data */ long i; if (justwts) { if (firstset) inputdata(chars); for (i = 0; i < chars; i++) weight[i] = 1; inputweights(chars, weight, &weights); if (justwts) { fprintf(outfile, "\n\nWeights set # %ld:\n\n", ith); if (progress) printf("\nWeights set # %ld:\n\n", ith); } if (printdata) printweights(outfile, 0, chars, weight, "Sites"); } else { if (!firstset) { samenumsp(&chars, ith); reallocchars(); } inputdata(chars); for (i = 0; i < chars; i++) weight[i] = 1; if (weights) { inputweights(chars, weight, &weights); if (printdata) printweights(outfile, 0, chars, weight, "Sites"); } } makeweights(); makevalues(treenode, zeros, zeros2, usertree); if (!usertree) { allocdiscnode(&temp, zeros, zeros2, endsite); allocdiscnode(&temp1, zeros, zeros2, endsite); allocdiscnode(&temp2, zeros, zeros2, endsite); allocdiscnode(&tempsum, zeros, zeros2, endsite); allocdiscnode(&temprm, zeros, zeros2, endsite); allocdiscnode(&tempadd, zeros, zeros2, endsite); allocdiscnode(&tempf, zeros, zeros2, endsite); allocdiscnode(&tmp, zeros, zeros2, endsite); allocdiscnode(&tmp1, zeros, zeros2, endsite); allocdiscnode(&tmp2, zeros, zeros2, endsite); allocdiscnode(&tmp3, zeros, zeros2, endsite); allocdiscnode(&tmprm, zeros, zeros2, endsite); allocdiscnode(&tmpadd, zeros, zeros2, endsite); } } /* doinput */ void initparsnode(node **p, node **grbg, node *q, long len, long nodei, long *ntips, long *parens, initops whichinit, pointarray treenode, pointarray nodep, Char *str, Char *ch, FILE *intree) { /* initializes a node */ boolean minusread; double valyew, divisor; switch (whichinit) { case bottom: gnudisctreenode(grbg, p, nodei, endsite, zeros, zeros2); treenode[nodei - 1] = *p; break; case nonbottom: gnudisctreenode(grbg, p, nodei, endsite, zeros, zeros2); break; case tip: match_names_to_data (str, treenode, p, spp); break; case length: /* if there is a length, read it and discard value */ processlength(&valyew, &divisor, ch, &minusread, intree, parens); break; default: /*cases hslength,hsnolength,treewt,unittrwt,iter,*/ break; /*length should never occur */ } } /* initparsnode */ void evaluate(node *r) { /* determines the number of steps needed for a tree. this is the minimum number of steps needed to evolve sequences on this tree */ long i, steps; long term; double sum; sum = 0.0; for (i = 0; i < endsite; i++) { steps = r->numsteps[i]; if ((long)steps <= threshwt[i]) term = steps; else term = threshwt[i]; sum += (double)term; if (usertree && which <= maxuser) fsteps[which - 1][i] = term; } if (usertree && which <= maxuser) { nsteps[which - 1] = sum; if (which == 1) { minwhich = 1; minsteps = sum; } else if (sum < minsteps) { minwhich = which; minsteps = sum; } } like = -sum; } /* evaluate */ void tryadd(node *p, node *item, node *nufork) { /* temporarily adds one fork and one tip to the tree. if the location where they are added yields greater "likelihood" than other locations tested up to that time, then keeps that location as there */ long pos; double belowsum, parentsum; boolean found, collapse, changethere, trysave; if (!p->tip) { memcpy(temp->discbase, p->discbase, endsite*sizeof(unsigned char)); memcpy(temp->numsteps, p->numsteps, endsite*sizeof(long)); memcpy(temp->discnumnuc, p->discnumnuc, endsite*sizeof(discnucarray)); temp->numdesc = p->numdesc + 1; if (p->back) { multifillin(temp, tempadd, 1); sumnsteps2(tempsum, temp, p->back, 0, endsite, threshwt); } else { multisumnsteps(temp, tempadd, 0, endsite, threshwt); tempsum->sumsteps = temp->sumsteps; } if (tempsum->sumsteps <= -bestyet) { if (p->back) sumnsteps2(tempsum, temp, p->back, endsite+1, endsite, threshwt); else { multisumnsteps(temp, temp1, endsite+1, endsite, threshwt); tempsum->sumsteps = temp->sumsteps; } } p->sumsteps = tempsum->sumsteps; } if (p == root) sumnsteps2(temp, item, p, 0, endsite, threshwt); else { sumnsteps(temp1, item, p, 0, endsite); sumnsteps2(temp, temp1, p->back, 0, endsite, threshwt); } if (temp->sumsteps <= -bestyet) { if (p == root) sumnsteps2(temp, item, p, endsite+1, endsite, threshwt); else { sumnsteps(temp1, item, p, endsite+1, endsite); sumnsteps2(temp, temp1, p->back, endsite+1, endsite, threshwt); } } belowsum = temp->sumsteps; multf = false; like = -belowsum; if (!p->tip && belowsum >= p->sumsteps) { multf = true; like = -p->sumsteps; } trysave = true; if (!multf && p != root) { parentsum = treenode[p->back->index - 1]->sumsteps; if (belowsum >= parentsum) trysave = false; } if (lastrearr) { changethere = true; if (like >= bstlike2 && trysave) { if (like > bstlike2) found = false; else { addnsave(p, item, nufork, &root, &grbg, multf, treenode, place, zeros, zeros2); pos = 0; findtree(&found, &pos, nextree, place, bestrees); } if (!found) { collapse = collapsible(item, p, temp, temp1, temp2, tempsum, temprm, tmpadd, multf, root, zeros, zeros2, treenode); if (!thorough) changethere = !collapse; if (thorough || !collapse || like > bstlike2 || (nextree == 1)) { if (like > bstlike2) { addnsave(p, item, nufork, &root, &grbg, multf, treenode, place, zeros, zeros2); bestlike = bstlike2 = like; addbestever(&pos, &nextree, maxtrees, collapse, place, bestrees); } else addtiedtree(pos, &nextree, maxtrees, collapse, place, bestrees); } } } if (like >= bestyet) { if (like > bstlike2) bstlike2 = like; if (changethere && trysave) { bestyet = like; there = p; mulf = multf; } } } else if ((like > bestyet) || (like >= bestyet && trysave)) { bestyet = like; there = p; mulf = multf; } } /* tryadd */ void addpreorder(node *p, node *item, node *nufork) { /* traverses a n-ary tree, calling PROCEDURE tryadd at a node before calling tryadd at its descendants */ node *q; if (p == NULL) return; tryadd(p, item, nufork); if (!p->tip) { q = p->next; while (q != p) { addpreorder(q->back, item, nufork); q = q->next; } } } /* addpreorder */ void trydescendants(node *item, node *forknode, node *parent, node *parentback, boolean trybelow) { /* tries rearrangements at parent and below parent's descendants */ node *q, *tempblw; boolean bestever=0, belowbetter, multf=0, saved, trysave; double parentsum=0, belowsum; memcpy(temp->discbase, parent->discbase, endsite*sizeof(unsigned char)); memcpy(temp->numsteps, parent->numsteps, endsite*sizeof(long)); memcpy(temp->discnumnuc, parent->discnumnuc, endsite*sizeof(discnucarray)); temp->numdesc = parent->numdesc + 1; multifillin(temp, tempadd, 1); sumnsteps2(tempsum, parentback, temp, 0, endsite, threshwt); belowbetter = true; if (lastrearr) { parentsum = tempsum->sumsteps; if (-tempsum->sumsteps >= bstlike2) { belowbetter = false; bestever = false; multf = true; if (-tempsum->sumsteps > bstlike2) bestever = true; savelocrearr(item, forknode, parent, tmp, tmp1, tmp2, tmp3, tmprm, tmpadd, &root, maxtrees, &nextree, multf, bestever, &saved, place, bestrees, treenode, &grbg, zeros, zeros2); if (saved) { like = bstlike2 = -tempsum->sumsteps; there = parent; mulf = true; } } } else if (-tempsum->sumsteps >= like) { there = parent; mulf = true; like = -tempsum->sumsteps; } if (trybelow) { sumnsteps(temp, parent, tempadd, 0, endsite); sumnsteps2(tempsum, temp, parentback, 0, endsite, threshwt); if (lastrearr) { belowsum = tempsum->sumsteps; if (-tempsum->sumsteps >= bstlike2 && belowbetter && (forknode->numdesc > 2 || (forknode->numdesc == 2 && parent->back->index != forknode->index))) { trysave = false; memcpy(temp->discbase, parentback->discbase, endsite*sizeof(unsigned char)); memcpy(temp->numsteps, parentback->numsteps, endsite*sizeof(long)); memcpy(temp->discnumnuc, parentback->discnumnuc, endsite*sizeof(discnucarray)); temp->numdesc = parentback->numdesc + 1; multifillin(temp, tempadd, 1); sumnsteps2(tempsum, parent, temp, 0, endsite, threshwt); if (-tempsum->sumsteps < bstlike2) { multf = false; bestever = false; trysave = true; } if (-belowsum > bstlike2) { multf = false; bestever = true; trysave = true; } if (trysave) { if (treenode[parent->index - 1] != parent) tempblw = parent->back; else tempblw = parent; savelocrearr(item, forknode, tempblw, tmp, tmp1, tmp2, tmp3, tmprm, tmpadd, &root, maxtrees, &nextree, multf, bestever, &saved, place, bestrees, treenode, &grbg, zeros, zeros2); if (saved) { like = bstlike2 = -belowsum; there = tempblw; mulf = false; } } } } else if (-tempsum->sumsteps > like) { like = -tempsum->sumsteps; if (-tempsum->sumsteps > bestyet) { if (treenode[parent->index - 1] != parent) tempblw = parent->back; else tempblw = parent; there = tempblw; mulf = false; } } } q = parent->next; while (q != parent) { if (q->back && q->back != item) { memcpy(temp1->discbase, q->discbase, endsite*sizeof(unsigned char)); memcpy(temp1->numsteps, q->numsteps, endsite*sizeof(long)); memcpy(temp1->discnumnuc, q->discnumnuc, endsite*sizeof(discnucarray)); temp1->numdesc = q->numdesc; multifillin(temp1, parentback, 0); if (lastrearr) belowbetter = (-parentsum < bstlike2); if (!q->back->tip) { memcpy(temp->discbase, q->back->discbase, endsite*sizeof(unsigned char)); memcpy(temp->numsteps, q->back->numsteps, endsite*sizeof(long)); memcpy(temp->discnumnuc, q->back->discnumnuc, endsite*sizeof(discnucarray)); temp->numdesc = q->back->numdesc + 1; multifillin(temp, tempadd, 1); sumnsteps2(tempsum, temp1, temp, 0, endsite, threshwt); if (lastrearr) { if (-tempsum->sumsteps >= bstlike2) { belowbetter = false; bestever = false; multf = true; if (-tempsum->sumsteps > bstlike2) bestever = true; savelocrearr(item, forknode, q->back, tmp, tmp1, tmp2, tmp3, tmprm, tmpadd, &root, maxtrees, &nextree, multf, bestever, &saved, place, bestrees, treenode, &grbg, zeros, zeros2); if (saved) { like = bstlike2 = -tempsum->sumsteps; there = q->back; mulf = true; } } } else if (-tempsum->sumsteps >= like) { like = -tempsum->sumsteps; there = q->back; mulf = true; } } sumnsteps(temp, q->back, tempadd, 0, endsite); sumnsteps2(tempsum, temp, temp1, 0, endsite, threshwt); if (lastrearr) { if (-tempsum->sumsteps >= bstlike2) { trysave = false; multf = false; if (belowbetter) { bestever = false; trysave = true; } if (-tempsum->sumsteps > bstlike2) { bestever = true; trysave = true; } if (trysave) { if (treenode[q->back->index - 1] != q->back) tempblw = q; else tempblw = q->back; savelocrearr(item, forknode, tempblw, tmp, tmp1, tmp2, tmp3, tmprm, tmpadd, &root, maxtrees, &nextree, multf, bestever, &saved, place, bestrees, treenode, &grbg, zeros, zeros2); if (saved) { like = bstlike2 = -tempsum->sumsteps; there = tempblw; mulf = false; } } } } else if (-tempsum->sumsteps > like) { like = -tempsum->sumsteps; if (-tempsum->sumsteps > bestyet) { if (treenode[q->back->index - 1] != q->back) tempblw = q; else tempblw = q->back; there = tempblw; mulf = false; } } } q = q->next; } } /* trydescendants */ void trylocal(node *item, node *forknode) { /* rearranges below forknode, below descendants of forknode when there are more than 2 descendants, then unroots the back of forknode and rearranges on its descendants */ node *q; boolean bestever, multf, saved; memcpy(temprm->discbase, zeros2, endsite*sizeof(unsigned char)); memcpy(temprm->numsteps, zeros, endsite*sizeof(long)); memcpy(temprm->olddiscbase, item->discbase, endsite*sizeof(unsigned char)); memcpy(temprm->oldnumsteps, item->numsteps, endsite*sizeof(long)); memcpy(tempf->discbase, forknode->discbase, endsite*sizeof(unsigned char)); memcpy(tempf->numsteps, forknode->numsteps, endsite*sizeof(long)); memcpy(tempf->discnumnuc, forknode->discnumnuc, endsite*sizeof(discnucarray)); tempf->numdesc = forknode->numdesc - 1; multifillin(tempf, temprm, -1); if (!forknode->back) { sumnsteps2(tempsum, tempf, tempadd, 0, endsite, threshwt); if (lastrearr) { if (-tempsum->sumsteps > bstlike2) { bestever = true; multf = false; savelocrearr(item, forknode, forknode, tmp, tmp1, tmp2, tmp3, tmprm, tmpadd, &root, maxtrees, &nextree, multf, bestever, &saved, place, bestrees, treenode, &grbg, zeros, zeros2); if (saved) { like = bstlike2 = -tempsum->sumsteps; there = forknode; mulf = false; } } } else if (-tempsum->sumsteps > like) { like = -tempsum->sumsteps; if (-tempsum->sumsteps > bestyet) { there = forknode; mulf = false; } } } else { sumnsteps(temp, tempf, tempadd, 0, endsite); sumnsteps2(tempsum, temp, forknode->back, 0, endsite, threshwt); if (lastrearr) { if (-tempsum->sumsteps > bstlike2) { bestever = true; multf = false; savelocrearr(item, forknode, forknode, tmp, tmp1, tmp2, tmp3, tmprm, tmpadd, &root, maxtrees, &nextree, multf, bestever, &saved, place, bestrees, treenode, &grbg, zeros, zeros2); if (saved) { like = bstlike2 = -tempsum->sumsteps; there = forknode; mulf = false; } } } else if (-tempsum->sumsteps > like) { like = -tempsum->sumsteps; if (-tempsum->sumsteps > bestyet) { there = forknode; mulf = false; } } trydescendants(item, forknode, forknode->back, tempf, false); } q = forknode->next; while (q != forknode) { if (q->back != item) { memcpy(temp2->discbase, q->discbase, endsite*sizeof(unsigned char)); memcpy(temp2->numsteps, q->numsteps, endsite*sizeof(long)); memcpy(temp2->discnumnuc, q->discnumnuc, endsite*sizeof(discnucarray)); temp2->numdesc = q->numdesc - 1; multifillin(temp2, temprm, -1); if (!q->back->tip) { trydescendants(item, forknode, q->back, temp2, true); } else { sumnsteps(temp1, q->back, tempadd, 0, endsite); sumnsteps2(tempsum, temp1, temp2, 0, endsite, threshwt); if (lastrearr) { if (-tempsum->sumsteps > bstlike2) { multf = false; bestever = true; savelocrearr(item, forknode, q->back, tmp, tmp1, tmp2, tmp3, tmprm, tmpadd, &root, maxtrees, &nextree, multf, bestever, &saved, place, bestrees, treenode, &grbg, zeros, zeros2); if (saved) { like = bstlike2 = -tempsum->sumsteps; there = q->back; mulf = false; } } } else if (-tempsum->sumsteps > like) { like = -tempsum->sumsteps; if (-tempsum->sumsteps > bestyet) { there = q->back; mulf = false; } } } } q = q->next; } } /* trylocal */ void trylocal2(node *item, node *forknode, node *other) { /* rearranges below forknode, below descendants of forknode when there are more than 2 descendants, then unroots the back of forknode and rearranges on its descendants. Used if forknode has binary descendants */ node *q; boolean bestever=0, multf, saved, belowbetter, trysave; memcpy(tempf->discbase, other->discbase, endsite*sizeof(unsigned char)); memcpy(tempf->numsteps, other->numsteps, endsite*sizeof(long)); memcpy(tempf->olddiscbase, forknode->discbase, endsite*sizeof(unsigned char)); memcpy(tempf->oldnumsteps, forknode->numsteps, endsite*sizeof(long)); tempf->numdesc = other->numdesc; if (forknode->back) trydescendants(item, forknode, forknode->back, tempf, false); if (!other->tip) { memcpy(temp->discbase, other->discbase, endsite*sizeof(unsigned char)); memcpy(temp->numsteps, other->numsteps, endsite*sizeof(long)); memcpy(temp->discnumnuc, other->discnumnuc, endsite*sizeof(discnucarray)); temp->numdesc = other->numdesc + 1; multifillin(temp, tempadd, 1); if (forknode->back) sumnsteps2(tempsum, forknode->back, temp, 0, endsite, threshwt); else sumnsteps2(tempsum, NULL, temp, 0, endsite, threshwt); belowbetter = true; if (lastrearr) { if (-tempsum->sumsteps >= bstlike2) { belowbetter = false; bestever = false; multf = true; if (-tempsum->sumsteps > bstlike2) bestever = true; savelocrearr(item, forknode, other, tmp, tmp1, tmp2, tmp3, tmprm, tmpadd, &root, maxtrees, &nextree, multf, bestever, &saved, place, bestrees, treenode, &grbg, zeros, zeros2); if (saved) { like = bstlike2 = -tempsum->sumsteps; there = other; mulf = true; } } } else if (-tempsum->sumsteps >= like) { there = other; mulf = true; like = -tempsum->sumsteps; } if (forknode->back) { memcpy(temprm->discbase, forknode->back->discbase, endsite*sizeof(unsigned char)); memcpy(temprm->numsteps, forknode->back->numsteps, endsite*sizeof(long)); } else { memcpy(temprm->discbase, zeros2, endsite*sizeof(unsigned char)); memcpy(temprm->numsteps, zeros, endsite*sizeof(long)); } memcpy(temprm->olddiscbase, other->back->discbase, endsite*sizeof(unsigned char)); memcpy(temprm->oldnumsteps, other->back->numsteps, endsite*sizeof(long)); q = other->next; while (q != other) { memcpy(temp2->discbase, q->discbase, endsite*sizeof(unsigned char)); memcpy(temp2->numsteps, q->numsteps, endsite*sizeof(long)); memcpy(temp2->discnumnuc, q->discnumnuc, endsite*sizeof(discnucarray)); if (forknode->back) { temp2->numdesc = q->numdesc; multifillin(temp2, temprm, 0); } else { temp2->numdesc = q->numdesc - 1; multifillin(temp2, temprm, -1); } if (!q->back->tip) trydescendants(item, forknode, q->back, temp2, true); else { sumnsteps(temp1, q->back, tempadd, 0, endsite); sumnsteps2(tempsum, temp1, temp2, 0, endsite, threshwt); if (lastrearr) { if (-tempsum->sumsteps >= bstlike2) { trysave = false; multf = false; if (belowbetter) { bestever = false; trysave = true; } if (-tempsum->sumsteps > bstlike2) { bestever = true; trysave = true; } if (trysave) { savelocrearr(item, forknode, q->back, tmp, tmp1, tmp2, tmp3, tmprm, tmpadd, &root, maxtrees, &nextree, multf, bestever, &saved, place, bestrees, treenode, &grbg, zeros, zeros2); if (saved) { like = bstlike2 = -tempsum->sumsteps; there = q->back; mulf = false; } } } } else if (-tempsum->sumsteps > like) { like = -tempsum->sumsteps; if (-tempsum->sumsteps > bestyet) { there = q->back; mulf = false; } } } q = q->next; } } } /* trylocal2 */ void tryrearr(node *p, boolean *success) { /* evaluates one rearrangement of the tree. if the new tree has greater "likelihood" than the old one sets success = TRUE and keeps the new tree. otherwise, restores the old tree */ node *forknode, *newfork, *other, *oldthere; double oldlike; boolean oldmulf; if (p->back == NULL) return; forknode = treenode[p->back->index - 1]; if (!forknode->back && forknode->numdesc <= 2 && alltips(forknode, p)) return; oldlike = bestyet; like = -10.0 * spp * chars; memcpy(tempadd->discbase, p->discbase, endsite*sizeof(unsigned char)); memcpy(tempadd->numsteps, p->numsteps, endsite*sizeof(long)); memcpy(tempadd->olddiscbase, zeros2, endsite*sizeof(unsigned char)); memcpy(tempadd->oldnumsteps, zeros, endsite*sizeof(long)); if (forknode->numdesc > 2) { oldthere = there = forknode; oldmulf = mulf = true; trylocal(p, forknode); } else { findbelow(&other, p, forknode); oldthere = there = other; oldmulf = mulf = false; trylocal2(p, forknode, other); } if ((like <= oldlike) || (there == oldthere && mulf == oldmulf)) return; recompute = true; re_move(p, &forknode, &root, recompute, treenode, &grbg, zeros, zeros2); if (mulf) add(there, p, NULL, &root, recompute, treenode, &grbg, zeros, zeros2); else { if (forknode->numdesc > 0) getnufork(&newfork, &grbg, treenode, zeros, zeros2); else newfork = forknode; add(there, p, newfork, &root, recompute, treenode, &grbg, zeros, zeros2); } if (like - oldlike > LIKE_EPSILON) { *success = true; bestyet = like; } } /* tryrearr */ void repreorder(node *p, boolean *success) { /* traverses a binary tree, calling PROCEDURE tryrearr at a node before calling tryrearr at its descendants */ node *q, *this; if (p == NULL) return; if (!p->visited) { tryrearr(p, success); p->visited = true; } if (!p->tip) { q = p; while (q->next != p) { this = q->next->back; repreorder(q->next->back,success); if (q->next->back == this) q = q->next; } } } /* repreorder */ void rearrange(node **r) { /* traverses the tree (preorder), finding any local rearrangement which decreases the number of steps. if traversal succeeds in increasing the tree's "likelihood", PROCEDURE rearrange runs traversal again */ boolean success=true; while (success) { success = false; clearvisited(treenode); repreorder(*r, &success); } } /* rearrange */ void describe() { /* prints ancestors, steps and table of numbers of steps in each site */ if (treeprint) { fprintf(outfile, "\nrequires a total of %10.3f\n", like / -10.0); fprintf(outfile, "\n between and length\n"); fprintf(outfile, " ------- --- ------\n"); printbranchlengths(root); } if (stepbox) writesteps(chars, weights, oldweight, root); if (ancseq) { hypstates(chars, root, treenode, &garbage); putc('\n', outfile); } putc('\n', outfile); if (trout) { col = 0; treeout3(root, nextree, &col, root); } } /* describe */ void pars_coordinates(node *p, double lengthsum, long *tipy, double *tipmax) { /* establishes coordinates of nodes */ node *q, *first, *last; double xx; if (p == NULL) return; if (p->tip) { p->xcoord = (long)(over * lengthsum + 0.5); p->ycoord = (*tipy); p->ymin = (*tipy); p->ymax = (*tipy); (*tipy) += down; if (lengthsum > (*tipmax)) (*tipmax) = lengthsum; return; } q = p->next; do { xx = q->v; if (xx > 100.0) xx = 100.0; pars_coordinates(q->back, lengthsum + xx, tipy,tipmax); q = q->next; } while (p != q); first = p->next->back; q = p; while (q->next != p) q = q->next; last = q->back; p->xcoord = (long)(over * lengthsum + 0.5); if ((p == root) || count_sibs(p) > 2) p->ycoord = p->next->next->back->ycoord; else p->ycoord = (first->ycoord + last->ycoord) / 2; p->ymin = first->ymin; p->ymax = last->ymax; } /* pars_coordinates */ void pars_printree() { /* prints out diagram of the tree2 */ long tipy; double scale, tipmax; long i; if (!treeprint) return; putc('\n', outfile); tipy = 1; tipmax = 0.0; pars_coordinates(root, 0.0, &tipy, &tipmax); scale = 1.0 / (long)(tipmax + 1.000); for (i = 1; i <= (tipy - down); i++) drawline3(i, scale, root); putc('\n', outfile); } /* pars_printree */ void globrearrange() { /* does global rearrangements */ long j; double gotlike; boolean frommulti; node *item, *nufork; recompute = true; do { printf(" "); gotlike = bestlike; for (j = 0; j < nonodes; j++) { bestyet = -10.0 * spp * chars; if (j < spp) item = treenode[enterorder[j] -1]; else item = treenode[j]; if ((item != root) && ((j < spp) || ((j >= spp) && (item->numdesc > 0))) && !((item->back->index == root->index) && (root->numdesc == 2) && alltips(root, item))) { re_move(item, &nufork, &root, recompute, treenode, &grbg, zeros, zeros2); frommulti = (nufork->numdesc > 0); clearcollapse(treenode); there = root; memcpy(tempadd->discbase, item->discbase, endsite*sizeof(unsigned char)); memcpy(tempadd->numsteps, item->numsteps, endsite*sizeof(long)); memcpy(tempadd->olddiscbase, zeros2, endsite*sizeof(unsigned char)); memcpy(tempadd->oldnumsteps, zeros, endsite*sizeof(long)); if (frommulti){ oldnufork = nufork; getnufork(&nufork, &grbg, treenode, zeros, zeros2); } addpreorder(root, item, nufork); if (frommulti) oldnufork = NULL; if (!mulf) add(there, item, nufork, &root, recompute, treenode, &grbg, zeros, zeros2); else add(there, item, NULL, &root, recompute, treenode, &grbg, zeros, zeros2); } if (progress) { if (j % ((nonodes / 72) + 1) == 0) putchar('.'); fflush(stdout); } } if (progress) putchar('\n'); } while (bestlike > gotlike); } /* globrearrange */ void load_tree(long treei) { /* restores a tree from bestrees */ long j, nextnode; boolean recompute = false; node *dummy; for (j = spp - 1; j >= 1; j--) re_move(treenode[j], &dummy, &root, recompute, treenode, &grbg, zeros, zeros2); root = treenode[0]; recompute = true; add(treenode[0], treenode[1], treenode[spp], &root, recompute, treenode, &grbg, zeros, zeros2); nextnode = spp + 2; for (j = 3; j <= spp; j++) { if (bestrees[treei].btree[j - 1] > 0) add(treenode[bestrees[treei].btree[j - 1] - 1], treenode[j - 1], treenode[nextnode++ - 1], &root, recompute, treenode, &grbg, zeros, zeros2); else add(treenode[treenode[-bestrees[treei].btree[j-1]-1]->back->index-1], treenode[j - 1], NULL, &root, recompute, treenode, &grbg, zeros, zeros2); } } /* load_tree */ void grandrearr() { /* calls either global rearrangement or local rearrangement on best trees */ long treei; boolean done; done = false; do { treei = findunrearranged(bestrees, nextree, true); if (treei < 0) done = true; else bestrees[treei].gloreange = true; if (!done) { load_tree(treei); globrearrange(); done = rearrfirst; } } while (!done); } /* grandrearr */ void maketree() { /* constructs a binary tree from the pointers in treenode. adds each node at location which yields highest "likelihood" then rearranges the tree for greatest "likelihood" */ long i, j, numtrees, nextnode; boolean done, firsttree, goteof, haslengths; node *item, *nufork, *dummy; pointarray nodep; if (!usertree) { for (i = 1; i <= spp; i++) enterorder[i - 1] = i; if (jumble) randumize(seed, enterorder); recompute = true; root = treenode[enterorder[0] - 1]; add(treenode[enterorder[0] - 1], treenode[enterorder[1] - 1], treenode[spp], &root, recompute, treenode, &grbg, zeros, zeros2); if (progress) { printf("Adding species:\n"); writename(0, 2, enterorder); #ifdef WIN32 phyFillScreenColor(); #endif } lastrearr = false; oldnufork = NULL; for (i = 3; i <= spp; i++) { bestyet = -10.0 * spp * chars; item = treenode[enterorder[i - 1] - 1]; getnufork(&nufork, &grbg, treenode, zeros, zeros2); there = root; memcpy(tempadd->discbase, item->discbase, endsite*sizeof(unsigned char)); memcpy(tempadd->numsteps, item->numsteps, endsite*sizeof(long)); memcpy(tempadd->olddiscbase, zeros2, endsite*sizeof(unsigned char)); memcpy(tempadd->oldnumsteps, zeros, endsite*sizeof(long)); addpreorder(root, item, nufork); if (!mulf) add(there, item, nufork, &root, recompute, treenode, &grbg, zeros, zeros2); else add(there, item, NULL, &root, recompute, treenode, &grbg, zeros, zeros2); like = bestyet; rearrange(&root); if (progress) { writename(i - 1, 1, enterorder); #ifdef WIN32 phyFillScreenColor(); #endif } lastrearr = (i == spp); if (lastrearr) { bestlike = bestyet; if (jumb == 1) { bstlike2 = bestlike; nextree = 1; initbestrees(bestrees, maxtrees, true); initbestrees(bestrees, maxtrees, false); } if (progress) { printf("\nDoing global rearrangements"); if (rearrfirst) printf(" on the first of the trees tied for best\n"); else printf(" on all trees tied for best\n"); printf(" !"); for (j = 0; j < nonodes; j++) if (j % ((nonodes / 72) + 1) == 0) putchar('-'); printf("!\n"); #ifdef WIN32 phyFillScreenColor(); #endif } globrearrange(); rearrange(&root); } } done = false; while (!done && findunrearranged(bestrees, nextree, true) >= 0) { grandrearr(); done = rearrfirst; } if (progress) { putchar('\n'); #ifdef WIN32 phyFillScreenColor(); #endif } recompute = false; for (i = spp - 1; i >= 1; i--) re_move(treenode[i], &dummy, &root, recompute, treenode, &grbg, zeros, zeros2); if (jumb == njumble) { collapsebestrees(&root, &grbg, treenode, bestrees, place, zeros, zeros2, chars, recompute, progress); if (treeprint) { putc('\n', outfile); if (nextree == 2) fprintf(outfile, "One most parsimonious tree found:\n"); else fprintf(outfile, "%6ld trees in all found\n", nextree - 1); } if (nextree > maxtrees + 1) { if (treeprint) fprintf(outfile, "here are the first %4ld of them\n", (long)maxtrees); nextree = maxtrees + 1; } if (treeprint) putc('\n', outfile); for (i = 0; i <= (nextree - 2); i++) { root = treenode[0]; add(treenode[0], treenode[1], treenode[spp], &root, recompute, treenode, &grbg, zeros, zeros2); nextnode = spp + 2; for (j = 3; j <= spp; j++) { if (bestrees[i].btree[j - 1] > 0) add(treenode[bestrees[i].btree[j - 1] - 1], treenode[j - 1], treenode[nextnode++ - 1], &root, recompute, treenode, &grbg, zeros, zeros2); else add(treenode[treenode[-bestrees[i].btree[j - 1]-1]->back->index-1], treenode[j - 1], NULL, &root, recompute, treenode, &grbg, zeros, zeros2); } reroot(treenode[outgrno - 1], root); postorder(root); evaluate(root); treelength(root, chars, treenode); pars_printree(); describe(); for (j = 1; j < spp; j++) re_move(treenode[j], &dummy, &root, recompute, treenode, &grbg, zeros, zeros2); } } } else { /* Open in binary: ftell() is broken for UNIX line-endings under WIN32 */ openfile(&intree,INTREE,"input tree", "rb",progname,intreename); numtrees = countsemic(&intree); if (numtrees > MAXNUMTREES) { printf( "\n\nERROR: number of input trees is read incorrectly from %s\n\n", intreename); exxit(-1); } if (numtrees > 2) initseed(&inseed, &inseed0, seed); if (treeprint) { fprintf(outfile, "User-defined tree"); if (numtrees > 1) putc('s', outfile); fprintf(outfile, ":\n"); } fsteps = (long **)Malloc(maxuser*sizeof(long *)); for (j = 1; j <= maxuser; j++) fsteps[j - 1] = (long *)Malloc(endsite*sizeof(long)); nodep = NULL; which = 1; while (which <= numtrees) { firsttree = true; nextnode = 0; haslengths = true; treeread(intree, &root, treenode, &goteof, &firsttree, nodep, &nextnode, &haslengths, &grbg, initparsnode,false,nonodes); if (treeprint) fprintf(outfile, "\n\n"); if (outgropt) reroot(treenode[outgrno - 1], root); postorder(root); evaluate(root); treelength(root, chars, treenode); pars_printree(); describe(); if (which < numtrees) gdispose(root, &grbg, treenode); which++; } FClose(intree); putc('\n', outfile); if (numtrees > 1 && chars > 1 ) standev(chars, numtrees, minwhich, minsteps, nsteps, fsteps, seed); for (j = 1; j <= maxuser; j++) free(fsteps[j - 1]); free(fsteps); } if (jumb == njumble) { if (progress) { printf("\nOutput written to file \"%s\"\n", outfilename); if (trout) { printf("\nTree"); if ((usertree && numtrees > 1) || (!usertree && nextree != 2)) printf("s"); printf(" also written onto file \"%s\"\n", outtreename); } } } } /* maketree */ void freerest() { if (!usertree) { freenode(&temp); freenode(&temp1); freenode(&temp2); freenode(&tempsum); freenode(&temprm); freenode(&tempadd); freenode(&tempf); freenode(&tmp); freenode(&tmp1); freenode(&tmp2); freenode(&tmp3); freenode(&tmprm); freenode(&tmpadd); } freegrbg(&grbg); if (ancseq) freegarbage(&garbage); free(threshwt); free(zeros); free(zeros2); freenodes(nonodes, treenode); } /* freerest*/ int main(int argc, Char *argv[]) { /* Discrete character parsimony by uphill search */ /* reads in spp, chars, and the data. Then calls maketree to construct the tree */ #ifdef MAC argc = 1; /* macsetup("Pars",""); */ argv[0]="Pars"; #endif init(argc, argv); progname = argv[0]; openfile(&infile,INFILE,"input file", "r",argv[0],infilename); openfile(&outfile,OUTFILE,"output file", "w",argv[0],outfilename); ibmpc = IBMCRT; ansi = ANSICRT; msets = 1; firstset = true; garbage = NULL; grbg = NULL; doinit(); if (weights || justwts) openfile(&weightfile,WEIGHTFILE,"weights file","r",argv[0],weightfilename); if (trout) openfile(&outtree,OUTTREE,"output tree file", "w",argv[0],outtreename); for (ith = 1; ith <= msets; ith++) { if (msets > 1 && !justwts) { fprintf(outfile, "\nData set # %ld:\n\n", ith); if (progress) printf("\nData set # %ld:\n\n", ith); } doinput(); if (ith == 1) firstset = false; for (jumb = 1; jumb <= njumble; jumb++) maketree(); freerest(); } FClose(infile); FClose(outfile); if (weights || justwts) FClose(weightfile); if (trout) FClose(outtree); if (usertree) FClose(intree); #ifdef MAC fixmacfile(outfilename); fixmacfile(outtreename); #endif if (progress) printf("\nDone.\n\n"); #ifdef WIN32 phyRestoreConsoleAttributes(); #endif return 0; } /* Discrete character parsimony by uphill search */ phylip-3.697/src/penny.c0000644004732000473200000005767612406201116014647 0ustar joefelsenst_g #include "phylip.h" #include "disc.h" #include "wagner.h" /* version 3.696. Written by Joseph Felsenstein, Akiko Fuseki, Sean Lamont, and Andrew Keeffe. Copyright (c) 1993-2014, Joseph Felsenstein All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #define maxtrees 1000 /* maximum number of trees to be printed out */ #define often 100 /* how often to notify how many trees examined */ #define many 1000 /* how many multiples of howoften before stop */ typedef long *treenumbers; typedef double *valptr; typedef long *placeptr; #ifndef OLDC /* function prototypes */ void getoptions(void); void allocrest(void); void doinit(void); void inputoptions(void); void doinput(void); void supplement(bitptr); void evaluate(node2 *); void addtraverse(node2 *,node2 *,node2 *,long *,long *,valptr,placeptr); void addit(long); void reroot(node2 *); void describe(void); void maketree(void); /* function prototypes */ #endif Char infilename[FNMLNGTH], outfilename[FNMLNGTH], outtreename[FNMLNGTH], weightfilename[FNMLNGTH], ancfilename[FNMLNGTH], mixfilename[FNMLNGTH]; node2 *root; long outgrno, rno, howmanny, howoften, col, msets, ith; /* outgrno indicates outgroup */ boolean weights, thresh, ancvar, questions, allsokal, allwagner, mixture, simple, trout, noroot, didreroot, outgropt, progress, treeprint, stepbox, ancseq, mulsets, firstset; boolean *ancone, *anczero, *ancone0, *anczero0, justwts; pointptr2 treenode; /* pointers to all nodes in tree */ double fracdone, fracinc; double threshold; double *threshwt; bitptr wagner, wagner0; boolean *added; Char *guess; steptr numsteps; long **bestorders, **bestrees; steptr numsone, numszero; gbit *garbage; long examined, mults; boolean firsttime, done, full; double like, bestyet; treenumbers current, order; long fullset; bitptr zeroanc, oneanc; bitptr suppsteps; void getoptions() { /* interactively set options */ long loopcount, loopcount2; Char ch, ch2; fprintf(outfile, "\nPenny algorithm, version %s\n",VERSION); fprintf(outfile, " branch-and-bound to find all"); fprintf(outfile, " most parsimonious trees\n\n"); howoften = often; howmanny = many; outgrno = 1; outgropt = false; simple = true; thresh = false; threshold = spp; trout = true; weights = false; justwts = false; ancvar = false; allsokal = false; allwagner = true; mixture = false; printdata = false; progress = true; treeprint = true; stepbox = false; ancseq = false; loopcount = 0; for(;;) { cleerhome(); printf("\nPenny algorithm, version %s\n",VERSION); printf(" branch-and-bound to find all most parsimonious trees\n\n"); printf("Settings for this run:\n"); printf(" X Use Mixed method? %s\n", mixture ? "Yes" : "No"); printf(" P Parsimony method? %s\n", (allwagner && !mixture) ? "Wagner" : (!(allwagner || mixture)) ? "Camin-Sokal" : "(methods in mixture)"); printf(" F How often to report, in trees:%5ld\n",howoften); printf(" H How many groups of%5ld trees:%6ld\n",howoften,howmanny); printf(" O Outgroup root?"); if (outgropt) printf(" Yes, at species number%3ld\n", outgrno); else printf(" No, use as outgroup species%3ld\n", outgrno); printf(" S Branch and bound is simple? %s\n", simple ? "Yes" : "No. reconsiders order of species"); printf(" T Use Threshold parsimony?"); if (thresh) printf(" Yes, count steps up to%4.1f per char.\n", threshold); else printf(" No, use ordinary parsimony\n"); printf(" A Use ancestral states in input file? %s\n", ancvar ? "Yes" : "No"); printf(" W Sites weighted? %s\n", (weights ? "Yes" : "No")); printf(" M Analyze multiple data sets?"); if (mulsets) printf(" Yes, %2ld %s\n", msets, (justwts ? "sets of weights" : "data sets")); else printf(" No\n"); printf(" 0 Terminal type (IBM PC, ANSI, none)? %s\n", ibmpc ? "IBM PC" : ansi ? "ANSI" : "(none)"); printf(" 1 Print out the data at start of run %s\n", printdata ? "Yes" : "No"); printf(" 2 Print indications of progress of run %s\n", progress ? "Yes" : "No"); printf(" 3 Print out tree %s\n", treeprint ? "Yes" : "No"); printf(" 4 Print out steps in each character %s\n", stepbox ? "Yes" : "No"); printf(" 5 Print states at all nodes of tree %s\n", ancseq ? "Yes" : "No"); printf(" 6 Write out trees onto tree file? %s\n", trout ? "Yes" : "No"); if(weights && justwts){ printf( "WARNING: W option and Multiple Weights options are both on. "); printf( "The W menu option is unnecessary and has no additional effect. \n"); } printf("\nAre these settings correct?"); printf(" (type Y or the letter for one to change)\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); uppercase(&ch); if (ch == 'Y') break; if (strchr("WHFSOMPATX1234560",ch) != NULL){ switch (ch) { case 'X': mixture = !mixture; break; case 'P': allwagner = !allwagner; break; case 'A': ancvar = !ancvar; break; case 'H': inithowmany(&howmanny, howoften); break; case 'F': inithowoften(&howoften); break; case 'S': simple = !simple; break; case 'O': outgropt = !outgropt; if (outgropt) initoutgroup(&outgrno, spp); else outgrno = 1; break; case 'T': thresh = !thresh; if (thresh) initthreshold(&threshold); break; case 'W': weights = !weights; break; case 'M': mulsets = !mulsets; if (mulsets){ printf("Multiple data sets or multiple weights?"); loopcount2 = 0; do { printf(" (type D or W)\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch2); getchar(); if (ch2 == '\n') ch2 = ' '; uppercase(&ch2); countup(&loopcount2, 10); } while ((ch2 != 'W') && (ch2 != 'D')); justwts = (ch2 == 'W'); if (justwts) justweights(&msets); else initdatasets(&msets); } break; case '0': initterminal(&ibmpc, &ansi); break; case '1': printdata = !printdata; break; case '2': progress = !progress; break; case '3': treeprint = !treeprint; break; case '4': stepbox = !stepbox; break; case '5': ancseq = !ancseq; break; case '6': trout = !trout; break; } } else printf("Not a possible option!\n"); countup(&loopcount, 100); } allsokal = (!allwagner && !mixture); } /* getoptions */ void allocrest() { long i; weight = (steptr)Malloc(chars*sizeof(steptr)); threshwt = (double *)Malloc(chars*sizeof(double)); bestorders = (long **)Malloc(maxtrees*sizeof(long *)); bestrees = (long **)Malloc(maxtrees*sizeof(long *)); for (i = 1; i <= maxtrees; i++) { bestorders[i - 1] = (long *)Malloc(spp*sizeof(long)); bestrees[i - 1] = (long *)Malloc(spp*sizeof(long)); } numsteps = (steptr)Malloc(chars*sizeof(steptr)); guess = (Char *)Malloc(chars*sizeof(Char)); numszero = (steptr)Malloc(chars*sizeof(steptr)); numsone = (steptr)Malloc(chars*sizeof(steptr)); current = (treenumbers)Malloc(spp*sizeof(long)); order = (treenumbers)Malloc(spp*sizeof(long)); nayme = (naym *)Malloc(spp*sizeof(naym)); added = (boolean *)Malloc(nonodes*sizeof(boolean)); ancone = (boolean *)Malloc(chars*sizeof(boolean)); anczero = (boolean *)Malloc(chars*sizeof(boolean)); ancone0 = (boolean *)Malloc(chars*sizeof(boolean)); anczero0 = (boolean *)Malloc(chars*sizeof(boolean)); wagner = (bitptr)Malloc(words*sizeof(long)); wagner0 = (bitptr)Malloc(words*sizeof(long)); zeroanc = (bitptr)Malloc(words*sizeof(long)); oneanc = (bitptr)Malloc(words*sizeof(long)); suppsteps = (bitptr)Malloc(words*sizeof(long)); extras = (steptr)Malloc(chars*sizeof(steptr)); } /* allocrest */ void doinit() { /* initializes variables */ inputnumbers(&spp, &chars, &nonodes, 1); words = chars / bits + 1; getoptions(); if (printdata) fprintf(outfile, "%2ld species, %3ld characters\n", spp, chars); alloctree2(&treenode); setuptree2(treenode); allocrest(); } /* doinit */ void inputoptions() { /* input the information on the options */ long i; if(justwts){ if(firstset){ scan_eoln(infile); if (ancvar) { inputancestors(anczero0, ancone0); } if (mixture) { inputmixture(wagner0); } } for (i = 0; i < (chars); i++) weight[i] = 1; inputweights(chars, weight, &weights); for (i = 0; i < (words); i++) { if (mixture) wagner[i] = wagner0[i]; else if (allsokal) wagner[i] = 0; else wagner[i] = (1L << (bits + 1)) - (1L << 1); } } else { if (!firstset) { samenumsp(&chars, ith); } scan_eoln(infile); for (i = 0; i < (chars); i++) weight[i] = 1; if (ancvar) { inputancestors(anczero0, ancone0); } if (mixture) { inputmixture(wagner0); } if (weights) inputweights(chars, weight, &weights); for (i = 0; i < (words); i++) { if (mixture) wagner[i] = wagner0[i]; else if (allsokal) wagner[i] = 0; else wagner[i] = (1L << (bits + 1)) - (1L << 1); } } for (i = 0; i < (chars); i++) { if (!ancvar) { anczero[i] = true; ancone[i] = (((1L << (i % bits + 1)) & wagner[i / bits]) != 0); } else { anczero[i] = anczero0[i]; ancone[i] = ancone0[i]; } } noroot = true; questions = false; for (i = 0; i < (chars); i++) { if (weight[i] > 0) { noroot = (noroot && ancone[i] && anczero[i] && ((((1L << (i % bits + 1)) & wagner[i / bits]) != 0) || threshold <= 2.0)); } questions = (questions || (ancone[i] && anczero[i])); threshwt[i] = threshold * weight[i]; } } /* inputoptions */ void doinput() { /* reads the input data */ inputoptions(); if(!justwts || firstset) inputdata2(treenode); } /* doinput */ void supplement(bitptr suppsteps) { /* determine minimum number of steps more which will be added when rest of species are put in tree */ long i, j, k, l; long defone, defzero, a; k = 0; for (i = 0; i < (words); i++) { defone = 0; defzero = 0; a = 0; for (l = 1; l <= bits; l++) { k++; if (k <= chars) { if (!ancone[k - 1]) defzero = ((long)defzero) | (1L << l); if (!anczero[k - 1]) defone = ((long)defone) | (1L << l); } } for (j = 0; j < (spp); j++) { defone |= treenode[j]->empstte1[i] & (~treenode[j]->empstte0[i]); defzero |= treenode[j]->empstte0[i] & (~treenode[j]->empstte1[i]); if (added[j]) a |= defone & defzero; } suppsteps[i] = defone & defzero & (~a); } } /* supplement */ void evaluate(node2 *r) { /* Determines the number of steps needed for a tree. This is the minimum number needed to evolve chars on this tree */ long i, stepnum, smaller; double sum; sum = 0.0; for (i = 0; i < (chars); i++) { numszero[i] = 0; numsone[i] = 0; } supplement(suppsteps); for (i = 0; i < (words); i++) zeroanc[i] = fullset; full = true; postorder(r, fullset, full, wagner, zeroanc); cpostorder(r, full, zeroanc, numszero, numsone); count(r->fulstte1, zeroanc, numszero, numsone); count(suppsteps, zeroanc, numszero, numsone); for (i = 0; i < (words); i++) zeroanc[i] = 0; full = false; postorder(r, fullset, full, wagner, zeroanc); cpostorder(r, full, zeroanc, numszero, numsone); count(r->empstte0, zeroanc, numszero, numsone); count(suppsteps, zeroanc, numszero, numsone); for (i = 0; i < (chars); i++) { smaller = spp * weight[i]; numsteps[i] = smaller; if (anczero[i]) { numsteps[i] = numszero[i]; smaller = numszero[i]; } if (ancone[i] && numsone[i] < smaller) numsteps[i] = numsone[i]; stepnum = numsteps[i] + extras[i]; if (stepnum <= threshwt[i]) sum += stepnum; else sum += threshwt[i]; guess[i] = '?'; if (!ancone[i] || (anczero[i] && numszero[i] < numsone[i])) guess[i] = '0'; else if (!anczero[i] || (ancone[i] && numsone[i] < numszero[i])) guess[i] = '1'; } if (examined == 0 && mults == 0) bestyet = -1.0; like = sum; } /* evaluate */ void addtraverse(node2 *a, node2 *b, node2 *c, long *m, long *n, valptr valyew, placeptr place) { /* traverse all places to add b */ if (done) return; if ((*m) <= 2 || !(noroot && (a == root || a == root->next->back))) { add3(a, b, c, &root, treenode); (*n)++; evaluate(root); examined++; if (examined == howoften) { examined = 0; mults++; if (mults == howmanny) done = true; if (progress) { printf("%6ld", mults); if (bestyet >= 0) printf("%18.5f", bestyet); else printf(" - "); printf("%17ld%20.2f\n", nextree - 1, fracdone * 100); #ifdef WIN32 phyFillScreenColor(); #endif } } valyew[(*n) - 1] = like; place[(*n) - 1] = a->index; re_move3(&b, &c, &root, treenode); } if (!a->tip) { addtraverse(a->next->back, b, c, m,n,valyew,place); addtraverse(a->next->next->back, b, c, m,n,valyew,place); } } /* addtraverse */ void addit(long m) { /* adds the species one by one, recursively */ long n; valptr valyew; placeptr place; long i, j, n1, besttoadd = 0; valptr bestval; placeptr bestplace; double oldfrac, oldfdone, sum, bestsum; valyew = (valptr)Malloc(nonodes*sizeof(double)); bestval = (valptr)Malloc(nonodes*sizeof(double)); place = (placeptr)Malloc(nonodes*sizeof(long)); bestplace = (placeptr)Malloc(nonodes*sizeof(long)); if (simple && !firsttime) { n = 0; added[order[m - 1] - 1] = true; addtraverse(root, treenode[order[m - 1] - 1], treenode[spp + m - 2], &m,&n,valyew,place); besttoadd = order[m - 1]; memcpy(bestplace, place, nonodes*sizeof(long)); memcpy(bestval, valyew, nonodes*sizeof(double)); } else { bestsum = -1.0; for (i = 1; i <= (spp); i++) { if (!added[i - 1]) { n = 0; added[i - 1] = true; addtraverse(root, treenode[i - 1], treenode[spp + m - 2], &m, &n,valyew,place); added[i - 1] = false; sum = 0.0; for (j = 0; j < (n); j++) sum += valyew[j]; if (sum > bestsum) { bestsum = sum; besttoadd = i; memcpy(bestplace, place, nonodes*sizeof(long)); memcpy(bestval, valyew, nonodes*sizeof(double)); } } } } order[m - 1] = besttoadd; memcpy(place, bestplace, nonodes*sizeof(long)); memcpy(valyew, bestval, nonodes*sizeof(double)); shellsort(valyew, place, n); oldfrac = fracinc; oldfdone = fracdone; n1 = 0; for (i = 0; i < (n); i++) { if (valyew[i] <= bestyet || bestyet < 0.0) n1++; } if (n1 > 0) fracinc /= n1; for (i = 0; i < (n); i++) { if (valyew[i] <= bestyet || bestyet < 0.0) { current[m - 1] = place[i]; add3(treenode[place[i] - 1], treenode[besttoadd - 1], treenode[spp + m - 2], &root, treenode); added[besttoadd - 1] = true; if (m < spp) addit(m + 1); else { if (valyew[i] < bestyet || bestyet < 0.0) { nextree = 1; bestyet = valyew[i]; } if (nextree <= maxtrees) { memcpy(bestorders[nextree - 1], order, spp*sizeof(long)); memcpy(bestrees[nextree - 1], current, spp*sizeof(long)); } nextree++; firsttime = false; } re_move3(&treenode[besttoadd - 1], &treenode[spp + m - 2], &root, treenode); added[besttoadd - 1] = false; } fracdone += fracinc; } fracinc = oldfrac; fracdone = oldfdone; free(valyew); free(bestval); free(place); free(bestplace); } /* addit */ void reroot(node2 *outgroup) { /* reorients tree, putting outgroup in desired position. */ node2 *p, *q, *newbottom, *oldbottom; if (outgroup->back->index == root->index) return; newbottom = outgroup->back; p = treenode[newbottom->index - 1]->back; while (p->index != root->index) { oldbottom = treenode[p->index - 1]; treenode[p->index - 1] = p; p = oldbottom->back; } p = root->next; q = root->next->next; p->back->back = q->back; q->back->back = p->back; p->back = outgroup; q->back = outgroup->back; outgroup->back->back = root->next->next; outgroup->back = root->next; treenode[newbottom->index - 1] = newbottom; } /* reroot */ void describe() { /* prints ancestors, steps and table of numbers of steps in each character */ if (stepbox) { putc('\n', outfile); writesteps(weights, numsteps); } if (questions && (!noroot || didreroot)) guesstates(guess); if (ancseq) { hypstates(fullset, full, noroot, didreroot, root, wagner, zeroanc, oneanc, treenode, guess, garbage); putc('\n', outfile); } if (trout) { col = 0; treeout2(root, &col, root); } } /* describe */ void maketree() { /* tree construction recursively by branch and bound */ long i, j, k; node2 *dummy; fullset = (1L << (bits + 1)) - (1L << 1); if (progress) { printf("\nHow many\n"); printf("trees looked Approximate\n"); printf("at so far Length of How many percentage\n"); printf("(multiples shortest tree trees this long searched\n"); printf("of %4ld): found so far found so far so far\n", howoften); printf("---------- ------------ ------------ ------------\n"); #ifdef WIN32 phyFillScreenColor(); #endif } done = false; mults = 0; examined = 0; nextree = 1; root = treenode[0]; firsttime = true; for (i = 0; i < (spp); i++) added[i] = false; added[0] = true; order[0] = 1; k = 2; fracdone = 0.0; fracinc = 1.0; bestyet = -1.0; addit(k); if (done) { if (progress) { printf("Search broken off! Not guaranteed to\n"); printf(" have found the most parsimonious trees.\n"); } if (treeprint) { fprintf(outfile, "Search broken off! Not guaranteed to\n"); fprintf(outfile, " have found the most parsimonious\n"); fprintf(outfile, " trees, but here is what we found:\n"); } } if (treeprint) { fprintf(outfile, "\nrequires a total of %18.3f\n\n", bestyet); if (nextree == 2) fprintf(outfile, "One most parsimonious tree found:\n"); else fprintf(outfile, "%5ld trees in all found\n", nextree - 1); } if (nextree > maxtrees + 1) { if (treeprint) fprintf(outfile, "here are the first%4ld of them\n", (long)maxtrees); nextree = maxtrees + 1; } if (treeprint) putc('\n', outfile); for (i = 0; i < (spp); i++) added[i] = true; for (i = 0; i <= (nextree - 2); i++) { for (j = k; j <= (spp); j++) add3(treenode[bestrees[i][j - 1] - 1], treenode[bestorders[i][j - 1] - 1], treenode[spp + j - 2], &root, treenode); if (noroot) reroot(treenode[outgrno - 1]); didreroot = (outgropt && noroot); evaluate(root); printree(treeprint, noroot, didreroot, root); describe(); for (j = k - 1; j < (spp); j++) re_move3(&treenode[bestorders[i][j] - 1], &dummy, &root, treenode); } if (progress) { printf("\nOutput written to file \"%s\"\n\n", outfilename); if (trout) printf("Trees also written onto file \"%s\"\n\n", outtreename); } if (ancseq) freegarbage(&garbage); } /* maketree */ int main(int argc, Char *argv[]) { /* Penny's branch-and-bound method */ /* Reads in the number of species, number of characters, options and data. Then finds all most parsimonious trees */ #ifdef MAC argc = 1; /* macsetup("Penny",""); */ argv[0] = "Penny"; #endif init(argc,argv); openfile(&infile,INFILE,"input file", "r",argv[0],infilename); openfile(&outfile,OUTFILE,"output file", "w",argv[0],outfilename); ibmpc = IBMCRT; ansi = ANSICRT; mulsets = false; msets = 1; firstset = true; garbage = NULL; bits = 8*sizeof(long) - 1; doinit(); if (weights || justwts) openfile(&weightfile,WEIGHTFILE,"weights file","r",argv[0],weightfilename); if (trout) openfile(&outtree,OUTTREE,"output tree file", "w",argv[0],outtreename); if(ancvar) openfile(&ancfile,ANCFILE,"ancestors file", "r",argv[0],ancfilename); if(mixture) openfile(&mixfile,MIXFILE,"mixture file", "r",argv[0],mixfilename); for (ith = 1; ith <= msets; ith++) { if(firstset){ if (allsokal && !mixture) fprintf(outfile, "Camin-Sokal parsimony method\n\n"); if (allwagner && !mixture) fprintf(outfile, "Wagner parsimony method\n\n"); } doinput(); if (msets > 1 && !justwts) { fprintf(outfile, "Data set # %ld:\n\n",ith); if (progress) printf("\nData set # %ld:\n",ith); } if (justwts){ if(firstset && mixture && printdata) printmixture(outfile, wagner); fprintf(outfile, "Weights set # %ld:\n\n", ith); if (progress) printf("\nWeights set # %ld:\n\n", ith); } else if (mixture && printdata) printmixture(outfile, wagner); if (printdata){ if (weights || justwts) printweights(outfile, 0, chars, weight, "Characters"); if (ancvar) printancestors(outfile, anczero, ancone); } if (ith == 1) firstset = false; maketree(); } FClose(infile); FClose(outfile); FClose(outtree); #ifdef MAC fixmacfile(outfilename); fixmacfile(outtreename); #endif #ifdef WIN32 phyRestoreConsoleAttributes(); #endif return 0; } /* Penny's branch-and-bound method */ phylip-3.697/src/phylip.c0000644004732000473200000024541212406201117015007 0ustar joefelsenst_g /* version 3.696. Written by Joseph Felsenstein, Akiko Fuseki, Sean Lamont, Andrew Keeffe, and Dan Fineman. Copyright (c) 1993-2014, Joseph Felsenstein All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #include #include #include "phylip.h" #ifdef WIN32 #include /* for console code (clear screen, text color settings) */ CONSOLE_SCREEN_BUFFER_INFO savecsbi; boolean savecsbi_valid = false; HANDLE hConsoleOutput; void phyClearScreen(); void phySaveConsoleAttributes(); void phySetConsoleAttributes(); void phyRestoreConsoleAttributes(); void phyFillScreenColor(); #endif /* WIN32 */ #ifndef OLDC static void crash_handler(int signum); #endif /* #if defined(OSX_CARBON) && defined(__MWERKS__) boolean fixedpath = false; #endif */ FILE *outfile, *infile, *intree, *intree2, *outtree, *weightfile, *catfile, *ancfile, *mixfile, *factfile; long spp, words, bits; boolean ibmpc, ansi, tranvsp; naym *nayme; /* names of species */ static void crash_handler(int sig_num) { /* when we crash, lets print out something usefull */ printf("ERROR: "); switch(sig_num) { #ifdef SIGSEGV case SIGSEGV: puts("This program has caused a Segmentation fault."); break; #endif /* SIGSEGV */ #ifdef SIGFPE case SIGFPE: puts("This program has caused a Floating Point Exception"); break; #endif /* SIGFPE */ #ifdef SIGILL case SIGILL: puts("This program has attempted an illegal instruction"); break; #endif /* SIGILL */ #ifdef SIGPIPE case SIGPIPE: puts("This program tried to write to a broken pipe"); break; #endif /* SIGPIPE */ #ifdef SIGBUS case SIGBUS: puts("This program had a bus error"); break; #endif /* SIGBUS */ } if (sig_num == SIGSEGV) { puts( " This may have been caused by an incorrectly formatted input file"); puts( " or input tree file. You should check those files carefully."); puts(" If this seems to be a bug, please mail joe@gs.washington.edu"); } else { puts(" Most likely, you have encountered a bug in the program."); puts(" Since this seems to be a bug, please mail joe@gs.washington.edu"); } puts(" with the name of the program, your computer system type,"); puts(" a full description of the problem, and with the input data file."); puts(" (which should be in the body of the message, not as an Attachment)."); #ifdef WIN32 phyRestoreConsoleAttributes(); #endif abort(); } void init(int argc, char** argv) { /* initialization routine for all programs * anything done at the beginning for every program should be done here */ /* set up signal handler for * segfault, floating point exception, illeagal instruction, bad pipe, bus error * there are more signals that can cause a crash, but these are the most common * even these aren't found on all machines. */ #ifdef SIGSEGV signal(SIGSEGV, crash_handler); #endif /* SIGSEGV */ #ifdef SIGFPE signal(SIGFPE, crash_handler); #endif /* SIGFPE */ #ifdef SIGILL signal(SIGILL, crash_handler); #endif /* SIGILL */ #ifdef SIGPIPE signal(SIGPIPE, crash_handler); #endif /* SIGPIPE */ #ifdef SIGBUS signal(SIGBUS, crash_handler); #endif /* SIGBUS */ /* Set default terminal characteristics */ ibmpc = IBMCRT; ansi = ANSICRT; javarun = false; /* Clear the screen */ cleerhome(); #ifdef WIN32 /* Perform DOS console configuration */ phySetConsoleAttributes(); phyClearScreen(); #endif /* WIN32 */ } void scan_eoln(FILE *f) { /* Eat everything up to EOF or newline, including newline */ char ch; while (!eoff(f) && !eoln(f)) gettc(f); if (!eoff(f)) ch = gettc(f); } boolean eoff(FILE *f) { /* Return true iff next getc() is EOF */ int ch; if (feof(f)) return true; ch = getc(f); if (ch == EOF) { ungetc(ch, f); return true; } ungetc(ch, f); return false; } /*eoff*/ boolean eoln(FILE *f) { /* Return true iff next getc() is EOL or EOF */ register int ch; ch = getc(f); if (ch == EOF) return true; ungetc(ch, f); return ((ch == '\n') || (ch == '\r')); } /*eoln*/ int filexists(char *filename) { /* Return true iff file already exists */ FILE *fp; fp = fopen(filename,"r"); if (fp) { fclose(fp); return 1; } else return 0; } /*filexists*/ const char* get_command_name (const char *vektor) { /* returns the name of the program from vektor without the whole path */ char *last_slash; /* Point to the last slash... */ last_slash = strrchr (vektor, DELIMITER); if (last_slash) /* If there was a last slash, return the character after it */ return last_slash + 1; else /* If not, return the vector */ return vektor; } /* get_command_name */ void EOF_error() { /* Print a message and exit when EOF is reached prematurely. */ puts("\n\nERROR: Unexpected end-of-file.\n"); exxit(-1); } /* EOF_error */ void getstryng(char *fname) { /* read in a file name from stdin and take off newline if any */ char *end; fflush(stdout); fname = fgets(fname, FNMLNGTH, stdin); if ( fname == NULL ) EOF_error(); if ( (end = strpbrk(fname, "\n\r")) != NULL) *end = '\0'; } /* getstryng */ void countup(long *loopcount, long maxcount) { /* count how many times this loop has tried to read data, bail out if exceeds maxcount */ (*loopcount)++; if ((*loopcount) >= maxcount) { printf("\nERROR: Made %ld attempts to read input in loop. Aborting run.\n", *loopcount); exxit(-1); } } /* countup */ void openfile(FILE **fp,const char *filename,const char *filedesc, const char *mode,const char *application, char *perm) { /* open a file, testing whether it exists etc. */ FILE *of; char file[FNMLNGTH]; char filemode[3]; char input[FNMLNGTH]; char ch; const char *progname_without_path; long loopcount, loopcount2; /* #if defined(OSX_CARBON) && defined(__MWERKS__) ProcessSerialNumber myProcess; FSRef bundleLocation; unsigned char bundlePath[FNMLNGTH]; if(!fixedpath){ // change path to the bundle location instead of root directory GetCurrentProcess(&myProcess); GetProcessBundleLocation(&myProcess, &bundleLocation); FSRefMakePath(&bundleLocation, bundlePath, FNMLNGTH); chdir((const char*)bundlePath); chdir(".."); // get out of the .app directory fixedpath = true; } #endif */ progname_without_path = get_command_name(application); strcpy(file, filename); strcpy(filemode, mode); loopcount = 0; while (1){ #if ! OVERWRITE_FILES if (filemode[0] == 'w' && filexists(file)){ printf("\n%s: the file \"%s\" that you wanted to\n", progname_without_path, file); printf(" use as %s already exists.\n", filedesc); printf(" Do you want to Replace it, Append to it,\n"); printf(" write to a new File, or Quit?\n"); loopcount2 = 0; do { printf(" (please type R, A, F, or Q) \n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); if ( fgets(input, sizeof(input), stdin) == NULL ) EOF_error(); ch = input[0]; uppercase(&ch); countup(&loopcount2, 10); } while (ch != 'A' && ch != 'R' && ch != 'F' && ch != 'Q'); if (ch == 'Q') exxit(-1); if (ch == 'A') { strcpy(filemode,"a"); continue; } else if (ch == 'F') { file[0] = '\0'; loopcount2 = 0; while (file[0] =='\0') { printf("Please enter a new file name> "); fflush(stdout); getstryng(file); countup(&loopcount2, 10); } strcpy(filemode,"w"); continue; } } #endif /* ! OVERWRITE_FILES */ of = fopen(file,filemode); if (of) break; else { switch (filemode[0]){ case 'r': printf("%s: can't find %s \"%s\"\n", progname_without_path, filedesc, file); file[0] = '\0'; loopcount2 = 0; while ( file[0] =='\0' ) { printf("Please enter a new file name> "); fflush(stdout); countup(&loopcount2, 10); getstryng(file); } break; case 'w': case 'a': printf("%s: can't write %s \"%s\"\n", progname_without_path, filedesc, file); file[0] = '\0'; loopcount2 = 0; while (file[0] =='\0') { printf("Please enter a new file name> "); fflush(stdout); countup(&loopcount2, 10); getstryng(file); } continue; default: printf("There is some error in the call of openfile. Unknown mode.\n"); exxit(-1); } } countup(&loopcount, 20); } *fp = of; if (perm != NULL) strcpy(perm,file); } /* openfile */ void cleerhome() { /* home cursor and clear screen, if possible */ #ifdef WIN32 if(ibmpc || ansi){ phyClearScreen(); } else { printf("\n\n"); } #else printf("%s", ((ibmpc || ansi) ? ("\033[2J\033[H") : "\n\n")); #endif } /* cleerhome */ double randum(longer seed) { /* random number generator -- slow but machine independent This is a multiplicative congruential 32-bit generator x(t+1) = 1664525 * x(t) mod 2^32, one that passes the Coveyou-Macpherson and Lehmer tests, see Knuth ACP vol. 2 We here implement it representing each integer in base-64 notation -- i.e. as an array of 6 six-bit chunks */ long i, j, k, sum; longer mult, newseed; double x; mult[0] = 13; /* these four statements set the multiplier */ mult[1] = 24; /* -- they are its "digits" in a base-64 */ mult[2] = 22; /* notation: 1664525 = 6*64^3+22*64^2 */ mult[3] = 6; /* +24*64+13 */ for (i = 0; i <= 5; i++) newseed[i] = 0; for (i = 0; i <= 5; i++) { /* do the multiplication piecewise */ sum = newseed[i]; k = i; if (i > 3) k = 3; for (j = 0; j <= k; j++) sum += mult[j] * seed[i - j]; newseed[i] = sum; for (j = i; j <= 4; j++) { newseed[j + 1] += newseed[j] / 64; newseed[j] &= 63; } } memcpy(seed, newseed, sizeof(longer)); /* new seed replaces old one */ seed[5] &= 3; /* from the new seed, get a floating point fraction */ x = 0.0; for (i = 0; i <= 5; i++) x = x / 64.0 + seed[i]; x /= 4.0; return x; } /* randum */ void randumize(longer seed, long *enterorder) { /* randomize input order of species -- randomly permute array enterorder */ long i, j, k; for (i = 0; i < spp; i++) { j = (long)(randum(seed) * (i+1)); k = enterorder[j]; enterorder[j] = enterorder[i]; enterorder[i] = k; } } /* randumize */ double normrand(longer seed) {/* standardized Normal random variate */ double x; x = randum(seed)+randum(seed)+randum(seed)+randum(seed) + randum(seed)+randum(seed)+randum(seed)+randum(seed) + randum(seed)+randum(seed)+randum(seed)+randum(seed)-6.0; return(x); } /* normrand */ long readlong(const char *prompt) { /* read a long */ long res, loopcount; char string[100]; loopcount = 0; do { printf("%s", prompt); fflush(stdout); getstryng(string); if (sscanf(string,"%ld",&res) == 1) break; countup(&loopcount, 10); } while (1); return res; } /* readlong */ void uppercase(Char *ch) { /* convert ch to upper case */ *ch = (islower (*ch) ? toupper(*ch) : (*ch)); } /* uppercase */ void initseed(long *inseed, long *inseed0, longer seed) { /* input random number seed */ long i, loopcount; assert(inseed); assert(inseed0); loopcount = 0; for (;;) { printf("\nRandom number seed (must be odd)?\n"); fflush(stdout); if (scanf("%ld%*[^\n]", inseed) == 1) { getchar(); if (*inseed > 0 && (*inseed & 0x1)) break; } countup(&loopcount, 10); } *inseed0 = *inseed; for (i = 0; i <= 5; i++) seed[i] = 0; i = 0; do { seed[i] = *inseed & 63; *inseed /= 64; i++; } while (*inseed != 0); } /*initseed*/ void initjumble(long *inseed, long *inseed0, longer seed, long *njumble) { /* input number of jumblings for jumble option */ long loopcount; initseed(inseed, inseed0, seed); loopcount = 0; for (;;) { printf("Number of times to jumble?\n"); fflush(stdout); if (scanf("%ld%*[^\n]", njumble) == 1) { getchar(); if (*njumble >= 1) break; } countup(&loopcount, 10); } } /*initjumble*/ void initoutgroup(long *outgrno, long spp) { /* input outgroup number */ long loopcount; loopcount = 0; for (;;) { printf("Type number of the outgroup:\n"); fflush(stdout); if (scanf("%ld%*[^\n]", outgrno) == 1) { getchar(); if (*outgrno >= 1 && *outgrno <= spp) break; else { printf("BAD OUTGROUP NUMBER: %ld\n", *outgrno); printf(" Must be in range 1 - %ld\n", spp); } } countup(&loopcount, 10); } } /*initoutgroup*/ void initthreshold(double *threshold) { /* input threshold for threshold parsimony option */ long loopcount; loopcount = 0; for (;;) { printf("What will be the threshold value?\n"); fflush(stdout); if (scanf("%lf%*[^\n]", threshold) == 1) { getchar(); if (*threshold >= 1.0) break; else printf("BAD THRESHOLD VALUE: it must be greater than 1\n"); } countup(&loopcount, 10); } *threshold = (long)(*threshold * 10.0 + 0.5) / 10.0; } /*initthreshold*/ void initcatn(long *categs) { /* initialize category number for rate categories */ long loopcount; loopcount = 0; *categs = 0; for (;;) { printf("Number of categories (1-%d)?\n", maxcategs); fflush(stdout); if (scanf("%ld%*[^\n]", categs) == 1) { getchar(); if (*categs > maxcategs || *categs < 1) continue; else break; } countup(&loopcount, 10); } } /*initcatn*/ void initcategs(long categs, double *rate) { /* initialize category rates for HMM rates */ long i, loopcount, scanned; char line[100], rest[100]; boolean done; loopcount = 0; for (;;){ printf("Rate for each category? (use a space to separate)\n"); fflush(stdout); getstryng(line); done = true; for (i = 0; i < categs; i++){ scanned = sscanf(line,"%lf %[^\n]", &rate[i], rest); if ((scanned < 2 && i < (categs - 1)) || (scanned < 1 && i == (categs - 1))) { printf("Please enter exactly %ld values.\n", categs); done = false; break; } strcpy(line, rest); } if (done) break; countup(&loopcount, 100); } } /*initcategs*/ void initprobcat(long categs, double *probsum, double *probcat) { /* input probabilities of rate categores for HMM rates */ long i, loopcount, scanned; boolean done; char line[100], rest[100]; loopcount = 0; do { printf("Probability for each category?"); printf(" (use a space to separate)\n"); fflush(stdout); getstryng(line); done = true; for (i = 0; i < categs; i++) { scanned = sscanf(line, "%lf %[^\n]", &probcat[i], rest); if ((scanned < 2 && i < (categs - 1)) || (scanned < 1 && i == (categs - 1))) { done = false; printf("Please enter exactly %ld values.\n", categs); break; } strcpy(line, rest); } if (!done) continue; *probsum = 0.0; for (i = 0; i < categs; i++) *probsum += probcat[i]; if (fabs(1.0 - (*probsum)) > 0.001) { done = false; printf("Probabilities must add up to"); printf(" 1.0, plus or minus 0.001.\n"); } countup(&loopcount, 100); } while (!done); } /*initprobcat*/ void lgr(long m, double b, raterootarray lgroot) { /* For use by initgammacat. Get roots of m-th Generalized Laguerre polynomial, given roots of (m-1)-th, these are to be stored in lgroot[m][] */ long i; double upper, lower, x, y; boolean dwn; /* is function declining in this interval? */ if (m == 1) { lgroot[1][1] = 1.0+b; } else { dwn = true; for (i=1; i<=m; i++) { if (i < m) { if (i == 1) lower = 0.0; else lower = lgroot[m-1][i-1]; upper = lgroot[m-1][i]; } else { /* i == m, must search above */ lower = lgroot[m-1][i-1]; x = lgroot[m-1][m-1]; do { x = 2.0*x; y = glaguerre(m, b, x); } while ((dwn && (y > 0.0)) || ((!dwn) && (y < 0.0))); upper = x; } while (upper-lower > 0.000000001) { x = (upper+lower)/2.0; if (glaguerre(m, b, x) > 0.0) { if (dwn) lower = x; else upper = x; } else { if (dwn) upper = x; else lower = x; } } lgroot[m][i] = (lower+upper)/2.0; dwn = !dwn; /* switch for next one */ } } } /* lgr */ double logfac (long n) { /* log(n!) values were calculated with Mathematica with a precision of 30 digits */ long i; double x; switch (n) { case 0: return 0.; case 1: return 0.; case 2: return 0.693147180559945309417232121458; case 3: return 1.791759469228055000812477358381; case 4: return 3.1780538303479456196469416013; case 5: return 4.78749174278204599424770093452; case 6: return 6.5792512120101009950601782929; case 7: return 8.52516136106541430016553103635; case 8: return 10.60460290274525022841722740072; case 9: return 12.80182748008146961120771787457; case 10: return 15.10441257307551529522570932925; case 11: return 17.50230784587388583928765290722; case 12: return 19.98721449566188614951736238706; default: x = 19.98721449566188614951736238706; for (i = 13; i <= n; i++) x += log(i); return x; } } /* logfac */ double glaguerre(long m, double b, double x) { /* Generalized Laguerre polynomial computed recursively. For use by initgammacat */ long i; double gln, glnm1, glnp1; /* L_n, L_(n-1), L_(n+1) */ if (m == 0) return 1.0; else { if (m == 1) return 1.0 + b - x; else { gln = 1.0+b-x; glnm1 = 1.0; for (i=2; i <= m; i++) { glnp1 = ((2*(i-1)+b+1.0-x)*gln - (i-1+b)*glnm1)/i; glnm1 = gln; gln = glnp1; } return gln; } } } /* glaguerre */ void initlaguerrecat(long categs, double alpha, double *rate, double *probcat) { /* calculate rates and probabilities to approximate Gamma distribution of rates with "categs" categories and shape parameter "alpha" using rates and weights from Generalized Laguerre quadrature */ long i; raterootarray lgroot; /* roots of GLaguerre polynomials */ double f, x, xi, y; alpha = alpha - 1.0; lgroot[1][1] = 1.0+alpha; for (i = 2; i <= categs; i++) lgr(i, alpha, lgroot); /* get roots for L^(a)_n */ /* here get weights */ /* Gamma weights are (1+a)(1+a/2) ... (1+a/n)*x_i/((n+1)^2 [L_{n+1}^a(x_i)]^2) */ f = 1; for (i = 1; i <= categs; i++) f *= (1.0+alpha/i); for (i = 1; i <= categs; i++) { xi = lgroot[categs][i]; y = glaguerre(categs+1, alpha, xi); x = f*xi/((categs+1)*(categs+1)*y*y); rate[i-1] = xi/(1.0+alpha); probcat[i-1] = x; } } /* initlaguerrecat */ double hermite(long n, double x) { /* calculates hermite polynomial with degree n and parameter x */ /* seems to be unprecise for n>13 -> root finder does not converge*/ double h1 = 1.; double h2 = 2. * x; double xx = 2. * x; long i; for (i = 1; i < n; i++) { xx = 2. * x * h2 - 2. * (i) * h1; h1 = h2; h2 = xx; } return xx; } /* hermite */ void root_hermite(long n, double *hroot) { /* find roots of Hermite polynmials */ long z; long ii; long start; if (n % 2 == 0) { start = n/2; z = 1; } else { start = n/2 + 1; z=2; hroot[start-1] = 0.0; } for (ii = start; ii < n; ii++) { /* search only upwards*/ hroot[ii] = halfroot(hermite, n, hroot[ii-1]+EPSILON, 1./n); hroot[start - z] = -hroot[ii]; z++; } } /* root_hermite */ double halfroot(double (*func)(long m, double x), long n, double startx, double delta) { /* searches from the bound (startx) only in one direction (by positive or negative delta, which results in other-bound=startx+delta) delta should be small. (*func) is a function with two arguments */ double xl; double xu; double xm = 0; double fu; double fl; double fm = 100000.; double gradient; boolean dwn = false; /* decide if we search above or below startx and escapes to trace back to the starting point that most often will be the root from the previous calculation */ if (delta < 0) { xu = startx; xl = xu + delta; } else { xl = startx; xu = xl + delta; } delta = fabs(delta); fu = (*func)(n, xu); fl = (*func)(n, xl); gradient = (fl-fu)/(xl-xu); while(fabs(fm) > EPSILON) { /* is root outside of our bracket?*/ if ((fu<0.0 && fl<0.0) || (fu>0.0 && fl > 0.0)) { xu += delta; fu = (*func)(n, xu); fl = (*func)(n, xl); gradient = (fl-fu)/(xl-xu); dwn = (gradient < 0.0) ? true : false; } else { xm = xl - fl / gradient; fm = (*func)(n, xm); if (dwn) { if (fm > 0.) { xl = xm; fl = fm; } else { xu = xm; fu = fm; } } else { if (fm > 0.) { xu = xm; fu = fm; } else { xl = xm; fl = fm; } } gradient = (fl-fu)/(xl-xu); } } return xm; } /* halfroot */ void hermite_weight(long n, double * hroot, double * weights) { /* calculate the weights for the hermite polynomial at the roots using formula from Abramowitz and Stegun chapter 25.4.46 p.890 */ long i; double hr2; double numerator; numerator = exp(0.6931471805599 * ( n-1.) + logfac(n)) / (n*n); for (i = 0; i < n; i++) { hr2 = hermite(n-1, hroot[i]); weights[i] = numerator / (hr2*hr2); } } /* hermiteweight */ void inithermitcat(long categs, double alpha, double *rate, double *probcat) { /* calculates rates and probabilities */ long i; double *hroot; double std; std = SQRT2 /sqrt(alpha); hroot = (double *) Malloc((categs+1) * sizeof(double)); root_hermite(categs, hroot); /* calculate roots */ hermite_weight(categs, hroot, probcat); /* set weights */ for (i=0; i= 100.0) inithermitcat(categs, alpha, rate, probcat); else initlaguerrecat(categs, alpha, rate, probcat); } /* initgammacat */ void inithowmany(long *howmanny, long howoften) {/* input how many cycles */ long loopcount; loopcount = 0; for (;;) { printf("How many cycles of %4ld trees?\n", howoften); fflush(stdout); if (scanf("%ld%*[^\n]", howmanny) == 1) { getchar(); if (*howmanny >= 1) break; } countup(&loopcount, 10); } } /*inithowmany*/ void inithowoften(long *howoften) { /* input how many trees per cycle */ long loopcount; loopcount = 0; for (;;) { printf("How many trees per cycle?\n"); fflush(stdout); if (scanf("%ld%*[^\n]", howoften) == 1) { getchar(); if (*howoften >= 1) break; } countup(&loopcount, 10); } } /*inithowoften*/ void initlambda(double *lambda) { /* input patch length parameter for autocorrelated HMM rates */ long loopcount; loopcount = 0; for (;;) { printf("Mean block length of sites having the same rate (greater than 1)?\n"); fflush(stdout); if (scanf("%lf%*[^\n]", lambda) == 1) { getchar(); if (*lambda > 1.0) break; } countup(&loopcount, 10); } *lambda = 1.0 / *lambda; } /* initlambda */ void initfreqs(double *freqa, double *freqc, double *freqg, double *freqt) { /* input frequencies of the four bases */ char input[100]; long scanned, loopcount; printf("Base frequencies for A, C, G, T/U (use blanks to separate)?\n"); loopcount = 0; do { fflush(stdout); getstryng(input); scanned = sscanf(input,"%lf%lf%lf%lf%*[^\n]", freqa, freqc, freqg, freqt); if (scanned == 4) break; else printf("Please enter exactly 4 values.\n"); countup(&loopcount, 100); } while (1); } /* initfreqs */ void initratio(double *ttratio) { /* input transition/transversion ratio */ long loopcount; loopcount = 0; for (;;) { printf("Transition/transversion ratio?\n"); fflush(stdout); if (scanf("%lf%*[^\n]", ttratio) == 1) { getchar(); if (*ttratio >= 0.0) break; else printf("Transition/transversion ratio cannot be negative.\n"); } countup(&loopcount, 10); } } /* initratio */ void initpower(double *power) { for (;;) { printf("New power?\n"); fflush(stdout); if (scanf("%lf%*[^\n]", power) == 1) { getchar(); break; } } } /*initpower*/ void initdatasets(long *datasets) { /* handle multi-data set option */ long loopcount; loopcount = 0; for (;;) { printf("How many data sets?\n"); fflush(stdout); if (scanf("%ld%*[^\n]", datasets) == 1) { getchar(); if (*datasets > 1) break; else printf("Bad data sets number: it must be greater than 1\n"); } countup(&loopcount, 10); } } /* initdatasets */ void justweights(long *datasets) { /* handle multi-data set option by weights */ long loopcount; loopcount = 0; for (;;) { printf("How many sets of weights?\n"); fflush(stdout); if (scanf("%ld%*[^\n]", datasets) == 1) { getchar(); if (*datasets >= 1) break; else printf("BAD NUMBER: it must be greater than 1\n"); } countup(&loopcount, 10); } } /* justweights */ void initterminal(boolean *ibmpc, boolean *ansi) { /* handle terminal option */ if (*ibmpc) { *ibmpc = false; *ansi = true; } else if (*ansi) *ansi = false; else *ibmpc = true; } /*initterminal*/ void initnumlines(long *screenlines) { long loopcount; loopcount = 0; do { *screenlines = readlong("Number of lines on screen?\n"); countup(&loopcount, 10); } while (*screenlines <= 12); } /*initnumlines*/ void initbestrees(bestelm *bestrees, long maxtrees, boolean glob) { /* initializes either global or local field of each array in bestrees */ long i; if (glob) for (i = 0; i < maxtrees; i++) bestrees[i].gloreange = false; else for (i = 0; i < maxtrees; i++) bestrees[i].locreange = false; } /* initbestrees */ void newline(FILE *filename, long i, long j, long k) { /* go to new line if i is a multiple of j, indent k spaces */ long m; if ((i - 1) % j != 0 || i <= 1) return; putc('\n', filename); for (m = 1; m <= k; m++) putc(' ', filename); } /* newline */ void inputnumbersold(long *spp, long *chars, long *nonodes, long n) { /* input the numbers of species and of characters */ if (fscanf(infile, "%ld%ld", spp, chars) != 2 || *spp <= 0 || *chars <= 0) { printf( "ERROR: Unable to read the number of species or characters in data set\n"); printf( "The input file is incorrect (perhaps it was not saved text only).\n"); } *nonodes = *spp * 2 - n; } /* inputnumbersold */ void inputnumbers(long *spp, long *chars, long *nonodes, long n) { /* Read numbers of species and characters from first line of a data set. * Return the results in *spp and *chars, respectively. Also returns * (*spp * 2 - n) in *nonodes */ if (fscanf(infile, "%ld%ld", spp, chars) != 2 || *spp <= 0 || *chars <= 0) { printf( "ERROR: Unable to read the number of species or characters in data set\n"); printf( "The input file is incorrect (perhaps it was not saved text only).\n"); } *nonodes = *spp * 2 - n; } /* inputnumbers */ void inputnumbers2(long *spp, long *nonodes, long n) { /* read species number */ if (fscanf(infile, "%ld", spp) != 1 || *spp <= 0) { printf("ERROR: Unable to read the number of species in data set\n"); printf( "The input file is incorrect (perhaps it was not saved text only).\n"); } fprintf(outfile, "\n%4ld Populations\n", *spp); *nonodes = *spp * 2 - n; } /* inputnumbers2 */ void inputnumbers3(long *spp, long *chars) { /* input the numbers of species and of characters */ if (fscanf(infile, "%ld%ld", spp, chars) != 2 || *spp <= 0 || *chars <= 0) { printf( "ERROR: Unable to read the number of species or characters in data set\n"); printf( "The input file is incorrect (perhaps it was not saved text only).\n"); exxit(-1); } } /* inputnumbers3 */ void samenumsp(long *chars, long ith) { /* check if spp is same as the first set in other data sets */ long cursp, curchs; if (eoln(infile)) scan_eoln(infile); if (fscanf(infile, "%ld%ld", &cursp, &curchs) == 2) { if (cursp != spp) { printf( "\n\nERROR: Inconsistent number of species in data set %ld\n\n", ith); exxit(-1); } } else { printf( "Unable to read number of species and sites from data set %ld\n\n", ith); exxit(-1); } *chars = curchs; } /* samenumsp */ void samenumsp2(long ith) { /* check if spp is same as the first set in other data sets */ long cursp; if (eoln(infile)) scan_eoln(infile); if (fscanf(infile, "%ld", &cursp) != 1) { printf("\n\nERROR: Unable to read number of species in data set %ld\n", ith); printf( "The input file is incorrect (perhaps it was not saved text only).\n"); exxit(-1); } if (cursp != spp) { printf( "\n\nERROR: Inconsistent number of species in data set %ld\n\n", ith); exxit(-1); } } /* samenumsp2 */ void readoptions(long *extranum, const char *options) { /* read option characters from input file */ Char ch; while (!(eoln(infile))) { ch = gettc(infile); uppercase(&ch); if (strchr(options, ch) != NULL) (* extranum)++; else if (!(ch == ' ' || ch == '\t')) { printf("BAD OPTION CHARACTER: %c\n", ch); exxit(-1); } } scan_eoln(infile); } /* readoptions */ void matchoptions(Char *ch, const char *options) { /* match option characters to those in auxiliary options line */ *ch = gettc(infile); uppercase(ch); if (strchr(options, *ch) == NULL) { printf("ERROR: Incorrect auxiliary options line"); printf(" which starts with %c\n", *ch); exxit(-1); } } /* matchoptions */ void inputweightsold(long chars, steptr weight, boolean *weights) { Char ch; int i; for (i = 1; i < nmlngth ; i++) getc(infile); for (i = 0; i < chars; i++) { do { if (eoln(infile)) scan_eoln(infile); ch = gettc(infile); if (ch == '\n') ch = ' '; } while (ch == ' '); weight[i] = 1; if (isdigit(ch)) weight[i] = ch - '0'; else if (isalpha(ch)) { uppercase(&ch); weight[i] = ch - 'A' + 10; } else { printf("\n\nERROR: Bad weight character: %c\n\n", ch); exxit(-1); } } scan_eoln(infile); *weights = true; } /*inputweightsold*/ void inputweights(long chars, steptr weight, boolean *weights) { /* input the character weights, 0-9 and A-Z for weights 0 - 35 */ Char ch; long i; for (i = 0; i < chars; i++) { do { if (eoln(weightfile)) scan_eoln(weightfile); ch = gettc(weightfile); if (ch == '\n') ch = ' '; } while (ch == ' '); weight[i] = 1; if (isdigit(ch)) weight[i] = ch - '0'; else if (isalpha(ch)) { uppercase(&ch); weight[i] = ch - 'A' + 10; } else { printf("\n\nERROR: Bad weight character: %c\n\n", ch); exxit(-1); } } scan_eoln(weightfile); *weights = true; } /* inputweights */ void inputweights2(long a, long b, long *weightsum, steptr weight, boolean *weights, const char *prog) { /* input the character weights, 0 or 1 */ Char ch; long i; *weightsum = 0; for (i = a; i < b; i++) { do { if (eoln(weightfile)) scan_eoln(weightfile); ch = gettc(weightfile); } while (ch == ' '); weight[i] = 1; if (ch == '0' || ch == '1') weight[i] = ch - '0'; else { printf("\n\nERROR: Bad weight character: %c -- ", ch); printf("weights in %s must be 0 or 1\n", prog); exxit(-1); } *weightsum += weight[i]; } *weights = true; scan_eoln(weightfile); } /* inputweights2 */ void printweights(FILE *filename, long inc, long chars, steptr weight, const char *letters) { /* print out the weights of sites */ long i, j; boolean letterweights; letterweights = false; for (i = 0; i < chars; i++) if (weight[i] > 9) letterweights = true; fprintf(filename, "\n %s are weighted as follows:", letters); if (letterweights) fprintf(filename, " (A = 10, B = 11, etc.)\n"); else putc('\n', filename); for (i = 0; i < chars; i++) { if (i % 60 == 0) { putc('\n', filename); for (j = 1; j <= nmlngth + 3; j++) putc(' ', filename); } if (weight[i+inc] < 10) fprintf(filename, "%ld", weight[i + inc]); else fprintf(filename, "%c", 'A'-10+(int)weight[i + inc]); if ((i+1) % 5 == 0 && (i+1) % 60 != 0) putc(' ', filename); } fprintf(filename, "\n\n"); } /* printweights */ void inputcategs(long a, long b, steptr category, long categs, const char *prog) { /* input the categories, 1-9 */ Char ch; long i; for (i = a; i < b; i++) { do { if (eoln(catfile)) scan_eoln(catfile); ch = gettc(catfile); } while (ch == ' '); if ((ch >= '1') && (ch <= ('0'+categs))) category[i] = ch - '0'; else { printf("\n\nERROR: Bad category character: %c", ch); printf(" -- categories in %s are currently 1-%ld\n", prog, categs); exxit(-1); } } scan_eoln(catfile); } /* inputcategs */ void printcategs(FILE *filename, long chars, steptr category, const char *letters) { /* print out the sitewise categories */ long i, j; fprintf(filename, "\n %s are:\n", letters); for (i = 0; i < chars; i++) { if (i % 60 == 0) { putc('\n', filename); for (j = 1; j <= nmlngth + 3; j++) putc(' ', filename); } fprintf(filename, "%ld", category[i]); if ((i+1) % 10 == 0 && (i+1) % 60 != 0) putc(' ', filename); } fprintf(filename, "\n\n"); } /* printcategs */ void inputfactors(long chars, Char *factor, boolean *factors) { /* reads the factor symbols */ long i; for (i = 0; i < (chars); i++) { if (eoln(factfile)) scan_eoln(factfile); factor[i] = gettc(factfile); if (factor[i] == '\n') factor[i] = ' '; } scan_eoln(factfile); *factors = true; } /* inputfactors */ void printfactors(FILE *filename, long chars, Char *factor, const char *letters) { /* print out list of factor symbols */ long i; fprintf(filename, "Factors%s:\n\n", letters); for (i = 1; i <= nmlngth - 5; i++) putc(' ', filename); for (i = 1; i <= (chars); i++) { newline(filename, i, 55, nmlngth + 3); putc(factor[i - 1], filename); if (i % 5 == 0) putc(' ', filename); } putc('\n', filename); } /* printfactors */ void headings(long chars, const char *letters1, const char *letters2) { long i, j; putc('\n', outfile); j = nmlngth + (chars + (chars - 1) / 10) / 2 - 5; if (j < nmlngth - 1) j = nmlngth - 1; if (j > 37) j = 37; fprintf(outfile, "Name"); for (i = 1; i <= j; i++) putc(' ', outfile); fprintf(outfile, "%s\n", letters1); fprintf(outfile, "----"); for (i = 1; i <= j; i++) putc(' ', outfile); fprintf(outfile, "%s\n\n", letters2); } /* headings */ void initname(long i) { /* read in species name */ long j; for (j = 0; j < nmlngth; j++) { if (eoff(infile) | eoln(infile)){ printf("\n\nERROR: end-of-line or end-of-file"); printf(" in the middle of species name for species %ld\n\n", i+1); exxit(-1); } nayme[i][j] = gettc(infile); if ((nayme[i][j] == '(') || (nayme[i][j] == ')') || (nayme[i][j] == ':') || (nayme[i][j] == ',') || (nayme[i][j] == ';') || (nayme[i][j] == '[') || (nayme[i][j] == ']')) { printf("\nERROR: Species name may not contain characters ( ) : ; , [ ] \n"); printf(" In name of species number %ld there is character %c\n\n", i+1, nayme[i][j]); exxit(-1); } } } /* initname */ void findtree(boolean *found, long *pos, long nextree, long *place, bestelm *bestrees) { /* finds tree given by array place in array bestrees by binary search */ /* used by dnacomp, dnapars, dollop, mix, & protpars */ long i, lower, upper; boolean below, done; below = false; lower = 1; upper = nextree - 1; (*found) = false; while (!(*found) && lower <= upper) { (*pos) = (lower + upper) / 2; i = 3; done = false; while (!done) { done = (i > spp); if (!done) done = (place[i - 1] != bestrees[(*pos) - 1].btree[i - 1]); if (!done) i++; } (*found) = (i > spp); if (*found) break; below = (place[i - 1] < bestrees[(*pos )- 1].btree[i - 1]); if (below) upper = (*pos) - 1; else lower = (*pos) + 1; } if (!(*found) && !below) (*pos)++; } /* findtree */ void addtree(long pos, long *nextree, boolean collapse, long *place, bestelm *bestrees) { /* puts tree from array place in its proper position in array bestrees */ /* used by dnacomp, dnapars, dollop, mix, & protpars */ long i; for (i = *nextree - 1; i >= pos; i--){ memcpy(bestrees[i].btree, bestrees[i - 1].btree, spp * sizeof(long)); bestrees[i].gloreange = bestrees[i - 1].gloreange; bestrees[i - 1].gloreange = false; bestrees[i].locreange = bestrees[i - 1].locreange; bestrees[i - 1].locreange = false; bestrees[i].collapse = bestrees[i - 1].collapse; } for (i = 0; i < spp; i++) bestrees[pos - 1].btree[i] = place[i]; bestrees[pos - 1].collapse = collapse; (*nextree)++; } /* addtree */ long findunrearranged(bestelm *bestrees, long nextree, boolean glob) { /* finds bestree with either global or local field false */ long i; if (glob) { for (i = 0; i < nextree - 1; i++) if (!bestrees[i].gloreange) return i; } else { for (i = 0; i < nextree - 1; i++) if (!bestrees[i].locreange) return i; } return -1; } /* findunrearranged */ boolean torearrange(bestelm *bestrees, long nextree) { /* sees if any best tree is yet to be rearranged */ if (findunrearranged(bestrees, nextree, true) >= 0) return true; else if (findunrearranged(bestrees, nextree, false) >= 0) return true; else return false; } /* torearrange */ void reducebestrees(bestelm *bestrees, long *nextree) { /* finds best trees with collapsible branches and deletes them */ long i, j; i = 0; j = *nextree - 2; do { while (!bestrees[i].collapse && i < *nextree - 1) i++; while (bestrees[j].collapse && j >= 0) j--; if (i < j) { memcpy(bestrees[i].btree, bestrees[j].btree, spp * sizeof(long)); bestrees[i].gloreange = bestrees[j].gloreange; bestrees[i].locreange = bestrees[j].locreange; bestrees[i].collapse = false; bestrees[j].collapse = true; } } while (i < j); *nextree = i + 1; } /* reducebestrees */ void shellsort(double *a, long *b, long n) { /* Shell sort keeping a, b in same order */ /* used by dnapenny, dolpenny, & penny */ long gap, i, j, itemp; double rtemp; gap = n / 2; while (gap > 0) { for (i = gap + 1; i <= n; i++) { j = i - gap; while (j > 0) { if (a[j - 1] > a[j + gap - 1]) { rtemp = a[j - 1]; a[j - 1] = a[j + gap - 1]; a[j + gap - 1] = rtemp; itemp = b[j - 1]; b[j - 1] = b[j + gap - 1]; b[j + gap - 1] = itemp; } j -= gap; } } gap /= 2; } } /* shellsort */ void getch(Char *c, long *parens, FILE *treefile) { /* get next nonblank character */ do { if (eoln(treefile)) scan_eoln(treefile); (*c) = gettc(treefile); if ((*c) == '\n' || (*c) == '\t') (*c) = ' '; } while ( *c == ' ' && !eoff(treefile) ); if ((*c) == '(') (*parens)++; if ((*c) == ')') (*parens)--; } /* getch */ void getch2(Char *c, long *parens) { /* get next nonblank character */ do { if (eoln(intree)) scan_eoln(intree); *c = gettc(intree); if (*c == '\n' || *c == '\t') *c = ' '; } while (!(*c != ' ' || eoff(intree))); if (*c == '(') (*parens)++; if (*c == ')') (*parens)--; } /* getch2 */ void findch(Char c, Char *ch, long which) { /* scan forward until find character c */ boolean done; long dummy_parens; done = false; while (!done) { if (c == ',') { if (*ch == '(' || *ch == ')' || *ch == ';') { printf( "\n\nERROR in user tree %ld: unmatched parenthesis or missing comma\n\n", which); exxit(-1); } else if (*ch == ',') done = true; } else if (c == ')') { if (*ch == '(' || *ch == ',' || *ch == ';') { printf("\n\nERROR in user tree %ld: ", which); printf("unmatched parenthesis or non-bifurcated node\n\n"); exxit(-1); } else { if (*ch == ')') done = true; } } else if (c == ';') { if (*ch != ';') { printf("\n\nERROR in user tree %ld: ", which); printf("unmatched parenthesis or missing semicolon\n\n"); exxit(-1); } else done = true; } if (*ch != ')' && done) continue; getch(ch, &dummy_parens, intree); } } /* findch */ void findch2(Char c, long *lparens, long *rparens, Char *ch) { /* skip forward in user tree until find character c */ boolean done; long dummy_parens; done = false; while (!done) { if (c == ',') { if (*ch == '(' || *ch == ')' || *ch == ':' || *ch == ';') { printf("\n\nERROR in user tree: "); printf("unmatched parenthesis, missing comma"); printf(" or non-trifurcated base\n\n"); exxit(-1); } else if (*ch == ',') done = true; } else if (c == ')') { if (*ch == '(' || *ch == ',' || *ch == ':' || *ch == ';') { printf( "\n\nERROR in user tree: unmatched parenthesis or non-bifurcated node\n\n"); exxit(-1); } else if (*ch == ')') { (*rparens)++; if ((*lparens) > 0 && (*lparens) == (*rparens)) { if ((*lparens) == spp - 2) { getch(ch, &dummy_parens, intree); if (*ch != ';') { printf( "\n\nERROR in user tree: "); printf("unmatched parenthesis or missing semicolon\n\n"); exxit(-1); } } } done = true; } } if (*ch != ')' && done) continue; if (*ch == ')') getch(ch, &dummy_parens, intree); } } /* findch2 */ void processlength(double *valyew, double *divisor, Char *ch, boolean *lengthIsNegative, FILE *treefile, long *parens) { /* read a branch length from a treefile */ long digit, ordzero, exponent, exponentIsNegative; boolean pointread, hasExponent; ordzero = '0'; *lengthIsNegative = false; pointread = false; hasExponent = false; exponentIsNegative = -1; /* 3 states: -1 = unassigned, 1 = true, 0 = false */ exponent = 0; *valyew = 0.0; *divisor = 1.0; getch(ch, parens, treefile); if ('+' == *ch) getch(ch, parens, treefile); /* ignore leading '+', because "+1.2345" == "1.2345" */ else if ('-' == *ch) { *lengthIsNegative = true; getch(ch, parens, treefile); } digit = (long)(*ch - ordzero); while ( ((digit <= 9) && (digit >= 0)) || '.' == *ch || '-' == *ch || '+' == *ch || 'E' == *ch || 'e' == *ch) { if ('.' == *ch) { if (!pointread) pointread = true; else { printf("\n\nERROR: Branch length found with more than one \'.\' in it.\n\n"); exxit(-1); } } else if ('+' == *ch) { if (hasExponent && -1 == exponentIsNegative) exponentIsNegative = 0; /* 3 states: -1 = unassigned, 1 = true, 0 = false */ else { printf("\n\nERROR: Branch length found with \'+\' in an unexpected place.\n\n"); exxit(-1); } } else if ('-' == *ch) { if (hasExponent && -1 == exponentIsNegative) exponentIsNegative = 1; /* 3 states: -1 = unassigned, 1 = true, 0 = false */ else { printf("\n\nERROR: Branch length found with \'-\' in an unexpected place.\n\n"); exxit(-1); } } else if ('E' == *ch || 'e' == *ch) { if (!hasExponent) hasExponent = true; else { printf("\n\nERROR: Branch length found with more than one \'E\' in it.\n\n"); exxit(-1); } } else { if (!hasExponent) { *valyew = *valyew * 10.0 + digit; if (pointread) *divisor *= 10.0; } else exponent = 10*exponent + digit; } getch(ch, parens, treefile); digit = (long)(*ch - ordzero); } if (hasExponent) { if (exponentIsNegative) *divisor *= pow(10.,(double)exponent); else *divisor /= pow(10.,(double)exponent); } if (*lengthIsNegative) *valyew = -(*valyew); } /* processlength */ void writename(long start, long n, long *enterorder) { /* write species name and number in entry order */ long i, j; for (i = start; i < start+n; i++) { printf(" %3ld. ", i+1); for (j = 0; j < nmlngth; j++) putchar(nayme[enterorder[i] - 1][j]); putchar('\n'); fflush(stdout); } } /* writename */ void memerror() { printf("Error allocating memory\n"); exxit(-1); } /* memerror */ void odd_malloc(long x) { /* error message if attempt to malloc too little or too much memory */ printf ("ERROR: a function asked for an inappropriate amount of memory:"); printf (" %ld bytes\n", x); printf (" This can mean one of two things:\n"); printf (" 1. The input file is incorrect"); printf (" (perhaps it was not saved as Text Only),\n"); printf (" 2. There is a bug in the program.\n"); printf (" Please check your input file carefully.\n"); printf (" If it seems to be a bug, please mail joe (at) gs.washington.edu\n"); printf (" with the name of the program, your computer system type,\n"); printf (" a full description of the problem, and with the input data file.\n"); printf (" (which should be in the body of the message, not as an Attachment).\n"); /* abort() can be used to crash */ exxit(-1); } MALLOCRETURN *mymalloc(long x) { /* wrapper for malloc, allowing error message if too little, too much */ MALLOCRETURN *new_block; if ((x <= 0) || (x > TOO_MUCH_MEMORY)) odd_malloc(x); new_block = (MALLOCRETURN *)calloc(1, x); if (!new_block) { memerror(); return (MALLOCRETURN *) new_block; } else return (MALLOCRETURN *) new_block; } /* mymalloc */ void gnu(node **grbg, node **p) { /* this and the following are do-it-yourself garbage collectors. Make a new node or pull one off the garbage list */ if (*grbg != NULL) { *p = *grbg; *grbg = (*grbg)->next; } else *p = (node *)Malloc(sizeof(node)); (*p)->back = NULL; (*p)->next = NULL; (*p)->tip = false; (*p)->times_in_tree = 0.0; (*p)->r = 0.0; (*p)->theta = 0.0; (*p)->x = NULL; (*p)->protx = NULL; /* for the sake of proml */ } /* gnu */ void chuck(node **grbg, node *p) { /* collect garbage on p -- put it on front of garbage list */ p->back = NULL; p->next = *grbg; *grbg = p; } /* chuck */ void zeronumnuc(node *p, long endsite) { long i,j; for (i = 0; i < endsite; i++) for (j = (long)A; j <= (long)O; j++) p->numnuc[i][j] = 0; } /* zeronumnuc */ void zerodiscnumnuc(node *p, long endsite) { long i,j; for (i = 0; i < endsite; i++) for (j = (long)zero; j <= (long)seven; j++) p->discnumnuc[i][j] = 0; } /* zerodiscnumnuc */ void allocnontip(node *p, long *zeros, long endsite) { /* allocate an interior node */ /* used by dnacomp, dnapars, & dnapenny */ p->numsteps = (steptr)Malloc(endsite*sizeof(long)); p->oldnumsteps = (steptr)Malloc(endsite*sizeof(long)); p->base = (baseptr)Malloc(endsite*sizeof(long)); p->oldbase = (baseptr)Malloc(endsite*sizeof(long)); p->numnuc = (nucarray *)Malloc(endsite*sizeof(nucarray)); memcpy(p->base, zeros, endsite*sizeof(long)); memcpy(p->numsteps, zeros, endsite*sizeof(long)); memcpy(p->oldbase, zeros, endsite*sizeof(long)); memcpy(p->oldnumsteps, zeros, endsite*sizeof(long)); zeronumnuc(p, endsite); } /* allocnontip */ void allocdiscnontip(node *p, long *zeros, unsigned char *zeros2, long endsite) { /* allocate an interior node */ /* used by pars */ p->numsteps = (steptr)Malloc(endsite*sizeof(long)); p->oldnumsteps = (steptr)Malloc(endsite*sizeof(long)); p->discbase = (discbaseptr)Malloc(endsite*sizeof(unsigned char)); p->olddiscbase = (discbaseptr)Malloc(endsite*sizeof(unsigned char)); p->discnumnuc = (discnucarray *)Malloc(endsite*sizeof(discnucarray)); memcpy(p->discbase, zeros2, endsite*sizeof(unsigned char)); memcpy(p->numsteps, zeros, endsite*sizeof(long)); memcpy(p->olddiscbase, zeros2, endsite*sizeof(unsigned char)); memcpy(p->oldnumsteps, zeros, endsite*sizeof(long)); zerodiscnumnuc(p, endsite); } /* allocdiscnontip */ void allocnode(node **anode, long *zeros, long endsite) { /* allocate a node */ /* used by dnacomp, dnapars, & dnapenny */ *anode = (node *)Malloc(sizeof(node)); allocnontip(*anode, zeros, endsite); } /* allocnode */ void allocdiscnode(node **anode, long *zeros, unsigned char *zeros2, long endsite) { /* allocate a node */ /* used by pars */ *anode = (node *)Malloc(sizeof(node)); allocdiscnontip(*anode, zeros, zeros2, endsite); } /* allocdiscnontip */ void gnutreenode(node **grbg, node **p, long i, long endsite, long *zeros) { /* this and the following are do-it-yourself garbage collectors. Make a new node or pull one off the garbage list */ if (*grbg != NULL) { *p = *grbg; *grbg = (*grbg)->next; memcpy((*p)->numsteps, zeros, endsite*sizeof(long)); memcpy((*p)->oldnumsteps, zeros, endsite*sizeof(long)); memcpy((*p)->base, zeros, endsite*sizeof(long)); memcpy((*p)->oldbase, zeros, endsite*sizeof(long)); zeronumnuc(*p, endsite); } else allocnode(p, zeros, endsite); (*p)->back = NULL; (*p)->next = NULL; (*p)->tip = false; (*p)->visited = false; (*p)->index = i; (*p)->numdesc = 0; (*p)->sumsteps = 0.0; } /* gnutreenode */ void gnudisctreenode(node **grbg, node **p, long i, long endsite, long *zeros, unsigned char *zeros2) { /* this and the following are do-it-yourself garbage collectors. Make a new node or pull one off the garbage list */ if (*grbg != NULL) { *p = *grbg; *grbg = (*grbg)->next; memcpy((*p)->numsteps, zeros, endsite*sizeof(long)); memcpy((*p)->oldnumsteps, zeros, endsite*sizeof(long)); memcpy((*p)->discbase, zeros2, endsite*sizeof(unsigned char)); memcpy((*p)->olddiscbase, zeros2, endsite*sizeof(unsigned char)); zerodiscnumnuc(*p, endsite); } else allocdiscnode(p, zeros, zeros2, endsite); (*p)->back = NULL; (*p)->next = NULL; (*p)->tip = false; (*p)->visited = false; (*p)->index = i; (*p)->numdesc = 0; (*p)->sumsteps = 0.0; } /* gnudisctreenode */ void setupnode(node *p, long i) { /* initialization of node pointers, variables */ p->next = NULL; p->back = NULL; p->times_in_tree = (double) i * 1.0; p->index = i; p->tip = false; } /* setupnode */ node *pnode(tree *t, node *p) { /* Get the "parent nodelet" of p's node group */ return t->nodep[p->index - 1]; } long count_sibs (node *p) { /* Count the number of nodes in a ring, return the total number of */ /* nodes excluding the one passed into the function (siblings) */ node *q; long return_int = 0; if (p->tip) { printf ("Error: the function count_sibs called on a tip. This is a bug.\n"); exxit (-1); } q = p->next; while (q != p) { if (q == NULL) { printf ("Error: a loop of nodes was not closed.\n"); exxit (-1); } else { return_int++; q = q->next; } } return return_int; } /* count_sibs */ void inittrav (node *p) { /* traverse to set pointers uninitialized on inserting */ long i, num_sibs; node *sib_ptr; if (p == NULL) return; if (p->tip) return; num_sibs = count_sibs (p); sib_ptr = p; for (i=0; i < num_sibs; i++) { sib_ptr = sib_ptr->next; sib_ptr->initialized = false; inittrav(sib_ptr->back); } } /* inittrav */ void commentskipper(FILE ***intree, long *bracket) { /* skip over comment bracket contents in reading tree */ char c; c = gettc(**intree); while (c != ']') { if(feof(**intree)) { printf("\n\nERROR: Unmatched comment brackets\n\n"); exxit(-1); } if(c == '[') { (*bracket)++; commentskipper(intree, bracket); } c = gettc(**intree); } (*bracket)--; } /* commentskipper */ long countcomma(FILE **treefile, long *comma) { /* Modified by Dan F. 11/10/96 */ /* countcomma rewritten so it passes back both lparen+comma to allocate nodep and a pointer to the comma variable. This allows the tree to know how many species exist, and the tips to be placed in the front of the nodep array */ /* The next line inserted so this function leaves the file pointing to where it found it, not just re-winding it. */ /* long orig_position = ftell (*treefile); */ fpos_t orig_position; Char c; long lparen = 0; long bracket = 0; /* Save the file position */ if ( fgetpos(*treefile, &orig_position) != 0 ) { printf("\n\nERROR: Could not save file position!\n\n"); exxit(-1); } (*comma) = 0; for (;;){ c = getc(*treefile); if (feof(*treefile)) break; if (c == ';') break; if (c == ',') (*comma)++; if (c == '(') lparen++; if (c == '[') { bracket++; commentskipper(&treefile, &bracket); } } /* Don't just rewind, */ /* rewind (*treefile); */ /* Re-set to where it pointed when the function was called */ /* fseek (*treefile, orig_position, SEEK_SET); */ fsetpos(*treefile, &orig_position); return lparen + (*comma); } /*countcomma*/ long countsemic(FILE **treefile) { /* Used to determine the number of user trees. Return either a: the number of semicolons in the file outside comments or b: the first integer in the file */ Char c; long return_val, semic = 0; long bracket = 0; /* Eat all whitespace */ c = gettc(*treefile); while ((c == ' ') || (c == '\t') || (c == '\n')) { c = gettc(*treefile); } /* Then figure out if the first non-white character is a digit; if so, return it */ if (isdigit (c)) { ungetc(c, *treefile); if (fscanf((*treefile), "%ld", &return_val) != 1) { printf("Error reading number of trees in tree file.\n\n"); exxit(-1); } } else { /* Loop past all characters, count the number of semicolons outside of comments */ for (;;){ c = fgetc(*treefile); if (feof(*treefile)) break; if (c == ';') semic++; if (c == '[') { bracket++; commentskipper(&treefile, &bracket); } } return_val = semic; } rewind (*treefile); return return_val; } /* countsemic */ void hookup(node *p, node *q) { /* hook together two nodes */ assert(p != NULL); assert(q != NULL); p->back = q; q->back = p; } /* hookup */ void unhookup(node *p, node *q) { /* unhook two nodes. Not strictly required, but helps check assumptions */ assert(p != NULL); assert(q != NULL); assert(p->back != NULL); assert(q->back != NULL); assert(p->back == q); assert(q->back == p); p->back = NULL; q->back = NULL; } void link_trees(long local_nextnum, long nodenum, long local_nodenum, pointarray nodep) { if(local_nextnum == 0) hookup(nodep[nodenum], nodep[local_nodenum]); else if(local_nextnum == 1) hookup(nodep[nodenum], nodep[local_nodenum]->next); else if(local_nextnum == 2) hookup(nodep[nodenum], nodep[local_nodenum]->next->next); else printf("Error in Link_Trees()"); } /* link_trees() */ void allocate_nodep(pointarray *nodep, FILE **treefile, long *precalc_tips) { /* pre-compute space and allocate memory for nodep */ long numnodes; /* returns number commas & ( */ long numcom = 0; /* returns number commas */ numnodes = countcomma(treefile, &numcom) + 1; *nodep = (pointarray)Malloc(2*numnodes*sizeof(node *)); (*precalc_tips) = numcom + 1; /* this will be used in placing the tip nodes in the front region of nodep. Used for species check? */ } /* allocate_nodep -plc */ void malloc_pheno (node *p, long endsite, long rcategs) { /* Allocate the phenotype arrays; used by dnaml */ long i; p->x = (phenotype)Malloc(endsite*sizeof(ratelike)); p->underflows = (double *)Malloc(endsite * sizeof(double)); for (i = 0; i < endsite; i++) p->x[i] = (ratelike)Malloc(rcategs*sizeof(sitelike)); } /* malloc_pheno */ void malloc_ppheno (node *p,long endsite, long rcategs) { /* Allocate the phenotype arrays; used by proml */ long i; p->protx = (pphenotype)Malloc(endsite*sizeof(pratelike)); p->underflows = (double *)Malloc(endsite*sizeof(double)); for (i = 0; i < endsite; i++) p->protx[i] = (pratelike)Malloc(rcategs*sizeof(psitelike)); } /* malloc_ppheno */ long take_name_from_tree (Char *ch, Char *str, FILE *treefile) { /* This loop reads a name from treefile and stores it in *str. Returns the length of the name string. str must be at least MAXNCH bytes, but no effort is made to null-terminate the string. Underscores and newlines are converted to spaces. Characters beyond MAXNCH are discarded. */ long name_length = 0; do { if ((*ch) == '_') (*ch) = ' '; if ( name_length < MAXNCH ) str[name_length++] = (*ch); if (eoln(treefile)) scan_eoln(treefile); (*ch) = gettc(treefile); if (*ch == '\n') *ch = ' '; } while ( strchr(":,)[;", *ch) == NULL ); return name_length; } /* take_name_from_tree */ void match_names_to_data (Char *str, pointarray treenode, node **p, long spp) { /* This loop matches names taken from treefile to indexed names in the data file */ boolean found; long i, n; n = 1; do { found = true; for (i = 0; i < nmlngth; i++) { found = (found && ((str[i] == nayme[n - 1][i]) || (((nayme[n - 1][i] == '_') && (str[i] == ' ')) || ((nayme[n - 1][i] == ' ') && (str[i] == '\0'))))); } if (found) *p = treenode[n - 1]; else n++; } while (!(n > spp || found)); if (n > spp) { printf("\n\nERROR: Cannot find species: "); for (i = 0; (str[i] != '\0') && (i < MAXNCH); i++) putchar(str[i]); printf(" in data file\n\n"); exxit(-1); } } /* match_names_to_data */ void addelement(node **p, node *q, Char *ch, long *parens, FILE *treefile, pointarray treenode, boolean *goteof, boolean *first, pointarray nodep, long *nextnode, long *ntips, boolean *haslengths, node **grbg, initptr initnode, boolean unifok, long maxnodes) { /* Recursive procedure adds nodes to user-defined tree This is the main (new) tree-reading procedure */ node *pfirst; long i, len = 0, nodei = 0; boolean notlast; Char str[MAXNCH+1]; node *r; long furs = 0; if ((*ch) == '(') { (*nextnode)++; /* get ready to use new interior node */ nodei = *nextnode; /* do what needs to be done at bottom */ if ( maxnodes != -1 && nodei > maxnodes) { printf("ERROR in input tree file: Attempting to allocate too\n"); printf("many nodes. This is usually caused by a unifurcation.\n"); printf("To use this tree with this program use Retree to read\n"); printf("and write this tree.\n"); exxit(-1); } /* do what needs to be done at bottom */ (*initnode)(p, grbg, q, len, nodei, ntips, parens, bottom, treenode, nodep, str, ch, treefile); pfirst = (*p); notlast = true; while (notlast) { /* loop through immediate descendants */ furs++; (*initnode)(&(*p)->next, grbg, q, len, nodei, ntips, parens, nonbottom, treenode, nodep, str, ch, treefile); /* ... doing what is done before each */ r = (*p)->next; getch(ch, parens, treefile); /* look for next character */ /* handle blank names */ if((*ch) == ',' || (*ch) == ':'){ ungetc((*ch), treefile); *ch = 0; } else if((*ch)==')'){ ungetc((*ch), treefile); (*parens)++; *ch = 0; } addelement(&(*p)->next->back, (*p)->next, ch, parens, treefile, treenode, goteof, first, nodep, nextnode, ntips, haslengths, grbg, initnode, unifok, maxnodes); (*initnode)(&r, grbg, q, len, nodei, ntips, parens, hslength, treenode, nodep, str, ch, treefile); /* do what is done after each about length */ pfirst->numdesc++; /* increment number of descendants */ *p = r; /* make r point back to p */ if ((*ch) == ')') { notlast = false; do { getch(ch, parens, treefile); } while ((*ch) != ',' && (*ch) != ')' && (*ch) != '[' && (*ch) != ';' && (*ch) != ':'); } } if ( furs <= 1 && !unifok ) { printf("ERROR in input tree file: A Unifurcation was detetected.\n"); printf("To use this tree with this program use retree to read and"); printf(" write this tree\n"); exxit(-1); } (*p)->next = pfirst; (*p) = pfirst; } else if ((*ch) != ')') { /* if it's a species name */ for (i = 0; i < MAXNCH+1; i++) /* fill string with nulls */ str[i] = '\0'; len = take_name_from_tree (ch, str, treefile); /* get the name */ if ((*ch) == ')') (*parens)--; /* decrement count of open parentheses */ (*initnode)(p, grbg, q, len, nodei, ntips, parens, tip, treenode, nodep, str, ch, treefile); /* do what needs to be done at a tip */ } else getch(ch, parens, treefile); if (q != NULL) hookup(q, (*p)); /* now hook up */ (*initnode)(p, grbg, q, len, nodei, ntips, parens, iter, treenode, nodep, str, ch, treefile); /* do what needs to be done to variable iter */ if ((*ch) == ':') (*initnode)(p, grbg, q, len, nodei, ntips, parens, length, treenode, nodep, str, ch, treefile); /* do what needs to be done with length */ else if ((*ch) != ';' && (*ch) != '[') (*initnode)(p, grbg, q, len, nodei, ntips, parens, hsnolength, treenode, nodep, str, ch, treefile); /* ... or what needs to be done when no length */ if ((*ch) == '[') (*initnode)(p, grbg, q, len, nodei, ntips, parens, treewt, treenode, nodep, str, ch, treefile); /* ... for processing a tree weight */ else if ((*ch) == ';') /* ... and at end of tree */ (*initnode)(p, grbg, q, len, nodei, ntips, parens, unittrwt, treenode, nodep, str, ch, treefile); } /* addelement */ void treeread (FILE *treefile, node **root, pointarray treenode, boolean *goteof, boolean *first, pointarray nodep, long *nextnode, boolean *haslengths, node **grbg, initptr initnode, boolean unifok, long maxnodes) { /* read in user-defined tree and set it up */ /* Eats blank lines and everything up to the first open paren, then * calls the recursive function addelement, which builds the * tree and calls back to initnode. */ char ch; long parens = 0; long ntips = 0; (*goteof) = false; (*nextnode) = spp; /* eat blank lines */ while (eoln(treefile) && !eoff(treefile)) scan_eoln(treefile); if (eoff(treefile)) { (*goteof) = true; return; } getch(&ch, &parens, treefile); while (ch != '(') { /* Eat everything in the file (i.e. digits, tabs) until you encounter an open-paren */ getch(&ch, &parens, treefile); } if (haslengths != NULL) *haslengths = true; addelement(root, NULL, &ch, &parens, treefile, treenode, goteof, first, nodep, nextnode, &ntips, haslengths, grbg, initnode, unifok, maxnodes); /* Eat blank lines and end of current line*/ do { scan_eoln(treefile); } while (eoln(treefile) && !eoff(treefile)); if (first) *first = false; if (parens != 0) { printf("\n\nERROR in tree file: unmatched parentheses\n\n"); exxit(-1); } } /* treeread */ void addelement2(node *q, Char *ch, long *parens, FILE *treefile, pointarray treenode, boolean lngths, double *trweight, boolean *goteof, long *nextnode, long *ntips, long no_species, boolean *haslengths, boolean unifok, long maxnodes) { /* recursive procedure adds nodes to user-defined tree -- old-style bifurcating-only version */ node *pfirst = NULL, *p; long i, len, current_loop_index; boolean notlast, minusread; Char str[MAXNCH]; double valyew, divisor; long furs = 0; if ((*ch) == '(') { current_loop_index = (*nextnode) + spp; (*nextnode)++; if ( maxnodes != -1 && current_loop_index > maxnodes) { printf("ERROR in intree file: Attempting to allocate too many nodes\n"); printf("This is usually caused by a unifurcation. To use this\n"); printf("intree with this program use retree to read and write\n"); printf("this tree.\n"); exxit(-1); } /* This is an assignment of an interior node */ p = treenode[current_loop_index]; pfirst = p; notlast = true; while (notlast) { furs++; /* This while loop goes through a circle (triad for bifurcations) of nodes */ p = p->next; /* added to ensure that non base nodes in loops have indices */ p->index = current_loop_index + 1; getch(ch, parens, treefile); addelement2(p, ch, parens, treefile, treenode, lngths, trweight, goteof, nextnode, ntips, no_species, haslengths, unifok, maxnodes); if ((*ch) == ')') { notlast = false; do { getch(ch, parens, treefile); } while ((*ch) != ',' && (*ch) != ')' && (*ch) != '[' && (*ch) != ';' && (*ch) != ':'); } } if ( furs <= 1 && !unifok ) { printf("ERROR in intree file: A Unifurcation was detected.\n"); printf("To use this intree with this program use retree to read and"); printf(" write this tree\n"); exxit(-1); } } else if ((*ch) != ')') { for (i = 0; i < MAXNCH; i++) str[i] = '\0'; len = take_name_from_tree (ch, str, treefile); match_names_to_data (str, treenode, &p, spp); pfirst = p; if ((*ch) == ')') (*parens)--; (*ntips)++; strncpy (p->nayme, str, len); } else getch(ch, parens, treefile); if ((*ch) == '[') { /* getting tree weight from last comment field */ if (!eoln(treefile)) { if (fscanf(treefile, "%lf", trweight) == 1) { getch(ch, parens, treefile); if (*ch != ']') { printf("\n\nERROR: Missing right square bracket\n\n"); exxit(-1); } else { getch(ch, parens, treefile); if (*ch != ';') { printf("\n\nERROR: Missing semicolon after square brackets\n\n"); exxit(-1); } } } else { printf("\n\nERROR: Expecting tree weight in last comment field.\n\n"); exxit(-1); } } } else if ((*ch) == ';') { (*trweight) = 1.0 ; if (!eoln(treefile)) printf("WARNING: tree weight set to 1.0\n"); } else if (haslengths != NULL) (*haslengths) = ((*haslengths) && q == NULL); if (q != NULL) hookup(q, pfirst); if ((*ch) == ':') { processlength(&valyew, &divisor, ch, &minusread, treefile, parens); if (q != NULL) { if (!minusread) q->oldlen = valyew / divisor; else q->oldlen = 0.0; if (lngths) { q->v = valyew / divisor; q->back->v = q->v; q->iter = false; q->back->iter = false; } } } } /* addelement2 */ void treeread2 (FILE *treefile, node **root, pointarray treenode, boolean lngths, double *trweight, boolean *goteof, boolean *haslengths, long *no_species, boolean unifok, long maxnodes) { /* read in user-defined tree and set it up -- old-style bifurcating-only version */ char ch; long parens = 0; long ntips = 0; long nextnode; (*goteof) = false; nextnode = 0; /* Eats all blank lines at start of file */ while (eoln(treefile) && !eoff(treefile)) scan_eoln(treefile); if (eoff(treefile)) { (*goteof) = true; return; } getch(&ch, &parens, treefile); while (ch != '(') { /* Eat everything in the file (i.e. digits, tabs) until you encounter an open-paren */ getch(&ch, &parens, treefile); } addelement2(NULL, &ch, &parens, treefile, treenode, lngths, trweight, goteof, &nextnode, &ntips, (*no_species), haslengths, unifok, maxnodes); (*root) = treenode[*no_species]; /*eat blank lines */ while (eoln(treefile) && !eoff(treefile)) scan_eoln(treefile); (*root)->oldlen = 0.0; if (parens != 0) { printf("\n\nERROR in tree file: unmatched parentheses\n\n"); exxit(-1); } } /* treeread2 */ void exxit(int exitcode) { /* Terminate the program with exit code exitcode. * On Windows, supplying a nonzero exitcode will print a message and wait * for the user to hit enter. */ #ifdef WIN32 phyRestoreConsoleAttributes(); #endif exit (exitcode); } /* exxit */ char gettc(FILE* file) { /* Return the next character in file. * If EOF is reached, print an error and die. * DOS ('\r\n') and Mac ('\r') newlines are returned as a single '\n'. */ int ch; ch=getc(file); if ( ch == EOF ) EOF_error(); if ( ch == '\r' ) { ch = getc(file); if ( ch != '\n' ) ungetc(ch, file); ch = '\n'; } return ch; } /* gettc */ void unroot(tree *t, long nonodes) { /* used by fitch, restml and contml */ if (t->start->back == NULL) { if (t->start->next->back->tip) t->start = t->start->next->next->back; else t->start = t->start->next->back; } if (t->start->next->back == NULL) { if (t->start->back->tip) t->start = t->start->next->next->back; else t->start = t->start->back; } if (t->start->next->next->back == NULL) { if (t->start->back->tip) t->start = t->start->next->back; else t->start = t->start->back; } unroot_r(t->start,t->nodep,nonodes); unroot_r(t->start->back, t->nodep, nonodes); } void unroot_here(node* root, node** nodep, long nonodes) { node* tmpnode; double newl; /* used by unroot */ /* assumes bifurcation this is ok in the programs that use it */ newl = root->next->oldlen + root->next->next->oldlen; root->next->back->oldlen = newl; root->next->next->back->oldlen = newl; newl = root->next->v + root->next->next->v; root->next->back->v = newl; root->next->next->back->v = newl; root->next->back->back=root->next->next->back; root->next->next->back->back = root->next->back; while ( root->index != nonodes ) { tmpnode = nodep[ root->index ]; nodep[root->index] = root; root->index++; root->next->index++; root->next->next->index++; nodep[root->index - 2] = tmpnode; tmpnode->index--; tmpnode->next->index--; tmpnode->next->next->index--; } } void unroot_r(node* p, node** nodep, long nonodes) { /* used by unroot */ node *q; if ( p->tip) return; q = p->next; while ( q != p ) { if (q->back == NULL) unroot_here(q, nodep, nonodes); else unroot_r(q->back, nodep, nonodes); q = q->next; } } void clear_connections(tree *t, long nonodes) { long i; node *p; for ( i = 0 ; i < nonodes ; i++) { p = t->nodep[i]; if (p != NULL) { p->back = NULL; p->v = 0; for (p = p->next; p && p != t->nodep[i]; p = p->next) { p->next->back = NULL; p->next->v = 0; } } } } #ifdef WIN32 void phySaveConsoleAttributes() { if ( GetConsoleScreenBufferInfo(hConsoleOutput, &savecsbi) ) savecsbi_valid = true; } /* PhySaveConsoleAttributes */ void phySetConsoleAttributes() { hConsoleOutput = GetStdHandle(STD_OUTPUT_HANDLE); if ( hConsoleOutput == INVALID_HANDLE_VALUE ) hConsoleOutput = NULL; if ( hConsoleOutput != NULL ) { phySaveConsoleAttributes(); SetConsoleTextAttribute(hConsoleOutput, BACKGROUND_GREEN | BACKGROUND_BLUE | BACKGROUND_INTENSITY); } } /* phySetConsoleAttributes */ void phyRestoreConsoleAttributes() { COORD coordScreen = { 0, 0 }; DWORD cCharsWritten; DWORD dwConSize; printf("Press enter to quit.\n"); fflush(stdout); getchar(); if ( savecsbi_valid ) { dwConSize = savecsbi.dwSize.X * savecsbi.dwSize.Y; SetConsoleTextAttribute(hConsoleOutput, savecsbi.wAttributes); FillConsoleOutputAttribute( hConsoleOutput, savecsbi.wAttributes, dwConSize, coordScreen, &cCharsWritten ); } } /* phyRestoreConsoleAttributes */ void phyFillScreenColor() { COORD coordScreen = { 0, 0 }; DWORD cCharsWritten; CONSOLE_SCREEN_BUFFER_INFO csbi; /* to get buffer info */ DWORD dwConSize; if ( GetConsoleScreenBufferInfo( hConsoleOutput, &csbi ) ) { dwConSize = csbi.dwSize.X * csbi.dwSize.Y; FillConsoleOutputAttribute( hConsoleOutput, csbi.wAttributes, dwConSize, coordScreen, &cCharsWritten ); } } /* phyFillScreenColor */ void phyClearScreen() { COORD coordScreen = { 0, 0 }; /* here's where we'll home the cursor */ DWORD cCharsWritten; CONSOLE_SCREEN_BUFFER_INFO csbi; /* to get buffer info */ DWORD dwConSize; /* number of character cells in the current buffer */ /* get the number of character cells in the current buffer */ if ( GetConsoleScreenBufferInfo(hConsoleOutput, &csbi) ) { dwConSize = csbi.dwSize.X * csbi.dwSize.Y; /* fill the entire screen with blanks */ FillConsoleOutputCharacter( hConsoleOutput, (TCHAR) ' ', dwConSize, coordScreen, &cCharsWritten ); /* get the current text attribute */ GetConsoleScreenBufferInfo( hConsoleOutput, &csbi ); /* now set the buffer's attributes accordingly */ FillConsoleOutputAttribute( hConsoleOutput, csbi.wAttributes, dwConSize, coordScreen, &cCharsWritten ); /* put the cursor at (0, 0) */ SetConsoleCursorPosition( hConsoleOutput, coordScreen ); } } /* phyClearScreen */ #endif /* WIN32 */ /* These functions are temporarily used for translating the fixed-width * space-padded nayme array to an array of null-terminated char *. */ char **stringnames_new(void) { /* Copy nayme array to null terminated strings and return array of char *. * Spaces are stripped from end of naym's. * Returned array size is spp+1; last element is NULL. */ char **names; char *ch; long len, i; names = (char **)Malloc((spp+1) * sizeof(char *)); for ( i = 0; i < spp; i++ ) { len = strlen(nayme[i]); names[i] = (char *)Malloc((MAXNCH+1) * sizeof(char)); strncpy(names[i], nayme[i], MAXNCH); names[i][MAXNCH] = '\0'; /* Strip trailing spaces */ for ( ch = names[i] + MAXNCH - 1; *ch == ' ' || *ch == '\0'; ch-- ) *ch = '\0'; } names[spp] = NULL; return names; } void stringnames_delete(char **names) { /* Free a string array returned by stringnames_new() */ long i; assert( names != NULL ); for ( i = 0; i < spp; i++ ) { assert( names[i] != NULL ); free(names[i]); } free(names); } int fieldwidth_double(double val, unsigned int precision) { /* Printf a double to a temporary buffer with specified precision using %g * and return its length. Precision must not be greater than 999,999 */ char format[10]; char buf[0x200]; /* TODO: What's the largest possible? */ if (precision > 999999) abort(); sprintf(format, "%%.%uf", precision); /* %.Nf */ /* snprintf() would be better, but is it avaliable on all systems? */ return sprintf(buf, format, val); } void output_matrix_d(FILE *fp, double **matrix, unsigned long rows, unsigned long cols, char **row_head, char **col_head, int flags) { /* * Print a matrix of double to file. Headings are given in row_head and * col_head, either of which may be NULL to indicate that headings should not * be printed. Otherwise, they must be null-terminated arrays of pointers to * null-terminalted character arrays. * * The macro OUTPUT_PRECISION defines the number of significant figures to * print, and OUTPUT_TEXTWIDTH defines the maximum length of each line. * * Optional formatting is specified by flags argument, using macros MAT_* * defined in phylip.h. */ unsigned *colwidth; /* [0..spp-1] min width of each column */ unsigned headwidth; /* min width of row header column */ unsigned long linelen; /* length of current printed line */ unsigned fw; unsigned long row, col; unsigned long i; unsigned long cstart, cend; unsigned long textwidth = OUTPUT_TEXTWIDTH; const unsigned int gutter = 1; boolean do_block; boolean lower_triangle; boolean border; boolean output_cols; boolean pad_row_head; if ( flags & MAT_NOHEAD ) col_head = NULL; if ( flags & MAT_NOBREAK ) textwidth = 0; do_block = (flags & MAT_BLOCK) && (textwidth > 0); lower_triangle = flags & MAT_LOWER; border = flags & MAT_BORDER; output_cols = flags & MAT_PCOLS; pad_row_head = flags & MAT_PADHEAD; /* Determine minimal width for row headers, if given */ headwidth = 0; if ( row_head != NULL ) { for (row = 0; row < rows; row++) { fw = strlen(row_head[row]); if ( headwidth < fw ) headwidth = fw; } } /* Enforce minimum of 10 ch for machine-readable output */ if ( (pad_row_head) && (headwidth < 10) ) headwidth = 10; /* Determine minimal width for each matrix col */ colwidth = (unsigned int *)Malloc(spp * sizeof(int)); for (col = 0; col < cols; col++) { if ( col_head != NULL ) colwidth[col] = strlen(col_head[col]); else colwidth[col] = 0; for (row = 0; row < rows; row++) { fw = fieldwidth_double(matrix[row][col], PRECISION); if ( colwidth[col] < fw ) colwidth[col] = fw; } } /*** Print the matrix ***/ /* Number of columns if requested */ if ( output_cols ) { fprintf(fp, "%5lu\n", cols); } /* Omit last column for lower triangle */ if ( lower_triangle ) cols--; /* Blocks */ cstart = cend = 0; while ( cend != cols ) { if ( do_block ) { linelen = headwidth; for ( col = cstart; col < cols; col++ ) { if ( linelen + colwidth[col] + gutter > textwidth ) { break; } linelen += colwidth[col] + gutter; } cend = col; /* Always print at least one, regardless of line len */ if ( cend == cstart ) cend++; } else { cend = cols; } /* Column headers */ if ( col_head != NULL ) { /* corner space */ for ( i = 0; i < headwidth; i++ ) putc(' ', fp); if ( border ) { for ( i = 0; i < gutter+1; i++ ) putc(' ', fp); } /* Names */ for ( col = cstart; col < cend; col++ ) { for ( i = 0; i < gutter; i++ ) putc(' ', fp); /* right justify */ fw = strlen(col_head[col]); for ( i = 0; i < colwidth[col] - fw; i++ ) putc(' ', fp); fputs(col_head[col], fp); } putc('\n', fp); } /* Top border */ if ( border ) { for ( i = 0; i < headwidth + gutter; i++ ) putc(' ', fp); putc('\\', fp); for ( col = cstart; col < cend; col++ ) { for ( i = 0; i < colwidth[col] + gutter; i++ ) putc('-', fp); } putc('\n', fp); } /* Rows */ for (row = 0; row < rows; row++) { /* Row header, if given */ if ( row_head != NULL ) { /* right-justify for non-machine-readable */ if ( !pad_row_head ) { for ( i = strlen(row_head[row]); i < headwidth; i++ ) putc(' ', fp); } fputs(row_head[row], fp); /* left-justify for machine-readable */ if ( pad_row_head ) { for ( i = strlen(row_head[row]); i < headwidth; i++ ) putc(' ', fp); } } linelen = headwidth; /* Left border */ if ( border ) { for ( i = 0; i < gutter; i++ ) putc(' ', fp); putc('|', fp); linelen += 2; } /* Row data */ for (col = cstart; col < cend; col++) { /* cols */ /* Stop after col == row for lower triangle */ if ( lower_triangle && col >= row ) break; /* Break line if going over max text width */ if ( !do_block && textwidth > 0 ) { if ( linelen + colwidth[col] > textwidth ) { putc('\n', fp); linelen = 0; } linelen += colwidth[col] + gutter; } for ( i = 0; i < gutter; i++ ) putc(' ', fp); /* Print the datum */ fprintf(fp, "%*.6f", colwidth[col], matrix[row][col]); } putc('\n', fp); } /* End of row */ if (col_head != NULL) putc('\n', fp); /* blank line */ cstart = cend; } /* End of block */ free(colwidth); } /* output_matrix_d */ phylip-3.697/src/phylip.h0000644004732000473200000006116413212364230015016 0ustar joefelsenst_g#ifndef _PHYLIP_H_ #define _PHYLIP_H_ /* version 3.697. Written by Joseph Felsenstein, Akiko Fuseki, Sean Lamont, Andrew Keeffe, Mike Palczewski, Doug Buxton and Dan Fineman. Copyright (c) 1980-2017, Joseph Felsenstein All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #define VERSION "3.697" /* Debugging options */ /* Define this to disable assertions */ #define NDEBUG /* Define this to enable debugging code */ /* #define DEBUG */ /* machine-specific stuff: based on a number of factors in the library stdlib.h, we will try to determine what kind of machine/compiler this program is being built on. However, it doesn't always succeed. However, if you have ANSI conforming C, it will probably work. We will try to figure out machine type based on defines in stdio, and compiler-defined things as well.: */ #include #include #ifdef WIN32 #include void phyClearScreen(void); void phySaveConsoleAttributes(void); void phySetConsoleAttributes(void); void phyRestoreConsoleAttributes(void); void phyFillScreenColor(void); #endif #ifdef GNUDOS #define DJGPP #define DOS #endif #ifdef THINK_C #define MAC #endif #ifdef __MWERKS__ #ifndef WIN32 #define MAC #endif #endif #ifdef __CMS_OPEN #define CMS #define EBCDIC true #define INFILE "infile data" #define OUTFILE "outfile data" #define FONTFILE "fontfile data" #define PLOTFILE "plotfile data" #define INTREE "intree data" #define INTREE2 "intree data 2" #define OUTTREE "outtree data" #define CATFILE "categories data" #define WEIGHTFILE "weights data" #define ANCFILE "ancestors data" #define MIXFILE "mixture data" #define FACTFILE "factors data" #else #define EBCDIC false #define INFILE "infile" #define OUTFILE "outfile" #define FONTFILE "fontfile" /* on unix this might be /usr/local/lib/fontfile */ #define PLOTFILE "plotfile" #define INTREE "intree" #define INTREE2 "intree2" #define OUTTREE "outtree" #define CATFILE "categories" #define WEIGHTFILE "weights" #define ANCFILE "ancestors" #define MIXFILE "mixture" #define FACTFILE "factors" #endif #ifdef L_ctermid /* try and detect for sysV or V7. */ #define SYSTEM_FIVE #endif #ifdef sequent #define SYSTEM_FIVE #endif #ifndef SYSTEM_FIVE #include # if defined(_STDLIB_H_) || defined(_H_STDLIB) || defined(H_SCCSID) || defined(unix) # define UNIX # define MACHINE_TYPE "BSD Unix C" # endif #endif #ifdef __STDIO_LOADED #define VMS #define MACHINE_TYPE "VAX/VMS C" #endif #ifdef __WATCOMC__ #define QUICKC #define WATCOM #define DOS #include "graph.h" #endif /* watcom-c has graphics library calls that are almost identical to * * quick-c, so the "QUICKC" symbol name stays. */ #ifdef _QC #define MACHINE_TYPE "MS-DOS / Quick C" #define QUICKC #include "graph.h" #define DOS #endif #ifdef _DOS_MODE #define MACHINE_TYPE "MS-DOS /Microsoft C " #define DOS /* DOS is always defined if on a DOS machine */ #define MSC /* MSC is defined for microsoft C */ #endif #ifdef __MSDOS__ /* TURBO c compiler, ONLY (no other DOS C compilers) */ #define DOS #define TURBOC #include #include #endif #ifdef DJGPP /* DJ Delorie's original gnu C/C++ port */ #include #endif #ifndef MACHINE_TYPE #define MACHINE_TYPE "ANSI C" #endif #ifdef DOS #define MALLOCRETURN void #else #define MALLOCRETURN void #endif #ifdef VMS #define signed /* signed doesn't exist in VMS */ #endif /* default screen types */ /* if on a DOS but not a Windows system can use IBM PC screen controls */ #ifdef DOS #ifndef WIN32 #define IBMCRT true #define ANSICRT false #endif #endif /* if on a Mac cannot use screen controls */ #ifdef MAC #define IBMCRT false #define ANSICRT false #endif /* if on a Windows system can use IBM PC screen controls */ #ifdef WIN32 #define IBMCRT true #define ANSICRT false #endif /* otherwise, let's assume we are on a Linux or Unix system with ANSI terminal controls */ #ifndef MAC #ifndef DOS #ifndef WIN32 #define IBMCRT false #define ANSICRT true #endif #endif #endif #ifdef DJGPP #undef MALLOCRETURN #define MALLOCRETURN void #endif /* includes: */ #ifdef UNIX #include #else #include #endif #include #include #include /* directory delimiters */ #ifdef MAC #define DELIMITER ':' #else #ifdef WIN32 #define DELIMITER '\\' #else #define DELIMITER '/' #endif #endif #define FClose(file) if (file) fclose(file) ; file=NULL #define Malloc(x) mymalloc((long)x) typedef void *Anyptr; #define Signed signed #define Const const #define Volatile volatile #define Char char /* Characters (not bytes) */ #define Static static /* Private global funcs and vars */ #define Local static /* Nested functions */ typedef unsigned char boolean; #define true 1 #define false 0 /* Number of items per machine word in set. * Used in consensus programs and clique */ #define SETBITS 31 MALLOCRETURN *mymalloc(long); /*** UI behavior ***/ /* Set to 1 to not ask before overwriting files */ #define OVERWRITE_FILES 0 /*** Static memory parameters ***/ #define FNMLNGTH 200 /* length of array to store a file name */ #define nmlngth 10 /* number of characters in species name */ #define MAXNCH 20 /* must be greater than or equal to nmlngth */ #define maxcategs 9 /* maximum number of site types */ #define maxcategs2 11 /* maximum number of site types + 2 */ #define point "." #define pointe '.' #define down 2 #define MAXSHIMOTREES 100 /*** Maximum likelihood parameters ***/ /* Used in proml, promlk, dnaml, dnamlk, etc. */ #define UNDEFINED 1.0 /* undefined or invalid likelihood */ #define smoothings 4 /* number of passes through smoothing algorithm */ #define iterations 8 /* number of iterates for each branch */ #define epsilon 0.0001 /* small number used in makenewv */ #define EPSILON 0.00001 /* small number used in hermite root-finding */ #define initialv 0.1 /* starting branch length unless otherwise */ #define INSERT_MIN_TYME 0.0001 /* Minimum tyme between nodes during inserts */ #define over 60 /* maximum width all branches of tree on screen */ #define LIKE_EPSILON 1e-10 /* Estimate of round-off error in likelihood * calculations. */ /*** Math constants ***/ #define SQRTPI 1.7724538509055160273 #define SQRT2 1.4142135623730950488 /*** Rearrangement parameters ***/ #define NLRSAVES 5 /* number of views that need to be saved during local * * rearrangement */ /*** Output options ***/ /* Number of significant figures to display in numeric output */ #define PRECISION 6 /* Maximum line length of matrix output - 0 for unlimited */ #define OUTPUT_TEXTWIDTH 78 /** output_matrix() flags **/ /* Block output: Matrices are vertically split into blocks that * fit within OUTPUT_TEXTWIDTH columns */ #define MAT_BLOCK 0x1 /* Lower triangle: Values on or above the diagonal are not printed */ #define MAT_LOWER 0x2 /* Print a border between headings and data */ #define MAT_BORDER 0x4 /* Do not print the column header */ #define MAT_NOHEAD 0x8 /* Output the number of columns before the matrix */ #define MAT_PCOLS 0x10 /* Do not enforce maximum line width */ #define MAT_NOBREAK 0x20 /* Pad row header with spaces to 10 char */ #define MAT_PADHEAD 0x40 /* Human-readable format. */ #define MAT_HUMAN MAT_BLOCK /* Machine-readable format. */ #define MAT_MACHINE (MAT_PCOLS | MAT_NOHEAD | MAT_PADHEAD) /* Lower-triangular format. */ #define MAT_LOWERTRI (MAT_LOWER | MAT_MACHINE) boolean javarun; typedef long *steptr; typedef long longer[6]; typedef char naym[MAXNCH]; typedef long *bitptr; typedef double raterootarray[maxcategs2][maxcategs2]; typedef struct bestelm { long *btree; boolean gloreange; boolean locreange; boolean collapse; } bestelm; extern FILE *infile, *outfile, *intree, *intree2, *outtree, *weightfile, *catfile, *ancfile, *mixfile, *factfile; extern long spp, words, bits; extern boolean ibmpc, ansi, tranvsp; extern naym *nayme; /* names of species */ boolean firstplotblock; // for debugging BMP output #define ebcdic EBCDIC typedef Char plotstring[MAXNCH]; /* Approx. 1GB, used to test for memory request errors */ #define TOO_MUCH_MEMORY 1000000000 /* The below pre-processor commands define the type used to store group arrays. We can't use #elif for metrowerks, so we use cascaded if statements */ #include /* minimum double we feel safe with, anything less will be considered underflow */ #define MIN_DOUBLE 10e-100 /* K&R says that there should be a plus in front of the number, but no machine we've seen actually uses one; we'll include it just in case. */ #define MAX_32BITS 2147483647 #define MAX_32BITS_PLUS +2147483647 /* If ints are 4 bytes, use them */ #if INT_MAX == MAX_32BITS typedef int group_type; #else #if INT_MAX == MAX_32BITS_PLUS typedef int group_type; #else /* Else, if longs are 4 bytes, use them */ #if LONG_MAX == MAX_32BITS typedef long group_type; #else #if LONG_MAX == MAX_32BITS_PLUS typedef long group_type; /* Default to longs */ #else typedef long group_type; #endif #endif #endif #endif /* for many programs */ #define maxuser 1000 /* maximum number of user-defined trees */ typedef Char **sequence; typedef enum { A, C, G, T, O } bases; typedef enum { alanine, arginine, asparagine, aspartic, cysteine, glutamine, glutamic, glycine, histidine, isoleucine, leucine, lysine, methionine, phenylalanine, proline, serine, threonine, tryptophan, tyrosine, valine } acids; /* for Pars */ typedef enum { zero = 0, one, two, three, four, five, six, seven } discbases; /* for Protpars */ typedef enum { ala, arg, asn, asp, cys, gln, glu, gly, his, ileu, leu, lys, met, phe, pro, ser1, ser2, thr, trp, tyr, val, del, stop, asx, glx, ser, unk, quest } aas; typedef double sitelike[(long)T - (long)A + 1]; /* used in dnaml, dnadist */ typedef double psitelike[(long)valine - (long)alanine + 1]; /* used in proml */ typedef long *baseptr; /* baseptr used in dnapars, dnacomp & dnapenny */ typedef long *baseptr2; /* baseptr used in dnamove */ typedef unsigned char *discbaseptr; /* discbaseptr used in pars */ typedef sitelike *ratelike; /* used in dnaml ... */ typedef psitelike *pratelike; /* used in proml */ typedef ratelike *phenotype; /* phenotype used in dnaml, dnamlk, dnadist */ typedef pratelike *pphenotype; /* phenotype used in proml */ typedef double *sitelike2; typedef sitelike2 *phenotype2; /* phenotype2 used in restml */ typedef double *phenotype3; /* for continuous char programs */ typedef double *vector; /* used in distance programs */ typedef long nucarray[(long)O - (long)A + 1]; typedef long discnucarray[(long)seven - (long)zero + 1]; typedef enum { nocollap, tocollap, undefined } collapstates; typedef enum { bottom, nonbottom, hslength, tip, iter, length, hsnolength, treewt, unittrwt } initops; typedef double **transmatrix; typedef transmatrix *transptr; /* transptr used in restml */ typedef long sitearray[3]; typedef sitearray *seqptr; /* seqptr used in protpars */ typedef struct node { struct node *next, *back; plotstring nayme; long naymlength, tipsabove, index; double times_in_tree; /* Previously known as cons_index */ double xcoord, ycoord; long long_xcoord, long_ycoord; /* for use in cons. */ double oldlen, length, r, theta, oldtheta, width, depth, tipdist, lefttheta, righttheta; group_type *nodeset; /* used by accumulate -plc */ long ymin, ymax; /* used by printree -plc */ boolean haslength; /* haslength used in dnamlk */ boolean iter; /* iter used in dnaml, fitch & restml */ boolean initialized; /* initialized used in dnamlk & restml */ long branchnum; /* branchnum used in restml */ phenotype x; /* x used in dnaml, dnamlk, dnadist */ phenotype2 x2; /* x2 used in restml */ phenotype3 view; /* contml etc */ pphenotype protx; /* protx used in proml */ aas *seq; /* the sequence used in protpars */ seqptr siteset; /* temporary storage for aa's used in protpars*/ double v, deltav, ssq; /* ssq used only in contrast */ double bigv; /* bigv used in contml */ double tyme, oldtyme; /* used in dnamlk */ double t; /* time in kitsch */ boolean sametime; /* bookkeeps scrunched nodes in kitsch */ double weight; /* weight of node used by scrunch in kitsch */ boolean processed; /* used by evaluate in kitsch */ boolean deleted; /* true if node is deleted (retree) */ boolean hasname; /* true if tip has a name (retree) */ double beyond; /* distance beyond this node to most distant tip */ /* (retree) */ boolean deadend; /* true if no undeleted nodes beyond this node */ /* (retree) */ boolean onebranch; /* true if there is one undeleted node beyond */ /* this node (retree) */ struct node *onebranchnode; /* if there is, a pointer to that node (retree)*/ double onebranchlength; /* if there is, the distance from here to there*/ /* (retree) */ boolean onebranchhaslength; /* true if there is a valid combined length*/ /* from here to there (retree) */ collapstates collapse; /* used in dnapars & dnacomp */ boolean tip; boolean bottom; /* used in dnapars & dnacomp, disc char */ boolean visited; /* used in dnapars & dnacomp disc char */ baseptr base; /* the sequence in dnapars/comp/penny */ discbaseptr discbase; /* the sequence in pars */ baseptr2 base2; /* the sequence in dnamove */ baseptr oldbase; /* record previous sequence */ discbaseptr olddiscbase; /* record previous sequence */ long numdesc; /* number of immediate descendants */ nucarray *numnuc; /* bookkeeps number of nucleotides */ discnucarray *discnumnuc; /* bookkeeps number of nucleotides */ steptr numsteps; /* bookkeeps steps */ steptr oldnumsteps; /* record previous steps */ double sumsteps; /* bookkeeps sum of steps */ nucarray cumlengths; /* bookkeeps cummulative minimum lengths */ discnucarray disccumlengths; /* bookkeeps cummulative minimum lengths */ nucarray numreconst; /* bookkeeps number of reconstructions */ discnucarray discnumreconst; /* bookkeeps number of reconstructions */ vector d, w; /* for distance matrix programs */ double dist; /* dist used in fitch */ bitptr stateone, statezero; /* discrete char programs */ long maxpos; /* maxpos used in Clique */ Char state; /* state used in Dnamove, Dolmove & Move */ double* underflows; /* used to record underflow */ } node; typedef node **pointarray; /*** tree structure ***/ typedef struct tree { /* An array of pointers to nodes. Each tip node and ring of nodes has a * unique index starting from one. The nodep array contains pointers to each * one, starting from 0. In the case of internal nodes, the entries in nodep * point to the rootward node in the group. Since the trees are otherwise * entirely symmetrical, except at the root, this is the only way to resolve * parent, child, and sibling relationships. * * Indices in range [0, spp) point to tips, while indices [spp, nonodes) * point to fork nodes */ pointarray nodep; /* A pointer to the first node. Typically, root is used when the tree is rooted, * and points to an internal node with no back link. */ node *root; /* start is used when trees are unrooted. It points to an internal node whose * back link typically points to the outgroup leaf. */ node *start; /* In maximum likelihood programs, the most recent evaluation is stored here */ double likelihood; /* Branch transition matrices for restml */ transptr trans; /* all transition matrices */ long *freetrans; /* an array of indexes of free matrices */ long transindex; /* index of last valid entry in freetrans[] */ } tree; typedef void (*initptr)(node **, node **, node *, long, long, long *, long *, initops, pointarray, pointarray, Char *, Char *, FILE *); #ifndef OLDC /* function prototypes */ void scan_eoln(FILE *); boolean eoff(FILE *); boolean eoln(FILE *); int filexists(char *); const char* get_command_name (const char *); void EOF_error(void); void getstryng(char *); void openfile(FILE **,const char *,const char *,const char *,const char *, char *); void cleerhome(void); void loopcount(long *, long); double randum(longer); void randumize(longer, long *); double normrand(longer); long readlong(const char *); void uppercase(Char *); void initseed(long *, long *, longer); void initjumble(long *, long *, longer, long *); void initoutgroup(long *, long); void initthreshold(double *); void initcatn(long *); void initcategs(long, double *); void initprobcat(long, double *, double *); double logfac (long); double halfroot(double (*func)(long , double), long, double, double); double hermite(long, double); void initlaguerrecat(long, double, double *, double *); void root_hermite(long, double *); void hermite_weight(long, double *, double *); void inithermitcat(long, double, double *, double *); void lgr(long, double, raterootarray); double glaguerre(long, double, double); void initgammacat(long, double, double *, double *); void inithowmany(long *, long); void inithowoften(long *); void initlambda(double *); void initfreqs(double *, double *, double *, double *); void initratio(double *); void initpower(double *); void initdatasets(long *); void justweights(long *); void initterminal(boolean *, boolean *); void initnumlines(long *); void initbestrees(bestelm *, long, boolean); void newline(FILE *, long, long, long); void inputnumbers(long *, long *, long *, long); void inputnumbersold(long *, long *, long *, long); void inputnumbers2(long *, long *, long n); void inputnumbers3(long *, long *); void samenumsp(long *, long); void samenumsp2(long); void readoptions(long *, const char *); void matchoptions(Char *, const char *); void inputweights(long, steptr, boolean *); void inputweightsold(long, steptr, boolean *); void inputweights2(long, long, long *, steptr, boolean *, const char *); void printweights(FILE *, long, long, steptr, const char *); void inputcategs(long, long, steptr, long, const char *); void printcategs(FILE *, long, steptr, const char *); void inputfactors(long, Char *, boolean *); void inputfactorsnew(long, Char *, boolean *); void printfactors(FILE *, long, Char *, const char *); void headings(long, const char *, const char *); void initname(long); void findtree(boolean *,long *,long,long *,bestelm *); void addtree(long,long *,boolean,long *,bestelm *); long findunrearranged(bestelm *, long, boolean); boolean torearrange(bestelm *, long); void reducebestrees(bestelm *, long *); void shellsort(double *, long *, long); void getch(Char *, long *, FILE *); void getch2(Char *, long *); void findch(Char, Char *, long); void findch2(Char, long *, long *, Char *); void findch3(Char, Char *, long, long); void processlength(double *,double *,Char *,boolean *,FILE *,long *); void writename(long, long, long *); void memerror(void); void odd_malloc(long); void gnu(node **, node **); void chuck(node **, node *); void zeronumnuc(node *, long); void zerodiscnumnuc(node *, long); void allocnontip(node *, long *, long); void allocdiscnontip(node *, long *, unsigned char *, long ); void allocnode(node **, long *, long); void allocdiscnode(node **, long *, unsigned char *, long ); void gnutreenode(node **, node **, long, long, long *); void gnudisctreenode(node **, node **, long , long, long *, unsigned char *); void setupnode(node *, long); node * pnode(tree *t, node *p); long count_sibs (node *); void inittrav (node *); void commentskipper(FILE ***, long *); long countcomma(FILE **, long *); long countsemic(FILE **); void hookup(node *, node *); void unhookup(node *, node *); void link_trees(long, long , long, pointarray); void allocate_nodep(pointarray *, FILE **, long *); void malloc_pheno(node *, long, long); void malloc_ppheno(node *, long, long); long take_name_from_tree (Char *, Char *, FILE *); void match_names_to_data (Char *, pointarray, node **, long); void addelement(node **, node *, Char *, long *, FILE *, pointarray, boolean *, boolean *, pointarray, long *, long *, boolean *, node **, initptr,boolean,long); void treeread (FILE *, node **, pointarray, boolean *, boolean *, pointarray, long *, boolean *, node **, initptr,boolean,long); void addelement2(node *, Char *, long *, FILE *, pointarray, boolean, double *, boolean *, long *, long *, long, boolean *,boolean, long); void treeread2 (FILE *, node **, pointarray, boolean, double *, boolean *, boolean *, long *,boolean,long); void exxit (int); void countup(long *loopcount, long maxcount); char gettc(FILE* file); void unroot_r(node* p,node ** nodep, long nonodes); void unroot(tree* t,long nonodes); void unroot_here(node* root, node** nodep, long nonodes); void clear_connections(tree *t, long nonodes); void init(int argc, char** argv); char **stringnames_new(void); void stringnames_delete(char **names); int fieldwidth_double(double val, unsigned int precision); void output_matrix_d(FILE *fp, double **matrix, unsigned long rows, unsigned long cols, char **row_head, char **col_head, int flags); void debugtree (tree *, FILE *); void debugtree2 (pointarray, long, FILE *); #endif /* OLDC */ #endif /* _PHYLIP_H_ */ phylip-3.697/src/phylip.html0000644004732000473200000001664712406201117015537 0ustar joefelsenst_g phylip

v3.6

PHYLIP programs and documentation

PHYLIP, the PHYLogeny Inference Package, consists of 35 programs. There are documentation files for each program, in the form of web pages in HTML 3.2. There are also documentation web pages for each group of programs, and a main documentation file that is the basic introduction to the package. Before running any of the programs you should read it.

Below you will find a list of the programs and the documentation files. The names of the documentation files are highlighted as links that will take you to those documentation files.

Introduction to PHYLIP

main documentation file

Molecular sequence methods

molecular sequence programs documentation file
protparsprotein parsimony documentation file
dnaparsDNA sequence parsimony documentation file
dnapennyDNA parsimony branch and bound documentation file
dnamoveinteractive DNA parsimony documentation file
dnacompDNA compatibility documentation file
dnamlDNA maximum likelihood documentation file
dnamlkDNA maximum likelihood with clock documentation file
promlProtein sequence maximum likelihood documentation file
promlkProtein sequence maximum likelihood with clock documentation file
dnainvarDNA invariants documentation file
dnadistDNA distance documentation file
protdistProtein sequence distance documentation file
restdistRestriction sites and fragments distances documentation file
restmlRestriction sites maximum likelihood documentation file
seqbootBootstrapping/Jackknifing documentation file

Distance matrix methods

Distance matrix programs documentation file
fitchFitch-Margoliash distance matrix method documentation file
kitschFitch-Margoliash distance matrix with clock documentation file
neighborNeighbor-Joining and UPGMA method documentation file

Gene frequencies and continuous characters

Continuous characters and gene frequencies documentation file
contmlMaximum likelihood continuous characters and gene frequencies documentation file
contrastContrast method documentation file
gendistGenetic distance documentation file

Discrete characters methods

Discrete characters methods documentation file
parsUnordered multistate parsimony documentation file
mixMixed method parsimony documentation file
pennyBranch and bound mixed method parsimony documentation file
moveInteractive mixed method parsimony documentation file
dollopDollo and polymorphism parsimony documentation file
dolpennyDollo and polymorphism branch and bound parsimony documentation file
dolmoveDollo and polymorphism interactive parsimony documentation file
clique0/1 characters compatibility method documentation file
factorCharacter recoding program documentation file

Tree drawing, consensus, tree editing, tree distances

Tree drawing programs documentation file
drawgramRooted tree drawing program documentation file
drawtreeUnrooted tree drawing program documentation file
  
consenseConsensus tree program documentation file
treedistTree distance program documentation file
retreeinteractive tree rearrangement program documentation file

phylip-3.697/src/printree.c0000644004732000473200000001042112406201117015320 0ustar joefelsenst_g#include "printree.h" static void mlk_drawline(FILE *fp, tree *t, long i, double scale) { /* draws one row of the tree diagram by moving up tree */ node *p, *q, *r, *first =NULL, *last =NULL; long n, j; boolean extra, done; p = t->root; q = t->root; extra = false; if ((long)(p->ycoord) == i) { if (p->index - spp >= 10) fprintf(fp, "-%2ld", p->index - spp); else fprintf(fp, "--%ld", p->index - spp); extra = true; } else fprintf(fp, " "); do { if (!p->tip) { r = p->next; done = false; do { if (i >= r->back->ymin && i <= r->back->ymax) { q = r->back; done = true; } r = r->next; } while (!(done || r == p)); first = p->next->back; r = p->next; while (r->next != p) r = r->next; last = r->back; } done = (p == q); n = (long)(scale * ((long)(p->xcoord) - (long)(q->xcoord)) + 0.5); if (n < 3 && !q->tip) n = 3; if (extra) { n--; extra = false; } if ((long)(q->ycoord) == i && !done) { if (p->ycoord != q->ycoord) putc('+', fp); else putc('-', fp); if (!q->tip) { for (j = 1; j <= n - 2; j++) putc('-', fp); if (q->index - spp >= 10) fprintf(fp, "%2ld", q->index - spp); else fprintf(fp, "-%ld", q->index - spp); extra = true; } else { for (j = 1; j < n; j++) putc('-', fp); } } else if (!p->tip) { if ((long)(last->ycoord) > i && (long)(first->ycoord) < i && i != (long)(p->ycoord)) { putc('!', fp); for (j = 1; j < n; j++) putc(' ', fp); } else { for (j = 1; j <= n; j++) putc(' ', fp); } } else { for (j = 1; j <= n; j++) putc(' ', fp); } if (p != q) p = q; } while (!done); if ((long)(p->ycoord) == i && p->tip) { for (j = 0; j < nmlngth; j++) putc(nayme[p->index - 1][j], fp); } putc('\n', fp); } /* mlk_drawline */ static void mlk_coordinates(node *p, long *tipy) { /* establishes coordinates of nodes */ node *q, *first, *last, *pp1 =NULL, *pp2 =NULL; long num_sibs, p1, p2, i; if (p->tip) { p->xcoord = 0; p->ycoord = (*tipy); p->ymin = (*tipy); p->ymax = (*tipy); (*tipy) += down; return; } q = p->next; do { mlk_coordinates(q->back, tipy); q = q->next; } while (p != q); num_sibs = count_sibs(p); p1 = (long)((num_sibs+1)/2.0); p2 = (long)((num_sibs+2)/2.0); i = 1; q = p->next; first = q->back; do { if (i == p1) pp1 = q->back; if (i == p2) pp2 = q->back; last = q->back; q = q->next; i++; } while (q != p); p->xcoord = (long)(0.5 - over * p->tyme); p->ycoord = (pp1->ycoord + pp2->ycoord) / 2; p->ymin = first->ymin; p->ymax = last->ymax; } /* mlk_coordinates */ void mlk_printree(FILE *fp, tree *t) { /* prints out diagram of the tree */ long tipy; double scale; long i; node *p; assert(fp != NULL); putc('\n', fp); tipy = 1; mlk_coordinates(t->root, &tipy); p = t->root; while (!p->tip) p = p->next->back; scale = 1.0 / (long)(p->tyme - t->root->tyme + 1.000); putc('\n', fp); for (i = 1; i <= tipy - down; i++) mlk_drawline(fp, t, i, scale); putc('\n', fp); } /* dnamlk_printree */ static void describe_r(FILE *fp, tree *t, node *p, double fracchange) { long i, num_sibs; node *sib_ptr, *sib_back_ptr; double v; if (p == t->root) fprintf(fp, " root "); else fprintf(fp, "%4ld ", p->back->index - spp); if (p->tip) { for (i = 0; i < nmlngth; i++) putc(nayme[p->index - 1][i], fp); } else fprintf(fp, "%4ld ", p->index - spp); if (p != t->root) { fprintf(fp, "%11.5f", fracchange * (p->tyme - t->root->tyme)); v = fracchange * (p->tyme - t->nodep[p->back->index - 1]->tyme); fprintf(fp, "%13.5f", v); } putc('\n', fp); if (!p->tip) { sib_ptr = p; num_sibs = count_sibs(p); for (i=0 ; i < num_sibs; i++) { sib_ptr = sib_ptr->next; sib_back_ptr = sib_ptr->back; describe_r(fp, t, sib_back_ptr, fracchange); } } } /* describe */ void mlk_describe(FILE *fp, tree *t, double fracchange) { describe_r(fp, t, t->root, fracchange); } phylip-3.697/src/printree.h0000644004732000473200000000022612406201117015327 0ustar joefelsenst_g#include #include "phylip.h" extern void mlk_printree(FILE *fp, tree *t); extern void mlk_describe(FILE *fp, tree *t, double fracchange); phylip-3.697/src/proml.c0000644004732000473200000031450012406201117014626 0ustar joefelsenst_g#include "phylip.h" #include "seq.h" /* version 3.696. Written by Joseph Felsenstein, Lucas Mix, Akiko Fuseki, Sean Lamont, Andrew Keeffe, Dan Fineman, and Patrick Colacurcio. Copyright (c) 1993-2014, Joseph Felsenstein All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ typedef long vall[maxcategs]; typedef double contribarr[maxcategs]; #ifndef OLDC /* function prototypes */ void init_protmats(void); void getoptions(void); void makeprotfreqs(void); void allocrest(void); void doinit(void); void inputoptions(void); void input_protdata(long); void makeweights(void); void prot_makevalues(long, pointarray, long, long, sequence, steptr); void prot_inittable(void); void alloc_pmatrix(long); void getinput(void); void inittravtree(node *); void prot_nuview(node *); void prot_slopecurv(node *, double, double *, double *, double *); void makenewv(node *); void update(node *); void smooth(node *); void make_pmatrix(double **, double **, double **, long, double, double, double *, double **); double prot_evaluate(node *, boolean); void treevaluate(void); void promlcopy(tree *, tree *, long, long); void proml_re_move(node **, node **); void insert_(node *, node *, boolean); void addtraverse(node *, node *, boolean); void rearrange(node *, node *); void proml_coordinates(node *, double, long *, double *); void proml_printree(void); void sigma(node *, double *, double *, double *); void describe(node *); void prot_reconstr(node *, long); void rectrav(node *, long, long); void summarize(void); void initpromlnode(node **, node **, node *, long, long, long *, long *, initops, pointarray, pointarray, Char *, Char *, FILE *); void dnaml_treeout(node *); void buildnewtip(long, tree *); void buildsimpletree(tree *); void free_all_protx (long, pointarray); void maketree(void); void clean_up(void); void globrearrange(void); void proml_unroot(node* root, node** nodep, long nonodes) ; void reallocsites(void); void prot_freetable(void); void free_pmatrix(long sib); void alloclrsaves(void); void freelrsaves(void); void resetlrsaves(void); /* function prototypes */ #endif long rcategs; boolean haslengths; long oldendsite=0; Char infilename[100], outfilename[100], intreename[100], outtreename[100], catfilename[100], weightfilename[100]; double *rate, *rrate, *probcat; long nonodes2, sites, weightsum, categs, datasets, ith, njumble, jumb; long inseed, inseed0, parens; boolean global, jumble, weights, trout, usertree, inserting = false, ctgry, rctgry, auto_, hypstate, progress, mulsets, justwts, firstset, improve, smoothit, polishing, lngths, gama, invar, usepmb, usepam, usejtt; tree curtree, bestree, bestree2, priortree; node *qwhere, *grbg, *addwhere; double cv, alpha, lambda, invarfrac, bestyet; long *enterorder; steptr aliasweight; contribarr *contribution, like, nulike, clai; double **term, **slopeterm, **curveterm; longer seed; char *progname; char aachar[26]="ARNDCQEGHILKMFPSTWYVBZX?*-"; node **lrsaves; /* Local variables for maketree, propagated globally for c version: */ long k, nextsp, numtrees, maxwhich, mx, mx0, mx1, shimotrees; double dummy, maxlogl; boolean succeeded, smoothed; double **l0gf; double *l0gl; double **tbl; Char ch, ch2; long col; vall *mp; /* Variables introduced to allow for protein probability calculations */ long max_num_sibs; /* maximum number of siblings used in a */ /* nuview calculation. determines size */ /* final size of pmatrices */ double *eigmat; /* eig matrix variable */ double **probmat; /* prob matrix variable */ double ****dpmatrix; /* derivative of pmatrix */ double ****ddpmatrix; /* derivative of xpmatrix */ double *****pmatrices; /* matrix of probabilities of protien */ /* conversion. The 5 subscripts refer */ /* to sibs, rcategs, categs, final and */ /* initial states, respectively. */ double freqaa[20]; /* amino acid frequencies */ /* this JTT matrix decomposition thanks to Elisabeth Tillier */ static double jtteigmat[] = {+0.00000000000000,-1.81721720738768,-1.87965834528616,-1.61403121885431, -1.53896608443751,-1.40486966367848,-1.30995061286931,-1.24668414819041, -1.17179756521289,-0.31033320987464,-0.34602837857034,-1.06031718484613, -0.99900602987105,-0.45576774888948,-0.86014403434677,-0.54569432735296, -0.76866956571861,-0.60593589295327,-0.65119724379348,-0.70249806480753}; static double jttprobmat[20][20] = {{+0.07686196156903,+0.05105697447152,+0.04254597872702,+0.05126897436552, +0.02027898986051,+0.04106097946952,+0.06181996909002,+0.07471396264303, +0.02298298850851,+0.05256897371552,+0.09111095444453,+0.05949797025102, +0.02341398829301,+0.04052997973502,+0.05053197473402,+0.06822496588753, +0.05851797074102,+0.01433599283201,+0.03230298384851,+0.06637396681302}, {-0.04445795120462,-0.01557336502860,-0.09314817363516,+0.04411372100382, -0.00511178725134,+0.00188472427522,-0.02176250428454,-0.01330231089224, +0.01004072641973,+0.02707838224285,-0.00785039050721,+0.02238829876349, +0.00257470703483,-0.00510311699563,-0.01727154263346,+0.20074235330882, -0.07236268502973,-0.00012690116016,-0.00215974664431,-0.01059243778174}, {+0.09480046389131,+0.00082658405814,+0.01530023104155,-0.00639909042723, +0.00160605602061,+0.00035896642912,+0.00199161318384,-0.00220482855717, -0.00112601328033,+0.14840201765438,-0.00344295714983,-0.00123976286718, -0.00439399942758,+0.00032478785709,-0.00104270266394,-0.02596605592109, -0.05645800566901,+0.00022319903170,-0.00022792271829,-0.16133258048606}, {-0.06924141195400,-0.01816245289173,-0.08104005811201,+0.08985697111009, +0.00279659017898,+0.01083740322821,-0.06449599336038,+0.01794514261221, +0.01036809141699,+0.04283504450449,+0.00634472273784,+0.02339134834111, -0.01748667848380,+0.00161859106290,+0.00622486432503,-0.05854130195643, +0.15083728660504,+0.00030733757661,-0.00143739522173,-0.05295810171941}, {-0.14637948915627,+0.02029296323583,+0.02615316895036,-0.10311538564943, -0.00183412744544,-0.02589124656591,+0.11073673851935,+0.00848581728407, +0.00106057791901,+0.05530240732939,-0.00031533506946,-0.03124002869407, -0.01533984125301,-0.00288717337278,+0.00272787410643,+0.06300929916280, +0.07920438311152,-0.00041335282410,-0.00011648873397,-0.03944076085434}, {-0.05558229086909,+0.08935293782491,+0.04869509588770,+0.04856877988810, -0.00253836047720,+0.07651693957635,-0.06342453535092,-0.00777376246014, -0.08570270266807,+0.01943016473512,-0.00599516526932,-0.09157595008575, -0.00397735155663,-0.00440093863690,-0.00232998056918,+0.02979967701162, -0.00477299485901,-0.00144011795333,+0.01795114942404,-0.00080059359232}, {+0.05807741644682,+0.14654292420341,-0.06724975334073,+0.02159062346633, -0.00339085518294,-0.06829036785575,+0.03520631903157,-0.02766062718318, +0.03485632707432,-0.02436836692465,-0.00397566003573,-0.10095488644404, +0.02456887654357,+0.00381764117077,-0.00906261340247,-0.01043058066362, +0.01651199513994,-0.00210417220821,-0.00872508520963,-0.01495915462580}, {+0.02564617106907,+0.02960554611436,-0.00052356748770,+0.00989267817318, -0.00044034172141,-0.02279910634723,-0.00363768356471,-0.01086345665971, +0.01229721799572,+0.02633650142592,+0.06282966783922,-0.00734486499924, -0.13863936313277,-0.00993891943390,-0.00655309682350,-0.00245191788287, -0.02431633805559,-0.00068554031525,-0.00121383858869,+0.06280025239509}, {+0.11362428251792,-0.02080375718488,-0.08802750967213,-0.06531316372189, -0.00166626058292,+0.06846081717224,+0.07007301248407,-0.01713112936632, -0.05900588794853,-0.04497159138485,+0.04222484636983,+0.00129043178508, -0.01550337251561,-0.01553102163852,-0.04363429852047,+0.01600063777880, +0.05787328925647,-0.00008265841118,+0.02870014572813,-0.02657681214523}, {+0.01840541226842,+0.00610159018805,+0.01368080422265,+0.02383751807012, -0.00923516894192,+0.01209943150832,+0.02906782189141,+0.01992384905334, +0.00197323568330,+0.00017531415423,-0.01796698381949,+0.01887083962858, -0.00063335886734,-0.02365277334702,+0.01209445088200,+0.01308086447947, +0.01286727242301,-0.11420358975688,-0.01886991700613,+0.00238338728588}, {-0.01100105031759,-0.04250695864938,-0.02554356700969,-0.05473632078607, +0.00725906469946,-0.03003724918191,-0.07051526125013,-0.06939439879112, -0.00285883056088,+0.05334304124753,+0.12839241846919,-0.05883473754222, +0.02424304967487,+0.09134510778469,-0.00226003347193,-0.01280041778462, -0.00207988305627,-0.02957493909199,+0.05290385686789,+0.05465710875015}, {-0.01421274522011,+0.02074863337778,-0.01006411985628,+0.03319995456446, -0.00005371699269,-0.12266046460835,+0.02419847062899,-0.00441168706583, -0.08299118738167,-0.00323230913482,+0.02954035119881,+0.09212856795583, +0.00718635627257,-0.02706936115539,+0.04473173279913,-0.01274357634785, -0.01395862740618,-0.00071538848681,+0.04767640012830,-0.00729728326990}, {-0.03797680968123,+0.01280286509478,-0.08614616553187,-0.01781049963160, +0.00674319990083,+0.04208667754694,+0.05991325707583,+0.03581015660092, -0.01529816709967,+0.06885987924922,-0.11719120476535,-0.00014333663810, +0.00074336784254,+0.02893416406249,+0.07466151360134,-0.08182016471377, -0.06581536577662,-0.00018195976501,+0.00167443595008,+0.09015415667825}, {+0.03577726799591,-0.02139253448219,-0.01137813538175,-0.01954939202830, -0.04028242801611,-0.01777500032351,-0.02106862264440,+0.00465199658293, -0.02824805812709,+0.06618860061778,+0.08437791757537,-0.02533125946051, +0.02806344654855,-0.06970805797879,+0.02328376968627,+0.00692992333282, +0.02751392122018,+0.01148722812804,-0.11130404325078,+0.07776346000559}, {-0.06014297925310,-0.00711674355952,-0.02424493472566,+0.00032464353156, +0.00321221847573,+0.03257969053884,+0.01072805771161,+0.06892027923996, +0.03326534127710,-0.01558838623875,+0.13794237677194,-0.04292623056646, +0.01375763233229,-0.11125153774789,+0.03510076081639,-0.04531670712549, -0.06170413486351,-0.00182023682123,+0.05979891871679,-0.02551802851059}, {-0.03515069991501,+0.02310847227710,+0.00474493548551,+0.02787717003457, -0.12038329679812,+0.03178473522077,+0.04445111601130,-0.05334957493090, +0.01290386678474,-0.00376064171612,+0.03996642737967,+0.04777677295520, +0.00233689200639,+0.03917715404594,-0.01755598277531,-0.03389088626433, -0.02180780263389,+0.00473402043911,+0.01964539477020,-0.01260807237680}, {-0.04120428254254,+0.00062717164978,-0.01688703578637,+0.01685776910152, +0.02102702093943,+0.01295781834163,+0.03541815979495,+0.03968150445315, -0.02073122710938,-0.06932247350110,+0.11696314241296,-0.00322523765776, -0.01280515661402,+0.08717664266126,+0.06297225078802,-0.01290501780488, -0.04693925076877,-0.00177653675449,-0.08407812137852,-0.08380714022487}, {+0.03138655228534,-0.09052573757196,+0.00874202219428,+0.06060593729292, -0.03426076652151,-0.04832468257386,+0.04735628794421,+0.14504653737383, -0.01709111334001,-0.00278794215381,-0.03513813820550,-0.11690294831883, -0.00836264902624,+0.03270980973180,-0.02587764129811,+0.01638786059073, +0.00485499822497,+0.00305477087025,+0.02295754527195,+0.00616929722958}, {-0.04898722042023,-0.01460879656586,+0.00508708857036,+0.07730497806331, +0.04252420017435,+0.00484232580349,+0.09861807969412,-0.05169447907187, -0.00917820907880,+0.03679081047330,+0.04998537112655,+0.00769330211980, +0.01805447683564,-0.00498723245027,-0.14148416183376,-0.05170281760262, -0.03230723310784,-0.00032890672639,-0.02363523071957,+0.03801365471627}, {-0.02047562162108,+0.06933781779590,-0.02101117884731,-0.06841945874842, -0.00860967572716,-0.00886650271590,-0.07185241332269,+0.16703684361030, -0.00635847581692,+0.00811478913823,+0.01847205842216,+0.06700967948643, +0.00596607376199,+0.02318239240593,-0.10552958537847,-0.01980199747773, -0.02003785382406,-0.00593392430159,-0.00965391033612,+0.00743094349652}}; /* this PMB matrix decomposition due to Elisabeth Tillier */ static double pmbeigmat[20] = {0.0000001586972220,-1.8416770496147100, -1.6025046986139100,-1.5801012515121300, -1.4987794099715900,-1.3520794233801900,-1.3003469390479700,-1.2439503327631300, -1.1962574080244200,-1.1383730501367500,-1.1153278910708000,-0.4934843510654760, -0.5419014550215590,-0.9657997830826700,-0.6276075673757390,-0.6675927795018510, -0.6932641383465870,-0.8897872681859630,-0.8382698977371710,-0.8074694642446040}; static double pmbprobmat[20][20] = {{0.0771762457248147,0.0531913844998640,0.0393445076407294,0.0466756566755510, 0.0286348361997465,0.0312327748383639,0.0505410248721427,0.0767106611472993, 0.0258916271688597,0.0673140562194124,0.0965705469252199,0.0515979465932174, 0.0250628079438675,0.0503492018628350,0.0399908189418273,0.0641898881894471, 0.0517539616710987,0.0143507440546115,0.0357994592438322,0.0736218495862984}, {0.0368263046116572,-0.0006728917107827,0.0008590805287740,-0.0002764255356960, 0.0020152937187455,0.0055743720652960,0.0003213317669367,0.0000449190281568, -0.0004226254397134,0.1805040629634510,-0.0272246813586204,0.0005904606533477, -0.0183743200073889,-0.0009194625608688,0.0008173657533167,-0.0262629806302238, 0.0265738757209787,0.0002176606241904,0.0021315644838566,-0.1823229927207580}, {-0.0194800075560895,0.0012068088610652,-0.0008803318319596,-0.0016044273960017, -0.0002938633803197,-0.0535796754602196,0.0155163896648621,-0.0015006360762140, 0.0021601372013703,0.0268513218744797,-0.1085292493742730,0.0149753083138452, 0.1346457366717310,-0.0009371698759829,0.0013501708044116,0.0346352293103622, -0.0276963770242276,0.0003643142783940,0.0002074817333067,-0.0174108903914110}, {0.0557839400850153,0.0023271577185437,0.0183481103396687,0.0023339480096311, 0.0002013267015151,-0.0227406863569852,0.0098644845475047,0.0064721276774396, 0.0001389408104210,-0.0473713878768274,-0.0086984445005797,0.0026913674934634, 0.0283724052562196,0.0001063665179457,0.0027442574779383,-0.1875312134708470, 0.1279864877057640,0.0005103347834563,0.0003155113168637,0.0081451082759554}, {0.0037510125027265,0.0107095920636885,0.0147305410328404,-0.0112351252180332, -0.0001500408626446,-0.1523450933729730,0.0611532413339872,-0.0005496748939503, 0.0048714378736644,-0.0003826320053999,0.0552010244407311,0.0482555671001955, -0.0461664995115847,-0.0021165008617978,-0.0004574454232187,0.0233755883688949, -0.0035484915422384,0.0009090698422851,0.0013840637687758,-0.0073895139302231}, {-0.0111512564930024,0.1025460064723080,0.0396772456883791,-0.0298408501361294, -0.0001656742634733,-0.0079876311843289,0.0712644184507945,-0.0010780604625230, -0.0035880882043592,0.0021070399334252,0.0016716329894279,-0.1810123023850110, 0.0015141703608724,-0.0032700852781804,0.0035503782441679,0.0118634302028026, 0.0044561606458028,-0.0001576678495964,0.0023470722225751,-0.0027457045397157}, {0.1474525743949170,-0.0054432538500293,0.0853848892349828,-0.0137787746207348, -0.0008274830358513,0.0042248844582553,0.0019556229305563,-0.0164191435175148, -0.0024501858854849,0.0120908948084233,-0.0381456105972653,0.0101271614855119, -0.0061945941321859,0.0178841099895867,-0.0014577779202600,-0.0752120602555032, -0.1426985695849920,0.0002862275078983,-0.0081191734261838,0.0313401149422531}, {0.0542034611735289,-0.0078763926211829,0.0060433542506096,0.0033396210615510, 0.0013965072374079,0.0067798903832256,-0.0135291136622509,-0.0089982442731848, -0.0056744537593887,-0.0766524225176246,0.1881210263933930,-0.0065875518675173, 0.0416627569300375,-0.0953804133524747,-0.0012559228448735,0.0101622644292547, -0.0304742453119050,0.0011702318499737,0.0454733434783982,-0.1119239362388150}, {0.1069409037912470,0.0805064400880297,-0.1127352030714600,0.1001181253523260, -0.0021480427488769,-0.0332884841459003,-0.0679837575848452,-0.0043812841356657, 0.0153418716846395,-0.0079441315103188,-0.0121766182046363,-0.0381127991037620, -0.0036338726532673,0.0195324059593791,-0.0020165963699984,-0.0061222685010268, -0.0253761448771437,-0.0005246410999057,-0.0112205170502433,0.0052248485517237}, {-0.0325247648326262,0.0238753651653669,0.0203684886605797,0.0295666232678825, -0.0003946714764213,-0.0157242718469554,-0.0511737848084862,0.0084725632040180, -0.0167068828528921,0.0686962159427527,-0.0659702890616198,-0.0014289912494271, -0.0167000964093416,-0.1276689083678200,0.0036575057830967,-0.0205958145531018, 0.0000368919612829,0.0014413626622426,0.1064360941926030,0.0863372661517408}, {-0.0463777468104402,0.0394712148670596,0.1118686750747160,0.0440711686389031, -0.0026076286506751,-0.0268454015202516,-0.1464943067133240,-0.0137514051835380, -0.0094395514284145,-0.0144124844774228,0.0249103379323744,-0.0071832157138676, 0.0035592787728526,0.0415627419826693,0.0027040097365669,0.0337523666612066, 0.0316121324137152,-0.0011350177559026,-0.0349998884574440,-0.0302651879823361}, {0.0142360925194728,0.0413145623127025,0.0324976427846929,0.0580930922002398, -0.0586974207121084,0.0202001168873069,0.0492204086749069,0.1126593173463060, 0.0116620013776662,-0.0780333711712066,-0.1109786767320410,0.0407775100936731, -0.0205013161312652,-0.0653458585025237,0.0347351829703865,0.0304448983224773, 0.0068813748197884,-0.0189002309261882,-0.0334507528405279,-0.0668143558699485}, {-0.0131548829657936,0.0044244322828034,-0.0050639951827271,-0.0038668197633889, -0.1536822386530220,0.0026336969165336,0.0021585651200470,-0.0459233839062969, 0.0046854727140565,0.0393815434593599,0.0619554007991097,0.0027456299925622, 0.0117574347936383,0.0373018612990383,0.0024818527553328,-0.0133956606027299, -0.0020457128424105,0.0154178819990401,0.0246524142683911,0.0275363065682921}, {-0.1542307272455030,0.0364861558267547,-0.0090880407008181,0.0531673937889863, 0.0157585615170580,0.0029986538457297,0.0180194047699875,0.0652152443589317, 0.0266842840376180,0.0388457366405908,0.0856237634510719,0.0126955778952183, 0.0099593861698250,-0.0013941794862563,0.0294065511237513,-0.1151906949298290, -0.0852991447389655,0.0028699120202636,-0.0332087026659522,0.0006811857297899}, {0.0281300736924501,-0.0584072081898638,-0.0178386569847853,-0.0536470338171487, -0.0186881656029960,-0.0240008730656106,-0.0541064820498883,0.2217137098936020, -0.0260500001542033,0.0234505236798375,0.0311127151218573,-0.0494139126682672, 0.0057093465049849,0.0124937286655911,-0.0298322975915689,0.0006520211333102, -0.0061018680727128,-0.0007081999479528,-0.0060523759094034,0.0215845995364623}, {0.0295321046399105,-0.0088296411830544,-0.0065057049917325,-0.0053478115612781, -0.0100646496794634,-0.0015473619084872,0.0008539960632865,-0.0376381933046211, -0.0328135588935604,0.0672161874239480,0.0667626853916552,-0.0026511651464901, 0.0140451514222062,-0.0544836996133137,0.0427485157912094,0.0097455780205802, 0.0177309072915667,-0.0828759701187452,-0.0729504795471370,0.0670731961252313}, {0.0082646581043963,-0.0319918630534466,-0.0188454445200422,-0.0374976353856606, 0.0037131290686848,-0.0132507796987883,-0.0306958830735725,-0.0044119395527308, -0.0140786756619672,-0.0180512599925078,-0.0208243802903953,-0.0232202769398931, -0.0063135878270273,0.0110442171178168,0.1824538048228460,-0.0006644614422758, -0.0069909097436659,0.0255407650654681,0.0099119399501151,-0.0140911517070698}, {0.0261344441524861,-0.0714454044548650,0.0159436926233439,0.0028462736216688, -0.0044572637889080,-0.0089474834434532,-0.0177570282144517,-0.0153693244094452, 0.1160919467206400,0.0304911481385036,0.0047047513411774,-0.0456535116423972, 0.0004491494948617,-0.0767108879444462,-0.0012688533741441,0.0192445965934123, 0.0202321954782039,0.0281039933233607,-0.0590403018490048,0.0364080426546883}, {0.0115826306265004,0.1340228176509380,-0.0236200652949049,-0.1284484655137340, -0.0004742338006503,0.0127617346949511,-0.0428560878860394,0.0060030732454125, 0.0089182609926781,0.0085353834972860,0.0048464809638033,0.0709740071429510, 0.0029940462557054,-0.0483434904493132,-0.0071713680727884,-0.0036840391887209, 0.0031454003250096,0.0246243550241551,-0.0449551277644180,0.0111449232769393}, {0.0140356721886765,-0.0196518236826680,0.0030517022326582,0.0582672093364850, -0.0000973895685457,0.0021704767224292,0.0341806268602705,-0.0152035987563018, -0.0903198657739177,0.0259623214586925,0.0155832497882743,-0.0040543568451651, 0.0036477631918247,-0.0532892744763217,-0.0142569373662724,0.0104500681408622, 0.0103483945857315,0.0679534422398752,-0.0768068882938636,0.0280289727046158}} ; /* dcmut version of PAM model from www.ebi.ac.uk/goldman-srv/dayhoff/ */ static double pameigmat[] = {0,-1.93321786301018,-2.20904642493621,-1.74835983874903, -1.64854548332072,-1.54505559488222,-1.33859384676989,-1.29786201193594, -0.235548517495575,-0.266951066089808,-0.28965813670665,-1.10505826965282, -1.04323310568532,-0.430423720979904,-0.541719761016713,-0.879636093986914, -0.711249353378695,-0.725050487280602,-0.776855937389452,-0.808735559461343}; static double pamprobmat[20][20] ={ {0.08712695644, 0.04090397955, 0.04043197978, 0.04687197656, 0.03347398326, 0.03825498087, 0.04952997524, 0.08861195569, 0.03361898319, 0.03688598156, 0.08535695732, 0.08048095976, 0.01475299262, 0.03977198011, 0.05067997466, 0.06957696521, 0.05854197073, 0.01049399475, 0.02991598504, 0.06471796764}, {0.07991048383, 0.006888314018, 0.03857806206, 0.07947073194, 0.004895492884, 0.03815829405, -0.1087562465, 0.008691167141, -0.0140554828, 0.001306404001, -0.001888411299, -0.006921303342, 0.0007655604228, 0.001583298443, 0.006879590446, -0.171806883, 0.04890917949, 0.0006700432804, 0.0002276237277, -0.01350591875}, {-0.01641514483, -0.007233933239, -0.1377830621, 0.1163201333, -0.002305138017, 0.01557250366, -0.07455879489, -0.003225343503, 0.0140630487, 0.005112274204, 0.001405731862, 0.01975833782, -0.001348402973, -0.001085733262, -0.003880514478, 0.0851493313, -0.01163526615, -0.0001197903399, 0.002056153393, 0.0001536095643}, {0.009669278686, -0.006905863869, 0.101083544, 0.01179903104, -0.003780967591, 0.05845105878, -0.09138357299, -0.02850503638, -0.03233951408, 0.008708065876, -0.004700705411, -0.02053221579, 0.001165851398, -0.001366585849, -0.01317695074, 0.1199985703, -0.1146346193, -0.0005953021314, -0.0004297615194, 0.007475695618}, {0.1722243502, -0.003737582995, -0.02964873222, -0.02050116381, -0.0004530478465, -0.02460043205, 0.02280768412, -0.02127364909, 0.01570095258, 0.1027744285, -0.005330539586, 0.0179697651, -0.002904077286, -0.007068126663, -0.0142869583, -0.01444241844, -0.08218861544, 0.0002069181629, 0.001099671379, -0.1063484263}, {-0.1553433627, -0.001169168032, 0.02134785337, 0.0007602305436, 0.0001395330122, 0.03194992019, -0.01290252206, 0.03281720789, -0.01311103735, 0.1177254769, -0.008008783885, -0.02375317548, -0.002817809762, -0.008196682776, 0.01731267617, 0.01853526375, 0.08249908546, -2.788771776e-05, 0.001266182191, -0.09902299976}, {-0.03671080341, 0.0274168035, 0.04625877597, 0.07520706414, -0.0001833803619, -0.1207833161, -0.006415807779, -0.005465629648, 0.02778273972, 0.007589688485, -0.02945266034, -0.03797542064, 0.07044042052, -0.002018573865, 0.01845277071, 0.006901513991, -0.02430934639, -0.0005919635873, -0.001266962331, -0.01487591261}, {-0.03060317816, 0.01182361623, 0.04200270053, 0.05406235279, -0.0003920498815, -0.09159709348, -0.009602690652, -0.00382944418, 0.01761361993, 0.01605684317, 0.05198878008, 0.02198696949, -0.09308930025, -0.00102622863, 0.01477637127, 0.0009314065393, -0.01860959472, -0.0005964703968, -0.002694284083, 0.02079767439}, {0.0195976494, -0.005104484936, 0.007406728707, 0.01236244954, 0.0201446796, 0.007039564785, 0.01276942134, 0.02641595685, 0.002764624354, 0.001273314658, -0.01335316035, 0.01105658671, 2.148773499e-05, -0.02692205639, 0.0118684991, 0.01212624708, 0.01127770094, -0.09842754796, -0.01942336432, 0.007105703151}, {-0.01819461888, -0.01509348507, -0.01297636935, -0.01996453439, 0.1715705905, -0.01601550692, -0.02122706144, -0.02854628494, -0.009351082371, -0.001527995472, -0.010198224, -0.03609537551, -0.003153182095, 0.02395980501, -0.01378664626, -0.005992611421, -0.01176810875, 0.003132361603, 0.03018439539, -0.004956065656}, {-0.02733614784, -0.02258066705, -0.0153112506, -0.02475728664, -0.04480525045, -0.01526640341, -0.02438517425, -0.04836914601, -0.00635964824, 0.02263169831, 0.09794101931, -0.04004304158, 0.008464393478, 0.1185443142, -0.02239294163, -0.0281550321, -0.01453581604, -0.0246742804, 0.0879619849, 0.02342867605}, {0.06483718238, 0.1260012082, -0.006496013283, 0.009914915531, -0.004181603532, 0.0003493226286, 0.01408035752, -0.04881663016, -0.03431167356, -0.01768005602, 0.02362447761, -0.1482364784, -0.01289035619, -0.001778893279, -0.05240099752, 0.05536174567, 0.06782165352, -0.003548568717, 0.001125301173, -0.03277489363}, {0.06520296909, -0.0754802543, 0.03139281903, -0.03266449554, -0.004485188002, -0.03389072036, -0.06163274338, -0.06484769882, 0.05722658289, -0.02824079619, 0.01544837349, 0.03909752708, 0.002029218884, 0.003151939572, -0.05471208363, 0.07962008342, 0.125916047, 0.0008696184937, -0.01086027514, -0.05314092355}, {0.004543119081, 0.01935177735, 0.01905511007, 0.02682993409, -0.01199617967, 0.01426278655, 0.02472521255, 0.03864795501, 0.02166224804, -0.04754243479, -0.1921545477, 0.03621321546, -0.02120627881, 0.04928097895, 0.009396088815, 0.01748042052, -6.173742851e-05, -0.003168033098, 0.07723565812, -0.08255529309}, {0.06710378668, -0.09441410284, -0.004801776989, 0.008830272165, -0.01021645042, -0.02764365608, 0.004250361851, 0.1648777542, -0.037446109, 0.004541057635, -0.0296980702, -0.1532325189, -0.008940580901, 0.006998050812, 0.02338809379, 0.03175059182, 0.02033965512, 0.006388075608, 0.001762762044, 0.02616280361}, {0.01915943021, -0.05432967274, 0.01249342683, 0.06836622457, 0.002054462161, -0.01233535859, 0.07087282652, -0.08948637051, -0.1245896013, -0.02204522882, 0.03791481736, 0.06557467874, 0.005529294156, -0.006296644235, 0.02144530752, 0.01664230081, 0.02647078439, 0.001737725271, 0.01414149877, -0.05331990116}, {0.0266659303, 0.0564142853, -0.0263767738, -0.08029726006, -0.006059357163, -0.06317558457, -0.0911894019, 0.05401487057, -0.08178072458, 0.01580699778, -0.05370550396, 0.09798653264, 0.003934944022, 0.01977291947, 0.0441198541, 0.02788220393, 0.03201877081, -0.00206161759, -0.005101423308, 0.03113033802}, {0.02980360751, -0.009513246268, -0.009543527165, -0.02190644172, -0.006146440672, 0.01207009085, -0.0126989156, -0.1378266418, 0.0275235217, 0.00551720592, -0.03104791544, -0.07111701247, -0.006081754489, -0.01337494521, 0.1783961085, 0.01453225059, 0.01938736048, 0.0004488631071, 0.0110844398, 0.02049339243}, {-0.01433508581, 0.01258858175, -0.004294252236, -0.007146532854, 0.009541628809, 0.008040155729, -0.006857781832, 0.05584120066, 0.007749418365, -0.05867835844, 0.08008131283, -0.004877854222, -0.0007128540743, 0.09489058424, 0.06421121962, 0.00271493526, -0.03229944773, -0.001732026038, -0.08053448316, -0.1241903609}, {-0.009854113227, 0.01294129929, -0.00593064392, -0.03016833115, -0.002018439732, -0.00792418722, -0.03372768732, 0.07828561288, 0.007722254639, -0.05067377561, 0.1191848621, 0.005059475202, 0.004762387166, -0.1029870175, 0.03537190114, 0.001089956203, -0.02139157573, -0.001015245062, 0.08400521847, -0.08273195059}}; void init_protmats() { long l; eigmat = (double *) Malloc (20 * sizeof(double)); for (l = 0; l <= 19; l++) if (usejtt) eigmat[l] = jtteigmat[l]; else { if (usepmb) eigmat[l] = pmbeigmat[l]; else eigmat[l] = pameigmat[l]; } probmat = (double **) Malloc (20 * sizeof(double *)); for (l = 0; l <= 19; l++) if (usejtt) probmat[l] = jttprobmat[l]; else { if (usepmb) probmat[l] = pmbprobmat[l]; else probmat[l] = pamprobmat[l]; } } /* init_protmats */ void getoptions() { /* interactively set options */ long i, loopcount, loopcount2; Char ch; boolean didchangecat, didchangercat; double probsum; fprintf(outfile, "\nAmino acid sequence Maximum Likelihood"); fprintf(outfile, " method, version %s\n\n",VERSION); putchar('\n'); ctgry = false; didchangecat = false; rctgry = false; didchangercat = false; categs = 1; rcategs = 1; auto_ = false; gama = false; global = false; hypstate = false; improve = false; invar = false; jumble = false; njumble = 1; lngths = false; lambda = 1.0; outgrno = 1; outgropt = false; trout = true; usertree = false; weights = false; printdata = false; progress = true; treeprint = true; usejtt = true; usepmb = false; usepam = false; interleaved = true; loopcount = 0; for (;;){ cleerhome(); printf("Amino acid sequence Maximum Likelihood"); printf(" method, version %s\n\n",VERSION); printf("Settings for this run:\n"); printf(" U Search for best tree? %s\n", (usertree ? "No, use user trees in input file" : "Yes")); if (usertree) { printf(" L Use lengths from user trees? %s\n", (lngths ? "Yes" : "No")); } printf(" P JTT, PMB or PAM probability model? %s\n", usejtt ? "Jones-Taylor-Thornton" : usepmb ? "Henikoff/Tillier PMB" : "Dayhoff PAM"); printf(" C One category of sites?"); if (!ctgry || categs == 1) printf(" Yes\n"); else printf(" %ld categories of sites\n", categs); printf(" R Rate variation among sites?"); if (!rctgry) printf(" constant rate of change\n"); else { if (gama) printf(" Gamma distributed rates\n"); else { if (invar) printf(" Gamma+Invariant sites\n"); else printf(" user-defined HMM of rates\n"); } printf(" A Rates at adjacent sites correlated?"); if (!auto_) printf(" No, they are independent\n"); else printf(" Yes, mean block length =%6.1f\n", 1.0 / lambda); } printf(" W Sites weighted? %s\n", (weights ? "Yes" : "No")); if (!usertree) { printf(" S Speedier but rougher analysis? %s\n", (improve ? "No, not rough" : "Yes")); printf(" G Global rearrangements? %s\n", (global ? "Yes" : "No")); } if (!usertree) { printf(" J Randomize input order of sequences?"); if (jumble) printf(" Yes (seed =%8ld,%3ld times)\n", inseed0, njumble); else printf(" No. Use input order\n"); } printf(" O Outgroup root? %s%3ld\n", (outgropt ? "Yes, at sequence number" : "No, use as outgroup species"),outgrno); printf(" M Analyze multiple data sets?"); if (mulsets) printf(" Yes, %2ld %s\n", datasets, (justwts ? "sets of weights" : "data sets")); else printf(" No\n"); printf(" I Input sequences interleaved? %s\n", (interleaved ? "Yes" : "No, sequential")); printf(" 0 Terminal type (IBM PC, ANSI, none)? %s\n", (ibmpc ? "IBM PC" : ansi ? "ANSI" : "(none)")); printf(" 1 Print out the data at start of run %s\n", (printdata ? "Yes" : "No")); printf(" 2 Print indications of progress of run %s\n", (progress ? "Yes" : "No")); printf(" 3 Print out tree %s\n", (treeprint ? "Yes" : "No")); printf(" 4 Write out trees onto tree file? %s\n", (trout ? "Yes" : "No")); printf(" 5 Reconstruct hypothetical sequences? %s\n", (hypstate ? "Yes" : "No")); printf("\n Y to accept these or type the letter for one to change\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); if (ch == '\n') ch = ' '; uppercase(&ch); if (ch == 'Y') break; if (((!usertree) && (strchr("UPCRAWSGJOMI012345", ch) != NULL)) || (usertree && ((strchr("UPLCRAWSOMI012345", ch) != NULL)))){ switch (ch) { case 'C': ctgry = !ctgry; if (ctgry) { printf("\nSitewise user-assigned categories:\n\n"); initcatn(&categs); if (rate){ free(rate); } rate = (double *) Malloc(categs * sizeof(double)); didchangecat = true; initcategs(categs, rate); } break; case 'P': if (usejtt) { usejtt = false; usepmb = true; } else { if (usepmb) { usepmb = false; usepam = true; } else { usepam = false; usejtt = true; } } break; case 'R': if (!rctgry) { rctgry = true; gama = true; } else { if (gama) { gama = false; invar = true; } else { if (invar) invar = false; else rctgry = false; } } break; case 'A': auto_ = !auto_; if (auto_) initlambda(&lambda); break; case 'W': weights = !weights; break; case 'S': improve = !improve; break; case 'G': global = !global; break; case 'J': jumble = !jumble; if (jumble) initjumble(&inseed, &inseed0, seed, &njumble); else njumble = 1; break; case 'L': lngths = !lngths; break; case 'O': outgropt = !outgropt; if (outgropt) initoutgroup(&outgrno, spp); break; case 'U': usertree = !usertree; break; case 'M': mulsets = !mulsets; if (mulsets) { printf("Multiple data sets or multiple weights?"); loopcount2 = 0; do { printf(" (type D or W)\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch2); getchar(); if (ch2 == '\n') ch2 = ' '; uppercase(&ch2); countup(&loopcount2, 10); } while ((ch2 != 'W') && (ch2 != 'D')); justwts = (ch2 == 'W'); if (justwts) justweights(&datasets); else initdatasets(&datasets); if (!jumble) { jumble = true; initjumble(&inseed, &inseed0, seed, &njumble); } } break; case 'I': interleaved = !interleaved; break; case '0': initterminal(&ibmpc, &ansi); break; case '1': printdata = !printdata; break; case '2': progress = !progress; break; case '3': treeprint = !treeprint; break; case '4': trout = !trout; break; case '5': hypstate = !hypstate; break; } } else printf("Not a possible option!\n"); countup(&loopcount, 100); } if (gama || invar) { loopcount = 0; do { printf( "\nCoefficient of variation of substitution rate among sites (must be positive)\n"); printf( " In gamma distribution parameters, this is 1/(square root of alpha)\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%lf%*[^\n]", &cv); getchar(); countup(&loopcount, 10); } while (cv <= 0.0); alpha = 1.0 / (cv * cv); } if (!rctgry) auto_ = false; if (rctgry) { printf("\nRates in HMM"); if (invar) printf(" (including one for invariant sites)"); printf(":\n"); initcatn(&rcategs); if (probcat){ free(probcat); free(rrate); } probcat = (double *) Malloc(rcategs * sizeof(double)); rrate = (double *) Malloc(rcategs * sizeof(double)); didchangercat = true; if (gama) initgammacat(rcategs, alpha, rrate, probcat); else { if (invar) { loopcount = 0; do { printf("Fraction of invariant sites?\n"); fflush(stdout); scanf("%lf%*[^\n]", &invarfrac); getchar(); countup (&loopcount, 10); } while ((invarfrac <= 0.0) || (invarfrac >= 1.0)); initgammacat(rcategs-1, alpha, rrate, probcat); for (i = 0; i < rcategs-1; i++) probcat[i] = probcat[i]*(1.0-invarfrac); probcat[rcategs-1] = invarfrac; rrate[rcategs-1] = 0.0; } else { initcategs(rcategs, rrate); initprobcat(rcategs, &probsum, probcat); } } } if (!didchangercat){ rrate = (double *) Malloc(rcategs*sizeof(double)); probcat = (double *) Malloc(rcategs*sizeof(double)); rrate[0] = 1.0; probcat[0] = 1.0; } if (!didchangecat) { rate = (double *) Malloc(categs*sizeof(double)); rate[0] = 1.0; } init_protmats(); } /* getoptions */ void makeprotfreqs() { /* calculate amino acid frequencies based on eigmat */ long i, mineig; mineig = 0; for (i = 0; i <= 19; i++) if (fabs(eigmat[i]) < fabs(eigmat[mineig])) mineig = i; memcpy(freqaa, probmat[mineig], 20 * sizeof(double)); for (i = 0; i <= 19; i++) freqaa[i] = fabs(freqaa[i]); } /* makeprotfreqs */ void reallocsites() { long i; for (i = 0; i < spp; i++) free(y[i]); free(category); free(weight); free(alias); free(ally); free(location); free(aliasweight); for (i = 0; i < spp; i++) y[i] = (Char *) Malloc(sites*sizeof(Char)); category = (long *) Malloc(sites*sizeof(long)); weight = (long *) Malloc(sites*sizeof(long)); alias = (long *) Malloc(sites*sizeof(long)); ally = (long *) Malloc(sites*sizeof(long)); location = (long *) Malloc(sites*sizeof(long)); aliasweight = (long *) Malloc(sites*sizeof(long)); for (i = 0; i < sites; i++) category[i] = 1; for (i = 0; i < sites; i++) weight[i] = 1; /* makeweights(); */ } void allocrest() { /* more allocations */ long i; y = (Char **) Malloc(spp*sizeof(Char *)); for (i = 0; i < spp; i++) y[i] = (Char *) Malloc(sites*sizeof(Char)); nayme = (naym *) Malloc(spp*sizeof(naym)); enterorder = (long *) Malloc(spp*sizeof(long)); category = (long *) Malloc(sites*sizeof(long)); weight = (long *) Malloc(sites*sizeof(long)); alias = (long *) Malloc(sites*sizeof(long)); ally = (long *) Malloc(sites*sizeof(long)); location = (long *) Malloc(sites*sizeof(long)); aliasweight = (long *) Malloc(sites*sizeof(long)); } /* allocrest */ void doinit() { /* initializes variables */ inputnumbers(&spp, &sites, &nonodes2, 1); getoptions(); if (!usertree) nonodes2--; makeprotfreqs(); if (printdata) fprintf(outfile, "%2ld species, %3ld sites\n", spp, sites); alloctree(&curtree.nodep, nonodes2, usertree); /* printf("in doinit calling allocrest\n"); */ allocrest(); if (usertree) return; alloctree(&bestree.nodep, nonodes2, 0); alloctree(&priortree.nodep, nonodes2, 0); if (njumble <= 1) return; alloctree(&bestree2.nodep, nonodes2, 0); } /* doinit */ void inputoptions() { long i; if (!firstset) { samenumsp(&sites, ith); reallocsites(); } if (firstset) { for (i = 0; i < sites; i++) category[i] = 1; for (i = 0; i < sites; i++) weight[i] = 1; } if (justwts || weights) inputweights(sites, weight, &weights); weightsum = 0; for (i = 0; i < sites; i++) weightsum += weight[i]; if ((ctgry && categs > 1) && (firstset || !justwts)) { inputcategs(0, sites, category, categs, "ProML"); if (printdata) printcategs(outfile, sites, category, "Site categories"); } if (weights && printdata) printweights(outfile, 0, sites, weight, "Sites"); fprintf(outfile, "%s model of amino acid change\n\n", (usejtt ? "Jones-Taylor-Thornton" : usepmb ? "Henikoff/Tillier PMB" : "Dayhoff PAM")); } /* inputoptions */ void input_protdata(long chars) { /* input the names and sequences for each species */ /* used by proml */ long i, j, k, l, basesread, basesnew; Char charstate; boolean allread, done; if (printdata) headings(chars, "Sequences", "---------"); basesread = 0; basesnew = 0; allread = false; while (!(allread)) { /* eat white space -- if the separator line has spaces on it*/ do { charstate = gettc(infile); } while (charstate == ' ' || charstate == '\t'); ungetc(charstate, infile); if (eoln(infile)) scan_eoln(infile); i = 1; while (i <= spp) { if ((interleaved && basesread == 0) || !interleaved) initname(i - 1); j = (interleaved) ? basesread : 0; done = false; while (!done && !eoff(infile)) { if (interleaved) done = true; while (j < chars && !(eoln(infile) || eoff(infile))) { charstate = gettc(infile); if (charstate == '\n' || charstate == '\t') charstate = ' '; if (charstate == ' ' || (charstate >= '0' && charstate <= '9')) continue; uppercase(&charstate); if ((strchr("ABCDEFGHIKLMNPQRSTVWXYZ*?-", charstate)) == NULL) { printf("ERROR: bad amino acid: %c at position %ld of species %ld\n", charstate, j+1, i); if (charstate == '.') { printf(" Periods (.) may not be used as gap characters.\n"); printf(" The correct gap character is (-)\n"); } exxit(-1); } j++; y[i - 1][j - 1] = charstate; } if (interleaved) continue; if (j < chars) scan_eoln(infile); else if (j == chars) done = true; } if (interleaved && i == 1) basesnew = j; scan_eoln(infile); if ((interleaved && j != basesnew) || (!interleaved && j != chars)) { printf("ERROR: SEQUENCES OUT OF ALIGNMENT AT POSITION %ld.\n", j); exxit(-1); } i++; } if (interleaved) { basesread = basesnew; allread = (basesread == chars); } else allread = (i > spp); } if (!printdata) return; for (i = 1; i <= ((chars - 1) / 60 + 1); i++) { for (j = 1; j <= spp; j++) { for (k = 0; k < nmlngth; k++) putc(nayme[j - 1][k], outfile); fprintf(outfile, " "); l = i * 60; if (l > chars) l = chars; for (k = (i - 1) * 60 + 1; k <= l; k++) { if (j > 1 && y[j - 1][k - 1] == y[0][k - 1]) charstate = '.'; else charstate = y[j - 1][k - 1]; putc(charstate, outfile); if (k % 10 == 0 && k % 60 != 0) putc(' ', outfile); } putc('\n', outfile); } putc('\n', outfile); } putc('\n', outfile); } /* input_protdata */ void makeweights() { /* make up weights vector to avoid duplicate computations */ long i; for (i = 1; i <= sites; i++) { alias[i - 1] = i; ally[i - 1] = i; aliasweight[i - 1] = weight[i - 1]; location[i - 1] = 0; } sitesort2 (sites, aliasweight); sitecombine2(sites, aliasweight); sitescrunch2(sites, 1, 2, aliasweight); endsite = 0; for (i = 1; i <= sites; i++) { if (ally[i - 1] == i) endsite++; } for (i = 1; i <= endsite; i++) { location[alias[i - 1] - 1] = i; } term = (double **) Malloc(endsite * sizeof(double *)); for (i = 0; i < endsite; i++) term[i] = (double *) Malloc(rcategs * sizeof(double)); slopeterm = (double **) Malloc(endsite * sizeof(double *)); for (i = 0; i < endsite; i++) slopeterm[i] = (double *) Malloc(rcategs * sizeof(double)); curveterm = (double **) Malloc(endsite * sizeof(double *)); for (i = 0; i < endsite; i++) curveterm[i] = (double *) Malloc(rcategs * sizeof(double)); mp = (vall *) Malloc(sites*sizeof(vall)); contribution = (contribarr *) Malloc(endsite*sizeof(contribarr)); } /* makeweights */ void prot_makevalues(long categs, pointarray treenode, long endsite, long spp, sequence y, steptr alias) { /* set up fractional likelihoods at tips */ /* a version of makevalues2 found in seq.c */ /* used by proml */ long i, j, k, l; long b; for (k = 0; k < endsite; k++) { j = alias[k]; for (i = 0; i < spp; i++) { for (l = 0; l < categs; l++) { memset(treenode[i]->protx[k][l], 0, sizeof(double)*20); switch (y[i][j - 1]) { case 'A': treenode[i]->protx[k][l][0] = 1.0; break; case 'R': treenode[i]->protx[k][l][(long)arginine - (long)alanine] = 1.0; break; case 'N': treenode[i]->protx[k][l][(long)asparagine - (long)alanine] = 1.0; break; case 'D': treenode[i]->protx[k][l][(long)aspartic - (long)alanine] = 1.0; break; case 'C': treenode[i]->protx[k][l][(long)cysteine - (long)alanine] = 1.0; break; case 'Q': treenode[i]->protx[k][l][(long)glutamine - (long)alanine] = 1.0; break; case 'E': treenode[i]->protx[k][l][(long)glutamic - (long)alanine] = 1.0; break; case 'G': treenode[i]->protx[k][l][(long)glycine - (long)alanine] = 1.0; break; case 'H': treenode[i]->protx[k][l][(long)histidine - (long)alanine] = 1.0; break; case 'I': treenode[i]->protx[k][l][(long)isoleucine - (long)alanine] = 1.0; break; case 'L': treenode[i]->protx[k][l][(long)leucine - (long)alanine] = 1.0; break; case 'K': treenode[i]->protx[k][l][(long)lysine - (long)alanine] = 1.0; break; case 'M': treenode[i]->protx[k][l][(long)methionine - (long)alanine] = 1.0; break; case 'F': treenode[i]->protx[k][l][(long)phenylalanine - (long)alanine] = 1.0; break; case 'P': treenode[i]->protx[k][l][(long)proline - (long)alanine] = 1.0; break; case 'S': treenode[i]->protx[k][l][(long)serine - (long)alanine] = 1.0; break; case 'T': treenode[i]->protx[k][l][(long)threonine - (long)alanine] = 1.0; break; case 'W': treenode[i]->protx[k][l][(long)tryptophan - (long)alanine] = 1.0; break; case 'Y': treenode[i]->protx[k][l][(long)tyrosine - (long)alanine] = 1.0; break; case 'V': treenode[i]->protx[k][l][(long)valine - (long)alanine] = 1.0; break; case 'B': treenode[i]->protx[k][l][(long)asparagine - (long)alanine] = 1.0; treenode[i]->protx[k][l][(long)aspartic - (long)alanine] = 1.0; break; case 'Z': treenode[i]->protx[k][l][(long)glutamine - (long)alanine] = 1.0; treenode[i]->protx[k][l][(long)glutamic - (long)alanine] = 1.0; break; case 'X': /* unknown aa */ for (b = 0; b <= 19; b++) treenode[i]->protx[k][l][b] = 1.0; break; case '?': /* unknown aa */ for (b = 0; b <= 19; b++) treenode[i]->protx[k][l][b] = 1.0; break; case '*': /* stop codon symbol */ for (b = 0; b <= 19; b++) treenode[i]->protx[k][l][b] = 1.0; break; case '-': /* deletion event-absent data or aa */ for (b = 0; b <= 19; b++) treenode[i]->protx[k][l][b] = 1.0; break; } } } } } /* prot_makevalues */ void free_pmatrix(long sib) { long j,k,l; for (j = 0; j < rcategs; j++) { for (k = 0; k < categs; k++) { for (l = 0; l < 20; l++) free(pmatrices[sib][j][k][l]); free(pmatrices[sib][j][k]); } free(pmatrices[sib][j]); } free(pmatrices[sib]); } void alloc_pmatrix(long sib) { /* Allocate memory for a new pmatrix. Called iff num_sibs>max_num_sibs */ long j, k, l; double ****temp_matrix; temp_matrix = (double ****) Malloc (rcategs * sizeof(double ***)); for (j = 0; j < rcategs; j++) { temp_matrix[j] = (double ***) Malloc(categs * sizeof(double **)); for (k = 0; k < categs; k++) { temp_matrix[j][k] = (double **) Malloc(20 * sizeof (double *)); for (l = 0; l < 20; l++) temp_matrix[j][k][l] = (double *) Malloc(20 * sizeof(double)); } } pmatrices[sib] = temp_matrix; max_num_sibs++; } /* alloc_pmatrix */ void prot_freetable() { long i,j,k,l; for (j = 0; j < rcategs; j++) { for (k = 0; k < categs; k++) { for (l = 0; l < 20; l++) free(ddpmatrix[j][k][l]); free(ddpmatrix[j][k]); } free(ddpmatrix[j]); } free(ddpmatrix); for (j = 0; j < rcategs; j++) { for (k = 0; k < categs; k++) { for (l = 0; l < 20; l++) free(dpmatrix[j][k][l]); free(dpmatrix[j][k]); } free(dpmatrix[j]); } free(dpmatrix); for (j = 0; j < rcategs; j++) free(tbl[j]); free(tbl); for ( i = 0 ; i < max_num_sibs ; i++ ) free_pmatrix(i); free(pmatrices); } void prot_inittable() { /* Define a lookup table. Precompute values and print them out in tables */ /* Allocate memory for the pmatrices, dpmatices and ddpmatrices */ long i, j, k, l; double sumrates; /* Allocate memory for pmatrices, the array of pointers to pmatrices */ pmatrices = (double *****) Malloc ( spp * sizeof(double ****)); /* Allocate memory for the first 2 pmatrices, the matrix of conversion */ /* probabilities, but only once per run (aka not on the second jumble. */ alloc_pmatrix(0); alloc_pmatrix(1); /* Allocate memory for one dpmatrix, the first derivative matrix */ dpmatrix = (double ****) Malloc( rcategs * sizeof(double ***)); for (j = 0; j < rcategs; j++) { dpmatrix[j] = (double ***) Malloc( categs * sizeof(double **)); for (k = 0; k < categs; k++) { dpmatrix[j][k] = (double **) Malloc( 20 * sizeof(double *)); for (l = 0; l < 20; l++) dpmatrix[j][k][l] = (double *) Malloc( 20 * sizeof(double)); } } /* Allocate memory for one ddpmatrix, the second derivative matrix */ ddpmatrix = (double ****) Malloc( rcategs * sizeof(double ***)); for (j = 0; j < rcategs; j++) { ddpmatrix[j] = (double ***) Malloc( categs * sizeof(double **)); for (k = 0; k < categs; k++) { ddpmatrix[j][k] = (double **) Malloc( 20 * sizeof(double *)); for (l = 0; l < 20; l++) ddpmatrix[j][k][l] = (double *) Malloc( 20 * sizeof(double)); } } /* Allocate memory and assign values to tbl, the matrix of possible rates*/ tbl = (double **) Malloc( rcategs * sizeof(double *)); for (j = 0; j < rcategs; j++) tbl[j] = (double *) Malloc( categs * sizeof(double)); for (j = 0; j < rcategs; j++) for (k = 0; k < categs; k++) tbl[j][k] = rrate[j]*rate[k]; sumrates = 0.0; for (i = 0; i < endsite; i++) { for (j = 0; j < rcategs; j++) sumrates += aliasweight[i] * probcat[j] * tbl[j][category[alias[i] - 1] - 1]; } sumrates /= (double)sites; for (j = 0; j < rcategs; j++) for (k = 0; k < categs; k++) { tbl[j][k] /= sumrates; } if(jumb > 1) return; if (gama) { fprintf(outfile, "\nDiscrete approximation to gamma distributed rates\n"); fprintf(outfile, " Coefficient of variation of rates = %f (alpha = %f)\n", cv, alpha); } if (rcategs > 1) { fprintf(outfile, "\nStates in HMM Rate of change Probability\n\n"); for (i = 0; i < rcategs; i++) if (probcat[i] < 0.0001) fprintf(outfile, "%9ld%16.3f%20.6f\n", i+1, rrate[i], probcat[i]); else if (probcat[i] < 0.001) fprintf(outfile, "%9ld%16.3f%19.5f\n", i+1, rrate[i], probcat[i]); else if (probcat[i] < 0.01) fprintf(outfile, "%9ld%16.3f%18.4f\n", i+1, rrate[i], probcat[i]); else fprintf(outfile, "%9ld%16.3f%17.3f\n", i+1, rrate[i], probcat[i]); putc('\n', outfile); if (auto_) fprintf(outfile, "Expected length of a patch of sites having the same rate = %8.3f\n", 1/lambda); putc('\n', outfile); } if (categs > 1) { fprintf(outfile, "\nSite category Rate of change\n\n"); for (k = 0; k < categs; k++) fprintf(outfile, "%9ld%16.3f\n", k+1, rate[k]); } if ((rcategs > 1) || (categs >> 1)) fprintf(outfile, "\n\n"); } /* prot_inittable */ void getinput() { /* reads the input data */ /* void debugtree(tree*, FILE*) */ if (!justwts || firstset) inputoptions(); if (!justwts || firstset) input_protdata(sites); if ( !firstset ) freelrsaves(); /* else */ makeweights(); alloclrsaves(); setuptree2(&curtree); if (!usertree) { setuptree2(&bestree); setuptree2(&priortree); if (njumble > 1) setuptree2(&bestree2); } prot_allocx(nonodes2, rcategs, curtree.nodep, usertree); if (!usertree) { prot_allocx(nonodes2, rcategs, bestree.nodep, 0); prot_allocx(nonodes2, rcategs, priortree.nodep, 0); if (njumble > 1) prot_allocx(nonodes2, rcategs, bestree2.nodep, 0); } prot_makevalues(rcategs, curtree.nodep, endsite, spp, y, alias); } /* getinput */ void inittravtree(node *p) { /* traverse tree to set initialized and v to initial values */ node* q; p->initialized = false; p->back->initialized = false; if ( usertree && (!lngths || p->iter) ) { p->v = initialv; p->back->v = initialv; } if ( !p->tip ) { q = p->next; while ( q != p ) { inittravtree(q->back); q = q->next; } } } /* inittravtree */ void prot_nuview(node *p) { long i, j, k, l, m, num_sibs, sib_index; node *sib_ptr, *sib_back_ptr; psitelike prot_xx, x2; double lw, prod7; double **pmat; double maxx; double correction; /* Figure out how many siblings the current node has */ /* and be sure that pmatrices is large enough */ num_sibs = count_sibs(p); for (i = 0; i < num_sibs; i++) if (pmatrices[i] == NULL) alloc_pmatrix(i); /* Recursive calls, should be called for all children */ sib_ptr = p; for (i=0 ; i < num_sibs; i++) { sib_ptr = sib_ptr->next; sib_back_ptr = sib_ptr->back; if (!sib_back_ptr->tip && !sib_back_ptr->initialized) prot_nuview(sib_back_ptr); } /* Make pmatrices for all possible combinations of category, rcateg */ /* and sib */ sib_ptr = p; /* return to p */ for (sib_index=0; sib_index < num_sibs; sib_index++) { sib_ptr = sib_ptr->next; sib_back_ptr = sib_ptr->back; lw = sib_back_ptr->v; for (j = 0; j < rcategs; j++) for (k = 0; k < categs; k++) make_pmatrix(pmatrices[sib_index][j][k], NULL, NULL, 0, lw, tbl[j][k], eigmat, probmat); } for (i = 0; i < endsite; i++) { maxx = 0; correction = 0; k = category[alias[i]-1] - 1; for (j = 0; j < rcategs; j++) { /* initialize to 1 all values of prot_xx */ for (m = 0; m <= 19; m++) prot_xx[m] = 1; sib_ptr = p; /* return to p */ /* loop through all sibs and calculate likelihoods for all possible*/ /* amino acid combinations */ for (sib_index=0; sib_index < num_sibs; sib_index++) { sib_ptr = sib_ptr->next; sib_back_ptr = sib_ptr->back; if ( j == 0) correction += sib_back_ptr->underflows[i]; memcpy(x2, sib_back_ptr->protx[i][j], sizeof(psitelike)); pmat = pmatrices[sib_index][j][k]; for (m = 0; m <= 19; m++) { prod7 = 0; for (l = 0; l <= 19; l++) prod7 += (pmat[m][l] * x2[l]); prot_xx[m] *= prod7; if ( prot_xx[m] > maxx && sib_index == (num_sibs - 1)) maxx = prot_xx[m]; } } /* And the final point of this whole function: */ memcpy(p->protx[i][j], prot_xx, sizeof(psitelike)); } p->underflows[i] = 0; if ( maxx < MIN_DOUBLE ) fix_protx(p,i,maxx,rcategs); p->underflows[i] += correction; } p->initialized = true; } /* prot_nuview */ void prot_slopecurv(node *p,double y,double *like,double *slope,double *curve) { /* compute log likelihood, slope and curvature at node p */ long i, j, k, l, m, lai; double sum, sumc, sumterm, lterm, sumcs, sumcc, sum2, slope2, curve2; double frexm = 0; /* frexm = freqaa[m]*x1[m] */ /* frexml = frexm*x2[l] */ double prod4m, prod5m, prod6m; /* elements of prod4-5 for */ /* each m */ double **pmat, **dpmat, **ddpmat; /* local pointers to global*/ /* matrices */ double prod4, prod5, prod6; contribarr thelike, nulike, nuslope, nucurve, theslope, thecurve, clai, cslai, cclai; node *q; psitelike x1, x2; q = p->back; sum = 0.0; for (j = 0; j < rcategs; j++) { for (k = 0; k < categs; k++) { make_pmatrix(pmatrices[0][j][k], dpmatrix[j][k], ddpmatrix[j][k], 2, y, tbl[j][k], eigmat, probmat); } } for (i = 0; i < endsite; i++) { k = category[alias[i]-1] - 1; for (j = 0; j < rcategs; j++) { memcpy(x1, p->protx[i][j], sizeof(psitelike)); memcpy(x2, q->protx[i][j], sizeof(psitelike)); pmat = pmatrices[0][j][k]; dpmat = dpmatrix[j][k]; ddpmat = ddpmatrix[j][k]; prod4 = 0.0; prod5 = 0.0; prod6 = 0.0; for (m = 0; m <= 19; m++) { prod4m = 0.0; prod5m = 0.0; prod6m = 0.0; frexm = x1[m] * freqaa[m]; for (l = 0; l <= 19; l++) { prod4m += x2[l] * pmat[m][l]; prod5m += x2[l] * dpmat[m][l]; prod6m += x2[l] * ddpmat[m][l]; } prod4 += frexm * prod4m; prod5 += frexm * prod5m; prod6 += frexm * prod6m; } term[i][j] = prod4; slopeterm[i][j] = prod5; curveterm[i][j] = prod6; } sumterm = 0.0; for (j = 0; j < rcategs; j++) sumterm += probcat[j] * term[i][j]; if (sumterm <= 0.0) sumterm = 0.000000001; /* ? shouldn't get here ?? */ lterm = log(sumterm) + p->underflows[i] + q->underflows[i]; for (j = 0; j < rcategs; j++) { term[i][j] = term[i][j] / sumterm; slopeterm[i][j] = slopeterm[i][j] / sumterm; curveterm[i][j] = curveterm[i][j] / sumterm; } sum += (aliasweight[i] * lterm); } for (i = 0; i < rcategs; i++) { thelike[i] = 1.0; theslope[i] = 0.0; thecurve[i] = 0.0; } for (i = 0; i < sites; i++) { sumc = 0.0; sumcs = 0.0; sumcc = 0.0; for (k = 0; k < rcategs; k++) { sumc += probcat[k] * thelike[k]; sumcs += probcat[k] * theslope[k]; sumcc += probcat[k] * thecurve[k]; } sumc *= lambda; sumcs *= lambda; sumcc *= lambda; if ((ally[i] > 0) && (location[ally[i]-1] > 0)) { lai = location[ally[i] - 1]; memcpy(clai, term[lai - 1], rcategs*sizeof(double)); memcpy(cslai, slopeterm[lai - 1], rcategs*sizeof(double)); memcpy(cclai, curveterm[lai - 1], rcategs*sizeof(double)); if (weight[i] > 1) { for (j = 0; j < rcategs; j++) { if (clai[j] > 0.0) clai[j] = exp(weight[i]*log(clai[j])); else clai[j] = 0.0; if (cslai[j] > 0.0) cslai[j] = exp(weight[i]*log(cslai[j])); else cslai[j] = 0.0; if (cclai[j] > 0.0) cclai[j] = exp(weight[i]*log(cclai[j])); else cclai[j] = 0.0; } } for (j = 0; j < rcategs; j++) { nulike[j] = ((1.0 - lambda) * thelike[j] + sumc) * clai[j]; nuslope[j] = ((1.0 - lambda) * theslope[j] + sumcs) * clai[j] + ((1.0 - lambda) * thelike[j] + sumc) * cslai[j]; nucurve[j] = ((1.0 - lambda) * thecurve[j] + sumcc) * clai[j] + 2.0 * ((1.0 - lambda) * theslope[j] + sumcs) * cslai[j] + ((1.0 - lambda) * thelike[j] + sumc) * cclai[j]; } } else { for (j = 0; j < rcategs; j++) { nulike[j] = ((1.0 - lambda) * thelike[j] + sumc); nuslope[j] = ((1.0 - lambda) * theslope[j] + sumcs); nucurve[j] = ((1.0 - lambda) * thecurve[j] + sumcc); } } memcpy(thelike, nulike, rcategs*sizeof(double)); memcpy(theslope, nuslope, rcategs*sizeof(double)); memcpy(thecurve, nucurve, rcategs*sizeof(double)); } sum2 = 0.0; slope2 = 0.0; curve2 = 0.0; for (i = 0; i < rcategs; i++) { sum2 += probcat[i] * thelike[i]; slope2 += probcat[i] * theslope[i]; curve2 += probcat[i] * thecurve[i]; } sum += log(sum2); (*like) = sum; (*slope) = slope2 / sum2; /* Expressed in terms of *slope to prevent overflow */ (*curve) = curve2 / sum2 - *slope * *slope; } /* prot_slopecurv */ void makenewv(node *p) { /* Newton-Raphson algorithm improvement of a branch length */ long it, ite; double y, yold=0, yorig, like, slope, curve, oldlike=0; boolean done, firsttime, better; node *q; q = p->back; y = p->v; yorig = y; done = false; firsttime = true; it = 1; ite = 0; while ((it < iterations) && (ite < 20) && (!done)) { prot_slopecurv(p, y, &like, &slope, &curve); better = false; if (firsttime) { /* if no older value of y to compare with */ yold = y; oldlike = like; firsttime = false; better = true; } else { if (like > oldlike) { /* update the value of yold if it was better */ yold = y; oldlike = like; better = true; it++; } } if (better) { y = y + slope/fabs(curve); /* Newton-Raphson, forced uphill-wards */ if (y < epsilon) y = epsilon; } else { if (fabs(y - yold) < epsilon) ite = 20; y = (y + (7 * yold)) / 8; /* retract 87% of way back */ } ite++; done = fabs(y-yold) < 0.1*epsilon; } smoothed = (fabs(yold-yorig) < epsilon) && (yorig > 1000.0*epsilon); p->v = yold; /* the last one that had better likelihood */ q->v = yold; curtree.likelihood = oldlike; } /* makenewv */ void update(node *p) { node *q; if (!p->tip && !p->initialized) prot_nuview(p); if (!p->back->tip && !p->back->initialized) prot_nuview(p->back); if ((!usertree) || (usertree && !lngths) || p->iter) { makenewv(p); if ( smoothit ) { inittrav(p); inittrav(p->back); } else if ( inserting && !p->tip ) { for ( q = p->next; q != p; q = q->next ) q->initialized = false; } } } /* update */ void smooth(node *p) { long i, num_sibs; node *sib_ptr; smoothed = false; update(p); if (p->tip) return; num_sibs = count_sibs(p); sib_ptr = p; for (i=0; i < num_sibs; i++) { sib_ptr = sib_ptr->next; if (polishing || (smoothit && !smoothed)) { smooth(sib_ptr->back); p->initialized = false; sib_ptr->initialized = false; } } } /* smooth */ void make_pmatrix(double **matrix, double **dmat, double **ddmat, long derivative, double lz, double rat, double *eigmat, double **probmat) { /* Computes the R matrix such that matrix[m][l] is the joint probability */ /* of m and l. */ /* Computes a P matrix such that matrix[m][l] is the conditional */ /* probability of m given l. This is accomplished by dividing all terms */ /* in the R matrix by freqaa[m], the frequency of l. */ long k, l, m; /* (l) original character state */ /* (m) final character state */ /* (k) lambda counter */ double p0, p1, p2, q; double elambdat[20], delambdat[20], ddelambdat[20]; /* exponential term for matrix */ /* and both derivative matrices */ for (k = 0; k <= 19; k++) { elambdat[k] = exp(lz * rat * eigmat[k]); if(derivative != 0) { delambdat[k] = (elambdat[k] * rat * eigmat[k]); ddelambdat[k] = (delambdat[k] * rat * eigmat[k]); } } for (m = 0; m <= 19; m++) { for (l = 0; l <= 19; l++) { p0 = 0.0; p1 = 0.0; p2 = 0.0; for (k = 0; k <= 19; k++) { q = probmat[k][m] * probmat[k][l]; p0 += (q * elambdat[k]); if(derivative !=0) { p1 += (q * delambdat[k]); p2 += (q * ddelambdat[k]); } } matrix[m][l] = p0 / freqaa[m]; if(derivative != 0) { dmat[m][l] = p1 / freqaa[m]; ddmat[m][l] = p2 / freqaa[m]; } } } } /* make_pmatrix */ double prot_evaluate(node *p, boolean saveit) { contribarr tterm; double sum, sum2, sumc, y, prod4, prodl, frexm, sumterm, lterm; double **pmat; long i, j, k, l, m, lai; node *q; psitelike x1, x2; sum = 0.0; q = p->back; y = p->v; for (j = 0; j < rcategs; j++) for (k = 0; k < categs; k++) make_pmatrix(pmatrices[0][j][k],NULL,NULL,0,y,tbl[j][k],eigmat,probmat); for (i = 0; i < endsite; i++) { k = category[alias[i]-1] - 1; for (j = 0; j < rcategs; j++) { memcpy(x1, p->protx[i][j], sizeof(psitelike)); memcpy(x2, q->protx[i][j], sizeof(psitelike)); prod4 = 0.0; pmat = pmatrices[0][j][k]; for (m = 0; m <= 19; m++) { prodl = 0.0; for (l = 0; l <= 19; l++) prodl += (pmat[m][l] * x2[l]); frexm = x1[m] * freqaa[m]; prod4 += (prodl * frexm); } tterm[j] = prod4; } sumterm = 0.0; for (j = 0; j < rcategs; j++) sumterm += probcat[j] * tterm[j]; if (sumterm < 0.0) sumterm = 0.00000001; /* ??? */ lterm = log(sumterm) + p->underflows[i] + q->underflows[i]; for (j = 0; j < rcategs; j++) clai[j] = tterm[j] / sumterm; memcpy(contribution[i], clai, rcategs*sizeof(double)); if (saveit && !auto_ && usertree && (which <= shimotrees)) l0gf[which - 1][i] = lterm; sum += aliasweight[i] * lterm; } for (j = 0; j < rcategs; j++) like[j] = 1.0; for (i = 0; i < sites; i++) { sumc = 0.0; for (k = 0; k < rcategs; k++) sumc += probcat[k] * like[k]; sumc *= lambda; if ((ally[i] > 0) && (location[ally[i]-1] > 0)) { lai = location[ally[i] - 1]; memcpy(clai, contribution[lai - 1], rcategs*sizeof(double)); for (j = 0; j < rcategs; j++) nulike[j] = ((1.0 - lambda) * like[j] + sumc) * clai[j]; } else { for (j = 0; j < rcategs; j++) nulike[j] = ((1.0 - lambda) * like[j] + sumc); } memcpy(like, nulike, rcategs*sizeof(double)); } sum2 = 0.0; for (i = 0; i < rcategs; i++) sum2 += probcat[i] * like[i]; sum += log(sum2); curtree.likelihood = sum; if (!saveit || auto_ || !usertree) return sum; if(which <= shimotrees) l0gl[which - 1] = sum; if (which == 1) { maxwhich = 1; maxlogl = sum; return sum; } if (sum > maxlogl) { maxwhich = which; maxlogl = sum; } return sum; } /* prot_evaluate */ void treevaluate() { /* evaluate a user tree */ long i; inittravtree(curtree.start); polishing = true; smoothit = true; for (i = 1; i <= smoothings * 4; i++) smooth (curtree.start); dummy = prot_evaluate(curtree.start, true); } /* treevaluate */ void promlcopy(tree *a, tree *b, long nonodes, long categs) { /* copy tree a to tree b */ long i, j=0; node *p, *q; for (i = 0; i < spp; i++) { prot_copynode(a->nodep[i], b->nodep[i], categs); if (a->nodep[i]->back) { if (a->nodep[i]->back == a->nodep[a->nodep[i]->back->index - 1]) b->nodep[i]->back = b->nodep[a->nodep[i]->back->index - 1]; else if (a->nodep[i]->back == a->nodep[a->nodep[i]->back->index - 1]->next ) b->nodep[i]->back = b->nodep[a->nodep[i]->back->index - 1]->next; else b->nodep[i]->back = b->nodep[a->nodep[i]->back->index - 1]->next->next; } else b->nodep[i]->back = NULL; } for (i = spp; i < nonodes; i++) { p = a->nodep[i]; q = b->nodep[i]; for (j = 1; j <= 3; j++) { prot_copynode(p, q, categs); if (p->back) { if (p->back == a->nodep[p->back->index - 1]) q->back = b->nodep[p->back->index - 1]; else if (p->back == a->nodep[p->back->index - 1]->next) q->back = b->nodep[p->back->index - 1]->next; else q->back = b->nodep[p->back->index - 1]->next->next; } else q->back = NULL; p = p->next; q = q->next; } } b->likelihood = a->likelihood; b->start = a->start; /* start used in dnaml only */ b->root = a->root; /* root used in dnamlk only */ } /* promlcopy */ void proml_re_move(node **p, node **q) { /* remove p and record in q where it was */ long i; /* assumes bifurcation (OK) */ *q = (*p)->next->back; hookup(*q, (*p)->next->next->back); (*p)->next->back = NULL; (*p)->next->next->back = NULL; (*q)->v += (*q)->back->v; (*q)->back->v = (*q)->v; if ( smoothit ) { inittrav((*q)); inittrav((*q)->back); inittrav((*p)->back); } if ( smoothit ) { for ( i = 0 ; i < smoothings ; i++ ) { smooth(*q); smooth((*q)->back); } } else smooth(*q); } /* proml_re_move */ void insert_(node *p, node *q, boolean dooinit) { /* Insert q near p */ long i, j, num_sibs; node *r, *sib_ptr; r = p->next->next; hookup(r, q->back); hookup(p->next, q); q->v = 0.5 * q->v; q->back->v = q->v; r->v = q->v; r->back->v = r->v; p->initialized = false; if (dooinit) { inittrav(p); inittrav(q); inittrav(q->back); } i = 1; inserting = true; while (i <= smoothings) { smooth(p); if (!p->tip) { num_sibs = count_sibs(p); sib_ptr = p; for (j=0; j < num_sibs; j++) { smooth(sib_ptr->next->back); sib_ptr = sib_ptr->next; } } i++; } inserting = false; } /* insert_ */ void addtraverse(node *p, node *q, boolean contin) { /* try adding p at q, proceed recursively through tree */ long i, num_sibs; double like, vsave = 0; node *qback = NULL, *sib_ptr; if (!smoothit) { vsave = q->v; qback = q->back; } insert_(p, q, false); like = prot_evaluate(p, false); if (like > bestyet + LIKE_EPSILON || bestyet == UNDEFINED) { bestyet = like; if (smoothit) { addwhere = q; promlcopy(&curtree, &bestree, nonodes2, rcategs); } else qwhere = q; succeeded = true; } if (smoothit) promlcopy(&priortree, &curtree, nonodes2, rcategs); else { hookup (q, qback); q->v = vsave; q->back->v = vsave; curtree.likelihood = bestyet; } if (!q->tip && contin) { num_sibs = count_sibs(q); if (q == curtree.start) num_sibs++; sib_ptr = q; for (i=0; i < num_sibs; i++) { addtraverse(p, sib_ptr->next->back, contin); sib_ptr = sib_ptr->next; } } } /* addtraverse */ void globrearrange() { /* does global rearrangements */ tree globtree; tree oldtree; int i,j,k,l,num_sibs,num_sibs2; node *where,*sib_ptr,*sib_ptr2; double oldbestyet = curtree.likelihood; int success = false; alloctree(&globtree.nodep,nonodes2,0); alloctree(&oldtree.nodep,nonodes2,0); setuptree2(&globtree); setuptree2(&oldtree); prot_allocx(nonodes2, rcategs, globtree.nodep, 0); prot_allocx(nonodes2, rcategs, oldtree.nodep, 0); promlcopy(&curtree,&globtree,nonodes2,rcategs); promlcopy(&curtree,&oldtree,nonodes2,rcategs); bestyet = curtree.likelihood; for ( i = spp ; i < nonodes2 ; i++ ) { num_sibs = count_sibs(curtree.nodep[i]); sib_ptr = curtree.nodep[i]; if ( (i - spp) % (( nonodes2 / 72 ) + 1 ) == 0 ) putchar('.'); fflush(stdout); for ( j = 0 ; j <= num_sibs ; j++ ) { proml_re_move(&sib_ptr,&where); promlcopy(&curtree,&priortree,nonodes2,rcategs); qwhere = where; if (where->tip) { promlcopy(&oldtree,&curtree,nonodes2,rcategs); promlcopy(&oldtree,&bestree,nonodes2,rcategs); sib_ptr=sib_ptr->next; continue; } else num_sibs2 = count_sibs(where); sib_ptr2 = where; for ( k = 0 ; k < num_sibs2 ; k++ ) { addwhere = NULL; addtraverse(sib_ptr,sib_ptr2->back,true); if ( !smoothit ) { if (succeeded && qwhere != where && qwhere != where->back) { insert_(sib_ptr,qwhere,true); smoothit = true; for (l = 1; l<=smoothings; l++) { smooth (where); smooth (where->back); } smoothit = false; success = true; promlcopy(&curtree,&globtree,nonodes2,rcategs); promlcopy(&priortree,&curtree,nonodes2,rcategs); } } else if ( addwhere && where != addwhere && where->back != addwhere && bestyet > globtree.likelihood) { promlcopy(&bestree,&globtree,nonodes2,rcategs); success = true; } sib_ptr2 = sib_ptr2->next; } promlcopy(&oldtree,&curtree,nonodes2,rcategs); promlcopy(&oldtree,&bestree,nonodes2,rcategs); sib_ptr = sib_ptr->next; } } promlcopy(&globtree,&curtree,nonodes2,rcategs); promlcopy(&globtree,&bestree,nonodes2,rcategs); if (success && globtree.likelihood > oldbestyet) { succeeded = true; } else { succeeded = false; } bestyet = globtree.likelihood; prot_freex(nonodes2,oldtree.nodep); prot_freex(nonodes2,globtree.nodep); freetree2(globtree.nodep,nonodes2); freetree2(oldtree.nodep,nonodes2); } /* globrearrange */ void freelrsaves() { long i,j; for ( i = 0 ; i < NLRSAVES ; i++ ) { for (j = 0; j < oldendsite; j++) free(lrsaves[i]->protx[j]); free(lrsaves[i]->protx); free(lrsaves[i]->underflows); free(lrsaves[i]); } free(lrsaves); } /* freelrsaves */ void alloclrsaves() { long i,j; lrsaves = Malloc(NLRSAVES * sizeof(node*)); oldendsite = endsite; for ( i = 0 ; i < NLRSAVES ; i++ ) { lrsaves[i] = Malloc(sizeof(node)); lrsaves[i]->protx = Malloc(endsite*sizeof(ratelike)); lrsaves[i]->underflows = Malloc(endsite * sizeof (double)); for (j = 0; j < endsite; j++) lrsaves[i]->protx[j] = (pratelike)Malloc(rcategs*sizeof(psitelike)); } } /* alloclrsaves */ void rearrange(node *p, node *pp) { /* rearranges the tree locally moving pp around near p */ long i, num_sibs; node *q, *r, *sib_ptr; node *rnb, *rnnb; /* assumes bifurcations (OK) */ if (!p->tip && !p->back->tip) { curtree.likelihood = bestyet; if (p->back->next != pp) r = p->back->next; else r = p->back->next->next; if (!smoothit) { rnb = r->next->back; rnnb = r->next->next->back; prot_copynode(r,lrsaves[0],rcategs); prot_copynode(r->next,lrsaves[1],rcategs); prot_copynode(r->next->next,lrsaves[2],rcategs); prot_copynode(p->next,lrsaves[3],rcategs); prot_copynode(p->next->next,lrsaves[4],rcategs); } else promlcopy(&curtree, &bestree, nonodes2, rcategs); proml_re_move(&r, &q); if (smoothit) promlcopy(&curtree, &priortree, nonodes2, rcategs); else qwhere = q; num_sibs = count_sibs (p); sib_ptr = p; for (i=0; i < num_sibs; i++) { sib_ptr = sib_ptr->next; addtraverse(r, sib_ptr->back, false); } if (smoothit) promlcopy(&bestree, &curtree, nonodes2, rcategs); else { if (qwhere == q) { hookup(rnb,r->next); hookup(rnnb,r->next->next); prot_copynode(lrsaves[0],r,rcategs); prot_copynode(lrsaves[1],r->next,rcategs); prot_copynode(lrsaves[2],r->next->next,rcategs); prot_copynode(lrsaves[3],p->next,rcategs); prot_copynode(lrsaves[4],p->next->next,rcategs); rnb->v = r->next->v; rnnb->v = r->next->next->v; r->back->v = r->v; curtree.likelihood = bestyet; } else { insert_(r, qwhere, true); smoothit = true; for (i = 1; i<=smoothings; i++) { smooth(r); smooth(r->back); } smoothit = false; promlcopy(&curtree, &bestree, nonodes2, rcategs); } } } if (!p->tip) { num_sibs = count_sibs(p); if (p == curtree.start) num_sibs++; sib_ptr = p; for (i=0; i < num_sibs; i++) { sib_ptr = sib_ptr->next; rearrange(sib_ptr->back, p); } } } /* rearrange */ void proml_coordinates(node *p, double lengthsum, long *tipy, double *tipmax) { /* establishes coordinates of nodes */ node *q, *first, *last; double xx; if (p->tip) { p->xcoord = (long)(over * lengthsum + 0.5); p->ycoord = (*tipy); p->ymin = (*tipy); p->ymax = (*tipy); (*tipy) += down; if (lengthsum > (*tipmax)) (*tipmax) = lengthsum; return; } q = p->next; do { xx = q->v; if (xx > 100.0) xx = 100.0; proml_coordinates(q->back, lengthsum + xx, tipy,tipmax); q = q->next; } while ((p == curtree.start || p != q) && (p != curtree.start || p->next != q)); first = p->next->back; q = p; while (q->next != p) q = q->next; last = q->back; p->xcoord = (long)(over * lengthsum + 0.5); if (p == curtree.start) p->ycoord = p->next->next->back->ycoord; else p->ycoord = (first->ycoord + last->ycoord) / 2; p->ymin = first->ymin; p->ymax = last->ymax; } /* proml_coordinates */ void proml_printree() { /* prints out diagram of the tree2 */ long tipy; double scale, tipmax; long i; if (!treeprint) return; putc('\n', outfile); tipy = 1; tipmax = 0.0; proml_coordinates(curtree.start, 0.0, &tipy, &tipmax); scale = 1.0 / (long)(tipmax + 1.000); for (i = 1; i <= (tipy - down); i++) drawline2(i, scale, curtree); putc('\n', outfile); } /* proml_printree */ void sigma(node *p, double *sumlr, double *s1, double *s2) { /* compute standard deviation */ double tt, aa, like, slope, curv; prot_slopecurv(p, p->v, &like, &slope, &curv); tt = p->v; p->v = epsilon; p->back->v = epsilon; aa = prot_evaluate(p, false); p->v = tt; p->back->v = tt; (*sumlr) = prot_evaluate(p, false) - aa; if (curv < -epsilon) { (*s1) = p->v + (-slope - sqrt(slope * slope - 3.841 * curv)) / curv; (*s2) = p->v + (-slope + sqrt(slope * slope - 3.841 * curv)) / curv; } else { (*s1) = -1.0; (*s2) = -1.0; } } /* sigma */ void describe(node *p) { /* print out information for one branch */ long i, num_sibs; node *q, *sib_ptr; double sumlr, sigma1, sigma2; if (!p->tip && !p->initialized) prot_nuview(p); if (!p->back->tip && !p->back->initialized) prot_nuview(p->back); q = p->back; if (q->tip) { fprintf(outfile, " "); for (i = 0; i < nmlngth; i++) putc(nayme[q->index-1][i], outfile); fprintf(outfile, " "); } else fprintf(outfile, " %4ld ", q->index - spp); if (p->tip) { for (i = 0; i < nmlngth; i++) putc(nayme[p->index-1][i], outfile); } else fprintf(outfile, "%4ld ", p->index - spp); fprintf(outfile, "%15.5f", q->v); if (!usertree || (usertree && !lngths) || p->iter) { sigma(q, &sumlr, &sigma1, &sigma2); if (sigma1 <= sigma2) fprintf(outfile, " ( zero, infinity)"); else { fprintf(outfile, " ("); if (sigma2 <= 0.0) fprintf(outfile, " zero"); else fprintf(outfile, "%9.5f", sigma2); fprintf(outfile, ",%12.5f", sigma1); putc(')', outfile); } if (sumlr > 1.9205) fprintf(outfile, " *"); if (sumlr > 2.995) putc('*', outfile); } putc('\n', outfile); if (!p->tip) { num_sibs = count_sibs(p); sib_ptr = p; for (i=0; i < num_sibs; i++) { sib_ptr = sib_ptr->next; describe(sib_ptr->back); } } } /* describe */ void prot_reconstr(node *p, long n) { /* reconstruct and print out acid at site n+1 at node p */ long i, j, k, first, num_sibs = 0; double f, sum, xx[20]; node *q = NULL; if (p->tip) putc(y[p->index-1][n], outfile); else { num_sibs = count_sibs(p); if ((ally[n] == 0) || (location[ally[n]-1] == 0)) putc('.', outfile); else { j = location[ally[n]-1] - 1; sum = 0; for (i = 0; i <= 19; i++) { f = p->protx[j][mx-1][i]; if (!p->tip) { q = p; for (k = 0; k < num_sibs; k++) { q = q->next; f *= q->protx[j][mx-1][i]; } } f = sqrt(f); xx[i] = f * freqaa[i]; sum += xx[i]; } for (i = 0; i <= 19; i++) xx[i] /= sum; first = 0; for (i = 0; i <= 19; i++) if (xx[i] > xx[first]) first = i; if (xx[first] > 0.95) putc(aachar[first], outfile); else putc(tolower(aachar[first]), outfile); if (rctgry && rcategs > 1) mx = mp[n][mx - 1]; else mx = 1; } } } /* prot_reconstr */ void rectrav(node *p, long m, long n) { /* print out segment of reconstructed sequence for one branch */ long i; node *q; putc(' ', outfile); if (p->tip) { for (i = 0; i < nmlngth; i++) putc(nayme[p->index-1][i], outfile); } else fprintf(outfile, "%4ld ", p->index - spp); fprintf(outfile, " "); mx = mx0; for (i = m; i <= n; i++) { if ((i % 10 == 0) && (i != m)) putc(' ', outfile); prot_reconstr(p, i); } putc('\n', outfile); if (!p->tip) for ( q = p->next; q != p; q = q->next ) rectrav(q->back, m, n); mx1 = mx; } /* rectrav */ void summarize() { /* print out branch length information and node numbers */ long i, j, mm, num_sibs; double mode, sum; double like[maxcategs],nulike[maxcategs]; double **marginal; node *sib_ptr; if (!treeprint) return; fprintf(outfile, "\nremember: "); if (outgropt) fprintf(outfile, "(although rooted by outgroup) "); fprintf(outfile, "this is an unrooted tree!\n\n"); fprintf(outfile, "Ln Likelihood = %11.5f\n", curtree.likelihood); fprintf(outfile, "\n Between And Length"); if (!(usertree && lngths && haslengths)) fprintf(outfile, " Approx. Confidence Limits"); fprintf(outfile, "\n"); fprintf(outfile, " ------- --- ------"); if (!(usertree && lngths && haslengths)) fprintf(outfile, " ------- ---------- ------"); fprintf(outfile, "\n\n"); for (i = spp; i < nonodes2; i++) { /* So this works with arbitrary multifurcations */ if (curtree.nodep[i]) { num_sibs = count_sibs (curtree.nodep[i]); sib_ptr = curtree.nodep[i]; for (j = 0; j < num_sibs; j++) { sib_ptr->initialized = false; sib_ptr = sib_ptr->next; } } } describe(curtree.start->back); /* So this works with arbitrary multifurcations */ num_sibs = count_sibs(curtree.start); sib_ptr = curtree.start; for (i=0; i < num_sibs; i++) { sib_ptr = sib_ptr->next; describe(sib_ptr->back); } fprintf(outfile, "\n"); if (!(usertree && lngths && haslengths)) { fprintf(outfile, " * = significantly positive, P < 0.05\n"); fprintf(outfile, " ** = significantly positive, P < 0.01\n\n"); } dummy = prot_evaluate(curtree.start, false); if (rctgry && rcategs > 1) { for (i = 0; i < rcategs; i++) like[i] = 1.0; for (i = sites - 1; i >= 0; i--) { sum = 0.0; for (j = 0; j < rcategs; j++) { nulike[j] = (1.0 - lambda + lambda * probcat[j]) * like[j]; mp[i][j] = j + 1; for (k = 1; k <= rcategs; k++) { if (k != j + 1) { if (lambda * probcat[k - 1] * like[k - 1] > nulike[j]) { nulike[j] = lambda * probcat[k - 1] * like[k - 1]; mp[i][j] = k; } } } if ((ally[i] > 0) && (location[ally[i]-1] > 0)) nulike[j] *= contribution[location[ally[i] - 1] - 1][j]; sum += nulike[j]; } for (j = 0; j < rcategs; j++) nulike[j] /= sum; memcpy(like, nulike, rcategs * sizeof(double)); } mode = 0.0; mx = 1; for (i = 1; i <= rcategs; i++) { if (probcat[i - 1] * like[i - 1] > mode) { mx = i; mode = probcat[i - 1] * like[i - 1]; } } mx0 = mx; fprintf(outfile, "Combination of categories that contributes the most to the likelihood:\n\n"); for (i = 1; i <= nmlngth + 3; i++) putc(' ', outfile); for (i = 1; i <= sites; i++) { fprintf(outfile, "%ld", mx); if (i % 10 == 0) putc(' ', outfile); if (i % 60 == 0 && i != sites) { putc('\n', outfile); for (j = 1; j <= nmlngth + 3; j++) putc(' ', outfile); } mx = mp[i - 1][mx - 1]; } fprintf(outfile, "\n\n"); marginal = (double **) Malloc(sites*sizeof(double *)); for (i = 0; i < sites; i++) marginal[i] = (double *) Malloc(rcategs*sizeof(double)); for (i = 0; i < rcategs; i++) like[i] = 1.0; for (i = sites - 1; i >= 0; i--) { sum = 0.0; for (j = 0; j < rcategs; j++) { nulike[j] = (1.0 - lambda + lambda * probcat[j]) * like[j]; for (k = 1; k <= rcategs; k++) { if (k != j + 1) nulike[j] += lambda * probcat[k - 1] * like[k - 1]; } if ((ally[i] > 0) && (location[ally[i]-1] > 0)) nulike[j] *= contribution[location[ally[i] - 1] - 1][j]; sum += nulike[j]; } for (j = 0; j < rcategs; j++) { nulike[j] /= sum; marginal[i][j] = nulike[j]; } memcpy(like, nulike, rcategs * sizeof(double)); } for (i = 0; i < rcategs; i++) like[i] = 1.0; for (i = 0; i < sites; i++) { sum = 0.0; for (j = 0; j < rcategs; j++) { nulike[j] = (1.0 - lambda + lambda * probcat[j]) * like[j]; for (k = 1; k <= rcategs; k++) { if (k != j + 1) nulike[j] += lambda * probcat[k - 1] * like[k - 1]; } marginal[i][j] *= like[j] * probcat[j]; sum += nulike[j]; } for (j = 0; j < rcategs; j++) nulike[j] /= sum; memcpy(like, nulike, rcategs * sizeof(double)); sum = 0.0; for (j = 0; j < rcategs; j++) sum += marginal[i][j]; for (j = 0; j < rcategs; j++) marginal[i][j] /= sum; } fprintf(outfile, "Most probable category at each site if > 0.95"); fprintf(outfile, " probability (\".\" otherwise)\n\n"); for (i = 1; i <= nmlngth + 3; i++) putc(' ', outfile); for (i = 0; i < sites; i++) { sum = 0.0; mm = 0; for (j = 0; j < rcategs; j++) if (marginal[i][j] > sum) { sum = marginal[i][j]; mm = j; } if (sum >= 0.95) fprintf(outfile, "%ld", mm+1); else putc('.', outfile); if ((i+1) % 60 == 0) { if (i != 0) { putc('\n', outfile); for (j = 1; j <= nmlngth + 3; j++) putc(' ', outfile); } } else if ((i+1) % 10 == 0) putc(' ', outfile); } putc('\n', outfile); for (i = 0; i < sites; i++) free(marginal[i]); free(marginal); } putc('\n', outfile); if (hypstate) { fprintf(outfile, "Probable sequences at interior nodes:\n\n"); fprintf(outfile, " node "); for (i = 0; (i < 13) && (i < ((sites + (sites-1)/10 - 39) / 2)); i++) putc(' ', outfile); fprintf(outfile, "Reconstructed sequence (caps if > 0.95)\n\n"); if (!rctgry || (rcategs == 1)) mx0 = 1; for (i = 0; i < sites; i += 60) { k = i + 59; if (k >= sites) k = sites - 1; rectrav(curtree.start, i, k); rectrav(curtree.start->back, i, k); putc('\n', outfile); mx0 = mx1; } } } /* summarize */ void initpromlnode(node **p, node **grbg, node *q, long len, long nodei, long *ntips, long *parens, initops whichinit, pointarray treenode, pointarray nodep, Char *str, Char *ch, FILE *intree) { /* initializes a node */ boolean minusread; double valyew, divisor; switch (whichinit) { case bottom: gnu(grbg, p); (*p)->index = nodei; (*p)->tip = false; malloc_ppheno((*p), endsite, rcategs); nodep[(*p)->index - 1] = (*p); break; case nonbottom: gnu(grbg, p); malloc_ppheno(*p, endsite, rcategs); (*p)->index = nodei; break; case tip: match_names_to_data(str, nodep, p, spp); break; case iter: (*p)->initialized = false; (*p)->v = initialv; (*p)->iter = true; if ((*p)->back != NULL){ (*p)->back->iter = true; (*p)->back->v = initialv; (*p)->back->initialized = false; } break; case length: processlength(&valyew, &divisor, ch, &minusread, intree, parens); (*p)->v = valyew / divisor; (*p)->iter = false; if ((*p)->back != NULL) { (*p)->back->v = (*p)->v; (*p)->back->iter = false; } break; case hsnolength: haslengths = false; break; default: /* cases hslength, treewt, unittrwt */ break; /* should never occur */ } } /* initpromlnode */ void dnaml_treeout(node *p) { /* write out file with representation of final tree2 */ /* Only works for bifurcations! */ long i, n, w; Char c; double x; node *q; boolean inloop; if (p->tip) { n = 0; for (i = 1; i <= nmlngth; i++) { if (nayme[p->index-1][i - 1] != ' ') n = i; } for (i = 0; i < n; i++) { c = nayme[p->index-1][i]; if (c == ' ') c = '_'; putc(c, outtree); } col += n; } else { putc('(', outtree); col++; inloop = false; q = p->next; do { if (inloop) { putc(',', outtree); col++; if (col > 45) { putc('\n', outtree); col = 0; } } inloop = true; dnaml_treeout(q->back); q = q->next; } while ((p == curtree.start || p != q) && (p != curtree.start || p->next != q)); putc(')', outtree); col++; } x = p->v; if (x > 0.0) w = (long)(0.43429448222 * log(x)); else if (x == 0.0) w = 0; else w = (long)(0.43429448222 * log(-x)) + 1; if (w < 0) w = 0; if (p == curtree.start) fprintf(outtree, ";\n"); else { fprintf(outtree, ":%*.5f", (int)(w + 7), x); col += w + 8; } } /* dnaml_treeout */ void buildnewtip(long m, tree *tr) { node *p; p = tr->nodep[nextsp + spp - 3]; hookup(tr->nodep[m - 1], p); p->v = initialv; p->back->v = initialv; } /* buildnewtip */ void buildsimpletree(tree *tr) { hookup(tr->nodep[enterorder[0] - 1], tr->nodep[enterorder[1] - 1]); tr->nodep[enterorder[0] - 1]->v = 1.0; tr->nodep[enterorder[0] - 1]->back->v = 1.0; tr->nodep[enterorder[1] - 1]->v = 1.0; tr->nodep[enterorder[1] - 1]->back->v = 1.0; buildnewtip(enterorder[2], tr); insert_(tr->nodep[enterorder[2] - 1]->back, tr->nodep[enterorder[0] - 1], false); } /* buildsimpletree */ void free_all_protx (long nonodes, pointarray treenode) { /* used in proml */ long i, j, k; node *p; /* Zero thru spp are tips, */ for (i = 0; i < spp; i++) { for (j = 0; j < endsite; j++) free(treenode[i]->protx[j]); free(treenode[i]->protx); free(treenode[i]->underflows); } /* The rest are rings (i.e. triads) */ for (i = spp; i < nonodes; i++) { if (treenode[i] != NULL) { p = treenode[i]; do { for (k = 0; k < endsite; k++) free(p->protx[k]); free(p->protx); free(p->underflows); p = p->next; } while (p != treenode[i]); } } } /* free_all_protx */ void proml_unroot(node* root, node** nodep, long nonodes) { node *p,*r,*q; double newl; long i; long numsibs; numsibs = count_sibs(root); if ( numsibs > 2 ) { q = root; r = root; while (!(q->next == root)) q = q->next; q->next = root->next; for(i=0 ; i < endsite ; i++){ free(r->protx[i]); r->protx[i] = NULL; } free(r->protx); r->protx = NULL; chuck(&grbg, r); curtree.nodep[spp] = q; } else { /* Bifurcating root - remove entire root fork */ /* Join oldlen on each side of root */ newl = root->next->oldlen + root->next->next->oldlen; root->next->back->oldlen = newl; root->next->next->back->oldlen = newl; /* Join v on each side of root */ newl = root->next->v + root->next->next->v; root->next->back->v = newl; root->next->next->back->v = newl; /* Connect root's children */ root->next->back->back=root->next->next->back; root->next->next->back->back = root->next->back; /* Move nodep entries down one and set indices */ for ( i = spp; i < nonodes-1; i++ ) { p = nodep[i+1]; nodep[i] = p; nodep[i+1] = NULL; if ( nodep[i] == NULL ) /* This may happen in a multifurcating intree */ break; do { p->index = i+1; p = p->next; } while (p != nodep[i]); } /* Free protx arrays from old root */ for(i=0 ; i < endsite ; i++){ free(root->protx[i]); free(root->next->protx[i]); free(root->next->next->protx[i]); root->protx[i] = NULL; root->next->protx[i] = NULL; root->next->next->protx[i] = NULL; } free(root->protx); free(root->next->protx); free(root->next->next->protx); chuck(&grbg,root->next->next); chuck(&grbg,root->next); chuck(&grbg,root); } } /* proml_unroot */ void maketree() { long i, j; boolean dummy_first, goteof; pointarray dummy_treenode=NULL; long nextnode; node *root; prot_inittable(); if (usertree) { openfile(&intree,INTREE,"input tree file", "r",progname,intreename); numtrees = countsemic(&intree); if(numtrees > MAXSHIMOTREES) shimotrees = MAXSHIMOTREES; else shimotrees = numtrees; if (numtrees > 2) initseed(&inseed, &inseed0, seed); l0gl = (double *) Malloc(shimotrees * sizeof(double)); l0gf = (double **) Malloc(shimotrees * sizeof(double *)); for (i=0; i < shimotrees; ++i) l0gf[i] = (double *) Malloc(endsite * sizeof(double)); if (treeprint) { fprintf(outfile, "User-defined tree"); if (numtrees > 1) putc('s', outfile); fprintf(outfile, ":\n\n"); } which = 1; /* This taken out of tree read, used to be [spp-1], but referring to [0] produces output identical to what the pre-modified dnaml produced. */ while (which <= numtrees) { /* These initializations required each time through the loop since multiple trees require re-initialization */ haslengths = true; nextnode = 0; dummy_first = true; goteof = false; treeread(intree, &root, dummy_treenode, &goteof, &dummy_first, curtree.nodep, &nextnode, &haslengths, &grbg, initpromlnode,false,nonodes2); proml_unroot(root,curtree.nodep,nonodes2); if (goteof && (which <= numtrees)) { /* if we hit the end of the file prematurely */ printf ("\n"); printf ("ERROR: trees missing at end of file.\n"); printf ("\tExpected number of trees:\t\t%ld\n", numtrees); printf ("\tNumber of trees actually in file:\t%ld.\n\n", which - 1); exxit (-1); } curtree.start = curtree.nodep[0]->back; if ( outgropt ) curtree.start = curtree.nodep[outgrno - 1]->back; treevaluate(); proml_printree(); summarize(); if (trout) { col = 0; dnaml_treeout(curtree.start); } if(which < numtrees){ prot_freex_notip(nextnode, curtree.nodep); gdispose(curtree.start, &grbg, curtree.nodep); } else nonodes2 = nextnode; which++; } FClose(intree); putc('\n', outfile); if (!auto_ && numtrees > 1 && weightsum > 1 ) standev2(numtrees, maxwhich, 0, endsite-1, maxlogl, l0gl, l0gf, aliasweight, seed); } else { /* no user tree */ for (i = 1; i <= spp; i++) enterorder[i - 1] = i; if (jumble) randumize(seed, enterorder); if (progress) { printf("\nAdding species:\n"); writename(0, 3, enterorder); #ifdef WIN32 phyFillScreenColor(); #endif } nextsp = 3; polishing = false; smoothit = improve; buildsimpletree(&curtree); curtree.start = curtree.nodep[enterorder[0] - 1]->back; nextsp = 4; while (nextsp <= spp) { buildnewtip(enterorder[nextsp - 1], &curtree); bestyet = UNDEFINED; if (smoothit) promlcopy(&curtree, &priortree, nonodes2, rcategs); addtraverse(curtree.nodep[enterorder[nextsp - 1] - 1]->back, curtree.start, true); if (smoothit) promlcopy(&bestree, &curtree, nonodes2, rcategs); else { insert_(curtree.nodep[enterorder[nextsp - 1] - 1]->back, qwhere, true); smoothit = true; for (i = 1; i<=smoothings; i++) { smooth(curtree.start); smooth(curtree.start->back); } smoothit = false; promlcopy(&curtree, &bestree, nonodes2, rcategs); bestyet = curtree.likelihood; } if (progress) { writename(nextsp - 1, 1, enterorder); #ifdef WIN32 phyFillScreenColor(); #endif } if (global && nextsp == spp && progress) { printf("Doing global rearrangements\n"); printf(" !"); for (j = spp ; j < nonodes2 ; j++) if ( (j - spp) % (( nonodes2 / 72 ) + 1 ) == 0 ) putchar('-'); printf("!\n"); #ifdef WIN32 phyFillScreenColor(); #endif } succeeded = true; while (succeeded) { succeeded = false; if (global && nextsp == spp && progress) { printf(" "); fflush(stdout); } if (global && nextsp == spp) globrearrange(); else rearrange(curtree.start, curtree.start->back); if (global && nextsp == spp && progress) putchar('\n'); } nextsp++; } if (global && progress) { putchar('\n'); fflush(stdout); #ifdef WIN32 phyFillScreenColor(); #endif } promlcopy(&curtree, &bestree, nonodes2, rcategs); if (njumble > 1) { if (jumb == 1) promlcopy(&bestree, &bestree2, nonodes2, rcategs); else if (bestree2.likelihood < bestree.likelihood) promlcopy(&bestree, &bestree2, nonodes2, rcategs); } if (jumb == njumble) { if (njumble > 1) promlcopy(&bestree2, &curtree, nonodes2, rcategs); curtree.start = curtree.nodep[outgrno - 1]->back; for (i = 0; i < nonodes2; i++) { if (i < spp) curtree.nodep[i]->initialized = false; else { curtree.nodep[i]->initialized = false; curtree.nodep[i]->next->initialized = false; curtree.nodep[i]->next->next->initialized = false; } } treevaluate(); proml_printree(); summarize(); if (trout) { col = 0; dnaml_treeout(curtree.start); } } } if (usertree) { free(l0gl); for (i=0; i < shimotrees; i++) free(l0gf[i]); free(l0gf); } prot_freetable(); if (jumb < njumble) return; /* printf("freeing stuff in maketree endsite: %li\n", endsite); */ free(contribution); free(mp); for (i=0; i < endsite; i++) free(term[i]); free(term); for (i=0; i < endsite; i++) free(slopeterm[i]); free(slopeterm); for (i=0; i < endsite; i++) free(curveterm[i]); free(curveterm); free_all_protx(nonodes2, curtree.nodep); if (!usertree) { free_all_protx(nonodes2, bestree.nodep); free_all_protx(nonodes2, priortree.nodep); if (njumble > 1) free_all_protx(nonodes2, bestree2.nodep); } if (progress) { printf("\nOutput written to file \"%s\"\n", outfilename); if (trout) printf("\nTree also written onto file \"%s\"\n", outtreename); } } /* maketree */ void clean_up() { /* Free and/or close stuff */ long i; /* printf("in clean_up spp: %li\n", spp); */ free (rrate); free (probcat); free (rate); /* Seems to require freeing every time... */ for (i = 0; i < spp; i++) { free (y[i]); } free (y); free (nayme); free (enterorder); free (category); free (weight); free (alias); free (ally); free (location); free (aliasweight); free (probmat); free (eigmat); FClose(infile); FClose(outfile); FClose(outtree); #ifdef MAC fixmacfile(outfilename); fixmacfile(outtreename); #endif } /* clean_up */ int main(int argc, Char *argv[]) { /* Protein Maximum Likelihood */ #ifdef MAC argc = 1; /* macsetup("ProML",""); */ argv[0] = "ProML"; #endif init(argc,argv); progname = argv[0]; openfile(&infile,INFILE,"input file","r",argv[0],infilename); openfile(&outfile,OUTFILE,"output file","w",argv[0],outfilename); mulsets = false; datasets = 1; firstset = true; ibmpc = IBMCRT; ansi = ANSICRT; grbg = NULL; doinit(); if (ctgry) openfile(&catfile,CATFILE,"categories file","r",argv[0],catfilename); if (weights || justwts) openfile(&weightfile,WEIGHTFILE,"weights file","r",argv[0],weightfilename); if (trout) openfile(&outtree,OUTTREE,"output tree file","w",argv[0],outtreename); for (ith = 1; ith <= datasets; ith++) { if (datasets > 1) { fprintf(outfile, "Data set # %ld:\n", ith); printf("\nData set # %ld:\n", ith); } getinput(); if (ith == 1) firstset = false; if (usertree) { max_num_sibs = 0; maketree(); } else { for (jumb = 1; jumb <= njumble; jumb++) { max_num_sibs = 0; maketree(); } } } clean_up(); printf("\nDone.\n\n"); #ifdef WIN32 phyRestoreConsoleAttributes(); #endif return 0; } /* Protein Maximum Likelihood */ phylip-3.697/src/promlk.c0000644004732000473200000027471412406201117015015 0ustar joefelsenst_g/* PHYLIP version 3.696. Written by Joseph Felsenstein. Copyright (c) 1986-2014, Joseph Felsenstein All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #include "phylip.h" #include "seq.h" #include "mlclock.h" #include "printree.h" #define epsilon 0.0001 /* used in makenewv, getthree, update */ #define over 60 typedef long vall[maxcategs]; typedef double contribarr[maxcategs]; #ifndef OLDC /* function prototypes */ void init_protmats(void); void getoptions(void); void makeprotfreqs(void); void allocrest(void); void doinit(void); void inputoptions(void); void input_protdata(long); void makeweights(void); void prot_makevalues(long, pointarray, long, long, sequence, steptr); void getinput(void); void prot_inittable(void); void alloc_pmatrix(long); void make_pmatrix(double **, double **, double **, long, double, double, double *, double **); boolean prot_nuview(node *p); void getthree(node *p, double thigh, double tlow); void update(node *); void smooth(node *); void promlk_add(node *, node *, node *, boolean); void promlk_re_move(node **, node **, boolean); double prot_evaluate(node *); void tryadd(node *, node **, node **); void addpreorder(node *, node *, node *, boolean); void restoradd(node *, node *, node *, double); void tryrearr(node *, boolean *); void repreorder(node *, boolean *); void rearrange(node **); void nodeinit(node *); void initrav(node *); void travinit(node *); void travsp(node *); void treevaluate(void); void prot_reconstr(node *, long); void rectrav(node *, long, long); void summarize(void); void promlk_treeout(node *); void initpromlnode(node **, node **, node *, long, long, long *, long *, initops, pointarray, pointarray, Char *, Char *, FILE *); void tymetrav(node *, double *); void free_all_protx(long, pointarray); void freegfcategs(void); void maketree(void); void clean_up(void); void reallocsites(void); void prot_freetable(void); void free_pmatrix(long sib); void invalidate_traverse(node *p); void invalidate_tyme(node *p); /* function prototypes */ #endif /* OLDC */ double **tbl; Char infilename[FNMLNGTH], outfilename[FNMLNGTH], intreename[FNMLNGTH], outtreename[FNMLNGTH], catfilename[FNMLNGTH], weightfilename[FNMLNGTH]; double *rrate; long sites, weightsum, categs, datasets, ith, njumble, jumb, numtrees, shimotrees; /* sites = number of sites in actual sequences numtrees = number of user-defined trees */ long inseed, inseed0, mx, mx0, mx1; boolean global, jumble, trout, usertree, weights, rctgry, ctgry, auto_, progress, mulsets, firstset, hypstate, smoothit, polishing, justwts, gama, invar, usejtt, usepmb, usepam; boolean lengthsopt = false; /* Use lengths in user tree option */ boolean lngths = false; /* Actually use lengths (depends on each input tree) */ tree curtree, bestree, bestree2; node *qwhere, *grbg; double sumrates, cv, alpha, lambda, lambda1, invarfrac; long *enterorder; steptr aliasweight; double *rate; longer seed; double *probcat; contribarr *contribution; char aachar[26]="ARNDCQEGHILKMFPSTWYVBZX?*-"; char *progname; long rcategs, nonodes2; boolean mnv_success = false; /* Local variables for maketree, propagated globally for C version: */ long k, maxwhich, col; double like, bestyet, tdelta, lnlike, slope, curv, maxlogl; boolean lastsp, smoothed, succeeded; double *l0gl; double x[3], lnl[3]; double expon1i[maxcategs], expon1v[maxcategs], expon2i[maxcategs], expon2v[maxcategs]; node *there; double **l0gf; Char ch, ch2; long **mp; /* Variables introduced to allow for protein probability calculations */ long max_num_sibs; /* maximum number of siblings used in a */ /* nuview calculation. determines size */ /* final size of pmatrices */ double *eigmat; /* eig matrix variable */ double **probmat; /* prob matrix variable */ double ****dpmatrix; /* derivative of pmatrix */ double ****ddpmatrix; /* derivative of xpmatrix */ double *****pmatrices; /* matrix of probabilities of protien */ /* conversion. The 5 subscripts refer */ /* to sibs, rcategs, categs, final and */ /* initial states, respectively. */ double freqaa[20]; /* amino acid frequencies */ /* this jtt matrix decomposition due to Elisabeth Tillier */ static double jtteigmat[] = {+0.00000000000000,-1.81721720738768,-1.87965834528616,-1.61403121885431, -1.53896608443751,-1.40486966367848,-1.30995061286931,-1.24668414819041, -1.17179756521289,-0.31033320987464,-0.34602837857034,-1.06031718484613, -0.99900602987105,-0.45576774888948,-0.86014403434677,-0.54569432735296, -0.76866956571861,-0.60593589295327,-0.65119724379348,-0.70249806480753}; static double jttprobmat[20][20] = {{+0.07686196156903,+0.05105697447152,+0.04254597872702,+0.05126897436552, +0.02027898986051,+0.04106097946952,+0.06181996909002,+0.07471396264303, +0.02298298850851,+0.05256897371552,+0.09111095444453,+0.05949797025102, +0.02341398829301,+0.04052997973502,+0.05053197473402,+0.06822496588753, +0.05851797074102,+0.01433599283201,+0.03230298384851,+0.06637396681302}, {-0.04445795120462,-0.01557336502860,-0.09314817363516,+0.04411372100382, -0.00511178725134,+0.00188472427522,-0.02176250428454,-0.01330231089224, +0.01004072641973,+0.02707838224285,-0.00785039050721,+0.02238829876349, +0.00257470703483,-0.00510311699563,-0.01727154263346,+0.20074235330882, -0.07236268502973,-0.00012690116016,-0.00215974664431,-0.01059243778174}, {+0.09480046389131,+0.00082658405814,+0.01530023104155,-0.00639909042723, +0.00160605602061,+0.00035896642912,+0.00199161318384,-0.00220482855717, -0.00112601328033,+0.14840201765438,-0.00344295714983,-0.00123976286718, -0.00439399942758,+0.00032478785709,-0.00104270266394,-0.02596605592109, -0.05645800566901,+0.00022319903170,-0.00022792271829,-0.16133258048606}, {-0.06924141195400,-0.01816245289173,-0.08104005811201,+0.08985697111009, +0.00279659017898,+0.01083740322821,-0.06449599336038,+0.01794514261221, +0.01036809141699,+0.04283504450449,+0.00634472273784,+0.02339134834111, -0.01748667848380,+0.00161859106290,+0.00622486432503,-0.05854130195643, +0.15083728660504,+0.00030733757661,-0.00143739522173,-0.05295810171941}, {-0.14637948915627,+0.02029296323583,+0.02615316895036,-0.10311538564943, -0.00183412744544,-0.02589124656591,+0.11073673851935,+0.00848581728407, +0.00106057791901,+0.05530240732939,-0.00031533506946,-0.03124002869407, -0.01533984125301,-0.00288717337278,+0.00272787410643,+0.06300929916280, +0.07920438311152,-0.00041335282410,-0.00011648873397,-0.03944076085434}, {-0.05558229086909,+0.08935293782491,+0.04869509588770,+0.04856877988810, -0.00253836047720,+0.07651693957635,-0.06342453535092,-0.00777376246014, -0.08570270266807,+0.01943016473512,-0.00599516526932,-0.09157595008575, -0.00397735155663,-0.00440093863690,-0.00232998056918,+0.02979967701162, -0.00477299485901,-0.00144011795333,+0.01795114942404,-0.00080059359232}, {+0.05807741644682,+0.14654292420341,-0.06724975334073,+0.02159062346633, -0.00339085518294,-0.06829036785575,+0.03520631903157,-0.02766062718318, +0.03485632707432,-0.02436836692465,-0.00397566003573,-0.10095488644404, +0.02456887654357,+0.00381764117077,-0.00906261340247,-0.01043058066362, +0.01651199513994,-0.00210417220821,-0.00872508520963,-0.01495915462580}, {+0.02564617106907,+0.02960554611436,-0.00052356748770,+0.00989267817318, -0.00044034172141,-0.02279910634723,-0.00363768356471,-0.01086345665971, +0.01229721799572,+0.02633650142592,+0.06282966783922,-0.00734486499924, -0.13863936313277,-0.00993891943390,-0.00655309682350,-0.00245191788287, -0.02431633805559,-0.00068554031525,-0.00121383858869,+0.06280025239509}, {+0.11362428251792,-0.02080375718488,-0.08802750967213,-0.06531316372189, -0.00166626058292,+0.06846081717224,+0.07007301248407,-0.01713112936632, -0.05900588794853,-0.04497159138485,+0.04222484636983,+0.00129043178508, -0.01550337251561,-0.01553102163852,-0.04363429852047,+0.01600063777880, +0.05787328925647,-0.00008265841118,+0.02870014572813,-0.02657681214523}, {+0.01840541226842,+0.00610159018805,+0.01368080422265,+0.02383751807012, -0.00923516894192,+0.01209943150832,+0.02906782189141,+0.01992384905334, +0.00197323568330,+0.00017531415423,-0.01796698381949,+0.01887083962858, -0.00063335886734,-0.02365277334702,+0.01209445088200,+0.01308086447947, +0.01286727242301,-0.11420358975688,-0.01886991700613,+0.00238338728588}, {-0.01100105031759,-0.04250695864938,-0.02554356700969,-0.05473632078607, +0.00725906469946,-0.03003724918191,-0.07051526125013,-0.06939439879112, -0.00285883056088,+0.05334304124753,+0.12839241846919,-0.05883473754222, +0.02424304967487,+0.09134510778469,-0.00226003347193,-0.01280041778462, -0.00207988305627,-0.02957493909199,+0.05290385686789,+0.05465710875015}, {-0.01421274522011,+0.02074863337778,-0.01006411985628,+0.03319995456446, -0.00005371699269,-0.12266046460835,+0.02419847062899,-0.00441168706583, -0.08299118738167,-0.00323230913482,+0.02954035119881,+0.09212856795583, +0.00718635627257,-0.02706936115539,+0.04473173279913,-0.01274357634785, -0.01395862740618,-0.00071538848681,+0.04767640012830,-0.00729728326990}, {-0.03797680968123,+0.01280286509478,-0.08614616553187,-0.01781049963160, +0.00674319990083,+0.04208667754694,+0.05991325707583,+0.03581015660092, -0.01529816709967,+0.06885987924922,-0.11719120476535,-0.00014333663810, +0.00074336784254,+0.02893416406249,+0.07466151360134,-0.08182016471377, -0.06581536577662,-0.00018195976501,+0.00167443595008,+0.09015415667825}, {+0.03577726799591,-0.02139253448219,-0.01137813538175,-0.01954939202830, -0.04028242801611,-0.01777500032351,-0.02106862264440,+0.00465199658293, -0.02824805812709,+0.06618860061778,+0.08437791757537,-0.02533125946051, +0.02806344654855,-0.06970805797879,+0.02328376968627,+0.00692992333282, +0.02751392122018,+0.01148722812804,-0.11130404325078,+0.07776346000559}, {-0.06014297925310,-0.00711674355952,-0.02424493472566,+0.00032464353156, +0.00321221847573,+0.03257969053884,+0.01072805771161,+0.06892027923996, +0.03326534127710,-0.01558838623875,+0.13794237677194,-0.04292623056646, +0.01375763233229,-0.11125153774789,+0.03510076081639,-0.04531670712549, -0.06170413486351,-0.00182023682123,+0.05979891871679,-0.02551802851059}, {-0.03515069991501,+0.02310847227710,+0.00474493548551,+0.02787717003457, -0.12038329679812,+0.03178473522077,+0.04445111601130,-0.05334957493090, +0.01290386678474,-0.00376064171612,+0.03996642737967,+0.04777677295520, +0.00233689200639,+0.03917715404594,-0.01755598277531,-0.03389088626433, -0.02180780263389,+0.00473402043911,+0.01964539477020,-0.01260807237680}, {-0.04120428254254,+0.00062717164978,-0.01688703578637,+0.01685776910152, +0.02102702093943,+0.01295781834163,+0.03541815979495,+0.03968150445315, -0.02073122710938,-0.06932247350110,+0.11696314241296,-0.00322523765776, -0.01280515661402,+0.08717664266126,+0.06297225078802,-0.01290501780488, -0.04693925076877,-0.00177653675449,-0.08407812137852,-0.08380714022487}, {+0.03138655228534,-0.09052573757196,+0.00874202219428,+0.06060593729292, -0.03426076652151,-0.04832468257386,+0.04735628794421,+0.14504653737383, -0.01709111334001,-0.00278794215381,-0.03513813820550,-0.11690294831883, -0.00836264902624,+0.03270980973180,-0.02587764129811,+0.01638786059073, +0.00485499822497,+0.00305477087025,+0.02295754527195,+0.00616929722958}, {-0.04898722042023,-0.01460879656586,+0.00508708857036,+0.07730497806331, +0.04252420017435,+0.00484232580349,+0.09861807969412,-0.05169447907187, -0.00917820907880,+0.03679081047330,+0.04998537112655,+0.00769330211980, +0.01805447683564,-0.00498723245027,-0.14148416183376,-0.05170281760262, -0.03230723310784,-0.00032890672639,-0.02363523071957,+0.03801365471627}, {-0.02047562162108,+0.06933781779590,-0.02101117884731,-0.06841945874842, -0.00860967572716,-0.00886650271590,-0.07185241332269,+0.16703684361030, -0.00635847581692,+0.00811478913823,+0.01847205842216,+0.06700967948643, +0.00596607376199,+0.02318239240593,-0.10552958537847,-0.01980199747773, -0.02003785382406,-0.00593392430159,-0.00965391033612,+0.00743094349652}}; /* dcmut version of PAM model from www.ebi.ac.uk/goldman-srv/dayhoff/ */ static double pameigmat[] = {0,-1.93321786301018,-2.20904642493621,-1.74835983874903, -1.64854548332072,-1.54505559488222,-1.33859384676989,-1.29786201193594, -0.235548517495575,-0.266951066089808,-0.28965813670665,-1.10505826965282, -1.04323310568532,-0.430423720979904,-0.541719761016713,-0.879636093986914, -0.711249353378695,-0.725050487280602,-0.776855937389452,-0.808735559461343}; static double pamprobmat[20][20] ={ {0.08712695644, 0.04090397955, 0.04043197978, 0.04687197656, 0.03347398326, 0.03825498087, 0.04952997524, 0.08861195569, 0.03361898319, 0.03688598156, 0.08535695732, 0.08048095976, 0.01475299262, 0.03977198011, 0.05067997466, 0.06957696521, 0.05854197073, 0.01049399475, 0.02991598504, 0.06471796764}, {0.07991048383, 0.006888314018, 0.03857806206, 0.07947073194, 0.004895492884, 0.03815829405, -0.1087562465, 0.008691167141, -0.0140554828, 0.001306404001, -0.001888411299, -0.006921303342, 0.0007655604228, 0.001583298443, 0.006879590446, -0.171806883, 0.04890917949, 0.0006700432804, 0.0002276237277, -0.01350591875}, {-0.01641514483, -0.007233933239, -0.1377830621, 0.1163201333, -0.002305138017, 0.01557250366, -0.07455879489, -0.003225343503, 0.0140630487, 0.005112274204, 0.001405731862, 0.01975833782, -0.001348402973, -0.001085733262, -0.003880514478, 0.0851493313, -0.01163526615, -0.0001197903399, 0.002056153393, 0.0001536095643}, {0.009669278686, -0.006905863869, 0.101083544, 0.01179903104, -0.003780967591, 0.05845105878, -0.09138357299, -0.02850503638, -0.03233951408, 0.008708065876, -0.004700705411, -0.02053221579, 0.001165851398, -0.001366585849, -0.01317695074, 0.1199985703, -0.1146346193, -0.0005953021314, -0.0004297615194, 0.007475695618}, {0.1722243502, -0.003737582995, -0.02964873222, -0.02050116381, -0.0004530478465, -0.02460043205, 0.02280768412, -0.02127364909, 0.01570095258, 0.1027744285, -0.005330539586, 0.0179697651, -0.002904077286, -0.007068126663, -0.0142869583, -0.01444241844, -0.08218861544, 0.0002069181629, 0.001099671379, -0.1063484263}, {-0.1553433627, -0.001169168032, 0.02134785337, 0.0007602305436, 0.0001395330122, 0.03194992019, -0.01290252206, 0.03281720789, -0.01311103735, 0.1177254769, -0.008008783885, -0.02375317548, -0.002817809762, -0.008196682776, 0.01731267617, 0.01853526375, 0.08249908546, -2.788771776e-05, 0.001266182191, -0.09902299976}, {-0.03671080341, 0.0274168035, 0.04625877597, 0.07520706414, -0.0001833803619, -0.1207833161, -0.006415807779, -0.005465629648, 0.02778273972, 0.007589688485, -0.02945266034, -0.03797542064, 0.07044042052, -0.002018573865, 0.01845277071, 0.006901513991, -0.02430934639, -0.0005919635873, -0.001266962331, -0.01487591261}, {-0.03060317816, 0.01182361623, 0.04200270053, 0.05406235279, -0.0003920498815, -0.09159709348, -0.009602690652, -0.00382944418, 0.01761361993, 0.01605684317, 0.05198878008, 0.02198696949, -0.09308930025, -0.00102622863, 0.01477637127, 0.0009314065393, -0.01860959472, -0.0005964703968, -0.002694284083, 0.02079767439}, {0.0195976494, -0.005104484936, 0.007406728707, 0.01236244954, 0.0201446796, 0.007039564785, 0.01276942134, 0.02641595685, 0.002764624354, 0.001273314658, -0.01335316035, 0.01105658671, 2.148773499e-05, -0.02692205639, 0.0118684991, 0.01212624708, 0.01127770094, -0.09842754796, -0.01942336432, 0.007105703151}, {-0.01819461888, -0.01509348507, -0.01297636935, -0.01996453439, 0.1715705905, -0.01601550692, -0.02122706144, -0.02854628494, -0.009351082371, -0.001527995472, -0.010198224, -0.03609537551, -0.003153182095, 0.02395980501, -0.01378664626, -0.005992611421, -0.01176810875, 0.003132361603, 0.03018439539, -0.004956065656}, {-0.02733614784, -0.02258066705, -0.0153112506, -0.02475728664, -0.04480525045, -0.01526640341, -0.02438517425, -0.04836914601, -0.00635964824, 0.02263169831, 0.09794101931, -0.04004304158, 0.008464393478, 0.1185443142, -0.02239294163, -0.0281550321, -0.01453581604, -0.0246742804, 0.0879619849, 0.02342867605}, {0.06483718238, 0.1260012082, -0.006496013283, 0.009914915531, -0.004181603532, 0.0003493226286, 0.01408035752, -0.04881663016, -0.03431167356, -0.01768005602, 0.02362447761, -0.1482364784, -0.01289035619, -0.001778893279, -0.05240099752, 0.05536174567, 0.06782165352, -0.003548568717, 0.001125301173, -0.03277489363}, {0.06520296909, -0.0754802543, 0.03139281903, -0.03266449554, -0.004485188002, -0.03389072036, -0.06163274338, -0.06484769882, 0.05722658289, -0.02824079619, 0.01544837349, 0.03909752708, 0.002029218884, 0.003151939572, -0.05471208363, 0.07962008342, 0.125916047, 0.0008696184937, -0.01086027514, -0.05314092355}, {0.004543119081, 0.01935177735, 0.01905511007, 0.02682993409, -0.01199617967, 0.01426278655, 0.02472521255, 0.03864795501, 0.02166224804, -0.04754243479, -0.1921545477, 0.03621321546, -0.02120627881, 0.04928097895, 0.009396088815, 0.01748042052, -6.173742851e-05, -0.003168033098, 0.07723565812, -0.08255529309}, {0.06710378668, -0.09441410284, -0.004801776989, 0.008830272165, -0.01021645042, -0.02764365608, 0.004250361851, 0.1648777542, -0.037446109, 0.004541057635, -0.0296980702, -0.1532325189, -0.008940580901, 0.006998050812, 0.02338809379, 0.03175059182, 0.02033965512, 0.006388075608, 0.001762762044, 0.02616280361}, {0.01915943021, -0.05432967274, 0.01249342683, 0.06836622457, 0.002054462161, -0.01233535859, 0.07087282652, -0.08948637051, -0.1245896013, -0.02204522882, 0.03791481736, 0.06557467874, 0.005529294156, -0.006296644235, 0.02144530752, 0.01664230081, 0.02647078439, 0.001737725271, 0.01414149877, -0.05331990116}, {0.0266659303, 0.0564142853, -0.0263767738, -0.08029726006, -0.006059357163, -0.06317558457, -0.0911894019, 0.05401487057, -0.08178072458, 0.01580699778, -0.05370550396, 0.09798653264, 0.003934944022, 0.01977291947, 0.0441198541, 0.02788220393, 0.03201877081, -0.00206161759, -0.005101423308, 0.03113033802}, {0.02980360751, -0.009513246268, -0.009543527165, -0.02190644172, -0.006146440672, 0.01207009085, -0.0126989156, -0.1378266418, 0.0275235217, 0.00551720592, -0.03104791544, -0.07111701247, -0.006081754489, -0.01337494521, 0.1783961085, 0.01453225059, 0.01938736048, 0.0004488631071, 0.0110844398, 0.02049339243}, {-0.01433508581, 0.01258858175, -0.004294252236, -0.007146532854, 0.009541628809, 0.008040155729, -0.006857781832, 0.05584120066, 0.007749418365, -0.05867835844, 0.08008131283, -0.004877854222, -0.0007128540743, 0.09489058424, 0.06421121962, 0.00271493526, -0.03229944773, -0.001732026038, -0.08053448316, -0.1241903609}, {-0.009854113227, 0.01294129929, -0.00593064392, -0.03016833115, -0.002018439732, -0.00792418722, -0.03372768732, 0.07828561288, 0.007722254639, -0.05067377561, 0.1191848621, 0.005059475202, 0.004762387166, -0.1029870175, 0.03537190114, 0.001089956203, -0.02139157573, -0.001015245062, 0.08400521847, -0.08273195059}}; /* this pmb matrix decomposition due to Elisabeth Tillier */ static double pmbeigmat[20] = {0.0000001586972220,-1.8416770496147100, -1.6025046986139100,-1.5801012515121300, -1.4987794099715900,-1.3520794233801900,-1.3003469390479700,-1.2439503327631300, -1.1962574080244200,-1.1383730501367500,-1.1153278910708000,-0.4934843510654760, -0.5419014550215590,-0.9657997830826700,-0.6276075673757390,-0.6675927795018510, -0.6932641383465870,-0.8897872681859630,-0.8382698977371710,-0.8074694642446040}; static double pmbprobmat[20][20] = {{0.0771762457248147,0.0531913844998640,0.0393445076407294,0.0466756566755510, 0.0286348361997465,0.0312327748383639,0.0505410248721427,0.0767106611472993, 0.0258916271688597,0.0673140562194124,0.0965705469252199,0.0515979465932174, 0.0250628079438675,0.0503492018628350,0.0399908189418273,0.0641898881894471, 0.0517539616710987,0.0143507440546115,0.0357994592438322,0.0736218495862984}, {0.0368263046116572,-0.0006728917107827,0.0008590805287740,-0.0002764255356960, 0.0020152937187455,0.0055743720652960,0.0003213317669367,0.0000449190281568, -0.0004226254397134,0.1805040629634510,-0.0272246813586204,0.0005904606533477, -0.0183743200073889,-0.0009194625608688,0.0008173657533167,-0.0262629806302238, 0.0265738757209787,0.0002176606241904,0.0021315644838566,-0.1823229927207580}, {-0.0194800075560895,0.0012068088610652,-0.0008803318319596,-0.0016044273960017, -0.0002938633803197,-0.0535796754602196,0.0155163896648621,-0.0015006360762140, 0.0021601372013703,0.0268513218744797,-0.1085292493742730,0.0149753083138452, 0.1346457366717310,-0.0009371698759829,0.0013501708044116,0.0346352293103622, -0.0276963770242276,0.0003643142783940,0.0002074817333067,-0.0174108903914110}, {0.0557839400850153,0.0023271577185437,0.0183481103396687,0.0023339480096311, 0.0002013267015151,-0.0227406863569852,0.0098644845475047,0.0064721276774396, 0.0001389408104210,-0.0473713878768274,-0.0086984445005797,0.0026913674934634, 0.0283724052562196,0.0001063665179457,0.0027442574779383,-0.1875312134708470, 0.1279864877057640,0.0005103347834563,0.0003155113168637,0.0081451082759554}, {0.0037510125027265,0.0107095920636885,0.0147305410328404,-0.0112351252180332, -0.0001500408626446,-0.1523450933729730,0.0611532413339872,-0.0005496748939503, 0.0048714378736644,-0.0003826320053999,0.0552010244407311,0.0482555671001955, -0.0461664995115847,-0.0021165008617978,-0.0004574454232187,0.0233755883688949, -0.0035484915422384,0.0009090698422851,0.0013840637687758,-0.0073895139302231}, {-0.0111512564930024,0.1025460064723080,0.0396772456883791,-0.0298408501361294, -0.0001656742634733,-0.0079876311843289,0.0712644184507945,-0.0010780604625230, -0.0035880882043592,0.0021070399334252,0.0016716329894279,-0.1810123023850110, 0.0015141703608724,-0.0032700852781804,0.0035503782441679,0.0118634302028026, 0.0044561606458028,-0.0001576678495964,0.0023470722225751,-0.0027457045397157}, {0.1474525743949170,-0.0054432538500293,0.0853848892349828,-0.0137787746207348, -0.0008274830358513,0.0042248844582553,0.0019556229305563,-0.0164191435175148, -0.0024501858854849,0.0120908948084233,-0.0381456105972653,0.0101271614855119, -0.0061945941321859,0.0178841099895867,-0.0014577779202600,-0.0752120602555032, -0.1426985695849920,0.0002862275078983,-0.0081191734261838,0.0313401149422531}, {0.0542034611735289,-0.0078763926211829,0.0060433542506096,0.0033396210615510, 0.0013965072374079,0.0067798903832256,-0.0135291136622509,-0.0089982442731848, -0.0056744537593887,-0.0766524225176246,0.1881210263933930,-0.0065875518675173, 0.0416627569300375,-0.0953804133524747,-0.0012559228448735,0.0101622644292547, -0.0304742453119050,0.0011702318499737,0.0454733434783982,-0.1119239362388150}, {0.1069409037912470,0.0805064400880297,-0.1127352030714600,0.1001181253523260, -0.0021480427488769,-0.0332884841459003,-0.0679837575848452,-0.0043812841356657, 0.0153418716846395,-0.0079441315103188,-0.0121766182046363,-0.0381127991037620, -0.0036338726532673,0.0195324059593791,-0.0020165963699984,-0.0061222685010268, -0.0253761448771437,-0.0005246410999057,-0.0112205170502433,0.0052248485517237}, {-0.0325247648326262,0.0238753651653669,0.0203684886605797,0.0295666232678825, -0.0003946714764213,-0.0157242718469554,-0.0511737848084862,0.0084725632040180, -0.0167068828528921,0.0686962159427527,-0.0659702890616198,-0.0014289912494271, -0.0167000964093416,-0.1276689083678200,0.0036575057830967,-0.0205958145531018, 0.0000368919612829,0.0014413626622426,0.1064360941926030,0.0863372661517408}, {-0.0463777468104402,0.0394712148670596,0.1118686750747160,0.0440711686389031, -0.0026076286506751,-0.0268454015202516,-0.1464943067133240,-0.0137514051835380, -0.0094395514284145,-0.0144124844774228,0.0249103379323744,-0.0071832157138676, 0.0035592787728526,0.0415627419826693,0.0027040097365669,0.0337523666612066, 0.0316121324137152,-0.0011350177559026,-0.0349998884574440,-0.0302651879823361}, {0.0142360925194728,0.0413145623127025,0.0324976427846929,0.0580930922002398, -0.0586974207121084,0.0202001168873069,0.0492204086749069,0.1126593173463060, 0.0116620013776662,-0.0780333711712066,-0.1109786767320410,0.0407775100936731, -0.0205013161312652,-0.0653458585025237,0.0347351829703865,0.0304448983224773, 0.0068813748197884,-0.0189002309261882,-0.0334507528405279,-0.0668143558699485}, {-0.0131548829657936,0.0044244322828034,-0.0050639951827271,-0.0038668197633889, -0.1536822386530220,0.0026336969165336,0.0021585651200470,-0.0459233839062969, 0.0046854727140565,0.0393815434593599,0.0619554007991097,0.0027456299925622, 0.0117574347936383,0.0373018612990383,0.0024818527553328,-0.0133956606027299, -0.0020457128424105,0.0154178819990401,0.0246524142683911,0.0275363065682921}, {-0.1542307272455030,0.0364861558267547,-0.0090880407008181,0.0531673937889863, 0.0157585615170580,0.0029986538457297,0.0180194047699875,0.0652152443589317, 0.0266842840376180,0.0388457366405908,0.0856237634510719,0.0126955778952183, 0.0099593861698250,-0.0013941794862563,0.0294065511237513,-0.1151906949298290, -0.0852991447389655,0.0028699120202636,-0.0332087026659522,0.0006811857297899}, {0.0281300736924501,-0.0584072081898638,-0.0178386569847853,-0.0536470338171487, -0.0186881656029960,-0.0240008730656106,-0.0541064820498883,0.2217137098936020, -0.0260500001542033,0.0234505236798375,0.0311127151218573,-0.0494139126682672, 0.0057093465049849,0.0124937286655911,-0.0298322975915689,0.0006520211333102, -0.0061018680727128,-0.0007081999479528,-0.0060523759094034,0.0215845995364623}, {0.0295321046399105,-0.0088296411830544,-0.0065057049917325,-0.0053478115612781, -0.0100646496794634,-0.0015473619084872,0.0008539960632865,-0.0376381933046211, -0.0328135588935604,0.0672161874239480,0.0667626853916552,-0.0026511651464901, 0.0140451514222062,-0.0544836996133137,0.0427485157912094,0.0097455780205802, 0.0177309072915667,-0.0828759701187452,-0.0729504795471370,0.0670731961252313}, {0.0082646581043963,-0.0319918630534466,-0.0188454445200422,-0.0374976353856606, 0.0037131290686848,-0.0132507796987883,-0.0306958830735725,-0.0044119395527308, -0.0140786756619672,-0.0180512599925078,-0.0208243802903953,-0.0232202769398931, -0.0063135878270273,0.0110442171178168,0.1824538048228460,-0.0006644614422758, -0.0069909097436659,0.0255407650654681,0.0099119399501151,-0.0140911517070698}, {0.0261344441524861,-0.0714454044548650,0.0159436926233439,0.0028462736216688, -0.0044572637889080,-0.0089474834434532,-0.0177570282144517,-0.0153693244094452, 0.1160919467206400,0.0304911481385036,0.0047047513411774,-0.0456535116423972, 0.0004491494948617,-0.0767108879444462,-0.0012688533741441,0.0192445965934123, 0.0202321954782039,0.0281039933233607,-0.0590403018490048,0.0364080426546883}, {0.0115826306265004,0.1340228176509380,-0.0236200652949049,-0.1284484655137340, -0.0004742338006503,0.0127617346949511,-0.0428560878860394,0.0060030732454125, 0.0089182609926781,0.0085353834972860,0.0048464809638033,0.0709740071429510, 0.0029940462557054,-0.0483434904493132,-0.0071713680727884,-0.0036840391887209, 0.0031454003250096,0.0246243550241551,-0.0449551277644180,0.0111449232769393}, {0.0140356721886765,-0.0196518236826680,0.0030517022326582,0.0582672093364850, -0.0000973895685457,0.0021704767224292,0.0341806268602705,-0.0152035987563018, -0.0903198657739177,0.0259623214586925,0.0155832497882743,-0.0040543568451651, 0.0036477631918247,-0.0532892744763217,-0.0142569373662724,0.0104500681408622, 0.0103483945857315,0.0679534422398752,-0.0768068882938636,0.0280289727046158}} ; void init_protmats() { long l; eigmat = (double *) Malloc (20 * sizeof(double)); for (l = 0; l <= 19; l++) if (usejtt) eigmat[l] = jtteigmat[l]; /** changed from jtteigmat*100. **/ else { if (usepmb) eigmat[l] = pmbeigmat[l]; else eigmat[l] = pameigmat[l]; /** changed from pameigmat*100. **/ } probmat = (double **) Malloc (20 * sizeof(double *)); for (l = 0; l <= 19; l++) { if (usejtt) { probmat[l] = jttprobmat[l]; } else { if (usepmb) probmat[l] = pmbprobmat[l]; else probmat[l] = pamprobmat[l]; } } } /* init_protmats */ void getoptions() { /* interactively set options */ long i, loopcount, loopcount2; Char ch; boolean done; boolean didchangecat, didchangercat; double probsum; fprintf(outfile, "\nAmino acid sequence\n"); fprintf(outfile, " Maximum Likelihood method with molecular "); fprintf(outfile, "clock, version %s\n\n", VERSION); putchar('\n'); auto_ = false; ctgry = false; didchangecat = false; rctgry = false; didchangercat = false; categs = 1; rcategs = 1; gama = false; invar = false; global = false; hypstate = false; jumble = false; njumble = 1; lambda = 1.0; lambda1 = 0.0; lengthsopt = false; trout = true; usepam = false; usepmb = false; usejtt = true; usertree = false; weights = false; printdata = false; progress = true; treeprint = true; interleaved = true; loopcount = 0; do { cleerhome(); printf("\nAmino acid sequence\n"); printf(" Maximum Likelihood method with molecular clock, version %s\n\n", VERSION); printf("Settings for this run:\n"); printf(" U Search for best tree?"); if (usertree) printf(" No, use user trees in input file\n"); else printf(" Yes\n"); printf(" P JTT, PMB or PAM probability model? %s\n", usejtt ? "Jones-Taylor-Thornton" : usepmb ? "Henikoff/Tillier PMB" : "Dayhoff PAM"); if (usertree) { printf(" L Use lengths from user tree?"); if (lengthsopt) printf(" Yes\n"); else printf(" No\n"); } printf(" C One category of substitution rates?"); if (!ctgry) printf(" Yes\n"); else printf(" %ld categories\n", categs); printf(" R Rate variation among sites?"); if (!rctgry) printf(" constant rate\n"); else { if (gama) printf(" Gamma distributed rates\n"); else { if (invar) printf(" Gamma+Invariant sites\n"); else printf(" user-defined HMM of rates\n"); } printf(" A Rates at adjacent sites correlated?"); if (!auto_) printf(" No, they are independent\n"); else printf(" Yes, mean block length =%6.1f\n", 1.0 / lambda); } if (!usertree) { printf(" G Global rearrangements?"); if (global) printf(" Yes\n"); else printf(" No\n"); } printf(" W Sites weighted? %s\n", (weights ? "Yes" : "No")); if (!usertree) { printf(" J Randomize input order of sequences?"); if (jumble) printf(" Yes (seed = %8ld, %3ld times)\n", inseed0, njumble); else printf(" No. Use input order\n"); } printf(" M Analyze multiple data sets?"); if (mulsets) printf(" Yes, %2ld %s\n", datasets, (justwts ? "sets of weights" : "data sets")); else printf(" No\n"); printf(" I Input sequences interleaved?"); if (interleaved) printf(" Yes\n"); else printf(" No, sequential\n"); printf(" 0 Terminal type (IBM PC, ANSI, none)?"); if (ibmpc) printf(" IBM PC\n"); if (ansi) printf(" ANSI\n"); if (!(ibmpc || ansi)) printf(" (none)\n"); printf(" 1 Print out the data at start of run"); if (printdata) printf(" Yes\n"); else printf(" No\n"); printf(" 2 Print indications of progress of run"); if (progress) printf(" Yes\n"); else printf(" No\n"); printf(" 3 Print out tree"); if (treeprint) printf(" Yes\n"); else printf(" No\n"); printf(" 4 Write out trees onto tree file?"); if (trout) printf(" Yes\n"); else printf(" No\n"); printf(" 5 Reconstruct hypothetical sequences? %s\n", (hypstate ? "Yes" : "No")); printf("\nAre these settings correct? " "(type Y or the letter for one to change)\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); if (ch == '\n') ch = ' '; uppercase(&ch); done = (ch == 'Y'); if (!done) { uppercase(&ch); if (((!usertree) && (strchr("UPCRJAFWGTMI012345", ch) != NULL)) || (usertree && ((strchr("UPCRAFWLTMI012345", ch) != NULL)))){ switch (ch) { case 'C': ctgry = !ctgry; if (ctgry) { printf("\nSitewise user-assigned categories:\n\n"); initcatn(&categs); if (rate){ free(rate); } rate = (double *) Malloc( categs * sizeof(double)); didchangecat = true; initcategs(categs, rate); } break; case 'P': if (usejtt) { usejtt = false; usepmb = true; } else { if (usepmb) { usepmb = false; usepam = true; } else { usepam = false; usejtt = true; } } break; case 'R': if (!rctgry) { rctgry = true; gama = true; } else { if (gama) { gama = false; invar = true; } else { if (invar) invar = false; else rctgry = false; } } break; case 'A': auto_ = !auto_; if (auto_) { initlambda(&lambda); lambda1 = 1.0 - lambda; } break; case 'G': global = !global; break; case 'W': weights = !weights; break; case 'J': jumble = !jumble; if (jumble) initjumble(&inseed, &inseed0, seed, &njumble); else njumble = 1; break; case 'L': lengthsopt = !lengthsopt; break; case 'U': usertree = !usertree; break; case 'M': mulsets = !mulsets; if (mulsets) { printf("Multiple data sets or multiple weights?"); loopcount2 = 0; do { printf(" (type D or W)\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch2); getchar(); if (ch2 == '\n') ch2 = ' '; uppercase(&ch2); countup(&loopcount2, 10); } while ((ch2 != 'W') && (ch2 != 'D')); justwts = (ch2 == 'W'); if (justwts) justweights(&datasets); else initdatasets(&datasets); if (!usertree && !jumble) { jumble = true; initjumble(&inseed, &inseed0, seed, &njumble); } } break; case 'I': interleaved = !interleaved; break; case '0': initterminal(&ibmpc, &ansi); break; case '1': printdata = !printdata; break; case '2': progress = !progress; break; case '3': treeprint = !treeprint; break; case '4': trout = !trout; break; case '5': hypstate = !hypstate; break; } } else printf("Not a possible option!\n"); } countup(&loopcount, 100); } while (!done); if (gama || invar) { loopcount = 0; do { printf( "\nCoefficient of variation of substitution rate among sites (must be positive)\n"); printf( " In gamma distribution parameters, this is 1/(square root of alpha)\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%lf%*[^\n]", &cv); getchar(); countup(&loopcount, 10); } while (cv <= 0.0); alpha = 1.0 / (cv * cv); } if (!rctgry) auto_ = false; if (rctgry) { printf("\nRates in HMM"); if (invar) printf(" (including one for invariant sites)"); printf(":\n"); initcatn(&rcategs); if (probcat){ free(probcat); free(rrate); } probcat = (double *) Malloc(rcategs * sizeof(double)); rrate = (double *) Malloc(rcategs * sizeof(double)); didchangercat = true; if (gama) initgammacat(rcategs, alpha, rrate, probcat); else { if (invar) { loopcount = 0; do { printf("Fraction of invariant sites?\n"); fflush(stdout); scanf("%lf%*[^\n]", &invarfrac); getchar(); countup(&loopcount, 10); } while ((invarfrac <= 0.0) || (invarfrac >= 1.0)); initgammacat(rcategs-1, alpha, rrate, probcat); for (i = 0; i < rcategs-1; i++) probcat[i] = probcat[i]*(1.0-invarfrac); probcat[rcategs-1] = invarfrac; rrate[rcategs-1] = 0.0; } else { initcategs(rcategs, rrate); initprobcat(rcategs, &probsum, probcat); } } } if (!didchangercat){ rrate = Malloc( rcategs*sizeof(double)); probcat = Malloc( rcategs*sizeof(double)); rrate[0] = 1.0; probcat[0] = 1.0; } if (!didchangecat){ rate = Malloc( categs*sizeof(double)); rate[0] = 1.0; } init_protmats(); } /* getoptions */ void makeprotfreqs() { /* calculate amino acid frequencies based on eigmat */ long i, mineig; mineig = 0; for (i = 0; i <= 19; i++) if (fabs(eigmat[i]) < fabs(eigmat[mineig])) mineig = i; memcpy(freqaa, probmat[mineig], 20 * sizeof(double)); for (i = 0; i <= 19; i++) freqaa[i] = fabs(freqaa[i]); } /* makeprotfreqs */ void reallocsites() { long i; for (i = 0; i < spp; i++) { free(y[i]); } free(enterorder); free(weight); free(category); free(alias); free(aliasweight); free(ally); free(location); for (i = 0; i < spp; i++) y[i] = (char *)Malloc(sites * sizeof(char)); enterorder = (long *)Malloc(spp*sizeof(long)); weight = (long *)Malloc(sites*sizeof(long)); category = (long *)Malloc(sites*sizeof(long)); alias = (long *)Malloc(sites*sizeof(long)); aliasweight = (long *)Malloc(sites*sizeof(long)); ally = (long *)Malloc(sites*sizeof(long)); location = (long *)Malloc(sites*sizeof(long)); for (i = 0; i < sites; i++) category[i] = 1; for (i = 0; i < sites; i++) weight[i] = 1; /* makeweights(); */ } /* reallocsites */ void allocrest() { long i; y = (Char **)Malloc(spp*sizeof(Char *)); nayme = (naym *)Malloc(spp*sizeof(naym)); for (i = 0; i < spp; i++) y[i] = (char *)Malloc(sites * sizeof(char)); enterorder = (long *)Malloc(spp*sizeof(long)); weight = (long *)Malloc(sites*sizeof(long)); category = (long *)Malloc(sites*sizeof(long)); alias = (long *)Malloc(sites*sizeof(long)); aliasweight = (long *)Malloc(sites*sizeof(long)); ally = (long *)Malloc(sites*sizeof(long)); location = (long *)Malloc(sites*sizeof(long)); } /* allocrest */ void doinit() { /* initializes variables */ inputnumbers(&spp, &sites, &nonodes, 1); nonodes2 = nonodes; getoptions(); makeprotfreqs(); if (printdata) fprintf(outfile, "%2ld species, %3ld sites\n", spp, sites); alloctree(&curtree.nodep, nonodes, usertree); allocrest(); if (usertree) return; alloctree(&bestree.nodep, nonodes, 0); if (njumble <= 1) return; alloctree(&bestree2.nodep, nonodes, 0); } /* doinit */ void inputoptions() { long i; if (!firstset) { samenumsp(&sites, ith); reallocsites(); } if (firstset) { for (i = 0; i < sites; i++) category[i] = 1; for (i = 0; i < sites; i++) weight[i] = 1; } if (justwts || weights) inputweights(sites, weight, &weights); weightsum = 0; for (i = 0; i < sites; i++) weightsum += weight[i]; if ((ctgry && categs > 1) && (firstset || !justwts)) { inputcategs(0, sites, category, categs, "ProMLK"); if (printdata) printcategs(outfile, sites, category, "Site categories"); } if (weights && printdata) printweights(outfile, 0, sites, weight, "Sites"); fprintf(outfile, "%s model of amino acid change\n\n", (usejtt ? "Jones-Taylor-Thornton" : usepmb ? "Henikoff/Tillier PMB" : "Dayhoff PAM")); } /* inputoptions */ void input_protdata(long chars) { /* input the names and sequences for each species */ /* used by proml */ long i, j, k, l, basesread, basesnew; Char charstate; boolean allread, done; if (printdata) headings(chars, "Sequences", "---------"); basesread = 0; basesnew = 0; allread = false; while (!(allread)) { /* eat white space -- if the separator line has spaces on it*/ do { charstate = gettc(infile); } while (charstate == ' ' || charstate == '\t'); ungetc(charstate, infile); if (eoln(infile)) scan_eoln(infile); i = 1; while (i <= spp) { if ((interleaved && basesread == 0) || !interleaved) initname(i - 1); j = (interleaved) ? basesread : 0; done = false; while (!done && !eoff(infile)) { if (interleaved) done = true; while (j < chars && !(eoln(infile) || eoff(infile))) { charstate = gettc(infile); if (charstate == '\n' || charstate == '\t') charstate = ' '; if (charstate == ' ' || (charstate >= '0' && charstate <= '9')) continue; uppercase(&charstate); if ((strchr("ABCDEFGHIKLMNPQRSTVWXYZ*?-", charstate)) == NULL){ printf("ERROR: bad amino acid: %c at position %ld of species %ld\n", charstate, j, i); if (charstate == '.') { printf(" Periods (.) may not be used as gap characters.\n"); printf(" The correct gap character is (-)\n"); } exxit(-1); } j++; y[i - 1][j - 1] = charstate; } if (interleaved) continue; if (j < chars) scan_eoln(infile); else if (j == chars) done = true; } if (interleaved && i == 1) basesnew = j; scan_eoln(infile); if ((interleaved && j != basesnew) || (!interleaved && j != chars)) { printf("ERROR: SEQUENCES OUT OF ALIGNMENT AT POSITION %ld.\n", j); exxit(-1); } i++; } if (interleaved) { basesread = basesnew; allread = (basesread == chars); } else allread = (i > spp); } if (!printdata) return; for (i = 1; i <= ((chars - 1) / 60 + 1); i++) { for (j = 1; j <= spp; j++) { for (k = 0; k < nmlngth; k++) putc(nayme[j - 1][k], outfile); fprintf(outfile, " "); l = i * 60; if (l > chars) l = chars; for (k = (i - 1) * 60 + 1; k <= l; k++) { if (j > 1 && y[j - 1][k - 1] == y[0][k - 1]) charstate = '.'; else charstate = y[j - 1][k - 1]; putc(charstate, outfile); if (k % 10 == 0 && k % 60 != 0) putc(' ', outfile); } putc('\n', outfile); } putc('\n', outfile); } putc('\n', outfile); } /* input_protdata */ void makeweights() { /* make up weights vector to avoid duplicate computations */ long i; for (i = 1; i <= sites; i++) { alias[i - 1] = i; ally[i - 1] = 1; aliasweight[i - 1] = weight[i - 1]; location[i - 1] = 0; } sitesort2(sites, aliasweight); sitecombine2(sites, aliasweight); sitescrunch2(sites, 1, 2, aliasweight); endsite = 0; for (i = 1; i <= sites; i++) { if (ally[i - 1] == i) endsite++; } for (i = 1; i <= endsite; i++) { location[alias[i - 1] - 1] = i; } contribution = (contribarr *) Malloc( endsite*sizeof(contribarr)); } /* makeweights */ void prot_makevalues(long categs, pointarray treenode, long endsite, long spp, sequence y, steptr alias) { /* set up fractional likelihoods at tips */ /* a version of makevalues2 found in seq.c */ /* used by proml */ long i, j, k, l; long b; for (k = 0; k < endsite; k++) { j = alias[k]; for (i = 0; i < spp; i++) { for (l = 0; l < categs; l++) { memset(treenode[i]->protx[k][l], 0, sizeof(double)*20); switch (y[i][j - 1]) { case 'A': treenode[i]->protx[k][l][0] = 1.0; break; case 'R': treenode[i]->protx[k][l][(long)arginine - (long)alanine] = 1.0; break; case 'N': treenode[i]->protx[k][l][(long)asparagine - (long)alanine] = 1.0; break; case 'D': treenode[i]->protx[k][l][(long)aspartic - (long)alanine] = 1.0; break; case 'C': treenode[i]->protx[k][l][(long)cysteine - (long)alanine] = 1.0; break; case 'Q': treenode[i]->protx[k][l][(long)glutamine - (long)alanine] = 1.0; break; case 'E': treenode[i]->protx[k][l][(long)glutamic - (long)alanine] = 1.0; break; case 'G': treenode[i]->protx[k][l][(long)glycine - (long)alanine] = 1.0; break; case 'H': treenode[i]->protx[k][l][(long)histidine - (long)alanine] = 1.0; break; case 'I': treenode[i]->protx[k][l][(long)isoleucine - (long)alanine] = 1.0; break; case 'L': treenode[i]->protx[k][l][(long)leucine - (long)alanine] = 1.0; break; case 'K': treenode[i]->protx[k][l][(long)lysine - (long)alanine] = 1.0; break; case 'M': treenode[i]->protx[k][l][(long)methionine - (long)alanine] = 1.0; break; case 'F': treenode[i]->protx[k][l][(long)phenylalanine - (long)alanine] = 1.0; break; case 'P': treenode[i]->protx[k][l][(long)proline - (long)alanine] = 1.0; break; case 'S': treenode[i]->protx[k][l][(long)serine - (long)alanine] = 1.0; break; case 'T': treenode[i]->protx[k][l][(long)threonine - (long)alanine] = 1.0; break; case 'W': treenode[i]->protx[k][l][(long)tryptophan - (long)alanine] = 1.0; break; case 'Y': treenode[i]->protx[k][l][(long)tyrosine - (long)alanine] = 1.0; break; case 'V': treenode[i]->protx[k][l][(long)valine - (long)alanine] = 1.0; break; case 'B': treenode[i]->protx[k][l][(long)asparagine - (long)alanine] = 1.0; treenode[i]->protx[k][l][(long)aspartic - (long)alanine] = 1.0; break; case 'Z': treenode[i]->protx[k][l][(long)glutamine - (long)alanine] = 1.0; treenode[i]->protx[k][l][(long)glutamic - (long)alanine] = 1.0; break; case 'X': /* unknown aa */ for (b = 0; b <= 19; b++) treenode[i]->protx[k][l][b] = 1.0; break; case '?': /* unknown aa */ for (b = 0; b <= 19; b++) treenode[i]->protx[k][l][b] = 1.0; break; case '*': /* stop codon symbol */ for (b = 0; b <= 19; b++) treenode[i]->protx[k][l][b] = 1.0; break; case '-': /* deletion event-absent data or aa */ for (b = 0; b <= 19; b++) treenode[i]->protx[k][l][b] = 1.0; break; } } } } } /* prot_makevalues */ void getinput() { long grcategs; /* reads the input data */ if (!justwts || firstset) inputoptions(); if (!justwts || firstset) input_protdata(sites); makeweights(); setuptree2(&curtree); if (!usertree) { setuptree2(&bestree); if (njumble > 1) setuptree2(&bestree2); } /* printf("in getinput categs: %li rcategs: %li\n", categs, rcategs); */ /* if (!firstset) */ /* freegfcategs(); */ grcategs = (categs > rcategs) ? categs : rcategs; /* printf("calling prot_allocx curtree grcategs: %li\n", grcategs); */ prot_allocx(nonodes, grcategs, curtree.nodep, usertree); if (!usertree) { /* printf("calling prot_allocx bestree grcategs: %li\n", grcategs); */ prot_allocx(nonodes, grcategs, bestree.nodep, 0); if (njumble > 1){ /* printf("calling prot_allocx bestree2 grcategs: %li\n", grcategs); */ prot_allocx(nonodes, grcategs, bestree2.nodep, 0); } } prot_makevalues(rcategs, curtree.nodep, endsite, spp, y, alias); } /* getinput */ void prot_freetable(void) { long i,j,k,l; for (j = 0; j < rcategs; j++) { for (k = 0; k < categs; k++) { for (l = 0; l < 20; l++) free(ddpmatrix[j][k][l]); free(ddpmatrix[j][k]); } free(ddpmatrix[j]); } free(ddpmatrix); for (j = 0; j < rcategs; j++) { for (k = 0; k < categs; k++) { for (l = 0; l < 20; l++) free(dpmatrix[j][k][l]); free(dpmatrix[j][k]); } free(dpmatrix[j]); } free(dpmatrix); for (j = 0; j < rcategs; j++) free(tbl[j]); free(tbl); for ( i = 0 ; i < max_num_sibs ; i++ ) free_pmatrix(i); free(pmatrices); } /* prot_freetable */ void freegfcategs() { long i,j; for ( i = 0 ; i < spp ; i++ ) { for (j = 0; j < endsite; j++) free(curtree.nodep[i]->protx[j]); free(curtree.nodep[i]->protx); free(curtree.nodep[i]->underflows); free(curtree.nodep[i]); } /* free(lrsaves); */ } /* freegfcategs */ void prot_inittable() { /* Define a lookup table. Precompute values and print them out in tables */ /* Allocate memory for the pmatrices, dpmatices and ddpmatrices */ long i, j, k, l; double sumrates; /* Allocate memory for pmatrices, the array of pointers to pmatrices */ pmatrices = (double *****) Malloc (spp * sizeof(double ****)); /* Allocate memory for the first 2 pmatrices, the matrix of conversion */ /* probabilities, but only once per run (aka not on the second jumble. */ alloc_pmatrix(0); alloc_pmatrix(1); /* Allocate memory for one dpmatrix, the first derivative matrix */ dpmatrix = (double ****) Malloc( rcategs * sizeof(double ***)); for (j = 0; j < rcategs; j++) { dpmatrix[j] = (double ***) Malloc( categs * sizeof(double **)); for (k = 0; k < categs; k++) { dpmatrix[j][k] = (double **) Malloc( 20 * sizeof(double *)); for (l = 0; l < 20; l++) dpmatrix[j][k][l] = (double *) Malloc( 20 * sizeof(double)); } } /* Allocate memory for one ddpmatrix, the second derivative matrix */ ddpmatrix = (double ****) Malloc( rcategs * sizeof(double ***)); for (j = 0; j < rcategs; j++) { ddpmatrix[j] = (double ***) Malloc( categs * sizeof(double **)); for (k = 0; k < categs; k++) { ddpmatrix[j][k] = (double **) Malloc( 20 * sizeof(double *)); for (l = 0; l < 20; l++) ddpmatrix[j][k][l] = (double *) Malloc( 20 * sizeof(double)); } } /* Allocate memory and assign values to tbl, the matrix of possible rates*/ tbl = (double **) Malloc( rcategs * sizeof(double *)); for (j = 0; j < rcategs; j++) tbl[j] = (double *) Malloc( categs * sizeof(double)); for (j = 0; j < rcategs; j++) for (k = 0; k < categs; k++) tbl[j][k] = rrate[j]*rate[k]; sumrates = 0.0; for (i = 0; i < endsite; i++) { for (j = 0; j < rcategs; j++) sumrates += aliasweight[i] * probcat[j] * tbl[j][category[alias[i] - 1] - 1]; } sumrates /= (double)sites; for (j = 0; j < rcategs; j++) for (k = 0; k < categs; k++) { tbl[j][k] /= sumrates; } if(jumb > 1) return; if (gama || invar) { fprintf(outfile, "\nDiscrete approximation to gamma distributed rates\n"); fprintf(outfile, " Coefficient of variation of rates = %f (alpha = %f)\n", cv, alpha); } if (rcategs > 1) { fprintf(outfile, "\nState in HMM Rate of change Probability\n\n"); for (i = 0; i < rcategs; i++) if (probcat[i] < 0.0001) fprintf(outfile, "%9ld%16.3f%20.6f\n", i+1, rrate[i], probcat[i]); else if (probcat[i] < 0.001) fprintf(outfile, "%9ld%16.3f%19.5f\n", i+1, rrate[i], probcat[i]); else if (probcat[i] < 0.01) fprintf(outfile, "%9ld%16.3f%18.4f\n", i+1, rrate[i], probcat[i]); else fprintf(outfile, "%9ld%16.3f%17.3f\n", i+1, rrate[i], probcat[i]); putc('\n', outfile); if (auto_) { fprintf(outfile, "Expected length of a patch of sites having the same rate = %8.3f\n", 1/lambda); putc('\n', outfile); } } if (categs > 1) { fprintf(outfile, "\nSite category Rate of change\n\n"); for (k = 0; k < categs; k++) fprintf(outfile, "%9ld%16.3f\n", k+1, rate[k]); fprintf(outfile, "\n\n"); } } /* prot_inittable */ void free_pmatrix(long sib) { long j,k,l; for (j = 0; j < rcategs; j++) { for (k = 0; k < categs; k++) { for (l = 0; l < 20; l++) free(pmatrices[sib][j][k][l]); free(pmatrices[sib][j][k]); } free(pmatrices[sib][j]); } free(pmatrices[sib]); } /* free_pmatrix */ void alloc_pmatrix(long sib) { /* Allocate memory for a new pmatrix. Called iff num_sibs>max_num_sibs */ long j, k, l; double ****temp_matrix; temp_matrix = (double ****) Malloc (rcategs * sizeof(double ***)); for (j = 0; j < rcategs; j++) { temp_matrix[j] = (double ***) Malloc(categs * sizeof(double **)); for (k = 0; k < categs; k++) { temp_matrix[j][k] = (double **) Malloc(20 * sizeof (double *)); for (l = 0; l < 20; l++) temp_matrix[j][k][l] = (double *) Malloc(20 * sizeof(double)); } } pmatrices[sib] = temp_matrix; max_num_sibs++; } /* alloc_pmatrix */ void make_pmatrix(double **matrix, double **dmat, double **ddmat, long derivative, double lz, double rat, double *eigmat, double **probmat) { /* Computes the R matrix such that matrix[m][l] is the joint probability */ /* of m and l. */ /* Computes a P matrix such that matrix[m][l] is the conditional */ /* probability of m given l. This is accomplished by dividing all terms */ /* in the R matrix by freqaa[m], the frequency of l. */ long k, l, m; /* (l) original character state */ /* (m) final character state */ /* (k) lambda counter */ double p0, p1, p2, q; double elambdat[20], delambdat[20], ddelambdat[20]; /* exponential term for matrix */ /* and both derivative matrices */ for (k = 0; k <= 19; k++) { elambdat[k] = exp(lz * rat * eigmat[k]); if(derivative != 0) { delambdat[k] = (elambdat[k] * rat * eigmat[k]); ddelambdat[k] = (delambdat[k] * rat * eigmat[k]); } } for (m = 0; m <= 19; m++) { for (l = 0; l <= 19; l++) { p0 = 0.0; p1 = 0.0; p2 = 0.0; for (k = 0; k <= 19; k++) { q = probmat[k][m] * probmat[k][l]; p0 += (q * elambdat[k]); if(derivative !=0) { p1 += (q * delambdat[k]); p2 += (q * ddelambdat[k]); } } matrix[m][l] = p0 / freqaa[m]; if(derivative != 0) { dmat[m][l] = p1 / freqaa[m]; ddmat[m][l] = p2 / freqaa[m]; } } } } /* make_pmatrix */ boolean prot_nuview(node *p) { /* Recursively update summary data for subtree rooted at p. Returns true if * view has changed. */ long i, j, k, l, num_sibs = 0, sib_index; long b, m; node *q; node *sib_ptr, *sib_back_ptr; psitelike prot_xx, x2; double prod7; double **pmat; double lw; double correction; double maxx; if ( p == NULL ) return false; if ( p->tip ) return false; /* Tips do not need to be initialized */ for (q = p->next; q != p; q = q->next) { num_sibs++; if ( q->back != NULL && !q->tip) { if ( prot_nuview(q->back) ) p->initialized = false; } } if ( p->initialized ) return false; /* Make sure pmatrices is large enough for all siblings */ for (i = 0; i < num_sibs; i++) if (pmatrices[i] == NULL) alloc_pmatrix(i); /* Make pmatrices for all possible combinations of category, rcateg */ /* and sib */ sib_ptr = p; /* return to p */ for (sib_index=0; sib_index < num_sibs; sib_index++) { sib_ptr = sib_ptr->next; sib_back_ptr = sib_ptr->back; if (sib_back_ptr != NULL) lw = fabs(p->tyme - sib_back_ptr->tyme); else lw = 0.0; for (j = 0; j < rcategs; j++) for (k = 0; k < categs; k++) make_pmatrix(pmatrices[sib_index][j][k], NULL, NULL, 0, lw, tbl[j][k], eigmat, probmat); } for (i = 0; i < endsite; i++) { correction = 0; maxx = 0; k = category[alias[i]-1] - 1; for (j = 0; j < rcategs; j++) { /* initialize to 1 all values of prot_xx */ for (m = 0; m <= 19; m++) prot_xx[m] = 1; sib_ptr = p; /* return to p */ /* loop through all sibs and calculate likelihoods for all possible*/ /* amino acid combinations */ for (sib_index=0; sib_index < num_sibs; sib_index++) { sib_ptr = sib_ptr->next; sib_back_ptr = sib_ptr->back; if (sib_back_ptr != NULL) { memcpy(x2, sib_back_ptr->protx[i][j], sizeof(psitelike)); if ( j == 0 ) correction += sib_back_ptr->underflows[i]; } else for (b = 0; b <= 19; b++) x2[b] = 1.0; pmat = pmatrices[sib_index][j][k]; for (m = 0; m <= 19; m++) { prod7 = 0; for (l = 0; l <= 19; l++) prod7 += (pmat[m][l] * x2[l]); prot_xx[m] *= prod7; if ( prot_xx[m] > maxx && sib_index == (num_sibs - 1 )) maxx = prot_xx[m]; } } /* And the final point of this whole function: */ memcpy(p->protx[i][j], prot_xx, sizeof(psitelike)); } p->underflows[i] = 0; if ( maxx < MIN_DOUBLE ) fix_protx(p,i,maxx,rcategs); p->underflows[i] += correction; } p->initialized = true; return true; } /* prot_nuview */ void update(node *p) { node *sib_ptr, *sib_back_ptr; long i, num_sibs; /* improve time and recompute views at a node */ if (p == NULL) return; if (p->back != NULL) { if (!p->back->tip && !p->back->initialized) prot_nuview(p->back); } sib_ptr = p; num_sibs = count_sibs(p); for (i=0 ; i < num_sibs; i++) { sib_ptr = sib_ptr->next; sib_back_ptr = sib_ptr->back; if (sib_back_ptr != NULL) { if (!sib_back_ptr->tip && !sib_back_ptr->initialized) prot_nuview(sib_back_ptr); } } if ( (!usertree) || (usertree && !lngths) ) { mnv_success = makenewv(p) || mnv_success; return; } prot_nuview(p); sib_ptr = p; num_sibs = count_sibs(p); for (i=0 ; i < num_sibs; i++) { sib_ptr = sib_ptr->next; prot_nuview(sib_ptr); } } /* update */ void smooth(node *p) { node *q; if (p == NULL) return; if (p->tip) return; /* optimize tyme here */ update(p); if (smoothit || polishing) { for (q = p->next; q != p; q = q->next) { if (!q->back->tip) { /* smooth subtrees */ smooth(q->back); /* optimize tyme again after each subtree */ update(p); } } } } /* smooth */ void promlk_add(node *below, node *newtip, node *newfork, boolean tempadd) { /* inserts the nodes newfork and its descendant, newtip, into the tree. */ long i; boolean done; node *p; double newtyme; /* Get parent nodelets */ below = curtree.nodep[below->index - 1]; newfork = curtree.nodep[newfork->index-1]; newtip = curtree.nodep[newtip->index-1]; /* Join above node to newfork */ if (below->back != NULL) below->back->back = newfork; newfork->back = below->back; /* Join below to newfork->next->next */ below->back = newfork->next->next; newfork->next->next->back = below; /* Join newtip to newfork->next */ newfork->next->back = newtip; newtip->back = newfork->next; /* assign newfork minimum child tyme */ if (newtip->tyme < below->tyme) p = newtip; else p = below; newtyme = p->tyme; setnodetymes(newfork,newtyme); /* Move root if inserting there */ if (curtree.root == below) curtree.root = newfork; /* If not at root, set newfork tyme to average below/above */ if (newfork->back != NULL) { if (p->tyme > newfork->back->tyme) newtyme = (p->tyme + newfork->back->tyme) / 2.0; else newtyme = p->tyme - INSERT_MIN_TYME; setnodetymes(newfork, newtyme); /* Now move from p to root, setting parent tymes older than children * by at least INSERT_MIN_TYME */ do { p = curtree.nodep[p->back->index - 1]; done = (p == curtree.root); if (!done) done = (curtree.nodep[p->back->index - 1]->tyme < p->tyme - INSERT_MIN_TYME); if (!done) { setnodetymes(curtree.nodep[p->back->index - 1], p->tyme - INSERT_MIN_TYME); } } while (!done); } else { /* root == newfork */ /* make root 2x older */ setnodetymes(newfork, newfork->tyme - 2*INSERT_MIN_TYME); } /* This is needed to prevent negative lengths */ all_tymes_valid(curtree.root, 0.0, true); /* Invalidate views */ inittrav(newtip); inittrav(newtip->back); /* Adjust branch lengths throughout */ for ( i = 1; i < smoothings; i++ ) { smoothed = true; smooth(newfork); smooth(newfork->back); if ( smoothed ) break; } } /* promlk_add */ void promlk_re_move(node **item, node **fork, boolean tempadd) { /* removes nodes item and its ancestor, fork, from the tree. the new descendant of fork's ancestor is made to be fork's second descendant (other than item). Also returns pointers to the deleted nodes, item and fork */ node *p, *q; long i; if ((*item)->back == NULL) { *fork = NULL; return; } *item = curtree.nodep[(*item)->index-1]; *fork = curtree.nodep[(*item)->back->index - 1]; if (curtree.root == *fork) { if (*item == (*fork)->next->back) curtree.root = (*fork)->next->next->back; else curtree.root = (*fork)->next->back; } p = (*item)->back->next->back; q = (*item)->back->next->next->back; if (p != NULL) p->back = q; if (q != NULL) q->back = p; (*fork)->back = NULL; p = (*fork)->next; while (p != *fork) { p->back = NULL; p = p->next; } (*item)->back = NULL; inittrav(p); inittrav(q); if (tempadd) return; /* This is needed to prevent negative lengths */ all_tymes_valid(curtree.root, 0.0, true); i = 1; while (i <= smoothings) { smooth(q); if (smoothit) smooth(q->back); i++; } } /* promlk_re_move */ double prot_evaluate(node *p) { /* Evaluate and return the log likelihood of the current tree * as seen from the branch from p to p->back. If p is the root node, * the first child branch is used instead. Views are updated as needed. */ contribarr tterm; static contribarr like, nulike, clai; double sum, sum2, sumc=0, y, prod4, prodl, frexm, sumterm, lterm; double **pmat; long i, j, k, l, m, lai; node *q, *r; psitelike x1, x2; sum = 0.0; if (p == curtree.root) { p = p->next; } r = p; q = p->back; prot_nuview (r); prot_nuview (q); y = fabs(r->tyme - q->tyme); for (j = 0; j < rcategs; j++) for (k = 0; k < categs; k++) make_pmatrix(pmatrices[0][j][k],NULL,NULL,0,y,tbl[j][k],eigmat,probmat); for (i = 0; i < endsite; i++) { k = category[alias[i]-1] - 1; for (j = 0; j < rcategs; j++) { memcpy(x1, r->protx[i][j], sizeof(psitelike)); memcpy(x2, q->protx[i][j], sizeof(psitelike)); prod4 = 0.0; pmat = pmatrices[0][j][k]; for (m = 0; m <= 19; m++) { prodl = 0.0; for (l = 0; l <= 19; l++) prodl += (pmat[m][l] * x2[l]); frexm = x1[m] * freqaa[m]; prod4 += (prodl * frexm); } tterm[j] = prod4; } sumterm = 0.0; for (j = 0; j < rcategs; j++) sumterm += probcat[j] * tterm[j]; if (sumterm < 0.0) sumterm = 0.00000001; /* ??? */ lterm = log(sumterm) + p->underflows[i] + q->underflows[i]; for (j = 0; j < rcategs; j++) clai[j] = tterm[j] / sumterm; memcpy(contribution[i], clai, rcategs * sizeof(double)); if (!auto_ && usertree && (which <= shimotrees)) l0gf[which - 1][i] = lterm; sum += aliasweight[i] * lterm; } if (auto_) { for (j = 0; j < rcategs; j++) like[j] = 1.0; for (i = 0; i < sites; i++) { sumc = 0.0; for (k = 0; k < rcategs; k++) sumc += probcat[k] * like[k]; sumc *= lambda; if ((ally[i] > 0) && (location[ally[i]-1] > 0)) { lai = location[ally[i] - 1]; memcpy(clai, contribution[lai - 1], rcategs*sizeof(double)); for (j = 0; j < rcategs; j++) nulike[j] = ((1.0 - lambda) * like[j] + sumc) * clai[j]; } else { for (j = 0; j < rcategs; j++) nulike[j] = ((1.0 - lambda) * like[j] + sumc); } memcpy(like, nulike, rcategs * sizeof(double)); } sum2 = 0.0; for (i = 0; i < rcategs; i++) sum2 += probcat[i] * like[i]; sum += log(sum2); } /* FIXME check sum for -inf or nan * (sometimes occurs with short branches) */ assert( sum - sum == 0.0 ); curtree.likelihood = sum; if (auto_ || !usertree) return sum; if(which <= shimotrees) l0gl[which - 1] = sum; if (which == 1) { maxwhich = 1; maxlogl = sum; return sum; } if (sum > maxlogl) { maxwhich = which; maxlogl = sum; } return sum; } /* prot_evaluate */ void tryadd(node *p, node **item, node **nufork) { /* temporarily adds one fork and one tip to the tree. if the location where they are added yields greater likelihood than other locations tested up to that time, then keeps that location as there */ long grcategs; grcategs = (categs > rcategs) ? categs : rcategs; promlk_add(p, *item, *nufork, true); like = prot_evaluate(p); if (lastsp) { if (like >= bestyet || bestyet == UNDEFINED) prot_copy_(&curtree, &bestree, nonodes, grcategs); } if (like > bestyet || bestyet == UNDEFINED) { bestyet = like; there = p; } promlk_re_move(item, nufork, true); } /* tryadd */ void addpreorder(node *p, node *item, node *nufork, boolean contin) { /* traverses a binary tree, calling function tryadd at a node before calling tryadd at its descendants */ if (p == NULL) return; tryadd(p, &item, &nufork); if ((!p->tip) && contin) { addpreorder(p->next->back, item, nufork, contin); addpreorder(p->next->next->back, item, nufork, contin); } } /* addpreorder */ void restoradd(node *below, node *newtip, node *newfork, double prevtyme) { /* restore "new" tip and fork to place "below". restore tymes */ /* assumes bifurcation */ hookup(newfork, below->back); hookup(newfork->next, below); hookup(newtip, newfork->next->next); curtree.nodep[newfork->index-1] = newfork; newfork->tyme = prevtyme; /* assumes bifurcations */ newfork->next->tyme = prevtyme; newfork->next->next->tyme = prevtyme; } /* restoradd */ void tryrearr(node *p, boolean *success) { /* evaluates one rearrangement of the tree. if the new tree has greater likelihood than the old one sets success = TRUE and keeps the new tree. otherwise, restores the old tree */ node *frombelow, *whereto, *forknode; double oldlike, prevtyme; boolean wasonleft; if (p == curtree.root) return; forknode = curtree.nodep[p->back->index - 1]; if (forknode == curtree.root) return; oldlike = bestyet; prevtyme = forknode->tyme; /* the following statement presumes bifurcating tree */ if (forknode->next->back == p) { frombelow = forknode->next->next->back; wasonleft = true; } else { frombelow = forknode->next->back; wasonleft = false; } whereto = curtree.nodep[forknode->back->index - 1]; promlk_re_move(&p, &forknode, true); promlk_add(whereto, p, forknode, true); like = prot_evaluate(p); if (like - oldlike > LIKE_EPSILON || oldlike == UNDEFINED) { (*success) = true; bestyet = like; } else { promlk_re_move(&p, &forknode, true); restoradd(frombelow, p, forknode, prevtyme); if (wasonleft && (forknode->next->next->back == p)) { hookup (forknode->next->back, forknode->next->next); hookup (forknode->next, p); } curtree.likelihood = oldlike; /* assumes bifurcation */ inittrav(forknode); inittrav(forknode->next); inittrav(forknode->next->next); } } /* tryrearr */ void repreorder(node *p, boolean *success) { /* traverses a binary tree, calling function tryrearr at a node before calling tryrearr at its descendants */ if (p == NULL) return; tryrearr(p, success); if (p->tip) return; /* assumes bifurcation */ if (!(*success)) repreorder(p->next->back, success); if (!(*success)) repreorder(p->next->next->back, success); } /* repreorder */ void rearrange(node **r) { /* traverses the tree (preorder), finding any local rearrangement which increases the likelihood. if traversal succeeds in increasing the tree's likelihood, function rearrange runs traversal again */ boolean success; success = true; while (success) { success = false; repreorder(*r, &success); } } /* rearrange */ void nodeinit(node *p) { /* set up times at one node */ node *sib_ptr, *sib_back_ptr; long i, num_sibs; double lowertyme; sib_ptr = p; num_sibs = count_sibs(p); /* lowertyme = lowest of children's times */ lowertyme = p->next->back->tyme; for (i=0 ; i < num_sibs; i++) { sib_ptr = sib_ptr->next; sib_back_ptr = sib_ptr->back; if (sib_back_ptr->tyme < lowertyme) lowertyme = sib_back_ptr->tyme; } p->tyme = lowertyme - 0.1; sib_ptr = p; for (i=0 ; i < num_sibs; i++) { sib_ptr = sib_ptr->next; sib_back_ptr = sib_ptr->back; sib_ptr->tyme = p->tyme; sib_back_ptr->v = sib_back_ptr->tyme - p->tyme; sib_ptr->v = sib_back_ptr->v; } } /* nodeinit */ void invalidate_traverse(node *p) { /* Invalidates p's view and all views looking toward p from p->back * on out. */ node *q; if (p == NULL) return; if (p->tip) return; p->initialized = false; q = p->back; if ( q == NULL ) return; if ( q->tip ) return; /* Call ourselves on p->back's sibs */ for ( q = q->next ; q != p->back ; q = q->next) { invalidate_traverse(q); } } /* invalidate_traverse */ void invalidate_tyme(node *p) { /* Must be called on a node after changing its tyme, and before calling * evaluate on any other node. */ node *q; if ( p == NULL ) return; invalidate_traverse(p); if ( p->tip ) return; for ( q = p->next; q != p; q = q->next ) { invalidate_traverse(q); } } /* invalidate_tyme */ void initrav(node *p) { long i, num_sibs; node *sib_ptr, *sib_back_ptr; /* traverse to set up times throughout tree */ if (p->tip) return; sib_ptr = p; num_sibs = count_sibs(p); for (i=0 ; i < num_sibs; i++) { sib_ptr = sib_ptr->next; sib_back_ptr = sib_ptr->back; initrav(sib_back_ptr); } nodeinit(p); } /* initrav */ void travinit(node *p) { long i, num_sibs; node *sib_ptr, *sib_back_ptr; /* traverse to set up initial values */ if (p == NULL) return; if (p->tip) return; if (p->initialized) return; sib_ptr = p; num_sibs = count_sibs(p); for (i=0 ; i < num_sibs; i++) { sib_ptr = sib_ptr->next; sib_back_ptr = sib_ptr->back; travinit(sib_back_ptr); } prot_nuview(p); p->initialized = true; } /* travinit */ void travsp(node *p) { long i, num_sibs; node *sib_ptr, *sib_back_ptr; /* traverse to find tips */ if (p == curtree.root) travinit(p); if (p->tip) travinit(p->back); else { sib_ptr = p; num_sibs = count_sibs(p); for (i=0 ; i < num_sibs; i++) { sib_ptr = sib_ptr->next; sib_back_ptr = sib_ptr->back; travsp(sib_back_ptr); } } } /* travsp */ void treevaluate() { /* evaluate likelihood of tree, after iterating branch lengths */ long i, j, num_sibs; node *sib_ptr; polishing = true; smoothit = true; for (i = 0; i < spp; i++) curtree.nodep[i]->initialized = false; for (i = spp; i < nonodes; i++) { sib_ptr = curtree.nodep[i]; sib_ptr->initialized = false; num_sibs = count_sibs(sib_ptr); for (j=0 ; j < num_sibs; j++) { sib_ptr = sib_ptr->next; sib_ptr->initialized = false; } } if (!lngths) initrav(curtree.root); travsp(curtree.root); i = 0; do { mnv_success = false; smooth(curtree.root); i++; } while (mnv_success); prot_evaluate(curtree.root); } /* treevaluate */ void prot_reconstr(node *p, long n) { /* reconstruct and print out acid at site n+1 at node p */ long i, j, k, first, num_sibs = 0; double f, sum, xx[20]; node *q = NULL; if (p->tip) putc(y[p->index-1][n], outfile); else { num_sibs = count_sibs(p); if ((ally[n] == 0) || (location[ally[n]-1] == 0)) putc('.', outfile); else { j = location[ally[n]-1] - 1; sum = 0; for (i = 0; i <= 19; i++) { f = p->protx[j][mx-1][i]; if (!p->tip) { q = p; for (k = 0; k < num_sibs; k++) { q = q->next; f *= q->protx[j][mx-1][i]; } } f = sqrt(f); xx[i] = f * freqaa[i]; sum += xx[i]; } for (i = 0; i <= 19; i++) xx[i] /= sum; first = 0; for (i = 0; i <= 19; i++) if (xx[i] > xx[first]) first = i; if (xx[first] > 0.95) putc(aachar[first], outfile); else putc(tolower(aachar[first]), outfile); if (rctgry && rcategs > 1) mx = mp[n][mx - 1]; else mx = 1; } } } /* prot_reconstr */ void rectrav(node *p, long m, long n) { /* print out segment of reconstructed sequence for one branch */ long num_sibs, i; node *sib_ptr; putc(' ', outfile); if (p->tip) { for (i = 0; i < nmlngth; i++) putc(nayme[p->index-1][i], outfile); } else fprintf(outfile, "%4ld ", p->index - spp); fprintf(outfile, " "); mx = mx0; for (i = m; i <= n; i++) { if ((i % 10 == 0) && (i != m)) putc(' ', outfile); prot_reconstr(p, i); } putc('\n', outfile); if (!p->tip) { num_sibs = count_sibs(p); sib_ptr = p; for (i = 0; i < num_sibs; i++) { sib_ptr = sib_ptr->next; rectrav(sib_ptr->back, m, n); } } mx1 = mx; } /* rectrav */ void summarize() { long i, j, mm; double mode, sum; double like[maxcategs], nulike[maxcategs]; double **marginal; mp = (long **)Malloc(sites * sizeof(long *)); for (i = 0; i <= sites-1; ++i) mp[i] = (long *)Malloc(sizeof(long)*rcategs); fprintf(outfile, "\nLn Likelihood = %11.5f\n\n", curtree.likelihood); fprintf(outfile, " Ancestor Node Node Height Length\n"); fprintf(outfile, " -------- ---- ---- ------ ------\n"); mlk_describe(outfile, &curtree, 1.0); putc('\n', outfile); if (rctgry && rcategs > 1) { for (i = 0; i < rcategs; i++) like[i] = 1.0; for (i = sites - 1; i >= 0; i--) { sum = 0.0; for (j = 0; j < rcategs; j++) { nulike[j] = (lambda1 + lambda * probcat[j]) * like[j]; mp[i][j] = j + 1; for (k = 1; k <= rcategs; k++) { if (k != j + 1) { if (lambda * probcat[k - 1] * like[k - 1] > nulike[j]) { nulike[j] = lambda * probcat[k - 1] * like[k - 1]; mp[i][j] = k; } } } if ((ally[i] > 0) && (location[ally[i]-1] > 0)) nulike[j] *= contribution[location[ally[i] - 1] - 1][j]; sum += nulike[j]; } for (j = 0; j < rcategs; j++) nulike[j] /= sum; memcpy(like, nulike, rcategs * sizeof(double)); } mode = 0.0; mx = 1; for (i = 1; i <= rcategs; i++) { if (probcat[i - 1] * like[i - 1] > mode) { mx = i; mode = probcat[i - 1] * like[i - 1]; } } mx0 = mx; fprintf(outfile, "Combination of categories that contributes the most to the likelihood:\n\n"); for (i = 1; i <= nmlngth + 3; i++) putc(' ', outfile); for (i = 1; i <= sites; i++) { fprintf(outfile, "%ld", mx); if (i % 10 == 0) putc(' ', outfile); if (i % 60 == 0 && i != sites) { putc('\n', outfile); for (j = 1; j <= nmlngth + 3; j++) putc(' ', outfile); } mx = mp[i - 1][mx - 1]; } fprintf(outfile, "\n\n"); marginal = (double **) Malloc( sites*sizeof(double *)); for (i = 0; i < sites; i++) marginal[i] = (double *) Malloc( rcategs*sizeof(double)); for (i = 0; i < rcategs; i++) like[i] = 1.0; for (i = sites - 1; i >= 0; i--) { sum = 0.0; for (j = 0; j < rcategs; j++) { nulike[j] = (lambda1 + lambda * probcat[j]) * like[j]; for (k = 1; k <= rcategs; k++) { if (k != j + 1) nulike[j] += lambda * probcat[k - 1] * like[k - 1]; } if ((ally[i] > 0) && (location[ally[i]-1] > 0)) nulike[j] *= contribution[location[ally[i] - 1] - 1][j]; sum += nulike[j]; } for (j = 0; j < rcategs; j++) { nulike[j] /= sum; marginal[i][j] = nulike[j]; } memcpy(like, nulike, rcategs * sizeof(double)); } for (i = 0; i < rcategs; i++) like[i] = 1.0; for (i = 0; i < sites; i++) { sum = 0.0; for (j = 0; j < rcategs; j++) { nulike[j] = (lambda1 + lambda * probcat[j]) * like[j]; for (k = 1; k <= rcategs; k++) { if (k != j + 1) nulike[j] += lambda * probcat[k - 1] * like[k - 1]; } marginal[i][j] *= like[j] * probcat[j]; sum += nulike[j]; } for (j = 0; j < rcategs; j++) nulike[j] /= sum; memcpy(like, nulike, rcategs * sizeof(double)); sum = 0.0; for (j = 0; j < rcategs; j++) sum += marginal[i][j]; for (j = 0; j < rcategs; j++) marginal[i][j] /= sum; } fprintf(outfile, "Most probable category at each site if > 0.95"); fprintf(outfile, " probability (\".\" otherwise)\n\n"); for (i = 1; i <= nmlngth + 3; i++) putc(' ', outfile); for (i = 0; i < sites; i++) { sum = 0.0; mm = 0; for (j = 0; j < rcategs; j++) if (marginal[i][j] > sum) { sum = marginal[i][j]; mm = j; } if (sum >= 0.95) fprintf(outfile, "%ld", mm+1); else putc('.', outfile); if ((i+1) % 60 == 0) { if (i != 0) { putc('\n', outfile); for (j = 1; j <= nmlngth + 3; j++) putc(' ', outfile); } } else if ((i+1) % 10 == 0) putc(' ', outfile); } putc('\n', outfile); for (i = 0; i < sites; i++) free(marginal[i]); free(marginal); } putc('\n', outfile); putc('\n', outfile); putc('\n', outfile); if (hypstate) { fprintf(outfile, "Probable sequences at interior nodes:\n\n"); fprintf(outfile, " node "); for (i = 0; (i < 13) && (i < ((sites + (sites-1)/10 - 39) / 2)); i++) putc(' ', outfile); fprintf(outfile, "Reconstructed sequence (caps if > 0.95)\n\n"); if (!rctgry || (rcategs == 1)) mx0 = 1; for (i = 0; i < sites; i += 60) { k = i + 59; if (k >= sites) k = sites - 1; rectrav(curtree.root, i, k); putc('\n', outfile); mx0 = mx1; } } for (i = 0; i <= sites-1; ++i) free(mp[i]); free(mp); } /* summarize */ void promlk_treeout(node *p) { /* write out file with representation of final tree */ node *sib_ptr; long i, n, w, num_sibs; Char c; double x; if (p->tip) { n = 0; for (i = 1; i <= nmlngth; i++) { if (nayme[p->index - 1][i - 1] != ' ') n = i; } for (i = 0; i < n; i++) { c = nayme[p->index - 1][i]; if (c == ' ') c = '_'; putc(c, outtree); } col += n; } else { sib_ptr = p; num_sibs = count_sibs(p); putc('(', outtree); col++; for (i=0; i < (num_sibs - 1); i++) { sib_ptr = sib_ptr->next; promlk_treeout(sib_ptr->back); putc(',', outtree); col++; if (col > 55) { putc('\n', outtree); col = 0; } } sib_ptr = sib_ptr->next; promlk_treeout(sib_ptr->back); putc(')', outtree); col++; } if (p == curtree.root) { fprintf(outtree, ";\n"); return; } x = (p->tyme - curtree.nodep[p->back->index - 1]->tyme); if (x > 0.0) w = (long)(0.4342944822 * log(x)); else if (x == 0.0) w = 0; else w = (long)(0.4342944822 * log(-x)) + 1; if (w < 0) w = 0; fprintf(outtree, ":%*.5f", (int)(w + 7), x); col += w + 8; } /* promlk_treeout */ void initpromlnode(node **p, node **grbg, node *q, long len, long nodei, long *ntips, long *parens, initops whichinit, pointarray treenode, pointarray nodep, Char *str, Char *ch, FILE *intree) { /* initializes a node */ boolean minusread; double valyew, divisor; switch (whichinit) { case bottom: gnu(grbg, p); (*p)->index = nodei; (*p)->tip = false; malloc_ppheno((*p), endsite, rcategs); nodep[(*p)->index - 1] = (*p); break; case nonbottom: gnu(grbg, p); malloc_ppheno(*p, endsite, rcategs); (*p)->index = nodei; break; case tip: match_names_to_data(str, nodep, p, spp); break; case iter: (*p)->initialized = false; (*p)->v = initialv; (*p)->iter = true; if ((*p)->back != NULL) (*p)->back->iter = true; break; case length: processlength(&valyew, &divisor, ch, &minusread, intree, parens); (*p)->v = valyew / divisor; (*p)->iter = false; if ((*p)->back != NULL) { (*p)->back->v = (*p)->v; (*p)->back->iter = false; } break; case hsnolength: if (usertree && lngths) { printf("Warning: one or more lengths not defined in user tree number %ld.\n", which); printf("PROMLK will attempt to optimize all branch lengths.\n\n"); lngths = false; } break; case unittrwt: curtree.nodep[spp]->iter = false; break; default: /* cases hslength, treewt */ break; /* should never occur */ } } /* initpromlnode */ void tymetrav(node *p, double *x) { /* set up times of nodes */ node *sib_ptr, *q; long i, num_sibs; double xmax; xmax = 0.0; if (!p->tip) { sib_ptr = p; num_sibs = count_sibs(p); for (i=0; i < num_sibs; i++) { sib_ptr = sib_ptr->next; tymetrav(sib_ptr->back, x); if (xmax > (*x)) xmax = (*x); } } else (*x) = 0.0; p->tyme = xmax; if (!p->tip) { q = p; while (q->next != p) { q = q->next; q->tyme = p->tyme; } } (*x) = p->tyme - p->v; } /* tymetrav */ void free_all_protx (long nonodes, pointarray treenode) { /* used in proml */ long i, j, k; node *p; /* printf("in free_all_protx nonodes: %li spp: %li endsite: %li\n", nonodes, spp, endsite); */ /* Zero thru spp are tips, */ for (i = 0; i < spp; i++) { for (j = 0; j < endsite; j++) free(treenode[i]->protx[j]); free(treenode[i]->protx); free(treenode[i]->underflows); } /* The rest are rings (i.e. triads) */ for (i = spp; i < nonodes; i++) { if (treenode[i] != NULL) { p = treenode[i]; for (j = 1; j <= 3; j++) { for (k = 0; k < endsite; k++) free(p->protx[k]); free(p->protx); free(p->underflows); p = p->next; } } } } /* free_all_protx */ void maketree() { /* constructs a binary tree from the pointers in curtree.nodep, adds each node at location which yields highest likelihood then rearranges the tree for greatest likelihood */ long i, j; long numtrees = 0; double bestlike, gotlike, x; node *item, *nufork, *dummy, *q, *root=NULL; boolean dummy_haslengths, dummy_first, goteof; long max_nonodes; /* Maximum number of nodes required to * express all species in a bifurcating tree * */ long nextnode; long grcategs; pointarray dummy_treenode=NULL; grcategs = (categs > rcategs) ? categs : rcategs; prot_inittable(); if (!usertree) { for (i = 1; i <= spp; i++) enterorder[i - 1] = i; if (jumble) randumize(seed, enterorder); curtree.root = curtree.nodep[spp]; curtree.root->back = NULL; for (i = 0; i < spp; i++) curtree.nodep[i]->back = NULL; for (i = spp; i < nonodes; i++) { q = curtree.nodep[i]; q->back = NULL; while ((q = q->next) != curtree.nodep[i]) q->back = NULL; } polishing = false; promlk_add(curtree.nodep[enterorder[0]-1], curtree.nodep[enterorder[1]-1], curtree.nodep[spp], false); if (progress) { printf("\nAdding species:\n"); writename(0, 2, enterorder); #ifdef WIN32 phyFillScreenColor(); #endif } lastsp = false; smoothit = false; for (i = 3; i <= spp; i++) { bestree.likelihood = UNDEFINED; bestyet = UNDEFINED; there = curtree.root; item = curtree.nodep[enterorder[i - 1] - 1]; nufork = curtree.nodep[spp + i - 2]; lastsp = (i == spp); addpreorder(curtree.root, item, nufork, true); promlk_add(there, item, nufork, false); like = prot_evaluate(curtree.root); rearrange(&curtree.root); if (curtree.likelihood > bestree.likelihood) { prot_copy_(&curtree, &bestree, nonodes, grcategs); } if (progress) { writename(i - 1, 1, enterorder); #ifdef WIN32 phyFillScreenColor(); #endif } if (lastsp && global) { if (progress) { printf("Doing global rearrangements\n"); printf(" !"); for (j = 1; j <= nonodes; j++) if ( j % (( nonodes / 72 ) + 1 ) == 0 ) putchar('-'); printf("!\n"); } bestlike = bestyet; do { if (progress) printf(" "); gotlike = bestlike; for (j = 0; j < nonodes; j++) { bestyet = UNDEFINED; item = curtree.nodep[j]; if (item != curtree.root) { nufork = curtree.nodep[curtree.nodep[j]->back->index - 1]; promlk_re_move(&item, &nufork, false); there = curtree.root; addpreorder(curtree.root, item, nufork, true); promlk_add(there, item, nufork, false); } if (progress) { if ( j % (( nonodes / 72 ) + 1 ) == 0 ) putchar('.'); fflush(stdout); } } if (progress) putchar('\n'); } while (bestlike < gotlike); } } if (njumble > 1 && lastsp) { for (i = 0; i < spp; i++ ) promlk_re_move(&curtree.nodep[i], &dummy, false); if (jumb == 1 || bestree2.likelihood < bestree.likelihood) prot_copy_(&bestree, &bestree2, nonodes, grcategs); } if (jumb == njumble) { if (njumble > 1) prot_copy_(&bestree2, &curtree, nonodes, grcategs); else prot_copy_(&bestree, &curtree, nonodes, grcategs); fprintf(outfile, "\n\n"); treevaluate(); curtree.likelihood = prot_evaluate(curtree.root); if (treeprint) mlk_printree(outfile, &curtree); summarize(); if (trout) { col = 0; promlk_treeout(curtree.root); } } } else { /* if ( usertree ) */ /* Open in binary: ftell() is broken for UNIX line-endings under WIN32 */ openfile(&intree, INTREE, "input tree file", "rb", progname, intreename); numtrees = countsemic(&intree); if(numtrees > MAXSHIMOTREES) shimotrees = MAXSHIMOTREES; else shimotrees = numtrees; if (numtrees > 2) initseed(&inseed, &inseed0, seed); l0gl = (double *)Malloc(shimotrees * sizeof(double)); l0gf = (double **)Malloc(shimotrees * sizeof(double *)); for (i=0; i < shimotrees; ++i) l0gf[i] = (double *)Malloc(endsite * sizeof(double)); if (treeprint) { fprintf(outfile, "User-defined tree"); if (numtrees > 1) putc('s', outfile); fprintf(outfile, ":\n\n"); } fprintf(outfile, "\n\n"); which = 1; max_nonodes = nonodes; while (which <= numtrees) { /* These initializations required each time through the loop since multiple trees require re-initialization */ dummy_haslengths = true; nextnode = 0; dummy_first = true; goteof = false; lngths = lengthsopt; nonodes = max_nonodes; treeread(intree, &root, dummy_treenode, &goteof, &dummy_first, curtree.nodep, &nextnode, &dummy_haslengths, &grbg, initpromlnode, false, nonodes); nonodes = nextnode; root = curtree.nodep[root->index - 1]; curtree.root = root; if (lngths) tymetrav(curtree.root, &x); if (goteof && (which <= numtrees)) { /* if we hit the end of the file prematurely */ printf ("\n"); printf ("ERROR: trees missing at end of file.\n"); printf ("\tExpected number of trees:\t\t%ld\n", numtrees); printf ("\tNumber of trees actually in file:\t%ld.\n\n", which - 1); exxit(-1); } curtree.start = curtree.nodep[0]->back; treevaluate(); if (treeprint) mlk_printree(outfile, &curtree); summarize(); if (trout) { col = 0; promlk_treeout(curtree.root); } if(which < numtrees){ prot_freex_notip(nonodes, curtree.nodep); gdispose(curtree.root, &grbg, curtree.nodep); } which++; } FClose(intree); if (!auto_ && numtrees > 1 && weightsum > 1 ) standev2(numtrees, maxwhich, 0, endsite, maxlogl, l0gl, l0gf, aliasweight, seed); } if (usertree) { free(l0gl); for (i=0; i < shimotrees; i++) free(l0gf[i]); free(l0gf); } prot_freetable(); if (jumb < njumble) return; free(contribution); /* printf("freeing nonodes2 curtree.nodep\n"); */ free_all_protx(nonodes2, curtree.nodep); if (!usertree) { /* printf("freeing nonodes2 bestree.nodep\n"); */ free_all_protx(nonodes2, bestree.nodep); if (njumble > 1){ /* printf("freeing nonodes2 bestree2.nodep\n"); */ free_all_protx(nonodes2, bestree2.nodep); } } if (progress) { printf("\n\nOutput written to file \"%s\"\n", outfilename); if (trout) printf("\nTree also written onto file \"%s\"\n", outtreename); } free(root); } /* maketree */ void clean_up() { /* Free and/or close stuff */ long i; free (rrate); free (probcat); free (rate); /* Seems to require freeing every time... */ for (i = 0; i < spp; i++) { free (y[i]); } free (y); free (nayme); free (enterorder); free (category); free (weight); free (alias); free (ally); free (location); free (aliasweight); free (probmat); free (eigmat); /* FIXME jumble should never be enabled with usertree * * also -- freetree2 was making memory leak. Since that's * broken and we're currently not bothering to free our * other trees, it makes more sense to me to not free * bestree2. We should fix all of them properly. Doing * that will require a freetree variant that specifically * handles nodes made with prot_allocx if (!usertree && njumble > 1) freetree2(bestree2.nodep, nonodes2); */ FClose(infile); FClose(outfile); FClose(outtree); #ifdef MAC fixmacfile(outfilename); fixmacfile(outtreename); #endif } /* clean_up */ int main(int argc, Char *argv[]) { /* Protein Maximum Likelihood with molecular clock */ /* Initialize mlclock.c */ mlclock_init(&curtree, &prot_evaluate); #ifdef MAC argc = 1; /* macsetup("Promlk", ""); */ argv[0] = "Promlk"; #endif init(argc,argv); progname = argv[0]; openfile(&infile, INFILE, "input file", "r", argv[0], infilename); openfile(&outfile, OUTFILE, "output file", "w", argv[0], outfilename); ibmpc = IBMCRT; ansi = ANSICRT; datasets = 1; mulsets = false; firstset = true; doinit(); /* Open output tree, categories, and weights files if needed */ if ( trout ) openfile(&outtree, OUTTREE, "output tree file", "w", argv[0], outtreename); if ( ctgry ) openfile(&catfile, CATFILE, "categories file", "r", argv[0], catfilename); if ( weights || justwts ) openfile(&weightfile, WEIGHTFILE, "weights file", "r", argv[0], weightfilename); /* Data set loop */ for ( ith = 1; ith <= datasets; ith++ ) { if ( datasets > 1 ) { fprintf(outfile, "Data set # %ld:\n\n", ith); if ( progress ) printf("\nData set # %ld:\n", ith); } getinput(); if ( ith == 1 ) firstset = false; /* Jumble loop */ if (usertree) { max_num_sibs = 0; maketree(); } else for ( jumb = 1; jumb <= njumble; jumb++ ) { max_num_sibs = 0; maketree(); } } clean_up(); printf("\nDone.\n\n"); #ifdef WIN32 phyRestoreConsoleAttributes(); #endif return 0; } /* Protein Maximum Likelihood with molecular clock */ phylip-3.697/src/protdist.c0000644004732000473200000020707712406201117015357 0ustar joefelsenst_g #include "phylip.h" #include "seq.h" /* version 3.696. Written by Joseph Felsenstein, Akiko Fuseki, Sean Lamont, and Andrew Keeffe. Copyright (c) 1993-2014, Joseph Felsenstein All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #define protepsilon .00001 typedef long *steparray; typedef enum { universal, ciliate, mito, vertmito, flymito, yeastmito } codetype; typedef enum { chemical, hall, george } cattype; typedef double matrix[20][20]; #ifndef OLDC /* function prototypes */ void protdist_uppercase(Char *); void protdist_inputnumbers(void); void getoptions(void); void transition(void); void doinit(void); void printcategories(void); void inputoptions(void); void protdist_inputdata(void); void doinput(void); void code(void); void protdist_cats(void); void maketrans(void); void givens(matrix, long, long, long, double, double, boolean); void coeffs(double, double, double *, double *, double); void tridiag(matrix, long, double); void shiftqr(matrix, long, double); void qreigen(matrix, long); void pmbeigen(void); void pameigen(void); void jtteigen(void); void predict(long, long, long); void makedists(void); void reallocchars(void); /* function prototypes */ #endif long chars, datasets, ith, ctgry, categs; /* spp = number of species chars = number of positions in actual sequences */ double freqa, freqc, freqg, freqt, cvi, invarfrac, ttratio, xi, xv, ease, fracchange; boolean weights, justwts, progress, mulsets, gama, invar, basesequal, usepmb, usejtt, usepam, kimura, similarity, firstset; codetype whichcode; cattype whichcat; steptr oldweight; double rate[maxcategs]; aas **gnode; aas trans[4][4][4]; double pie[20]; long cat[(long)ser - (long)ala + 1], numaa[(long)ser - (long)ala + 1]; double eig[20]; matrix prob, eigvecs; double **d; char infilename[100], outfilename[100], catfilename[100], weightfilename[100]; /* Local variables for makedists, propagated globally for c version: */ double tt, p, dp, d2p, q, elambdat; /* this jtt matrix decomposition due to Elisabeth Tillier */ static double jtteigs[] = {+0.00000000000000,-1.81721720738768,-1.87965834528616,-1.61403121885431, -1.53896608443751,-1.40486966367848,-1.30995061286931,-1.24668414819041, -1.17179756521289,-0.31033320987464,-0.34602837857034,-1.06031718484613, -0.99900602987105,-0.45576774888948,-0.86014403434677,-0.54569432735296, -0.76866956571861,-0.60593589295327,-0.65119724379348,-0.70249806480753}; static double jttprobs[20][20] = {{+0.07686196156903,+0.05105697447152,+0.04254597872702,+0.05126897436552, +0.02027898986051,+0.04106097946952,+0.06181996909002,+0.07471396264303, +0.02298298850851,+0.05256897371552,+0.09111095444453,+0.05949797025102, +0.02341398829301,+0.04052997973502,+0.05053197473402,+0.06822496588753, +0.05851797074102,+0.01433599283201,+0.03230298384851,+0.06637396681302}, {-0.04445795120462,-0.01557336502860,-0.09314817363516,+0.04411372100382, -0.00511178725134,+0.00188472427522,-0.02176250428454,-0.01330231089224, +0.01004072641973,+0.02707838224285,-0.00785039050721,+0.02238829876349, +0.00257470703483,-0.00510311699563,-0.01727154263346,+0.20074235330882, -0.07236268502973,-0.00012690116016,-0.00215974664431,-0.01059243778174}, {+0.09480046389131,+0.00082658405814,+0.01530023104155,-0.00639909042723, +0.00160605602061,+0.00035896642912,+0.00199161318384,-0.00220482855717, -0.00112601328033,+0.14840201765438,-0.00344295714983,-0.00123976286718, -0.00439399942758,+0.00032478785709,-0.00104270266394,-0.02596605592109, -0.05645800566901,+0.00022319903170,-0.00022792271829,-0.16133258048606}, {-0.06924141195400,-0.01816245289173,-0.08104005811201,+0.08985697111009, +0.00279659017898,+0.01083740322821,-0.06449599336038,+0.01794514261221, +0.01036809141699,+0.04283504450449,+0.00634472273784,+0.02339134834111, -0.01748667848380,+0.00161859106290,+0.00622486432503,-0.05854130195643, +0.15083728660504,+0.00030733757661,-0.00143739522173,-0.05295810171941}, {-0.14637948915627,+0.02029296323583,+0.02615316895036,-0.10311538564943, -0.00183412744544,-0.02589124656591,+0.11073673851935,+0.00848581728407, +0.00106057791901,+0.05530240732939,-0.00031533506946,-0.03124002869407, -0.01533984125301,-0.00288717337278,+0.00272787410643,+0.06300929916280, +0.07920438311152,-0.00041335282410,-0.00011648873397,-0.03944076085434}, {-0.05558229086909,+0.08935293782491,+0.04869509588770,+0.04856877988810, -0.00253836047720,+0.07651693957635,-0.06342453535092,-0.00777376246014, -0.08570270266807,+0.01943016473512,-0.00599516526932,-0.09157595008575, -0.00397735155663,-0.00440093863690,-0.00232998056918,+0.02979967701162, -0.00477299485901,-0.00144011795333,+0.01795114942404,-0.00080059359232}, {+0.05807741644682,+0.14654292420341,-0.06724975334073,+0.02159062346633, -0.00339085518294,-0.06829036785575,+0.03520631903157,-0.02766062718318, +0.03485632707432,-0.02436836692465,-0.00397566003573,-0.10095488644404, +0.02456887654357,+0.00381764117077,-0.00906261340247,-0.01043058066362, +0.01651199513994,-0.00210417220821,-0.00872508520963,-0.01495915462580}, {+0.02564617106907,+0.02960554611436,-0.00052356748770,+0.00989267817318, -0.00044034172141,-0.02279910634723,-0.00363768356471,-0.01086345665971, +0.01229721799572,+0.02633650142592,+0.06282966783922,-0.00734486499924, -0.13863936313277,-0.00993891943390,-0.00655309682350,-0.00245191788287, -0.02431633805559,-0.00068554031525,-0.00121383858869,+0.06280025239509}, {+0.11362428251792,-0.02080375718488,-0.08802750967213,-0.06531316372189, -0.00166626058292,+0.06846081717224,+0.07007301248407,-0.01713112936632, -0.05900588794853,-0.04497159138485,+0.04222484636983,+0.00129043178508, -0.01550337251561,-0.01553102163852,-0.04363429852047,+0.01600063777880, +0.05787328925647,-0.00008265841118,+0.02870014572813,-0.02657681214523}, {+0.01840541226842,+0.00610159018805,+0.01368080422265,+0.02383751807012, -0.00923516894192,+0.01209943150832,+0.02906782189141,+0.01992384905334, +0.00197323568330,+0.00017531415423,-0.01796698381949,+0.01887083962858, -0.00063335886734,-0.02365277334702,+0.01209445088200,+0.01308086447947, +0.01286727242301,-0.11420358975688,-0.01886991700613,+0.00238338728588}, {-0.01100105031759,-0.04250695864938,-0.02554356700969,-0.05473632078607, +0.00725906469946,-0.03003724918191,-0.07051526125013,-0.06939439879112, -0.00285883056088,+0.05334304124753,+0.12839241846919,-0.05883473754222, +0.02424304967487,+0.09134510778469,-0.00226003347193,-0.01280041778462, -0.00207988305627,-0.02957493909199,+0.05290385686789,+0.05465710875015}, {-0.01421274522011,+0.02074863337778,-0.01006411985628,+0.03319995456446, -0.00005371699269,-0.12266046460835,+0.02419847062899,-0.00441168706583, -0.08299118738167,-0.00323230913482,+0.02954035119881,+0.09212856795583, +0.00718635627257,-0.02706936115539,+0.04473173279913,-0.01274357634785, -0.01395862740618,-0.00071538848681,+0.04767640012830,-0.00729728326990}, {-0.03797680968123,+0.01280286509478,-0.08614616553187,-0.01781049963160, +0.00674319990083,+0.04208667754694,+0.05991325707583,+0.03581015660092, -0.01529816709967,+0.06885987924922,-0.11719120476535,-0.00014333663810, +0.00074336784254,+0.02893416406249,+0.07466151360134,-0.08182016471377, -0.06581536577662,-0.00018195976501,+0.00167443595008,+0.09015415667825}, {+0.03577726799591,-0.02139253448219,-0.01137813538175,-0.01954939202830, -0.04028242801611,-0.01777500032351,-0.02106862264440,+0.00465199658293, -0.02824805812709,+0.06618860061778,+0.08437791757537,-0.02533125946051, +0.02806344654855,-0.06970805797879,+0.02328376968627,+0.00692992333282, +0.02751392122018,+0.01148722812804,-0.11130404325078,+0.07776346000559}, {-0.06014297925310,-0.00711674355952,-0.02424493472566,+0.00032464353156, +0.00321221847573,+0.03257969053884,+0.01072805771161,+0.06892027923996, +0.03326534127710,-0.01558838623875,+0.13794237677194,-0.04292623056646, +0.01375763233229,-0.11125153774789,+0.03510076081639,-0.04531670712549, -0.06170413486351,-0.00182023682123,+0.05979891871679,-0.02551802851059}, {-0.03515069991501,+0.02310847227710,+0.00474493548551,+0.02787717003457, -0.12038329679812,+0.03178473522077,+0.04445111601130,-0.05334957493090, +0.01290386678474,-0.00376064171612,+0.03996642737967,+0.04777677295520, +0.00233689200639,+0.03917715404594,-0.01755598277531,-0.03389088626433, -0.02180780263389,+0.00473402043911,+0.01964539477020,-0.01260807237680}, {-0.04120428254254,+0.00062717164978,-0.01688703578637,+0.01685776910152, +0.02102702093943,+0.01295781834163,+0.03541815979495,+0.03968150445315, -0.02073122710938,-0.06932247350110,+0.11696314241296,-0.00322523765776, -0.01280515661402,+0.08717664266126,+0.06297225078802,-0.01290501780488, -0.04693925076877,-0.00177653675449,-0.08407812137852,-0.08380714022487}, {+0.03138655228534,-0.09052573757196,+0.00874202219428,+0.06060593729292, -0.03426076652151,-0.04832468257386,+0.04735628794421,+0.14504653737383, -0.01709111334001,-0.00278794215381,-0.03513813820550,-0.11690294831883, -0.00836264902624,+0.03270980973180,-0.02587764129811,+0.01638786059073, +0.00485499822497,+0.00305477087025,+0.02295754527195,+0.00616929722958}, {-0.04898722042023,-0.01460879656586,+0.00508708857036,+0.07730497806331, +0.04252420017435,+0.00484232580349,+0.09861807969412,-0.05169447907187, -0.00917820907880,+0.03679081047330,+0.04998537112655,+0.00769330211980, +0.01805447683564,-0.00498723245027,-0.14148416183376,-0.05170281760262, -0.03230723310784,-0.00032890672639,-0.02363523071957,+0.03801365471627}, {-0.02047562162108,+0.06933781779590,-0.02101117884731,-0.06841945874842, -0.00860967572716,-0.00886650271590,-0.07185241332269,+0.16703684361030, -0.00635847581692,+0.00811478913823,+0.01847205842216,+0.06700967948643, +0.00596607376199,+0.02318239240593,-0.10552958537847,-0.01980199747773, -0.02003785382406,-0.00593392430159,-0.00965391033612,+0.00743094349652}}; /* PMB matrix decomposition courtesy of Elisabeth Tillier */ static double pmbeigs[] = {0.0000001586972220,-1.8416770496147100, -1.6025046986139100,-1.5801012515121300, -1.4987794099715900,-1.3520794233801900,-1.3003469390479700,-1.2439503327631300, -1.1962574080244200,-1.1383730501367500,-1.1153278910708000,-0.4934843510654760, -0.5419014550215590,-0.9657997830826700,-0.6276075673757390,-0.6675927795018510, -0.6932641383465870,-0.8897872681859630,-0.8382698977371710,-0.8074694642446040}; static double pmbprobs[20][20] = {{0.0771762457248147,0.0531913844998640,0.0393445076407294,0.0466756566755510, 0.0286348361997465,0.0312327748383639,0.0505410248721427,0.0767106611472993, 0.0258916271688597,0.0673140562194124,0.0965705469252199,0.0515979465932174, 0.0250628079438675,0.0503492018628350,0.0399908189418273,0.0641898881894471, 0.0517539616710987,0.0143507440546115,0.0357994592438322,0.0736218495862984}, {0.0368263046116572,-0.0006728917107827,0.0008590805287740,-0.0002764255356960, 0.0020152937187455,0.0055743720652960,0.0003213317669367,0.0000449190281568, -0.0004226254397134,0.1805040629634510,-0.0272246813586204,0.0005904606533477, -0.0183743200073889,-0.0009194625608688,0.0008173657533167,-0.0262629806302238, 0.0265738757209787,0.0002176606241904,0.0021315644838566,-0.1823229927207580}, {-0.0194800075560895,0.0012068088610652,-0.0008803318319596,-0.0016044273960017, -0.0002938633803197,-0.0535796754602196,0.0155163896648621,-0.0015006360762140, 0.0021601372013703,0.0268513218744797,-0.1085292493742730,0.0149753083138452, 0.1346457366717310,-0.0009371698759829,0.0013501708044116,0.0346352293103622, -0.0276963770242276,0.0003643142783940,0.0002074817333067,-0.0174108903914110}, {0.0557839400850153,0.0023271577185437,0.0183481103396687,0.0023339480096311, 0.0002013267015151,-0.0227406863569852,0.0098644845475047,0.0064721276774396, 0.0001389408104210,-0.0473713878768274,-0.0086984445005797,0.0026913674934634, 0.0283724052562196,0.0001063665179457,0.0027442574779383,-0.1875312134708470, 0.1279864877057640,0.0005103347834563,0.0003155113168637,0.0081451082759554}, {0.0037510125027265,0.0107095920636885,0.0147305410328404,-0.0112351252180332, -0.0001500408626446,-0.1523450933729730,0.0611532413339872,-0.0005496748939503, 0.0048714378736644,-0.0003826320053999,0.0552010244407311,0.0482555671001955, -0.0461664995115847,-0.0021165008617978,-0.0004574454232187,0.0233755883688949, -0.0035484915422384,0.0009090698422851,0.0013840637687758,-0.0073895139302231}, {-0.0111512564930024,0.1025460064723080,0.0396772456883791,-0.0298408501361294, -0.0001656742634733,-0.0079876311843289,0.0712644184507945,-0.0010780604625230, -0.0035880882043592,0.0021070399334252,0.0016716329894279,-0.1810123023850110, 0.0015141703608724,-0.0032700852781804,0.0035503782441679,0.0118634302028026, 0.0044561606458028,-0.0001576678495964,0.0023470722225751,-0.0027457045397157}, {0.1474525743949170,-0.0054432538500293,0.0853848892349828,-0.0137787746207348, -0.0008274830358513,0.0042248844582553,0.0019556229305563,-0.0164191435175148, -0.0024501858854849,0.0120908948084233,-0.0381456105972653,0.0101271614855119, -0.0061945941321859,0.0178841099895867,-0.0014577779202600,-0.0752120602555032, -0.1426985695849920,0.0002862275078983,-0.0081191734261838,0.0313401149422531}, {0.0542034611735289,-0.0078763926211829,0.0060433542506096,0.0033396210615510, 0.0013965072374079,0.0067798903832256,-0.0135291136622509,-0.0089982442731848, -0.0056744537593887,-0.0766524225176246,0.1881210263933930,-0.0065875518675173, 0.0416627569300375,-0.0953804133524747,-0.0012559228448735,0.0101622644292547, -0.0304742453119050,0.0011702318499737,0.0454733434783982,-0.1119239362388150}, {0.1069409037912470,0.0805064400880297,-0.1127352030714600,0.1001181253523260, -0.0021480427488769,-0.0332884841459003,-0.0679837575848452,-0.0043812841356657, 0.0153418716846395,-0.0079441315103188,-0.0121766182046363,-0.0381127991037620, -0.0036338726532673,0.0195324059593791,-0.0020165963699984,-0.0061222685010268, -0.0253761448771437,-0.0005246410999057,-0.0112205170502433,0.0052248485517237}, {-0.0325247648326262,0.0238753651653669,0.0203684886605797,0.0295666232678825, -0.0003946714764213,-0.0157242718469554,-0.0511737848084862,0.0084725632040180, -0.0167068828528921,0.0686962159427527,-0.0659702890616198,-0.0014289912494271, -0.0167000964093416,-0.1276689083678200,0.0036575057830967,-0.0205958145531018, 0.0000368919612829,0.0014413626622426,0.1064360941926030,0.0863372661517408}, {-0.0463777468104402,0.0394712148670596,0.1118686750747160,0.0440711686389031, -0.0026076286506751,-0.0268454015202516,-0.1464943067133240,-0.0137514051835380, -0.0094395514284145,-0.0144124844774228,0.0249103379323744,-0.0071832157138676, 0.0035592787728526,0.0415627419826693,0.0027040097365669,0.0337523666612066, 0.0316121324137152,-0.0011350177559026,-0.0349998884574440,-0.0302651879823361}, {0.0142360925194728,0.0413145623127025,0.0324976427846929,0.0580930922002398, -0.0586974207121084,0.0202001168873069,0.0492204086749069,0.1126593173463060, 0.0116620013776662,-0.0780333711712066,-0.1109786767320410,0.0407775100936731, -0.0205013161312652,-0.0653458585025237,0.0347351829703865,0.0304448983224773, 0.0068813748197884,-0.0189002309261882,-0.0334507528405279,-0.0668143558699485}, {-0.0131548829657936,0.0044244322828034,-0.0050639951827271,-0.0038668197633889, -0.1536822386530220,0.0026336969165336,0.0021585651200470,-0.0459233839062969, 0.0046854727140565,0.0393815434593599,0.0619554007991097,0.0027456299925622, 0.0117574347936383,0.0373018612990383,0.0024818527553328,-0.0133956606027299, -0.0020457128424105,0.0154178819990401,0.0246524142683911,0.0275363065682921}, {-0.1542307272455030,0.0364861558267547,-0.0090880407008181,0.0531673937889863, 0.0157585615170580,0.0029986538457297,0.0180194047699875,0.0652152443589317, 0.0266842840376180,0.0388457366405908,0.0856237634510719,0.0126955778952183, 0.0099593861698250,-0.0013941794862563,0.0294065511237513,-0.1151906949298290, -0.0852991447389655,0.0028699120202636,-0.0332087026659522,0.0006811857297899}, {0.0281300736924501,-0.0584072081898638,-0.0178386569847853,-0.0536470338171487, -0.0186881656029960,-0.0240008730656106,-0.0541064820498883,0.2217137098936020, -0.0260500001542033,0.0234505236798375,0.0311127151218573,-0.0494139126682672, 0.0057093465049849,0.0124937286655911,-0.0298322975915689,0.0006520211333102, -0.0061018680727128,-0.0007081999479528,-0.0060523759094034,0.0215845995364623}, {0.0295321046399105,-0.0088296411830544,-0.0065057049917325,-0.0053478115612781, -0.0100646496794634,-0.0015473619084872,0.0008539960632865,-0.0376381933046211, -0.0328135588935604,0.0672161874239480,0.0667626853916552,-0.0026511651464901, 0.0140451514222062,-0.0544836996133137,0.0427485157912094,0.0097455780205802, 0.0177309072915667,-0.0828759701187452,-0.0729504795471370,0.0670731961252313}, {0.0082646581043963,-0.0319918630534466,-0.0188454445200422,-0.0374976353856606, 0.0037131290686848,-0.0132507796987883,-0.0306958830735725,-0.0044119395527308, -0.0140786756619672,-0.0180512599925078,-0.0208243802903953,-0.0232202769398931, -0.0063135878270273,0.0110442171178168,0.1824538048228460,-0.0006644614422758, -0.0069909097436659,0.0255407650654681,0.0099119399501151,-0.0140911517070698}, {0.0261344441524861,-0.0714454044548650,0.0159436926233439,0.0028462736216688, -0.0044572637889080,-0.0089474834434532,-0.0177570282144517,-0.0153693244094452, 0.1160919467206400,0.0304911481385036,0.0047047513411774,-0.0456535116423972, 0.0004491494948617,-0.0767108879444462,-0.0012688533741441,0.0192445965934123, 0.0202321954782039,0.0281039933233607,-0.0590403018490048,0.0364080426546883}, {0.0115826306265004,0.1340228176509380,-0.0236200652949049,-0.1284484655137340, -0.0004742338006503,0.0127617346949511,-0.0428560878860394,0.0060030732454125, 0.0089182609926781,0.0085353834972860,0.0048464809638033,0.0709740071429510, 0.0029940462557054,-0.0483434904493132,-0.0071713680727884,-0.0036840391887209, 0.0031454003250096,0.0246243550241551,-0.0449551277644180,0.0111449232769393}, {0.0140356721886765,-0.0196518236826680,0.0030517022326582,0.0582672093364850, -0.0000973895685457,0.0021704767224292,0.0341806268602705,-0.0152035987563018, -0.0903198657739177,0.0259623214586925,0.0155832497882743,-0.0040543568451651, 0.0036477631918247,-0.0532892744763217,-0.0142569373662724,0.0104500681408622, 0.0103483945857315,0.0679534422398752,-0.0768068882938636,0.0280289727046158}} ; /* dcmut version of PAM model from www.ebi.ac.uk/goldman-srv/dayhoff/ */ static double pameigs[] = {0,-1.93321786301018,-2.20904642493621,-1.74835983874903, -1.64854548332072,-1.54505559488222,-1.33859384676989,-1.29786201193594, -0.235548517495575,-0.266951066089808,-0.28965813670665,-1.10505826965282, -1.04323310568532,-0.430423720979904,-0.541719761016713,-0.879636093986914, -0.711249353378695,-0.725050487280602,-0.776855937389452,-0.808735559461343}; static double pamprobs[20][20] ={ {0.08712695644, 0.04090397955, 0.04043197978, 0.04687197656, 0.03347398326, 0.03825498087, 0.04952997524, 0.08861195569, 0.03361898319, 0.03688598156, 0.08535695732, 0.08048095976, 0.01475299262, 0.03977198011, 0.05067997466, 0.06957696521, 0.05854197073, 0.01049399475, 0.02991598504, 0.06471796764}, {0.07991048383, 0.006888314018, 0.03857806206, 0.07947073194, 0.004895492884, 0.03815829405, -0.1087562465, 0.008691167141, -0.0140554828, 0.001306404001, -0.001888411299, -0.006921303342, 0.0007655604228, 0.001583298443, 0.006879590446, -0.171806883, 0.04890917949, 0.0006700432804, 0.0002276237277, -0.01350591875}, {-0.01641514483, -0.007233933239, -0.1377830621, 0.1163201333, -0.002305138017, 0.01557250366, -0.07455879489, -0.003225343503, 0.0140630487, 0.005112274204, 0.001405731862, 0.01975833782, -0.001348402973, -0.001085733262, -0.003880514478, 0.0851493313, -0.01163526615, -0.0001197903399, 0.002056153393, 0.0001536095643}, {0.009669278686, -0.006905863869, 0.101083544, 0.01179903104, -0.003780967591, 0.05845105878, -0.09138357299, -0.02850503638, -0.03233951408, 0.008708065876, -0.004700705411, -0.02053221579, 0.001165851398, -0.001366585849, -0.01317695074, 0.1199985703, -0.1146346193, -0.0005953021314, -0.0004297615194, 0.007475695618}, {0.1722243502, -0.003737582995, -0.02964873222, -0.02050116381, -0.0004530478465, -0.02460043205, 0.02280768412, -0.02127364909, 0.01570095258, 0.1027744285, -0.005330539586, 0.0179697651, -0.002904077286, -0.007068126663, -0.0142869583, -0.01444241844, -0.08218861544, 0.0002069181629, 0.001099671379, -0.1063484263}, {-0.1553433627, -0.001169168032, 0.02134785337, 0.0007602305436, 0.0001395330122, 0.03194992019, -0.01290252206, 0.03281720789, -0.01311103735, 0.1177254769, -0.008008783885, -0.02375317548, -0.002817809762, -0.008196682776, 0.01731267617, 0.01853526375, 0.08249908546, -2.788771776e-05, 0.001266182191, -0.09902299976}, {-0.03671080341, 0.0274168035, 0.04625877597, 0.07520706414, -0.0001833803619, -0.1207833161, -0.006415807779, -0.005465629648, 0.02778273972, 0.007589688485, -0.02945266034, -0.03797542064, 0.07044042052, -0.002018573865, 0.01845277071, 0.006901513991, -0.02430934639, -0.0005919635873, -0.001266962331, -0.01487591261}, {-0.03060317816, 0.01182361623, 0.04200270053, 0.05406235279, -0.0003920498815, -0.09159709348, -0.009602690652, -0.00382944418, 0.01761361993, 0.01605684317, 0.05198878008, 0.02198696949, -0.09308930025, -0.00102622863, 0.01477637127, 0.0009314065393, -0.01860959472, -0.0005964703968, -0.002694284083, 0.02079767439}, {0.0195976494, -0.005104484936, 0.007406728707, 0.01236244954, 0.0201446796, 0.007039564785, 0.01276942134, 0.02641595685, 0.002764624354, 0.001273314658, -0.01335316035, 0.01105658671, 2.148773499e-05, -0.02692205639, 0.0118684991, 0.01212624708, 0.01127770094, -0.09842754796, -0.01942336432, 0.007105703151}, {-0.01819461888, -0.01509348507, -0.01297636935, -0.01996453439, 0.1715705905, -0.01601550692, -0.02122706144, -0.02854628494, -0.009351082371, -0.001527995472, -0.010198224, -0.03609537551, -0.003153182095, 0.02395980501, -0.01378664626, -0.005992611421, -0.01176810875, 0.003132361603, 0.03018439539, -0.004956065656}, {-0.02733614784, -0.02258066705, -0.0153112506, -0.02475728664, -0.04480525045, -0.01526640341, -0.02438517425, -0.04836914601, -0.00635964824, 0.02263169831, 0.09794101931, -0.04004304158, 0.008464393478, 0.1185443142, -0.02239294163, -0.0281550321, -0.01453581604, -0.0246742804, 0.0879619849, 0.02342867605}, {0.06483718238, 0.1260012082, -0.006496013283, 0.009914915531, -0.004181603532, 0.0003493226286, 0.01408035752, -0.04881663016, -0.03431167356, -0.01768005602, 0.02362447761, -0.1482364784, -0.01289035619, -0.001778893279, -0.05240099752, 0.05536174567, 0.06782165352, -0.003548568717, 0.001125301173, -0.03277489363}, {0.06520296909, -0.0754802543, 0.03139281903, -0.03266449554, -0.004485188002, -0.03389072036, -0.06163274338, -0.06484769882, 0.05722658289, -0.02824079619, 0.01544837349, 0.03909752708, 0.002029218884, 0.003151939572, -0.05471208363, 0.07962008342, 0.125916047, 0.0008696184937, -0.01086027514, -0.05314092355}, {0.004543119081, 0.01935177735, 0.01905511007, 0.02682993409, -0.01199617967, 0.01426278655, 0.02472521255, 0.03864795501, 0.02166224804, -0.04754243479, -0.1921545477, 0.03621321546, -0.02120627881, 0.04928097895, 0.009396088815, 0.01748042052, -6.173742851e-05, -0.003168033098, 0.07723565812, -0.08255529309}, {0.06710378668, -0.09441410284, -0.004801776989, 0.008830272165, -0.01021645042, -0.02764365608, 0.004250361851, 0.1648777542, -0.037446109, 0.004541057635, -0.0296980702, -0.1532325189, -0.008940580901, 0.006998050812, 0.02338809379, 0.03175059182, 0.02033965512, 0.006388075608, 0.001762762044, 0.02616280361}, {0.01915943021, -0.05432967274, 0.01249342683, 0.06836622457, 0.002054462161, -0.01233535859, 0.07087282652, -0.08948637051, -0.1245896013, -0.02204522882, 0.03791481736, 0.06557467874, 0.005529294156, -0.006296644235, 0.02144530752, 0.01664230081, 0.02647078439, 0.001737725271, 0.01414149877, -0.05331990116}, {0.0266659303, 0.0564142853, -0.0263767738, -0.08029726006, -0.006059357163, -0.06317558457, -0.0911894019, 0.05401487057, -0.08178072458, 0.01580699778, -0.05370550396, 0.09798653264, 0.003934944022, 0.01977291947, 0.0441198541, 0.02788220393, 0.03201877081, -0.00206161759, -0.005101423308, 0.03113033802}, {0.02980360751, -0.009513246268, -0.009543527165, -0.02190644172, -0.006146440672, 0.01207009085, -0.0126989156, -0.1378266418, 0.0275235217, 0.00551720592, -0.03104791544, -0.07111701247, -0.006081754489, -0.01337494521, 0.1783961085, 0.01453225059, 0.01938736048, 0.0004488631071, 0.0110844398, 0.02049339243}, {-0.01433508581, 0.01258858175, -0.004294252236, -0.007146532854, 0.009541628809, 0.008040155729, -0.006857781832, 0.05584120066, 0.007749418365, -0.05867835844, 0.08008131283, -0.004877854222, -0.0007128540743, 0.09489058424, 0.06421121962, 0.00271493526, -0.03229944773, -0.001732026038, -0.08053448316, -0.1241903609}, {-0.009854113227, 0.01294129929, -0.00593064392, -0.03016833115, -0.002018439732, -0.00792418722, -0.03372768732, 0.07828561288, 0.007722254639, -0.05067377561, 0.1191848621, 0.005059475202, 0.004762387166, -0.1029870175, 0.03537190114, 0.001089956203, -0.02139157573, -0.001015245062, 0.08400521847, -0.08273195059}}; void protdist_uppercase(Char *ch) { (*ch) = (isupper(*ch) ? (*ch) : toupper(*ch)); } /* protdist_uppercase */ void protdist_inputnumbers() { /* input the numbers of species and of characters */ long i; fscanf(infile, "%ld%ld", &spp, &chars); if (printdata) fprintf(outfile, "%2ld species, %3ld positions\n\n", spp, chars); gnode = (aas **)Malloc(spp * sizeof(aas *)); if (firstset) { for (i = 0; i < spp; i++) gnode[i] = (aas *)Malloc(chars * sizeof(aas )); } weight = (steparray)Malloc(chars*sizeof(long)); oldweight = (steparray)Malloc(chars*sizeof(long)); category = (steparray)Malloc(chars*sizeof(long)); d = (double **)Malloc(spp*sizeof(double *)); nayme = (naym *)Malloc(spp*sizeof(naym)); for (i = 0; i < spp; ++i) d[i] = (double *)Malloc(spp*sizeof(double)); } /* protdist_inputnumbers */ void getoptions() { /* interactively set options */ long loopcount, loopcount2; Char ch, ch2; Char in[100]; boolean done; if (printdata) fprintf(outfile, "\nProtein distance algorithm, version %s\n\n",VERSION); putchar('\n'); weights = false; printdata = false; progress = true; interleaved = true; similarity = false; ttratio = 2.0; whichcode = universal; whichcat = george; basesequal = true; freqa = 0.25; freqc = 0.25; freqg = 0.25; freqt = 0.25; usejtt = true; usepmb = false; usepam = false; kimura = false; gama = false; invar = false; invarfrac = 0.0; ease = 0.457; loopcount = 0; do { cleerhome(); printf("\nProtein distance algorithm, version %s\n\n",VERSION); printf("Settings for this run:\n"); printf(" P Use JTT, PMB, PAM, Kimura, categories model? %s\n", usejtt ? "Jones-Taylor-Thornton matrix" : usepmb ? "Henikoff/Tillier PMB matrix" : usepam ? "Dayhoff PAM matrix" : kimura ? "Kimura formula" : similarity ? "Similarity table" : "Categories model"); if (!kimura && !similarity) { printf(" G Gamma distribution of rates among positions?"); if (gama) printf(" Yes\n"); else { if (invar) printf(" Gamma+Invariant\n"); else printf(" No\n"); } } printf(" C One category of substitution rates?"); if (!ctgry || categs == 1) printf(" Yes\n"); else printf(" %ld categories\n", categs); printf(" W Use weights for positions?"); if (weights) printf(" Yes\n"); else printf(" No\n"); if (!(usejtt || usepmb || usepam || kimura || similarity)) { printf(" U Use which genetic code? %s\n", (whichcode == universal) ? "Universal" : (whichcode == ciliate) ? "Ciliate" : (whichcode == mito) ? "Universal mitochondrial" : (whichcode == vertmito) ? "Vertebrate mitochondrial" : (whichcode == flymito) ? "Fly mitochondrial" : (whichcode == yeastmito) ? "Yeast mitochondrial" : ""); printf(" A Which categorization of amino acids? %s\n", (whichcat == chemical) ? "Chemical" : (whichcat == george) ? "George/Hunt/Barker" : "Hall"); printf(" E Prob change category (1.0=easy):%8.4f\n",ease); printf(" T Transition/transversion ratio:%7.3f\n",ttratio); printf(" F Base Frequencies:"); if (basesequal) printf(" Equal\n"); else printf("%7.3f%6.3f%6.3f%6.3f\n", freqa, freqc, freqg, freqt); } printf(" M Analyze multiple data sets?"); if (mulsets) printf(" Yes, %2ld %s\n", datasets, (justwts ? "sets of weights" : "data sets")); else printf(" No\n"); printf(" I Input sequences interleaved? %s\n", (interleaved ? "Yes" : "No, sequential")); printf(" 0 Terminal type (IBM PC, ANSI)? %s\n", ibmpc ? "IBM PC" : ansi ? "ANSI" : "(none)"); printf(" 1 Print out the data at start of run %s\n", (printdata ? "Yes" : "No")); printf(" 2 Print indications of progress of run %s\n", progress ? "Yes" : "No"); printf("\nAre these settings correct? (type Y or the letter for one to change)\n"); in[0] = '\0'; getstryng(in); ch=in[0]; if (ch == '\n') ch = ' '; protdist_uppercase(&ch); done = (ch == 'Y'); if (!done) { if (((strchr("CPGMWI120",ch) != NULL) && (usejtt || usepmb || usepam)) || ((strchr("CPMWI120",ch) != NULL) && (kimura || similarity)) || ((strchr("CUAPGETFMWI120",ch) != NULL) && (! (usejtt || usepmb || usepam || kimura || similarity)))) { switch (ch) { case 'U': printf("Which genetic code?\n"); printf(" type for\n\n"); printf(" U Universal\n"); printf(" M Mitochondrial\n"); printf(" V Vertebrate mitochondrial\n"); printf(" F Fly mitochondrial\n"); printf(" Y Yeast mitochondrial\n\n"); loopcount2 = 0; do { printf("type U, M, V, F, or Y\n"); fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); if (ch == '\n') ch = ' '; protdist_uppercase(&ch); countup(&loopcount2, 10); } while (ch != 'U' && ch != 'M' && ch != 'V' && ch != 'F' && ch != 'Y'); switch (ch) { case 'U': whichcode = universal; break; case 'M': whichcode = mito; break; case 'V': whichcode = vertmito; break; case 'F': whichcode = flymito; break; case 'Y': whichcode = yeastmito; break; } break; case 'A': printf( "Which of these categorizations of amino acids do you want to use:\n\n"); printf( " all have groups: (Glu Gln Asp Asn), (Lys Arg His), (Phe Tyr Trp)\n"); printf(" plus:\n"); printf("George/Hunt/Barker:"); printf(" (Cys), (Met Val Leu Ileu), (Gly Ala Ser Thr Pro)\n"); printf("Chemical: "); printf(" (Cys Met), (Val Leu Ileu Gly Ala Ser Thr), (Pro)\n"); printf("Hall: "); printf(" (Cys), (Met Val Leu Ileu), (Gly Ala Ser Thr), (Pro)\n\n"); printf("Which do you want to use (type C, H, or G)\n"); loopcount2 = 0; do { fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); if (ch == '\n') ch = ' '; protdist_uppercase(&ch); countup(&loopcount2, 10); } while (ch != 'C' && ch != 'H' && ch != 'G'); switch (ch) { case 'C': whichcat = chemical; break; case 'H': whichcat = hall; break; case 'G': whichcat = george; break; } break; case 'C': ctgry = !ctgry; if (ctgry) { initcatn(&categs); initcategs(categs, rate); } break; case 'W': weights = !weights; break; case 'P': if (usejtt) { usejtt = false; usepmb = true; } else { if (usepmb) { usepmb = false; usepam = true; } else { if (usepam) { usepam = false; kimura = true; } else { if (kimura) { kimura = false; similarity = true; } else { if (similarity) similarity = false; else usejtt = true; } } } } break; case 'G': if (!(gama || invar)) gama = true; else { if (gama) { gama = false; invar = true; } else { if (invar) invar = false; } } break; case 'E': printf("Ease of changing category of amino acid?\n"); loopcount2 = 0; do { printf(" (1.0 if no difficulty of changing,\n"); printf(" less if less easy. Can't be negative\n"); fflush(stdout); scanf("%lf%*[^\n]", &ease); getchar(); countup(&loopcount2, 10); } while (ease > 1.0 || ease < 0.0); break; case 'T': loopcount2 = 0; do { printf("Transition/transversion ratio?\n"); fflush(stdout); scanf("%lf%*[^\n]", &ttratio); getchar(); countup(&loopcount2, 10); } while (ttratio < 0.0); break; case 'F': loopcount2 = 0; do { basesequal = false; printf("Frequencies of bases A,C,G,T ?\n"); fflush(stdout); scanf("%lf%lf%lf%lf%*[^\n]", &freqa, &freqc, &freqg, &freqt); getchar(); if (fabs(freqa + freqc + freqg + freqt - 1.0) >= 1.0e-3) printf("FREQUENCIES MUST SUM TO 1\n"); countup(&loopcount2, 10); } while (fabs(freqa + freqc + freqg + freqt - 1.0) >= 1.0e-3); break; case 'M': mulsets = !mulsets; if (mulsets) { printf("Multiple data sets or multiple weights?"); loopcount2 = 0; do { printf(" (type D or W)\n"); fflush(stdout); scanf("%c%*[^\n]", &ch2); getchar(); if (ch2 == '\n') ch2 = ' '; uppercase(&ch2); countup(&loopcount2, 10); } while ((ch2 != 'W') && (ch2 != 'D')); justwts = (ch2 == 'W'); if (justwts) justweights(&datasets); else initdatasets(&datasets); } break; case 'I': interleaved = !interleaved; break; case '0': if (ibmpc) { ibmpc = false; ansi = true; } else if (ansi) ansi = false; else ibmpc = true; break; case '1': printdata = !printdata; break; case '2': progress = !progress; break; } } else { if (strchr("CUAPGETFMWI120",ch) == NULL) printf("Not a possible option!\n"); else printf("That option not allowed with these settings\n"); printf("\nPress Enter or Return key to continue\n"); fflush(stdout); getchar(); } } countup(&loopcount, 100); } while (!done); if (gama || invar) { loopcount = 0; do { printf( "\nCoefficient of variation of substitution rate among positions (must be positive)\n"); printf( " In gamma distribution parameters, this is 1/(square root of alpha)\n"); fflush(stdout); scanf("%lf%*[^\n]", &cvi); getchar(); countup(&loopcount, 10); } while (cvi <= 0.0); cvi = 1.0 / (cvi * cvi); } if (invar) { loopcount = 0; do { printf("Fraction of invariant positions?\n"); fflush(stdout); scanf("%lf%*[^\n]", &invarfrac); getchar(); countup (&loopcount, 10); } while ((invarfrac <= 0.0) || (invarfrac >= 1.0)); } } /* getoptions */ void transition() { /* calculations related to transition-transversion ratio */ double aa, bb, freqr, freqy, freqgr, freqty; freqr = freqa + freqg; freqy = freqc + freqt; freqgr = freqg / freqr; freqty = freqt / freqy; aa = ttratio * freqr * freqy - freqa * freqg - freqc * freqt; bb = freqa * freqgr + freqc * freqty; xi = aa / (aa + bb); xv = 1.0 - xi; if (xi <= 0.0 && xi >= -epsilon) xi = 0.0; if (xi < 0.0){ printf("THIS TRANSITION-TRANSVERSION RATIO IS IMPOSSIBLE WITH"); printf(" THESE BASE FREQUENCIES\n"); exxit(-1);} } /* transition */ void doinit() { /* initializes variables */ protdist_inputnumbers(); getoptions(); transition(); } /* doinit*/ void printcategories() { /* print out list of categories of positions */ long i, j; fprintf(outfile, "Rate categories\n\n"); for (i = 1; i <= nmlngth + 3; i++) putc(' ', outfile); for (i = 1; i <= chars; i++) { fprintf(outfile, "%ld", category[i - 1]); if (i % 60 == 0) { putc('\n', outfile); for (j = 1; j <= nmlngth + 3; j++) putc(' ', outfile); } else if (i % 10 == 0) putc(' ', outfile); } fprintf(outfile, "\n\n"); } /* printcategories */ void reallocchars(void) { int i; free(weight); free(oldweight); free(category); for (i = 0; i < spp; i++) { free(gnode[i]); gnode[i] = (aas *)Malloc(chars * sizeof(aas )); } weight = (steparray)Malloc(chars*sizeof(long)); oldweight = (steparray)Malloc(chars*sizeof(long)); category = (steparray)Malloc(chars*sizeof(long)); } void inputoptions() { /* input the information on the options */ long i; if (!firstset && !justwts) { samenumsp(&chars, ith); reallocchars(); } if (firstset || !justwts) { for (i = 0; i < chars; i++) { category[i] = 1; oldweight[i] = 1; weight[i] = 1; } } /* if (!justwts && weights) {*/ if (justwts || weights) inputweights(chars, oldweight, &weights); if (printdata) putc('\n', outfile); if (usejtt && printdata) fprintf(outfile, " Jones-Taylor-Thornton model distance\n"); if (usepmb && printdata) fprintf(outfile, " Henikoff/Tillier PMB model distance\n"); if (usepam && printdata) fprintf(outfile, " Dayhoff PAM model distance\n"); if (kimura && printdata) fprintf(outfile, " Kimura protein distance\n"); if (!(usejtt || usepmb || usepam || kimura || similarity) && printdata) fprintf(outfile, " Categories model distance\n"); if (similarity) fprintf(outfile, " \n Table of similarity between sequences\n"); if ((ctgry && categs > 1) && (firstset || !justwts)) { inputcategs(0, chars, category, categs, "ProtDist"); if (printdata) printcategs(outfile, chars, category, "Position categories"); } else if (printdata && (categs > 1)) { fprintf(outfile, "\nPosition category Rate of change\n\n"); for (i = 1; i <= categs; i++) fprintf(outfile, "%15ld%13.3f\n", i, rate[i - 1]); putc('\n', outfile); printcategories(); } if (weights && printdata) printweights(outfile, 0, chars, oldweight, "Positions"); } /* inputoptions */ void protdist_inputdata() { /* input the names and sequences for each species */ long i, j, k, l, aasread=0, aasnew=0; Char charstate; boolean allread, done; aas aa=0; /* temporary amino acid for input */ if (progress) putchar('\n'); j = nmlngth + (chars + (chars - 1) / 10) / 2 - 5; if (j < nmlngth - 1) j = nmlngth - 1; if (j > 37) j = 37; if (printdata) { fprintf(outfile, "\nName"); for (i = 1; i <= j; i++) putc(' ', outfile); fprintf(outfile, "Sequences\n"); fprintf(outfile, "----"); for (i = 1; i <= j; i++) putc(' ', outfile); fprintf(outfile, "---------\n\n"); } aasread = 0; allread = false; while (!(allread)) { /* eat white space -- if the separator line has spaces on it*/ do { charstate = gettc(infile); } while (charstate == ' ' || charstate == '\t'); ungetc(charstate, infile); if (eoln(infile)) scan_eoln(infile); i = 1; while (i <= spp) { if ((interleaved && aasread == 0) || !interleaved) initname(i-1); if (interleaved) j = aasread; else j = 0; done = false; while (((!done) && (!(eoln(infile) || eoff(infile))))) { if (interleaved) done = true; while (((j < chars) & (!(eoln(infile) | eoff(infile))))) { charstate = gettc(infile); if (charstate == '\n' || charstate == '\t') charstate = ' '; if (charstate == ' ' || (charstate >= '0' && charstate <= '9')) continue; protdist_uppercase(&charstate); if ((!isalpha(charstate) && charstate != '.' && charstate != '?' && charstate != '-' && charstate != '*') || charstate == 'J' || charstate == 'O' || charstate == 'U' || charstate == '.') { printf("ERROR -- bad amino acid: %c at position %ld of species %3ld\n", charstate, j, i); if (charstate == '.') { printf(" Periods (.) may not be used as gap characters.\n"); printf(" The correct gap character is (-)\n"); } exxit(-1); } j++; switch (charstate) { case 'A': aa = ala; break; case 'B': aa = asx; break; case 'C': aa = cys; break; case 'D': aa = asp; break; case 'E': aa = glu; break; case 'F': aa = phe; break; case 'G': aa = gly; break; case 'H': aa = his; break; case 'I': aa = ileu; break; case 'K': aa = lys; break; case 'L': aa = leu; break; case 'M': aa = met; break; case 'N': aa = asn; break; case 'P': aa = pro; break; case 'Q': aa = gln; break; case 'R': aa = arg; break; case 'S': aa = ser; break; case 'T': aa = thr; break; case 'V': aa = val; break; case 'W': aa = trp; break; case 'X': aa = unk; break; case 'Y': aa = tyr; break; case 'Z': aa = glx; break; case '*': aa = stop; break; case '?': aa = quest; break; case '-': aa = del; break; } gnode[i - 1][j - 1] = aa; } if (interleaved) continue; if (j < chars) scan_eoln(infile); else if (j == chars) done = true; } if (interleaved && i == 1) aasnew = j; scan_eoln(infile); if ((interleaved && j != aasnew) || ((!interleaved) && j != chars)){ printf("ERROR: SEQUENCES OUT OF ALIGNMENT\n"); exxit(-1);} i++; } if (interleaved) { aasread = aasnew; allread = (aasread == chars); } else allread = (i > spp); } if ( printdata) { for (i = 1; i <= ((chars - 1) / 60 + 1); i++) { for (j = 1; j <= spp; j++) { for (k = 0; k < nmlngth; k++) putc(nayme[j - 1][k], outfile); fprintf(outfile, " "); l = i * 60; if (l > chars) l = chars; for (k = (i - 1) * 60 + 1; k <= l; k++) { if (j > 1 && gnode[j - 1][k - 1] == gnode[0][k - 1]) charstate = '.'; else { switch (gnode[j - 1][k - 1]) { case ala: charstate = 'A'; break; case asx: charstate = 'B'; break; case cys: charstate = 'C'; break; case asp: charstate = 'D'; break; case glu: charstate = 'E'; break; case phe: charstate = 'F'; break; case gly: charstate = 'G'; break; case his: charstate = 'H'; break; case ileu: charstate = 'I'; break; case lys: charstate = 'K'; break; case leu: charstate = 'L'; break; case met: charstate = 'M'; break; case asn: charstate = 'N'; break; case pro: charstate = 'P'; break; case gln: charstate = 'Q'; break; case arg: charstate = 'R'; break; case ser: charstate = 'S'; break; case thr: charstate = 'T'; break; case val: charstate = 'V'; break; case trp: charstate = 'W'; break; case tyr: charstate = 'Y'; break; case glx: charstate = 'Z'; break; case del: charstate = '-'; break; case stop: charstate = '*'; break; case unk: charstate = 'X'; break; case quest: charstate = '?'; break; default: /*cases ser1 and ser2 cannot occur*/ break; } } putc(charstate, outfile); if (k % 10 == 0 && k % 60 != 0) putc(' ', outfile); } putc('\n', outfile); } putc('\n', outfile); } putc('\n', outfile); } if (printdata) putc('\n', outfile); } /* protdist_inputdata */ void doinput() { /* reads the input data */ long i; double sumrates, weightsum; inputoptions(); if(!justwts || firstset) protdist_inputdata(); if (!ctgry) { categs = 1; rate[0] = 1.0; } weightsum = 0; for (i = 0; i < chars; i++) weightsum += oldweight[i]; sumrates = 0.0; for (i = 0; i < chars; i++) sumrates += oldweight[i] * rate[category[i] - 1]; for (i = 0; i < categs; i++) rate[i] *= weightsum / sumrates; } /* doinput */ void code() { /* make up table of the code 1 = u, 2 = c, 3 = a, 4 = g */ long n; aas b; trans[0][0][0] = phe; trans[0][0][1] = phe; trans[0][0][2] = leu; trans[0][0][3] = leu; trans[0][1][0] = ser; trans[0][1][1] = ser; trans[0][1][2] = ser; trans[0][1][3] = ser; trans[0][2][0] = tyr; trans[0][2][1] = tyr; trans[0][2][2] = stop; trans[0][2][3] = stop; trans[0][3][0] = cys; trans[0][3][1] = cys; trans[0][3][2] = stop; trans[0][3][3] = trp; trans[1][0][0] = leu; trans[1][0][1] = leu; trans[1][0][2] = leu; trans[1][0][3] = leu; trans[1][1][0] = pro; trans[1][1][1] = pro; trans[1][1][2] = pro; trans[1][1][3] = pro; trans[1][2][0] = his; trans[1][2][1] = his; trans[1][2][2] = gln; trans[1][2][3] = gln; trans[1][3][0] = arg; trans[1][3][1] = arg; trans[1][3][2] = arg; trans[1][3][3] = arg; trans[2][0][0] = ileu; trans[2][0][1] = ileu; trans[2][0][2] = ileu; trans[2][0][3] = met; trans[2][1][0] = thr; trans[2][1][1] = thr; trans[2][1][2] = thr; trans[2][1][3] = thr; trans[2][2][0] = asn; trans[2][2][1] = asn; trans[2][2][2] = lys; trans[2][2][3] = lys; trans[2][3][0] = ser; trans[2][3][1] = ser; trans[2][3][2] = arg; trans[2][3][3] = arg; trans[3][0][0] = val; trans[3][0][1] = val; trans[3][0][2] = val; trans[3][0][3] = val; trans[3][1][0] = ala; trans[3][1][1] = ala; trans[3][1][2] = ala; trans[3][1][3] = ala; trans[3][2][0] = asp; trans[3][2][1] = asp; trans[3][2][2] = glu; trans[3][2][3] = glu; trans[3][3][0] = gly; trans[3][3][1] = gly; trans[3][3][2] = gly; trans[3][3][3] = gly; if (whichcode == mito) trans[0][3][2] = trp; if (whichcode == vertmito) { trans[0][3][2] = trp; trans[2][3][2] = stop; trans[2][3][3] = stop; trans[2][0][2] = met; } if (whichcode == flymito) { trans[0][3][2] = trp; trans[2][0][2] = met; trans[2][3][2] = ser; } if (whichcode == yeastmito) { trans[0][3][2] = trp; trans[1][0][2] = thr; trans[2][0][2] = met; } n = 0; for (b = ala; (long)b <= (long)val; b = (aas)((long)b + 1)) { if (b != ser2) { n++; numaa[(long)b - (long)ala] = n; } } numaa[(long)ser - (long)ala] = (long)ser1 - (long)(ala) + 1; } /* code */ void protdist_cats() { /* define categories of amino acids */ aas b; /* fundamental subgroups */ cat[0] = 1; /* for alanine */ cat[(long)cys - (long)ala] = 1; cat[(long)met - (long)ala] = 2; cat[(long)val - (long)ala] = 3; cat[(long)leu - (long)ala] = 3; cat[(long)ileu - (long)ala] = 3; cat[(long)gly - (long)ala] = 4; cat[0] = 4; cat[(long)ser - (long)ala] = 4; cat[(long)thr - (long)ala] = 4; cat[(long)pro - (long)ala] = 5; cat[(long)phe - (long)ala] = 6; cat[(long)tyr - (long)ala] = 6; cat[(long)trp - (long)ala] = 6; cat[(long)glu - (long)ala] = 7; cat[(long)gln - (long)ala] = 7; cat[(long)asp - (long)ala] = 7; cat[(long)asn - (long)ala] = 7; cat[(long)lys - (long)ala] = 8; cat[(long)arg - (long)ala] = 8; cat[(long)his - (long)ala] = 8; if (whichcat == george) { /* George, Hunt and Barker: sulfhydryl, small hydrophobic, small hydrophilic, aromatic, acid/acid-amide/hydrophilic, basic */ for (b = ala; (long)b <= (long)val; b = (aas)((long)b + 1)) { if (cat[(long)b - (long)ala] == 3) cat[(long)b - (long)ala] = 2; if (cat[(long)b - (long)ala] == 5) cat[(long)b - (long)ala] = 4; } } if (whichcat == chemical) { /* Conn and Stumpf: monoamino, aliphatic, heterocyclic, aromatic, dicarboxylic, basic */ for (b = ala; (long)b <= (long)val; b = (aas)((long)b + 1)) { if (cat[(long)b - (long)ala] == 2) cat[(long)b - (long)ala] = 1; if (cat[(long)b - (long)ala] == 4) cat[(long)b - (long)ala] = 3; } } /* Ben Hall's personal opinion */ if (whichcat != hall) return; for (b = ala; (long)b <= (long)val; b = (aas)((long)b + 1)) { if (cat[(long)b - (long)ala] == 3) cat[(long)b - (long)ala] = 2; } } /* protdist_cats */ void maketrans() { /* Make up transition probability matrix from code and category tables */ long i, j, k, m, n, s, nb1, nb2; double x, sum; long sub[3], newsub[3]; double f[4], g[4]; aas b1, b2; double TEMP, TEMP1, TEMP2, TEMP3; for (i = 0; i <= 19; i++) { pie[i] = 0.0; for (j = 0; j <= 19; j++) prob[i][j] = 0.0; } f[0] = freqt; f[1] = freqc; f[2] = freqa; f[3] = freqg; g[0] = freqc + freqt; g[1] = freqc + freqt; g[2] = freqa + freqg; g[3] = freqa + freqg; TEMP = f[0]; TEMP1 = f[1]; TEMP2 = f[2]; TEMP3 = f[3]; fracchange = xi * (2 * f[0] * f[1] / g[0] + 2 * f[2] * f[3] / g[2]) + xv * (1 - TEMP * TEMP - TEMP1 * TEMP1 - TEMP2 * TEMP2 - TEMP3 * TEMP3); sum = 0.0; for (i = 0; i <= 3; i++) { for (j = 0; j <= 3; j++) { for (k = 0; k <= 3; k++) { if (trans[i][j][k] != stop) sum += f[i] * f[j] * f[k]; } } } for (i = 0; i <= 3; i++) { sub[0] = i + 1; for (j = 0; j <= 3; j++) { sub[1] = j + 1; for (k = 0; k <= 3; k++) { sub[2] = k + 1; b1 = trans[i][j][k]; for (m = 0; m <= 2; m++) { s = sub[m]; for (n = 1; n <= 4; n++) { memcpy(newsub, sub, sizeof(long) * 3L); newsub[m] = n; x = f[i] * f[j] * f[k] / (3.0 * sum); if (((s == 1 || s == 2) && (n == 3 || n == 4)) || ((n == 1 || n == 2) && (s == 3 || s == 4))) x *= xv * f[n - 1]; else x *= xi * f[n - 1] / g[n - 1] + xv * f[n - 1]; b2 = trans[newsub[0] - 1][newsub[1] - 1][newsub[2] - 1]; if (b1 != stop) { nb1 = numaa[(long)b1 - (long)ala]; pie[nb1 - 1] += x; if (b2 != stop) { nb2 = numaa[(long)b2 - (long)ala]; if (cat[(long)b1 - (long)ala] != cat[(long)b2 - (long)ala]) { prob[nb1 - 1][nb2 - 1] += x * ease; prob[nb1 - 1][nb1 - 1] += x * (1.0 - ease); } else prob[nb1 - 1][nb2 - 1] += x; } else prob[nb1 - 1][nb1 - 1] += x; } } } } } } for (i = 0; i <= 19; i++) prob[i][i] -= pie[i]; for (i = 0; i <= 19; i++) { for (j = 0; j <= 19; j++) prob[i][j] /= sqrt(pie[i] * pie[j]); } /* computes pi^(1/2)*B*pi^(-1/2) */ } /* maketrans */ void givens(double (*a)[20], long i, long j, long n, double ctheta, double stheta, boolean left) { /* Givens transform at i,j for 1..n with angle theta */ long k; double d; for (k = 0; k < n; k++) { if (left) { d = ctheta * a[i - 1][k] + stheta * a[j - 1][k]; a[j - 1][k] = ctheta * a[j - 1][k] - stheta * a[i - 1][k]; a[i - 1][k] = d; } else { d = ctheta * a[k][i - 1] + stheta * a[k][j - 1]; a[k][j - 1] = ctheta * a[k][j - 1] - stheta * a[k][i - 1]; a[k][i - 1] = d; } } } /* givens */ void coeffs(double x, double y, double *c, double *s, double accuracy) { /* compute cosine and sine of theta */ double root; root = sqrt(x * x + y * y); if (root < accuracy) { *c = 1.0; *s = 0.0; } else { *c = x / root; *s = y / root; } } /* coeffs */ void tridiag(double (*a)[20], long n, double accuracy) { /* Givens tridiagonalization */ long i, j; double s, c; for (i = 2; i < n; i++) { for (j = i + 1; j <= n; j++) { coeffs(a[i - 2][i - 1], a[i - 2][j - 1], &c, &s,accuracy); givens(a, i, j, n, c, s, true); givens(a, i, j, n, c, s, false); givens(eigvecs, i, j, n, c, s, true); } } } /* tridiag */ void shiftqr(double (*a)[20], long n, double accuracy) { /* QR eigenvalue-finder */ long i, j; double approx, s, c, d, TEMP, TEMP1; for (i = n; i >= 2; i--) { do { TEMP = a[i - 2][i - 2] - a[i - 1][i - 1]; TEMP1 = a[i - 1][i - 2]; d = sqrt(TEMP * TEMP + TEMP1 * TEMP1); approx = a[i - 2][i - 2] + a[i - 1][i - 1]; if (a[i - 1][i - 1] < a[i - 2][i - 2]) approx = (approx - d) / 2.0; else approx = (approx + d) / 2.0; for (j = 0; j < i; j++) a[j][j] -= approx; for (j = 1; j < i; j++) { coeffs(a[j - 1][j - 1], a[j][j - 1], &c, &s, accuracy); givens(a, j, j + 1, i, c, s, true); givens(a, j, j + 1, i, c, s, false); givens(eigvecs, j, j + 1, n, c, s, true); } for (j = 0; j < i; j++) a[j][j] += approx; } while (fabs(a[i - 1][i - 2]) > accuracy); } } /* shiftqr */ void qreigen(double (*prob)[20], long n) { /* QR eigenvector/eigenvalue method for symmetric matrix */ double accuracy; long i, j; accuracy = 1.0e-6; for (i = 0; i < n; i++) { for (j = 0; j < n; j++) eigvecs[i][j] = 0.0; eigvecs[i][i] = 1.0; } tridiag(prob, n, accuracy); shiftqr(prob, n, accuracy); for (i = 0; i < n; i++) eig[i] = prob[i][i]; for (i = 0; i <= 19; i++) { for (j = 0; j <= 19; j++) prob[i][j] = sqrt(pie[j]) * eigvecs[i][j]; } /* prob[i][j] is the value of U' times pi^(1/2) */ } /* qreigen */ void jtteigen() { /* eigenanalysis for JTT matrix, precomputed */ memcpy(prob,jttprobs,sizeof(jttprobs)); memcpy(eig,jtteigs,sizeof(jtteigs)); fracchange = 1.0; /** changed from 0.01 **/ } /* jtteigen */ void pmbeigen() { /* eigenanalysis for PMB matrix, precomputed */ memcpy(prob,pmbprobs,sizeof(pmbprobs)); memcpy(eig,pmbeigs,sizeof(pmbeigs)); fracchange = 1.0; } /* pmbeigen */ void pameigen() { /* eigenanalysis for PAM matrix, precomputed */ memcpy(prob,pamprobs,sizeof(pamprobs)); memcpy(eig,pameigs,sizeof(pameigs)); fracchange = 1.0; /** changed from 0.01 **/ } /* pameigen */ void predict(long nb1, long nb2, long cat) { /* make contribution to prediction of this aa pair */ long m; double TEMP; for (m = 0; m <= 19; m++) { if (gama || invar) elambdat = exp(-cvi*log(1.0-rate[cat-1]*tt*(eig[m]/(1.0-invarfrac))/cvi)); else elambdat = exp(rate[cat-1]*tt * eig[m]); q = prob[m][nb1 - 1] * prob[m][nb2 - 1] * elambdat; p += q; if (!gama && !invar) dp += rate[cat-1]*eig[m] * q; else dp += (rate[cat-1]*eig[m]/(1.0-rate[cat-1]*tt*(eig[m]/(1.0-invarfrac))/cvi)) * q; TEMP = eig[m]; if (!gama && !invar) d2p += TEMP * TEMP * q; else d2p += (rate[cat-1]*rate[cat-1]*eig[m]*eig[m]*(1.0+1.0/cvi)/ ((1.0-rate[cat-1]*tt*eig[m]/cvi) *(1.0-rate[cat-1]*tt*eig[m]/cvi))) * q; } if (nb1 == nb2) { p *= (1.0 - invarfrac); p += invarfrac; } dp *= (1.0 - invarfrac); d2p *= (1.0 - invarfrac); } /* predict */ void makedists() { /* compute the distances */ long i, j, k, m, n, itterations, nb1, nb2, cat; double delta, lnlike, slope, curv; boolean neginfinity, inf, overlap; aas b1, b2; if (!(printdata || similarity)) fprintf(outfile, "%5ld\n", spp); if (progress) printf("Computing distances:\n"); for (i = 1; i <= spp; i++) { if (progress) printf(" "); if (progress) { for (j = 0; j < nmlngth; j++) putchar(nayme[i - 1][j]); } if (progress) { printf(" "); fflush(stdout); } if (similarity) d[i-1][i-1] = 1.0; else d[i-1][i-1] = 0.0; for (j = 0; j <= i - 2; j++) { if (!(kimura || similarity)) { if (usejtt || usepmb || usepam) tt = 0.1/fracchange; else tt = 1.0; delta = tt / 2.0; itterations = 0; inf = false; do { lnlike = 0.0; slope = 0.0; curv = 0.0; neginfinity = false; overlap = false; for (k = 0; k < chars; k++) { if (oldweight[k] > 0) { cat = category[k]; b1 = gnode[i - 1][k]; b2 = gnode[j][k]; if (b1 != stop && b1 != del && b1 != quest && b1 != unk && b2 != stop && b2 != del && b2 != quest && b2 != unk) { overlap = true; p = 0.0; dp = 0.0; d2p = 0.0; nb1 = numaa[(long)b1 - (long)ala]; nb2 = numaa[(long)b2 - (long)ala]; if (b1 != asx && b1 != glx && b2 != asx && b2 != glx) predict(nb1, nb2, cat); else { if (b1 == asx) { if (b2 == asx) { predict(3L, 3L, cat); predict(3L, 4L, cat); predict(4L, 3L, cat); predict(4L, 4L, cat); } else { if (b2 == glx) { predict(3L, 6L, cat); predict(3L, 7L, cat); predict(4L, 6L, cat); predict(4L, 7L, cat); } else { predict(3L, nb2, cat); predict(4L, nb2, cat); } } } else { if (b1 == glx) { if (b2 == asx) { predict(6L, 3L, cat); predict(6L, 4L, cat); predict(7L, 3L, cat); predict(7L, 4L, cat); } else { if (b2 == glx) { predict(6L, 6L, cat); predict(6L, 7L, cat); predict(7L, 6L, cat); predict(7L, 7L, cat); } else { predict(6L, nb2, cat); predict(7L, nb2, cat); } } } else { if (b2 == asx) { predict(nb1, 3L, cat); predict(nb1, 4L, cat); predict(nb1, 3L, cat); predict(nb1, 4L, cat); } else if (b2 == glx) { predict(nb1, 6L, cat); predict(nb1, 7L, cat); predict(nb1, 6L, cat); predict(nb1, 7L, cat); } } } } if (p <= 0.0) neginfinity = true; else { lnlike += oldweight[k]*log(p); slope += oldweight[k]*dp / p; curv += oldweight[k]*(d2p / p - dp * dp / (p * p)); } } } } itterations++; if (!overlap){ printf("\nWARNING: NO OVERLAP BETWEEN SEQUENCES %ld AND %ld; -1.0 WAS WRITTEN\n", i, j+1); tt = -1.0/fracchange; itterations = 20; inf = true; } else if (!neginfinity) { if (curv < 0.0) { tt -= slope / curv; if (tt > 10000.0) { printf("\nWARNING: INFINITE DISTANCE BETWEEN SPECIES %ld AND %ld; -1.0 WAS WRITTEN\n", i, j+1); tt = -1.0/fracchange; inf = true; itterations = 20; } } else { if ((slope > 0.0 && delta < 0.0) || (slope < 0.0 && delta > 0.0)) delta /= -2; tt += delta; } } else { delta /= -2; tt += delta; } if (tt < protepsilon && !inf) tt = protepsilon; } while (itterations != 20); } else { m = 0; n = 0; for (k = 0; k < chars; k++) { b1 = gnode[i - 1][k]; b2 = gnode[j][k]; if ((((long)b1 <= (long)val) || ((long)b1 == (long)ser)) && (((long)b2 <= (long)val) || ((long)b2 == (long)ser))) { if (b1 == b2) m++; n++; } } p = 1 - (double)m / n; if (kimura) { dp = 1.0 - p - 0.2 * p * p; if (dp < 0.0) { printf( "\nDISTANCE BETWEEN SEQUENCES %3ld AND %3ld IS TOO LARGE FOR KIMURA FORMULA\n", i, j + 1); tt = -1.0; } else tt = -log(dp); } else { /* if similarity */ tt = 1.0 - p; } } d[i - 1][j] = fracchange * tt; d[j][i - 1] = d[i - 1][j]; if (progress) { putchar('.'); fflush(stdout); } } if (progress) { putchar('\n'); fflush(stdout); } } if (!similarity) { for (i = 0; i < spp; i++) { for (j = 0; j < nmlngth; j++) putc(nayme[i][j], outfile); k = spp; for (j = 1; j <= k; j++) { if (d[i][j-1] < 100.0) fprintf(outfile, "%10.6f", d[i][j-1]); else if (d[i][j-1] < 1000.0) fprintf(outfile, " %10.6f", d[i][j-1]); else fprintf(outfile, " %11.6f", d[i][j-1]); if ((j + 1) % 7 == 0 && j < k) putc('\n', outfile); } putc('\n', outfile); } } else { for (i = 0; i < spp; i += 6) { if ((i+6) < spp) n = i+6; else n = spp; fprintf(outfile, " "); for (j = i; j < n ; j++) { for (k = 0; k < (nmlngth-2); k++) putc(nayme[j][k], outfile); putc(' ', outfile); putc(' ', outfile); } putc('\n', outfile); for (j = 0; j < spp; j++) { for (k = 0; k < nmlngth; k++) putc(nayme[j][k], outfile); if ((i+6) < spp) n = i+6; else n = spp; for (k = i; k < n ; k++) if (d[j][k] < 100.0) fprintf(outfile, "%10.6f", d[j][k]); else if (d[j][k] < 1000.0) fprintf(outfile, " %10.6f", d[j][k]); else fprintf(outfile, " %11.6f", d[j][k]); putc('\n', outfile); } putc('\n', outfile); } } if (progress) printf("\nOutput written to file \"%s\"\n\n", outfilename); } /* makedists */ int main(int argc, Char *argv[]) { /* ML Protein distances by PMB, JTT, PAM or categories model */ #ifdef MAC argc = 1; /* macsetup("Protdist",""); */ argv[0] = "Protdist"; #endif init(argc, argv); openfile(&infile,INFILE,"input file","r",argv[0],infilename); openfile(&outfile,OUTFILE,"output file","w",argv[0],outfilename); ibmpc = IBMCRT; ansi = ANSICRT; mulsets = false; datasets = 1; firstset = true; doinit(); if (!(kimura || similarity)) code(); if (!(usejtt || usepmb || usepam || kimura || similarity)) { protdist_cats(); maketrans(); qreigen(prob, 20L); } else { if (kimura || similarity) fracchange = 1.0; else { if (usejtt) jtteigen(); else { if (usepmb) pmbeigen(); else pameigen(); } } } if (ctgry) openfile(&catfile,CATFILE,"categories file","r",argv[0],catfilename); if (weights || justwts) openfile(&weightfile,WEIGHTFILE,"weights file","r",argv[0],weightfilename); for (ith = 1; ith <= datasets; ith++) { doinput(); if (ith == 1) firstset = false; if ((datasets > 1) && progress) printf("\nData set # %ld:\n\n", ith); makedists(); } FClose(outfile); FClose(infile); #ifdef MAC fixmacfile(outfilename); #endif printf("Done.\n\n"); #ifdef WIN32 phyRestoreConsoleAttributes(); #endif return 0; } /* Protein distances */ phylip-3.697/src/protpars.c0000644004732000473200000015067212406201117015357 0ustar joefelsenst_g #include "phylip.h" #include "seq.h" /* version 3.696. Written by Joseph Felsenstein, Akiko Fuseki, Sean Lamont, and Andrew Keeffe. Copyright (c) 1993-2014, Joseph Felsenstein. All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #define maxtrees 100 /* maximum number of tied trees stored */ typedef enum { universal, ciliate, mito, vertmito, flymito, yeastmito } codetype; /* nodes will form a binary tree */ typedef struct gseq { seqptr seq; struct gseq *next; } gseq; #ifndef OLDC /* function prototypes */ void protgnu(gseq **); void protchuck(gseq *); void code(void); void setup(void); void getoptions(void); void protalloctree(void); void allocrest(void); void doinit(void); void protinputdata(void); void protmakevalues(void); void doinput(void); void protfillin(node *, node *, node *); void protpreorder(node *); void protadd(node *, node *, node *); void protre_move(node **, node **); void evaluate(node *); void protpostorder(node *); void protreroot(node *); void protsavetraverse(node *, long *, boolean *); void protsavetree(long *, boolean *); void tryadd(node *, node **, node **); void addpreorder(node *, node *, node *); void tryrearr(node *, boolean *); void repreorder(node *, boolean *); void rearrange(node **); void protgetch(Char *); void protaddelement(node **, long *, long *, boolean *); void prottreeread(void); void protancestset(long *, long *, long *, long *, long *); void prothyprint(long , long , boolean *, node *, boolean *, boolean *); void prothyptrav(node *, sitearray *, long, long, long *, boolean *, sitearray); void prothypstates(long *); void describe(void); void maketree(void); void reallocnode(node* p); void reallocchars(void); /* function prototypes */ #endif Char infilename[FNMLNGTH], outfilename[FNMLNGTH], intreename[FNMLNGTH], outtreename[FNMLNGTH], weightfilename[FNMLNGTH]; node *root; long chars, col, msets, ith, njumble, jumb; /* chars = number of sites in actual sequences */ long inseed, inseed0; boolean jumble, usertree, weights, thresh, trout, progress, stepbox, justwts, ancseq, mulsets, firstset; codetype whichcode; long fullset, fulldel; pointarray treenode; /* pointers to all nodes in tree */ double threshold; steptr threshwt; longer seed; long *enterorder; sitearray translate[(long)quest - (long)ala + 1]; aas trans[4][4][4]; long **fsteps; bestelm *bestrees; boolean dummy; gseq *garbage; node *temp, *temp1; Char ch; aas tmpa; char *progname; /* Local variables for maketree, propagated globally for c version: */ long minwhich; double like, bestyet, bestlike, minsteps, bstlike2; boolean lastrearr, recompute; node *there; double nsteps[maxuser]; long *place; boolean *names; void protgnu(gseq **p) { /* this and the following are do-it-yourself garbage collectors. Make a new node or pull one off the garbage list */ if (garbage != NULL) { *p = garbage; free((*p)->seq); (*p)->seq = (seqptr)Malloc(chars*sizeof(sitearray)); garbage = garbage->next; } else { *p = (gseq *)Malloc(sizeof(gseq)); (*p)->seq = (seqptr)Malloc(chars*sizeof(sitearray)); } (*p)->next = NULL; } /* protgnu */ void protchuck(gseq *p) { /* collect garbage on p -- put it on front of garbage list */ p->next = garbage; garbage = p; } /* protchuck */ void code() { /* make up table of the code 1 = u, 2 = c, 3 = a, 4 = g */ trans[0][0][0] = phe; trans[0][0][1] = phe; trans[0][0][2] = leu; trans[0][0][3] = leu; trans[0][1][0] = ser1; trans[0][1][1] = ser1; trans[0][1][2] = ser1; trans[0][1][3] = ser1; trans[0][2][0] = tyr; trans[0][2][1] = tyr; trans[0][2][2] = stop; trans[0][2][3] = stop; trans[0][3][0] = cys; trans[0][3][1] = cys; trans[0][3][2] = stop; trans[0][3][3] = trp; trans[1][0][0] = leu; trans[1][0][1] = leu; trans[1][0][2] = leu; trans[1][0][3] = leu; trans[1][1][0] = pro; trans[1][1][1] = pro; trans[1][1][2] = pro; trans[1][1][3] = pro; trans[1][2][0] = his; trans[1][2][1] = his; trans[1][2][2] = gln; trans[1][2][3] = gln; trans[1][3][0] = arg; trans[1][3][1] = arg; trans[1][3][2] = arg; trans[1][3][3] = arg; trans[2][0][0] = ileu; trans[2][0][1] = ileu; trans[2][0][2] = ileu; trans[2][0][3] = met; trans[2][1][0] = thr; trans[2][1][1] = thr; trans[2][1][2] = thr; trans[2][1][3] = thr; trans[2][2][0] = asn; trans[2][2][1] = asn; trans[2][2][2] = lys; trans[2][2][3] = lys; trans[2][3][0] = ser2; trans[2][3][1] = ser2; trans[2][3][2] = arg; trans[2][3][3] = arg; trans[3][0][0] = val; trans[3][0][1] = val; trans[3][0][2] = val; trans[3][0][3] = val; trans[3][1][0] = ala; trans[3][1][1] = ala; trans[3][1][2] = ala; trans[3][1][3] = ala; trans[3][2][0] = asp; trans[3][2][1] = asp; trans[3][2][2] = glu; trans[3][2][3] = glu; trans[3][3][0] = gly; trans[3][3][1] = gly; trans[3][3][2] = gly; trans[3][3][3] = gly; if (whichcode == mito) trans[0][3][2] = trp; if (whichcode == vertmito) { trans[0][3][2] = trp; trans[2][3][2] = stop; trans[2][3][3] = stop; trans[2][0][2] = met; } if (whichcode == flymito) { trans[0][3][2] = trp; trans[2][0][2] = met; trans[2][3][2] = ser2; } if (whichcode == yeastmito) { trans[0][3][2] = trp; trans[1][0][2] = thr; trans[2][0][2] = met; } } /* code */ void setup() { /* set up set table to get aasets from aas */ aas a, b; long i, j, k, l, s; for (a = ala; (long)a <= (long)stop; a = (aas)((long)a + 1)) { translate[(long)a - (long)ala][0] = 1L << ((long)a); translate[(long)a - (long)ala][1] = 1L << ((long)a); } for (i = 0; i <= 3; i++) { for (j = 0; j <= 3; j++) { for (k = 0; k <= 3; k++) { for (l = 0; l <= 3; l++) { translate[(long)trans[i][j][k]][1] |= (1L << (long)trans[l][j][k]); translate[(long)trans[i][j][k]][1] |= (1L << (long)trans[i][l][k]); translate[(long)trans[i][j][k]][1] |= (1L << (long)trans[i][j][l]); } } } } translate[(long)del - (long)ala][1] = 1L << ((long)del); fulldel = (1L << ((long)stop + 1)) - (1L << ((long)ala)); fullset = fulldel & (~(1L << ((long)del))); translate[(long)asx - (long)ala][0] = (1L << ((long)asn)) | (1L << ((long)asp)); translate[(long)glx - (long)ala][0] = (1L << ((long)gln)) | (1L << ((long)glu)); translate[(long)ser - (long)ala][0] = (1L << ((long)ser1)) | (1L << ((long)ser2)); translate[(long)unk - (long)ala][0] = fullset; translate[(long)quest - (long)ala][0] = fulldel; translate[(long)asx - (long)ala][1] = translate[(long)asn - (long)ala][1] | translate[(long)asp - (long)ala][1]; translate[(long)glx - (long)ala][1] = translate[(long)gln - (long)ala][1] | translate[(long)glu - (long)ala][1]; translate[(long)ser - (long)ala][1] = translate[(long)ser1 - (long)ala][1] | translate[(long)ser2 - (long)ala][1]; translate[(long)unk - (long)ala][1] = fullset; translate[(long)quest - (long)ala][1] = fulldel; for (a = ala; (long)a <= (long)quest; a = (aas)((long)a + 1)) { s = 0; for (b = ala; (long)b <= (long)stop; b = (aas)((long)b + 1)) { if (((1L << ((long)b)) & translate[(long)a - (long)ala][1]) != 0) s |= translate[(long)b - (long)ala][1]; } translate[(long)a - (long)ala][2] = s; } } /* setup */ void getoptions() { /* interactively set options */ long loopcount, loopcount2; Char ch, ch2; fprintf(outfile, "\nProtein parsimony algorithm, version %s\n\n",VERSION); putchar('\n'); jumble = false; njumble = 1; outgrno = 1; outgropt = false; thresh = false; trout = true; usertree = false; weights = false; whichcode = universal; printdata = false; progress = true; treeprint = true; stepbox = false; ancseq = false; dotdiff = true; interleaved = true; loopcount = 0; for (;;) { cleerhome(); printf("\nProtein parsimony algorithm, version %s\n\n",VERSION); printf("Setting for this run:\n"); printf(" U Search for best tree? %s\n", (usertree ? "No, use user trees in input file" : "Yes")); if (!usertree) { printf(" J Randomize input order of sequences?"); if (jumble) printf(" Yes (seed =%8ld,%3ld times)\n", inseed0, njumble); else printf(" No. Use input order\n"); } printf(" O Outgroup root?"); if (outgropt) printf(" Yes, at sequence number%3ld\n", outgrno); else printf(" No, use as outgroup species%3ld\n", outgrno); printf(" T Use Threshold parsimony?"); if (thresh) printf(" Yes, count steps up to%4.1f per site\n", threshold); else printf(" No, use ordinary parsimony\n"); printf(" C Use which genetic code? %s\n", (whichcode == universal) ? "Universal" : (whichcode == ciliate) ? "Ciliate" : (whichcode == mito) ? "Universal mitochondrial" : (whichcode == vertmito) ? "Vertebrate mitochondrial" : (whichcode == flymito) ? "Fly mitochondrial" : (whichcode == yeastmito) ? "Yeast mitochondrial" : ""); printf(" W Sites weighted? %s\n", (weights ? "Yes" : "No")); printf(" M Analyze multiple data sets?"); if (mulsets) printf(" Yes, %2ld %s\n", msets, (justwts ? "sets of weights" : "data sets")); else printf(" No\n"); printf(" I Input sequences interleaved? %s\n", (interleaved ? "Yes" : "No, sequential")); printf(" 0 Terminal type (IBM PC, ANSI, none)? %s\n", (ibmpc ? "IBM PC" : ansi ? "ANSI" : "(none)")); printf(" 1 Print out the data at start of run %s\n", (printdata ? "Yes" : "No")); printf(" 2 Print indications of progress of run %s\n", (progress ? "Yes" : "No")); printf(" 3 Print out tree %s\n", (treeprint ? "Yes" : "No")); printf(" 4 Print out steps in each site %s\n", (stepbox ? "Yes" : "No")); printf(" 5 Print sequences at all nodes of tree %s\n", (ancseq ? "Yes" : "No")); if (ancseq || printdata) printf(" . Use dot-differencing to display them %s\n", dotdiff ? "Yes" : "No"); printf(" 6 Write out trees onto tree file? %s\n", (trout ? "Yes" : "No")); if(weights && justwts){ printf( "WARNING: W option and Multiple Weights options are both on. "); printf( "The W menu option is unnecessary and has no additional effect. \n"); } printf( "\nAre these settings correct? (type Y or the letter for one to change)\n"); fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); uppercase(&ch); if (ch == 'Y') break; if (((!usertree) && (strchr("WCJOTUMI12345.60", ch) != NULL)) || (usertree && ((strchr("WCOTUMI12345.60", ch) != NULL)))){ switch (ch) { case 'J': jumble = !jumble; if (jumble) initjumble(&inseed, &inseed0, seed, &njumble); else njumble = 1; break; case 'W': weights = !weights; break; case 'O': outgropt = !outgropt; if (outgropt) initoutgroup(&outgrno, spp); else outgrno = 1; break; case 'T': thresh = !thresh; if (thresh) initthreshold(&threshold); break; case 'C': printf("\nWhich genetic code?\n"); printf(" type for\n\n"); printf(" U Universal\n"); printf(" M Mitochondrial\n"); printf(" V Vertebrate mitochondrial\n"); printf(" F Fly mitochondrial\n"); printf(" Y Yeast mitochondrial\n\n"); loopcount2 = 0; do { printf("type U, M, V, F, or Y\n"); fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); if (ch == '\n') ch = ' '; uppercase(&ch); countup(&loopcount2, 10); } while (ch != 'U' && ch != 'M' && ch != 'V' && ch != 'F' && ch != 'Y'); switch (ch) { case 'U': whichcode = universal; break; case 'M': whichcode = mito; break; case 'V': whichcode = vertmito; break; case 'F': whichcode = flymito; break; case 'Y': whichcode = yeastmito; break; } break; case 'M': mulsets = !mulsets; if (mulsets){ printf("Multiple data sets or multiple weights?"); loopcount2 = 0; do { printf(" (type D or W)\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch2); getchar(); if (ch2 == '\n') ch2 = ' '; uppercase(&ch2); countup(&loopcount2, 10); } while ((ch2 != 'W') && (ch2 != 'D')); justwts = (ch2 == 'W'); if (justwts) justweights(&msets); else initdatasets(&msets); if (!jumble) { jumble = true; initjumble(&inseed, &inseed0, seed, &njumble); } } break; case 'I': interleaved = !interleaved; break; case 'U': usertree = !usertree; break; case '0': initterminal(&ibmpc, &ansi); break; case '1': printdata = !printdata; break; case '2': progress = !progress; break; case '3': treeprint = !treeprint; break; case '4': stepbox = !stepbox; break; case '5': ancseq = !ancseq; break; case '.': dotdiff = !dotdiff; break; case '6': trout = !trout; break; } } else printf("Not a possible option!\n"); countup(&loopcount, 100); } } /* getoptions */ void protalloctree() { /* allocate treenode dynamically */ long i, j; node *p, *q; treenode = (pointarray)Malloc(nonodes*sizeof(node *)); for (i = 0; i < (spp); i++) { treenode[i] = (node *)Malloc(sizeof(node)); treenode[i]->numsteps = (steptr)Malloc(chars*sizeof(long)); treenode[i]->siteset = (seqptr)Malloc(chars*sizeof(sitearray)); treenode[i]->seq = (aas *)Malloc(chars*sizeof(aas)); } for (i = spp; i < (nonodes); i++) { q = NULL; for (j = 1; j <= 3; j++) { p = (node *)Malloc(sizeof(node)); p->numsteps = (steptr)Malloc(chars*sizeof(long)); p->siteset = (seqptr)Malloc(chars*sizeof(sitearray)); p->seq = (aas *)Malloc(chars*sizeof(aas)); p->next = q; q = p; } p->next->next->next = p; treenode[i] = p; } } /* protalloctree */ void reallocnode(node* p) { free(p->numsteps); free(p->siteset); free(p->seq); p->numsteps = (steptr)Malloc(chars*sizeof(long)); p->siteset = (seqptr)Malloc(chars*sizeof(sitearray)); p->seq = (aas *)Malloc(chars*sizeof(aas)); } void reallocchars(void) { /* reallocates variables that are dependand on the number of chars * do we need to reallocate the garbage list too? */ long i; node *p; if (usertree) for (i = 0; i < maxuser; i++) { free(fsteps[i]); fsteps[i] = (long *)Malloc(chars*sizeof(long)); } for (i = 0; i < nonodes; i++) { reallocnode(treenode[i]); if (i >= spp) { p=treenode[i]->next; while (p != treenode[i]) { reallocnode(p); p = p->next; } } } free(weight); free(threshwt); free(temp->numsteps); free(temp->siteset); free(temp->seq); free(temp1->numsteps); free(temp1->siteset); free(temp1->seq); weight = (steptr)Malloc(chars*sizeof(long)); threshwt = (steptr)Malloc(chars*sizeof(long)); temp->numsteps = (steptr)Malloc(chars*sizeof(long)); temp->siteset = (seqptr)Malloc(chars*sizeof(sitearray)); temp->seq = (aas *)Malloc(chars*sizeof(aas)); temp1->numsteps = (steptr)Malloc(chars*sizeof(long)); temp1->siteset = (seqptr)Malloc(chars*sizeof(sitearray)); temp1->seq = (aas *)Malloc(chars*sizeof(aas)); } void allocrest() { /* allocate remaining global arrays and variables dynamically */ long i; if (usertree) { fsteps = (long **)Malloc(maxuser*sizeof(long *)); for (i = 0; i < maxuser; i++) fsteps[i] = (long *)Malloc(chars*sizeof(long)); } bestrees = (bestelm *)Malloc(maxtrees*sizeof(bestelm)); for (i = 1; i <= maxtrees; i++) bestrees[i - 1].btree = (long *)Malloc(spp*sizeof(long)); nayme = (naym *)Malloc(spp*sizeof(naym)); enterorder = (long *)Malloc(spp*sizeof(long)); place = (long *)Malloc(nonodes*sizeof(long)); weight = (steptr)Malloc(chars*sizeof(long)); threshwt = (steptr)Malloc(chars*sizeof(long)); temp = (node *)Malloc(sizeof(node)); temp->numsteps = (steptr)Malloc(chars*sizeof(long)); temp->siteset = (seqptr)Malloc(chars*sizeof(sitearray)); temp->seq = (aas *)Malloc(chars*sizeof(aas)); temp1 = (node *)Malloc(sizeof(node)); temp1->numsteps = (steptr)Malloc(chars*sizeof(long)); temp1->siteset = (seqptr)Malloc(chars*sizeof(sitearray)); temp1->seq = (aas *)Malloc(chars*sizeof(aas)); } /* allocrest */ void doinit() { /* initializes variables */ inputnumbers(&spp, &chars, &nonodes, 1); getoptions(); if (printdata) fprintf(outfile, "%2ld species, %3ld sites\n\n", spp, chars); protalloctree(); allocrest(); } /* doinit*/ void protinputdata() { /* input the names and sequences for each species */ long i, j, k, l, aasread, aasnew = 0; Char charstate; boolean allread, done; aas aa; /* temporary amino acid for input */ if (printdata) headings(chars, "Sequences", "---------"); aasread = 0; allread = false; while (!(allread)) { /* eat white space -- if the separator line has spaces on it*/ do { charstate = gettc(infile); } while (charstate == ' ' || charstate == '\t'); ungetc(charstate, infile); if (eoln(infile)) { scan_eoln(infile); } i = 1; while (i <= spp) { if ((interleaved && aasread == 0) || !interleaved) initname(i - 1); j = interleaved ? aasread : 0; done = false; while (!done && !eoff(infile)) { if (interleaved) done = true; while (j < chars && !(eoln(infile) || eoff(infile))) { charstate = gettc(infile); if (charstate == '\n' || charstate == '\t') charstate = ' '; if (charstate == ' ' || (charstate >= '0' && charstate <= '9')) continue; uppercase(&charstate); if ((!isalpha(charstate) && charstate != '?' && charstate != '-' && charstate != '*') || charstate == 'J' || charstate == 'O' || charstate == 'U') { printf("WARNING -- BAD AMINO ACID:%c",charstate); printf(" AT POSITION%5ld OF SPECIES %3ld\n",j,i); exxit(-1); } j++; aa = (charstate == 'A') ? ala : (charstate == 'B') ? asx : (charstate == 'C') ? cys : (charstate == 'D') ? asp : (charstate == 'E') ? glu : (charstate == 'F') ? phe : (charstate == 'G') ? gly : aa; aa = (charstate == 'H') ? his : (charstate == 'I') ? ileu : (charstate == 'K') ? lys : (charstate == 'L') ? leu : (charstate == 'M') ? met : (charstate == 'N') ? asn : (charstate == 'P') ? pro : (charstate == 'Q') ? gln : (charstate == 'R') ? arg : aa; aa = (charstate == 'S') ? ser : (charstate == 'T') ? thr : (charstate == 'V') ? val : (charstate == 'W') ? trp : (charstate == 'X') ? unk : (charstate == 'Y') ? tyr : (charstate == 'Z') ? glx : (charstate == '*') ? stop : (charstate == '?') ? quest: (charstate == '-') ? del : aa; treenode[i - 1]->seq[j - 1] = aa; memcpy(treenode[i - 1]->siteset[j - 1], translate[(long)aa - (long)ala], sizeof(sitearray)); } if (interleaved) continue; if (j < chars) scan_eoln(infile); else if (j == chars) done = true; } if (interleaved && i == 1) aasnew = j; scan_eoln(infile); if ((interleaved && j != aasnew) || ((!interleaved) && j != chars)){ printf("ERROR: SEQUENCES OUT OF ALIGNMENT\n"); exxit(-1);} i++; } if (interleaved) { aasread = aasnew; allread = (aasread == chars); } else allread = (i > spp); } if (printdata) { for (i = 1; i <= ((chars - 1) / 60 + 1); i++) { for (j = 1; j <= (spp); j++) { for (k = 0; k < nmlngth; k++) putc(nayme[j - 1][k], outfile); fprintf(outfile, " "); l = i * 60; if (l > chars) l = chars; for (k = (i - 1) * 60 + 1; k <= l; k++) { if (j > 1 && treenode[j - 1]->seq[k - 1] == treenode[0]->seq[k - 1]) charstate = '.'; else { tmpa = treenode[j-1]->seq[k-1]; charstate = (tmpa == ala) ? 'A' : (tmpa == asx) ? 'B' : (tmpa == cys) ? 'C' : (tmpa == asp) ? 'D' : (tmpa == glu) ? 'E' : (tmpa == phe) ? 'F' : (tmpa == gly) ? 'G' : (tmpa == his) ? 'H' : (tmpa ==ileu) ? 'I' : (tmpa == lys) ? 'K' : (tmpa == leu) ? 'L' : charstate; charstate = (tmpa == met) ? 'M' : (tmpa == asn) ? 'N' : (tmpa == pro) ? 'P' : (tmpa == gln) ? 'Q' : (tmpa == arg) ? 'R' : (tmpa == ser) ? 'S' : (tmpa ==ser1) ? 'S' : (tmpa ==ser2) ? 'S' : charstate; charstate = (tmpa == thr) ? 'T' : (tmpa == val) ? 'V' : (tmpa == trp) ? 'W' : (tmpa == unk) ? 'X' : (tmpa == tyr) ? 'Y' : (tmpa == glx) ? 'Z' : (tmpa == del) ? '-' : (tmpa ==stop) ? '*' : (tmpa==quest) ? '?' : charstate; } putc(charstate, outfile); if (k % 10 == 0 && k % 60 != 0) putc(' ', outfile); } putc('\n', outfile); } putc('\n', outfile); } putc('\n', outfile); } putc('\n', outfile); } /* protinputdata */ void protmakevalues() { /* set up fractional likelihoods at tips */ long i, j; node *p; for (i = 1; i <= nonodes; i++) { treenode[i - 1]->back = NULL; treenode[i - 1]->tip = (i <= spp); treenode[i - 1]->index = i; for (j = 0; j < (chars); j++) treenode[i - 1]->numsteps[j] = 0; if (i > spp) { p = treenode[i - 1]->next; while (p != treenode[i - 1]) { p->back = NULL; p->tip = false; p->index = i; for (j = 0; j < (chars); j++) p->numsteps[j] = 0; p = p->next; } } } } /* protmakevalues */ void doinput() { /* reads the input data */ long i; if (justwts) { if (firstset) protinputdata(); for (i = 0; i < chars; i++) weight[i] = 1; inputweights(chars, weight, &weights); if (justwts) { fprintf(outfile, "\n\nWeights set # %ld:\n\n", ith); if (progress) printf("\nWeights set # %ld:\n\n", ith); } if (printdata) printweights(outfile, 0, chars, weight, "Sites"); } else { if (!firstset){ samenumsp(&chars, ith); reallocchars(); } for (i = 0; i < chars; i++) weight[i] = 1; if (weights) { inputweights(chars, weight, &weights); } if (weights) printweights(outfile, 0, chars, weight, "Sites"); protinputdata(); } if(!thresh) threshold = spp * 3.0; for(i = 0 ; i < (chars) ; i++){ weight[i]*=10; threshwt[i] = (long)(threshold * weight[i] + 0.5); } protmakevalues(); } /* doinput */ void protfillin(node *p, node *left, node *rt) { /* sets up for each node in the tree the aa set for site m at that point and counts the changes. The program spends much of its time in this function */ boolean counted, done; aas aa; long s = 0; sitearray ls, rs, qs; long i, j, m, n; for (m = 0; m < chars; m++) { if (left != NULL) memcpy(ls, left->siteset[m], sizeof(sitearray)); if (rt != NULL) memcpy(rs, rt->siteset[m], sizeof(sitearray)); if (left == NULL) { n = rt->numsteps[m]; memcpy(qs, rs, sizeof(sitearray)); } else if (rt == NULL) { n = left->numsteps[m]; memcpy(qs, ls, sizeof(sitearray)); } else { n = left->numsteps[m] + rt->numsteps[m]; if ((ls[0] == rs[0]) && (ls[1] == rs[1]) && (ls[2] == rs[2])) { qs[0] = ls[0]; qs[1] = ls[1]; qs[2] = ls[2]; } else { counted = false; for (i = 0; (!counted) && (i <= 3); i++) { switch (i) { case 0: s = ls[0] & rs[0]; break; case 1: s = (ls[0] & rs[1]) | (ls[1] & rs[0]); break; case 2: s = (ls[0] & rs[2]) | (ls[1] & rs[1]) | (ls[2] & rs[0]); break; case 3: s = ls[0] | (ls[1] & rs[2]) | (ls[2] & rs[1]) | rs[0]; break; } if (s != 0) { qs[0] = s; counted = true; } else n += weight[m]; } switch (i) { case 1: qs[1] = qs[0] | (ls[0] & rs[1]) | (ls[1] & rs[0]); qs[2] = qs[1] | (ls[0] & rs[2]) | (ls[1] & rs[1]) | (ls[2] & rs[0]); break; case 2: qs[1] = qs[0] | (ls[0] & rs[2]) | (ls[1] & rs[1]) | (ls[2] & rs[0]); qs[2] = qs[1] | ls[0] | (ls[1] & rs[2]) | (ls[2] & rs[1]) | rs[0]; break; case 3: qs[1] = qs[0] | ls[0] | (ls[1] & rs[2]) | (ls[2] & rs[1]) | rs[0]; qs[2] = qs[1] | ls[1] | (ls[2] & rs[2]) | rs[1]; break; case 4: qs[1] = qs[0] | ls[1] | (ls[2] & rs[2]) | rs[1]; qs[2] = qs[1] | ls[2] | rs[2]; break; } for (aa = ala; (long)aa <= (long)stop; aa = (aas)((long)aa + 1)) { done = false; for (i = 0; (!done) && (i <= 1); i++) { if (((1L << ((long)aa)) & qs[i]) != 0) { for (j = i+1; j <= 2; j++) qs[j] |= translate[(long)aa - (long)ala][j-i]; done = true; } } } } } p->numsteps[m] = n; memcpy(p->siteset[m], qs, sizeof(sitearray)); } } /* protfillin */ void protpreorder(node *p) { /* recompute number of steps in preorder taking both ancestoral and descendent steps into account */ if (p != NULL && !p->tip) { protfillin (p->next, p->next->next->back, p->back); protfillin (p->next->next, p->back, p->next->back); protpreorder (p->next->back); protpreorder (p->next->next->back); } } /* protpreorder */ void protadd(node *below, node *newtip, node *newfork) { /* inserts the nodes newfork and its left descendant, newtip, to the tree. below becomes newfork's right descendant */ if (below != treenode[below->index - 1]) below = treenode[below->index - 1]; if (below->back != NULL) below->back->back = newfork; newfork->back = below->back; below->back = newfork->next->next; newfork->next->next->back = below; newfork->next->back = newtip; newtip->back = newfork->next; if (root == below) root = newfork; root->back = NULL; if (recompute) { protfillin (newfork, newfork->next->back, newfork->next->next->back); protpreorder(newfork); if (newfork != root) protpreorder(newfork->back); } } /* protadd */ void protre_move(node **item, node **fork) { /* removes nodes item and its ancestor, fork, from the tree. the new descendant of fork's ancestor is made to be fork's second descendant (other than item). Also returns pointers to the deleted nodes, item and fork */ node *p, *q, *other; if ((*item)->back == NULL) { *fork = NULL; return; } *fork = treenode[(*item)->back->index - 1]; if ((*item) == (*fork)->next->back) other = (*fork)->next->next->back; else other = (*fork)->next->back; if (root == *fork) root = other; p = (*item)->back->next->back; q = (*item)->back->next->next->back; if (p != NULL) p->back = q; if (q != NULL) q->back = p; (*fork)->back = NULL; p = (*fork)->next; do { p->back = NULL; p = p->next; } while (p != (*fork)); (*item)->back = NULL; if (recompute) { protpreorder(other); if (other != root) protpreorder(other->back); } } /* protre_move */ void evaluate(node *r) { /* determines the number of steps needed for a tree. this is the minimum number of steps needed to evolve sequences on this tree */ long i, steps, term; double sum; sum = 0.0; for (i = 0; i < (chars); i++) { steps = r->numsteps[i]; if (steps <= threshwt[i]) term = steps; else term = threshwt[i]; sum += term; if (usertree && which <= maxuser) fsteps[which - 1][i] = term; } if (usertree && which <= maxuser) { nsteps[which - 1] = sum; if (which == 1) { minwhich = 1; minsteps = sum; } else if (sum < minsteps) { minwhich = which; minsteps = sum; } } like = -sum; } /* evaluate */ void protpostorder(node *p) { /* traverses a binary tree, calling PROCEDURE fillin at a node's descendants before calling fillin at the node */ if (p->tip) return; protpostorder(p->next->back); protpostorder(p->next->next->back); protfillin(p, p->next->back, p->next->next->back); } /* protpostorder */ void protreroot(node *outgroup) { /* reorients tree, putting outgroup in desired position. */ node *p, *q; if (outgroup->back->index == root->index) return; p = root->next; q = root->next->next; p->back->back = q->back; q->back->back = p->back; p->back = outgroup; q->back = outgroup->back; outgroup->back->back = q; outgroup->back = p; } /* protreroot */ void protsavetraverse(node *p, long *pos, boolean *found) { /* sets BOOLEANs that indicate which way is down */ p->bottom = true; if (p->tip) return; p->next->bottom = false; protsavetraverse(p->next->back, pos,found); p->next->next->bottom = false; protsavetraverse(p->next->next->back, pos,found); } /* protsavetraverse */ void protsavetree(long *pos, boolean *found) { /* record in place where each species has to be added to reconstruct this tree */ long i, j; node *p; boolean done; protreroot(treenode[outgrno - 1]); protsavetraverse(root, pos,found); for (i = 0; i < (nonodes); i++) place[i] = 0; place[root->index - 1] = 1; for (i = 1; i <= (spp); i++) { p = treenode[i - 1]; while (place[p->index - 1] == 0) { place[p->index - 1] = i; while (!p->bottom) p = p->next; p = p->back; } if (i > 1) { place[i - 1] = place[p->index - 1]; j = place[p->index - 1]; done = false; while (!done) { place[p->index - 1] = spp + i - 1; while (!p->bottom) p = p->next; p = p->back; done = (p == NULL); if (!done) done = (place[p->index - 1] != j); } } } } /* protsavetree */ void tryadd(node *p, node **item, node **nufork) { /* temporarily adds one fork and one tip to the tree. if the location where they are added yields greater "likelihood" than other locations tested up to that time, then keeps that location as there */ long pos; boolean found; node *rute, *q; if (p == root) protfillin(temp, *item, p); else { protfillin(temp1, *item, p); protfillin(temp, temp1, p->back); } evaluate(temp); if (lastrearr) { if (like < bestlike) { if ((*item) == (*nufork)->next->next->back) { q = (*nufork)->next; (*nufork)->next = (*nufork)->next->next; (*nufork)->next->next = q; q->next = (*nufork); } } else if (like >= bstlike2) { recompute = false; protadd(p, (*item), (*nufork)); rute = root->next->back; protsavetree(&pos,&found); protreroot(rute); if (like > bstlike2) { bestlike = bstlike2 = like; pos = 1; nextree = 1; addtree(pos, &nextree, dummy, place, bestrees); } else { pos = 0; findtree(&found, &pos, nextree, place, bestrees); if (!found) { if (nextree <= maxtrees) addtree(pos, &nextree, dummy, place, bestrees); } } protre_move (item, nufork); recompute = true; } } if (like >= bestyet) { bestyet = like; there = p; } } /* tryadd */ void addpreorder(node *p, node *item, node *nufork) { /* traverses a binary tree, calling PROCEDURE tryadd at a node before calling tryadd at its descendants */ if (p == NULL) return; tryadd(p, &item,&nufork); if (!p->tip) { addpreorder(p->next->back, item, nufork); addpreorder(p->next->next->back, item, nufork); } } /* addpreorder */ void tryrearr(node *p, boolean *success) { /* evaluates one rearrangement of the tree. if the new tree has greater "likelihood" than the old one sets success := TRUE and keeps the new tree. otherwise, restores the old tree */ node *frombelow, *whereto, *forknode, *q; double oldlike; if (p->back == NULL) return; forknode = treenode[p->back->index - 1]; if (forknode->back == NULL) return; oldlike = bestyet; if (p->back->next->next == forknode) frombelow = forknode->next->next->back; else frombelow = forknode->next->back; whereto = treenode[forknode->back->index - 1]; if (whereto->next->back == forknode) q = whereto->next->next->back; else q = whereto->next->back; protfillin(temp1, frombelow, q); protfillin(temp, temp1, p); protfillin(temp1, temp, whereto->back); evaluate(temp1); if (like - oldlike < LIKE_EPSILON) { if (p == forknode->next->next->back) { q = forknode->next; forknode->next = forknode->next->next; forknode->next->next = q; q->next = forknode; } } else { recompute = false; protre_move(&p, &forknode); protfillin(whereto, whereto->next->back, whereto->next->next->back); recompute = true; protadd(whereto, p, forknode); *success = true; bestyet = like; } } /* tryrearr */ void repreorder(node *p, boolean *success) { /* traverses a binary tree, calling PROCEDURE tryrearr at a node before calling tryrearr at its descendants */ if (p == NULL) return; tryrearr(p,success); if (!p->tip) { repreorder(p->next->back,success); repreorder(p->next->next->back,success); } } /* repreorder */ void rearrange(node **r) { /* traverses the tree (preorder), finding any local rearrangement which decreases the number of steps. if traversal succeeds in increasing the tree's "likelihood", PROCEDURE rearrange runs traversal again */ boolean success = true; while (success) { success = false; repreorder(*r, &success); } } /* rearrange */ void protgetch(Char *c) { /* get next nonblank character */ do { if (eoln(intree)) scan_eoln(intree); *c = gettc(intree); if (*c == '\n' || *c == '\t') *c = ' '; } while (!(*c != ' ' || eoff(intree))); } /* protgetch */ void protaddelement(node **p,long *nextnode,long *lparens,boolean *names) { /* recursive procedure adds nodes to user-defined tree */ node *q; long i, n; boolean found; Char str[nmlngth]; protgetch(&ch); if (ch == '(' ) { if ((*lparens) >= spp - 1) { printf("\nERROR IN USER TREE: TOO MANY LEFT PARENTHESES\n"); exxit(-1); } (*nextnode)++; (*lparens)++; q = treenode[(*nextnode) - 1]; protaddelement(&q->next->back, nextnode,lparens,names); q->next->back->back = q->next; findch(',', &ch, which); protaddelement(&q->next->next->back, nextnode,lparens,names); q->next->next->back->back = q->next->next; findch(')', &ch, which); *p = q; return; } for (i = 0; i < nmlngth; i++) str[i] = ' '; n = 1; do { if (ch == '_') ch = ' '; str[n - 1] = ch; if (eoln(intree)) scan_eoln(intree); ch = gettc(intree); n++; } while (ch != ',' && ch != ')' && ch != ':' && n <= nmlngth); n = 1; do { found = true; for (i = 0; i < nmlngth; i++) found = (found && ((str[i] == nayme[n - 1][i]) || ((nayme[n - 1][i] == '_') && (str[i] == ' ')))); if (found) { if (names[n - 1] == false) { *p = treenode[n - 1]; names[n - 1] = true; } else { printf("\nERROR IN USER TREE: DUPLICATE NAME FOUND -- "); for (i = 0; i < nmlngth; i++) putchar(nayme[n - 1][i]); putchar('\n'); exxit(-1); } } else n++; } while (!(n > spp || found)); if (n <= spp) return; printf("CANNOT FIND SPECIES: "); for (i = 0; i < nmlngth; i++) putchar(str[i]); putchar('\n'); } /* protaddelement */ void prottreeread() { /* read in user-defined tree and set it up */ long nextnode, lparens, i; root = treenode[spp]; nextnode = spp; root->back = NULL; names = (boolean *)Malloc(spp*sizeof(boolean)); for (i = 0; i < (spp); i++) names[i] = false; lparens = 0; protaddelement(&root, &nextnode,&lparens,names); if (ch == '[') { do ch = gettc(intree); while (ch != ']'); ch = gettc(intree); } findch(';', &ch, which); scan_eoln(intree); free(names); } /* prottreeread */ void protancestset(long *a, long *b, long *c, long *d, long *k) { /* sets up the aa set array. */ aas aa; long s, sa, sb; long i, j, m, n; boolean counted; counted = false; *k = 0; for (i = 0; i <= 5; i++) { if (*k < 3) { s = 0; if (i > 3) n = i - 3; else n = 0; for (j = n; j <= (i - n); j++) { if (j < 3) sa = a[j]; else sa = fullset; for (m = n; m <= (i - j - n); m++) { if (m < 3) sb = sa & b[m]; else sb = sa; if (i - j - m < 3) sb &= c[i - j - m]; s |= sb; } } if (counted || s != 0) { d[*k] = s; (*k)++; counted = true; } } } for (i = 0; i <= 1; i++) { for (aa = ala; (long)aa <= (long)stop; aa = (aas)((long)aa + 1)) { if (((1L << ((long)aa)) & d[i]) != 0) { for (j = i + 1; j <= 2; j++) d[j] |= translate[(long)aa - (long)ala][j - i]; } } } } /* protancestset */ void prothyprint(long b1, long b2, boolean *bottom, node *r, boolean *nonzero, boolean *maybe) { /* print out states in sites b1 through b2 at node */ long i; boolean dot; Char ch = 0; aas aa; if (*bottom) { if (!outgropt) fprintf(outfile, " "); else fprintf(outfile, "root "); } else fprintf(outfile, "%3ld ", r->back->index - spp); if (r->tip) { for (i = 0; i < nmlngth; i++) putc(nayme[r->index - 1][i], outfile); } else fprintf(outfile, "%4ld ", r->index - spp); if (*bottom) fprintf(outfile, " "); else if (*nonzero) fprintf(outfile, " yes "); else if (*maybe) fprintf(outfile, " maybe "); else fprintf(outfile, " no "); for (i = b1 - 1; i < b2; i++) { aa = r->seq[i]; switch (aa) { case ala: ch = 'A'; break; case asx: ch = 'B'; break; case cys: ch = 'C'; break; case asp: ch = 'D'; break; case glu: ch = 'E'; break; case phe: ch = 'F'; break; case gly: ch = 'G'; break; case his: ch = 'H'; break; case ileu: ch = 'I'; break; case lys: ch = 'K'; break; case leu: ch = 'L'; break; case met: ch = 'M'; break; case asn: ch = 'N'; break; case pro: ch = 'P'; break; case gln: ch = 'Q'; break; case arg: ch = 'R'; break; case ser: ch = 'S'; break; case ser1: ch = 'S'; break; case ser2: ch = 'S'; break; case thr: ch = 'T'; break; case trp: ch = 'W'; break; case tyr: ch = 'Y'; break; case val: ch = 'V'; break; case glx: ch = 'Z'; break; case del: ch = '-'; break; case stop: ch = '*'; break; case unk: ch = 'X'; break; case quest: ch = '?'; break; } if (!(*bottom) && dotdiff) dot = (r->siteset[i] [0] == treenode[r->back->index - 1]->siteset[i][0] || ((r->siteset[i][0] & (~((1L << ((long)ser1)) | (1L << ((long)ser2)) | (1L << ((long)ser))))) == 0 && (treenode[r->back->index - 1]->siteset[i] [0] & (~((1L << ((long)ser1)) | (1L << ((long)ser2)) | (1L << ((long)ser))))) == 0)); else dot = false; if (dot) putc('.', outfile); else putc(ch, outfile); if ((i + 1) % 10 == 0) putc(' ', outfile); } putc('\n', outfile); } /* prothyprint */ void prothyptrav(node *r, sitearray *hypset, long b1, long b2, long *k, boolean *bottom, sitearray nothing) { boolean maybe, nonzero; long i; aas aa; long anc = 0, hset; gseq *ancset, *temparray; protgnu(&ancset); protgnu(&temparray); maybe = false; nonzero = false; for (i = b1 - 1; i < b2; i++) { if (!r->tip) { protancestset(hypset[i], r->next->back->siteset[i], r->next->next->back->siteset[i], temparray->seq[i], k); memcpy(r->siteset[i], temparray->seq[i], sizeof(sitearray)); } if (!(*bottom)) anc = treenode[r->back->index - 1]->siteset[i][0]; if (!r->tip) { hset = r->siteset[i][0]; r->seq[i] = quest; for (aa = ala; (long)aa <= (long)stop; aa = (aas)((long)aa + 1)) { if (hset == 1L << ((long)aa)) r->seq[i] = aa; } if (hset == ((1L << ((long)asn)) | (1L << ((long)asp)))) r->seq[i] = asx; if (hset == ((1L << ((long)gln)) | (1L << ((long)gly)))) r->seq[i] = glx; if (hset == ((1L << ((long)ser1)) | (1L << ((long)ser2)))) r->seq[i] = ser; if (hset == fullset) r->seq[i] = unk; } nonzero = (nonzero || (r->siteset[i][0] & anc) == 0); maybe = (maybe || r->siteset[i][0] != anc); } prothyprint(b1, b2,bottom,r,&nonzero,&maybe); *bottom = false; if (!r->tip) { memcpy(temparray->seq, r->next->back->siteset, chars*sizeof(sitearray)); for (i = b1 - 1; i < b2; i++) protancestset(hypset[i], r->next->next->back->siteset[i], nothing, ancset->seq[i], k); prothyptrav(r->next->back, ancset->seq, b1, b2,k,bottom,nothing ); for (i = b1 - 1; i < b2; i++) protancestset(hypset[i], temparray->seq[i], nothing, ancset->seq[i],k); prothyptrav(r->next->next->back, ancset->seq, b1, b2, k,bottom,nothing); } protchuck(temparray); protchuck(ancset); } /* prothyptrav */ void prothypstates(long *k) { /* fill in and describe states at interior nodes */ boolean bottom; sitearray nothing; long i, n; seqptr hypset; fprintf(outfile, "\nFrom To Any Steps? State at upper node\n"); fprintf(outfile, " "); fprintf(outfile, "( . means same as in the node below it on tree)\n\n"); memcpy(nothing, translate[(long)quest - (long)ala], sizeof(sitearray)); hypset = (seqptr)Malloc(chars*sizeof(sitearray)); for (i = 0; i < (chars); i++) memcpy(hypset[i], nothing, sizeof(sitearray)); bottom = true; for (i = 1; i <= ((chars - 1) / 40 + 1); i++) { putc('\n', outfile); n = i * 40; if (n > chars) n = chars; bottom = true; prothyptrav(root, hypset, i * 40 - 39, n, k,&bottom,nothing); } free(hypset); } /* prothypstates */ void describe() { /* prints ancestors, steps and table of numbers of steps in each site */ long i,j,k; if (treeprint) fprintf(outfile, "\nrequires a total of %10.3f\n", like / -10); if (stepbox) { putc('\n', outfile); if (weights) fprintf(outfile, "weighted "); fprintf(outfile, "steps in each position:\n"); fprintf(outfile, " "); for (i = 0; i <= 9; i++) fprintf(outfile, "%4ld", i); fprintf(outfile, "\n *-----------------------------------------\n"); for (i = 0; i <= (chars / 10); i++) { fprintf(outfile, "%5ld", i * 10); putc('!', outfile); for (j = 0; j <= 9; j++) { k = i * 10 + j; if (k == 0 || k > chars) fprintf(outfile, " "); else fprintf(outfile, "%4ld", root->numsteps[k - 1] / 10); } putc('\n', outfile); } } if (ancseq) { prothypstates(&k); putc('\n', outfile); } putc('\n', outfile); if (trout) { col = 0; treeout(root, nextree, &col, root); } } /* describe */ void maketree() { /* constructs a binary tree from the pointers in treenode. adds each node at location which yields highest "likelihood" then rearranges the tree for greatest "likelihood" */ long i, j, numtrees; double gotlike; node *item, *nufork, *dummy; if (!usertree) { for (i = 1; i <= (spp); i++) enterorder[i - 1] = i; if (jumble) randumize(seed, enterorder); root = treenode[enterorder[0] - 1]; recompute = true; protadd(treenode[enterorder[0] - 1], treenode[enterorder[1] - 1], treenode[spp]); if (progress) { printf("\nAdding species:\n"); writename(0, 2, enterorder); } lastrearr = false; for (i = 3; i <= (spp); i++) { bestyet = -30.0*spp*chars; there = root; item = treenode[enterorder[i - 1] - 1]; nufork = treenode[spp + i - 2]; addpreorder(root, item, nufork); protadd(there, item, nufork); like = bestyet; rearrange(&root); if (progress) writename(i - 1, 1, enterorder); lastrearr = (i == spp); if (lastrearr) { if (progress) { printf("\nDoing global rearrangements\n"); printf(" !"); for (j = 1; j <= nonodes; j++) if ( j % (( nonodes / 72 ) + 1 ) == 0 ) putchar('-'); printf("!\n"); } bestlike = bestyet; if (jumb == 1) { bstlike2 = bestlike = -30.0*spp*chars; nextree = 1; } do { if (progress) printf(" "); gotlike = bestlike; for (j = 0; j < (nonodes); j++) { bestyet = -30.0*spp*chars; item = treenode[j]; if (item != root) { nufork = treenode[treenode[j]->back->index - 1]; protre_move(&item, &nufork); there = root; addpreorder(root, item, nufork); protadd(there, item, nufork); } if (progress) { if ( j % (( nonodes / 72 ) + 1 ) == 0 ) putchar('.'); fflush(stdout); } } if (progress) putchar('\n'); } while (bestlike > gotlike); } } if (progress) putchar('\n'); for (i = spp - 1; i >= 1; i--) protre_move(&treenode[i], &dummy); if (jumb == njumble) { if (treeprint) { putc('\n', outfile); if (nextree == 2) fprintf(outfile, "One most parsimonious tree found:\n"); else fprintf(outfile, "%6ld trees in all found\n", nextree - 1); } if (nextree > maxtrees + 1) { if (treeprint) fprintf(outfile, "here are the first%4ld of them\n", (long)maxtrees); nextree = maxtrees + 1; } if (treeprint) putc('\n', outfile); recompute = false; for (i = 0; i <= (nextree - 2); i++) { root = treenode[0]; protadd(treenode[0], treenode[1], treenode[spp]); for (j = 3; j <= (spp); j++) protadd(treenode[bestrees[i].btree[j - 1] - 1], treenode[j - 1], treenode[spp + j - 2]); protreroot(treenode[outgrno - 1]); protpostorder(root); evaluate(root); printree(root, 1.0); describe(); for (j = 1; j < (spp); j++) protre_move(&treenode[j], &dummy); } } } else { /* Open in binary: ftell() is broken for UNIX line-endings under WIN32 */ openfile(&intree,INTREE,"input tree file", "rb",progname,intreename); numtrees = countsemic(&intree); if (numtrees > 2) initseed(&inseed, &inseed0, seed); if (treeprint) { fprintf(outfile, "User-defined tree"); if (numtrees > 1) putc('s', outfile); fprintf(outfile, ":\n\n\n\n"); } which = 1; while (which <= numtrees) { prottreeread(); if (outgropt) protreroot(treenode[outgrno - 1]); protpostorder(root); evaluate(root); printree(root, 1.0); describe(); which++; } printf("\n"); FClose(intree); putc('\n', outfile); if (numtrees > 1 && chars > 1 ) standev(chars, numtrees, minwhich, minsteps, nsteps, fsteps, seed); } if (jumb == njumble && progress) { printf("Output written to file \"%s\"\n", outfilename); if (trout) printf("\nTrees also written onto file \"%s\"\n", outtreename); } } /* maketree */ int main(int argc, Char *argv[]) { /* Protein parsimony by uphill search */ #ifdef MAC argc = 1; /* macsetup("Protpars",""); */ argv[0] = "Protpars"; #endif init(argc,argv); progname = argv[0]; openfile(&infile,INFILE,"input file", "r",argv[0],infilename); openfile(&outfile,OUTFILE,"output file", "w",argv[0],outfilename); ibmpc = IBMCRT; ansi = ANSICRT; garbage = NULL; mulsets = false; msets = 1; firstset = true; code(); setup(); doinit(); if (weights || justwts) openfile(&weightfile,WEIGHTFILE,"weights file","r",argv[0],weightfilename); if (trout) openfile(&outtree,OUTTREE,"output tree file", "w",argv[0],outtreename); for (ith = 1; ith <= msets; ith++) { doinput(); if (ith == 1) firstset = false; if (msets > 1 && !justwts) { fprintf(outfile, "Data set # %ld:\n\n",ith); if (progress) printf("Data set # %ld:\n\n",ith); } for (jumb = 1; jumb <= njumble; jumb++) maketree(); } FClose(infile); FClose(outfile); FClose(outtree); #ifdef MAC fixmacfile(outfilename); fixmacfile(outtreename); #endif printf("\nDone.\n\n"); #ifdef WIN32 phyRestoreConsoleAttributes(); #endif return 0; } /* Protein parsimony by uphill search */ phylip-3.697/src/restdist.c0000644004732000473200000004214212406201117015336 0ustar joefelsenst_g #include "phylip.h" #include "seq.h" /* version 3.696. Written by Joseph Felsenstein, Akiko Fuseki, Sean Lamont, and Andrew Keeffe. Copyright (c) 1993-2014, Joseph Felsenstein. All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #define initialv 0.1 /* starting value of branch length */ #define iterationsr 20 /* how many Newton iterations per distance */ #ifndef OLDC /* function prototypes */ void restdist_inputnumbers(void); void getoptions(void); void allocrest(void); void doinit(void); void inputoptions(void); void restdist_inputdata(void); void restdist_sitesort(void); void restdist_sitecombine(void); void makeweights(void); void makev(long, long, double *); void makedists(void); void writedists(void); void getinput(void); void reallocsites(void); /* function prototypes */ #endif Char infilename[FNMLNGTH], outfilename[FNMLNGTH]; long sites, weightsum, datasets, ith; boolean restsites, neili, gama, weights, lower, progress, mulsets, firstset; double ttratio, fracchange, cvi, sitelength, xi, xv; double **d; steptr aliasweight; char *progname; Char ch; void restdist_inputnumbers() { /* read and print out numbers of species and sites */ fscanf(infile, "%ld%ld", &spp, &sites); } /* restdist_inputnumbers */ void getoptions() { /* interactively set options */ long loopcount, loopcount2; Char ch; putchar('\n'); sitelength = 6.0; neili = false; gama = false; cvi = 0.0; weights = false; lower = false; printdata = false; progress = true; restsites = true; interleaved = true; ttratio = 2.0; loopcount = 0; for (;;) { cleerhome(); printf("\nRestriction site or fragment distances, "); printf("version %s\n\n",VERSION); printf("Settings for this run:\n"); printf(" R Restriction sites or fragments? %s\n", (restsites ? "Sites" : "Fragments")); printf(" N Original or modified Nei/Li model? %s\n", (neili ? "Original" : "Modified")); if (!neili) { printf(" G Gamma distribution of rates among sites?"); if (!gama) printf(" No\n"); else printf(" Yes\n"); printf(" T Transition/transversion ratio? %f\n", ttratio); } printf(" S Site length? %4.1f\n",sitelength); printf(" L Form of distance matrix? %s\n", (lower ? "Lower-triangular" : "Square")); printf(" M Analyze multiple data sets?"); if (mulsets) printf(" Yes, %2ld sets\n", datasets); else printf(" No\n"); printf(" I Input sequences interleaved? %s\n", (interleaved ? "Yes" : "No, sequential")); printf(" 0 Terminal type (IBM PC, ANSI, none)? %s\n", ibmpc ? "IBM PC" : ansi ? "ANSI" : "(none)"); printf(" 1 Print out the data at start of run? %s\n", (printdata ? "Yes" : "No")); printf(" 2 Print indications of progress of run? %s\n", (progress ? "Yes" : "No")); printf("\n Y to accept these or type the letter for one to change\n"); fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); if (ch == '\n') ch = ' '; uppercase(&ch); if (ch == 'Y') break; if (strchr("RDNGTSLMI012",ch) != NULL){ switch (ch) { case 'R': restsites = !restsites; break; case 'G': if (!neili) gama = !gama; break; case 'N': neili = !neili; break; case 'T': if (!neili) initratio(&ttratio); break; case 'S': loopcount2 = 0; do { printf("New Sitelength?\n"); fflush(stdout); scanf("%lf%*[^\n]", &sitelength); getchar(); if (sitelength < 1.0) printf("BAD RESTRICTION SITE LENGTH: %f\n", sitelength); countup(&loopcount2, 10); } while (sitelength < 1.0); break; case 'L': lower = !lower; break; case 'M': mulsets = !mulsets; if (mulsets) initdatasets(&datasets); break; case 'I': interleaved = !interleaved; break; case '0': initterminal(&ibmpc, &ansi); break; case '1': printdata = !printdata; break; case '2': progress = !progress; break; } } else printf("Not a possible option!\n"); countup(&loopcount, 100); } if (gama) { loopcount = 0; do { printf( "\nCoefficient of variation of substitution rate among sites (must be positive)?\n"); fflush(stdout); scanf("%lf%*[^\n]", &cvi); getchar(); countup(&loopcount, 100); } while (cvi <= 0.0); cvi = 1.0 / (cvi * cvi); printf("\n"); } xi = (ttratio - 0.5)/(ttratio + 0.5); xv = 1.0 - xi; fracchange = xi*0.5 + xv*0.75; } /* getoptions */ void reallocsites() { long i; for (i = 0; i < spp; i++){ free(y[i]); y[i] = (Char *)Malloc(sites*sizeof(Char)); } free(weight); free(alias); free(aliasweight); weight = (steptr)Malloc((sites+1)*sizeof(long)); alias = (steptr)Malloc((sites+1)*sizeof(long)); aliasweight = (steptr)Malloc((sites+1)*sizeof(long)); makeweights(); } void allocrest() { long i; y = (Char **)Malloc(spp*sizeof(Char *)); for (i = 0; i < spp; i++) y[i] = (Char *)Malloc(sites*sizeof(Char)); nayme = (naym *)Malloc(spp*sizeof(naym)); weight = (steptr)Malloc((sites+1)*sizeof(long)); alias = (steptr)Malloc((sites+1)*sizeof(long)); aliasweight = (steptr)Malloc((sites+1)*sizeof(long)); d = (double **)Malloc(spp*sizeof(double *)); for (i = 0; i < spp; i++) d[i] = (double*)Malloc(spp*sizeof(double)); } /* allocrest */ void doinit() { /* initializes variables */ restdist_inputnumbers(); getoptions(); if (printdata) fprintf(outfile, "\n %4ld Species, %4ld Sites\n", spp, sites); allocrest(); } /* doinit */ void inputoptions() { /* read the options information */ Char ch; long i, extranum, cursp, curst; if (!firstset) { if (eoln(infile)) scan_eoln(infile); fscanf(infile, "%ld%ld", &cursp, &curst); if (cursp != spp) { printf("\nERROR: INCONSISTENT NUMBER OF SPECIES IN DATA SET %4ld\n", ith); exxit(-1); } sites = curst; reallocsites(); } for (i = 1; i <= sites; i++) weight[i] = 1; weightsum = sites; extranum = 0; fscanf(infile, "%*[ 0-9]"); readoptions(&extranum, "W"); for (i = 1; i <= extranum; i++) { matchoptions(&ch, "W"); inputweights2(1, sites+1, &weightsum, weight, &weights, "RESTDIST"); } } /* inputoptions */ void restdist_inputdata() { /* read the species and sites data */ long i, j, k, l, sitesread = 0, sitesnew = 0; Char ch; boolean allread, done; if (printdata) putc('\n', outfile); j = nmlngth + (sites + (sites - 1) / 10) / 2 - 5; if (j < nmlngth - 1) j = nmlngth - 1; if (j > 39) j = 39; if (printdata) { fprintf(outfile, "Name"); for (i = 1; i <= j; i++) putc(' ', outfile); fprintf(outfile, "Sites\n"); fprintf(outfile, "----"); for (i = 1; i <= j; i++) putc(' ', outfile); fprintf(outfile, "-----\n\n"); } sitesread = 0; allread = false; while (!(allread)) { /* eat white space -- if the separator line has spaces on it*/ do { ch = gettc(infile); } while (ch == ' ' || ch == '\t'); ungetc(ch, infile); if (eoln(infile)) scan_eoln(infile); i = 1; while (i <= spp ) { if ((interleaved && sitesread == 0) || !interleaved) initname(i - 1); if (interleaved) j = sitesread; else j = 0; done = false; while (!done && !eoff(infile)) { if (interleaved) done = true; while (j < sites && !(eoln(infile) || eoff(infile))) { ch = gettc(infile); if (ch == '\n' || ch == '\t') ch = ' '; if (ch == ' ') continue; uppercase(&ch); if (ch != '1' && ch != '0' && ch != '+' && ch != '-' && ch != '?') { printf(" ERROR -- Bad symbol %c",ch); printf(" at position %ld of species %ld\n", j+1, i); exxit(-1); } if (ch == '1') ch = '+'; if (ch == '0') ch = '-'; j++; y[i - 1][j - 1] = ch; } if (interleaved) continue; if (j < sites) scan_eoln(infile); else if (j == sites) done = true; } if (interleaved && i == 1) sitesnew = j; scan_eoln(infile); if ((interleaved && j != sitesnew ) || ((!interleaved) && j != sites)){ printf("ERROR: SEQUENCES OUT OF ALIGNMENT\n"); exxit(-1);} i++; } if (interleaved) { sitesread = sitesnew; allread = (sitesread == sites); } else allread = (i > spp); } if (printdata) { for (i = 1; i <= ((sites - 1) / 60 + 1); i++) { for (j = 0; j < spp; j++) { for (k = 0; k < nmlngth; k++) putc(nayme[j][k], outfile); fprintf(outfile, " "); l = i * 60; if (l > sites) l = sites; for (k = (i - 1) * 60 + 1; k <= l; k++) { putc(y[j][k - 1], outfile); if (k % 10 == 0 && k % 60 != 0) putc(' ', outfile); } putc('\n', outfile); } putc('\n', outfile); } putc('\n', outfile); } } /* restdist_inputdata */ void restdist_sitesort() { /* Shell sort keeping alias, aliasweight in same order */ long gap, i, j, jj, jg, k, itemp; boolean flip, tied; gap = sites / 2; while (gap > 0) { for (i = gap + 1; i <= sites; i++) { j = i - gap; flip = true; while (j > 0 && flip) { jj = alias[j]; jg = alias[j + gap]; flip = false; tied = true; k = 1; while (k <= spp && tied) { flip = (y[k - 1][jj - 1] > y[k - 1][jg - 1]); tied = (tied && y[k - 1][jj - 1] == y[k - 1][jg - 1]); k++; } if (tied) { aliasweight[j] += aliasweight[j + gap]; aliasweight[j + gap] = 0; } if (!flip) break; itemp = alias[j]; alias[j] = alias[j + gap]; alias[j + gap] = itemp; itemp = aliasweight[j]; aliasweight[j] = aliasweight[j + gap]; aliasweight[j + gap] = itemp; j -= gap; } } gap /= 2; } } /* restdist_sitesort */ void restdist_sitecombine() { /* combine sites that have identical patterns */ long i, j, k; boolean tied; i = 1; while (i < sites) { j = i + 1; tied = true; while (j <= sites && tied) { k = 1; while (k <= spp && tied) { tied = (tied && y[k - 1][alias[i] - 1] == y[k - 1][alias[j] - 1]); k++; } if (tied && aliasweight[j] > 0) { aliasweight[i] += aliasweight[j]; aliasweight[j] = 0; alias[j] = alias[i]; } j++; } i = j - 1; } } /* restdist_sitecombine */ void makeweights() { /* make up weights vector to avoid duplicate computations */ long i; for (i = 1; i <= sites; i++) { alias[i] = i; aliasweight[i] = weight[i]; } restdist_sitesort(); restdist_sitecombine(); sitescrunch2(sites + 1, 2, 3, aliasweight); for (i = 1; i <= sites; i++) { weight[i] = aliasweight[i]; if (weight[i] > 0) endsite = i; } weight[0] = 1; } /* makeweights */ void makev(long m, long n, double *v) { /* compute one distance */ long i, ii, it, numerator, denominator; double f, g=0, h, p1, p2, p3, q1, pp, tt, delta, vv; numerator = 0; denominator = 0; for (i = 0; i < endsite; i++) { ii = alias[i + 1]; if ((y[m-1][ii-1] == '+') || (y[n-1][ii-1] == '+')) { denominator += weight[i + 1]; if ((y[m-1][ii-1] == '+') && (y[n-1][ii-1] == '+')) { numerator += weight[i + 1]; } } } f = 2*numerator/(double)(denominator+numerator); if (restsites) { if (exp(-sitelength*1.38629436) > f) { printf("\nERROR: Infinite distance between "); printf(" species %3ld and %3ld\n", m, n); exxit(-1); } } if (!restsites) { if (!neili) { f = (sqrt(f*(f+8.0))-f)/2.0; } else { g = initialv; delta = g; it = 0; while (fabs(delta) > 0.00002 && it < iterationsr) { it++; h = g; g = exp(0.25*log(f * (3-2*g))); delta = g - h; } } } if ((!restsites) && neili) vv = - (2.0/sitelength) * log(g); else { if (neili && restsites) { pp = exp(log(f)/(2*sitelength)); vv = -(3.0/2.0)*log((4.0/3.0)*pp - (1.0/3.0)); } else { pp = exp(log(f)/sitelength); delta = initialv; tt = delta; it = 0; while (fabs(delta) > 0.000001 && it < iterationsr) { it++; if (gama) { p1 = exp(-cvi * log(1 + tt / cvi)); p2 = exp(-cvi * log(1 + xv * tt / cvi)) - exp(-cvi * log(1 + tt / cvi)); p3 = 1.0 - exp(-cvi * log(1 + xv * tt / cvi)); } else { p1 = exp(-tt); p2 = exp(-xv * tt) - exp(-tt); p3 = 1.0 - exp(-xv * tt); } q1 = p1 + p2 / 2.0 + p3 / 4.0; g = q1 - pp; if (g < 0.0) delta = fabs(delta) / -2.0; else delta = fabs(delta); tt += delta; } vv = fracchange * tt; } } *v = fabs(vv); } /* makev */ void makedists() { /* compute distance matrix */ long i, j; double v; if (progress) printf("Distances calculated for species\n"); for (i = 0; i < spp; i++) d[i][i] = 0.0; for (i = 1; i < spp; i++) { if (progress) { printf(" "); for (j = 0; j < nmlngth; j++) putchar(nayme[i - 1][j]); printf(" "); } for (j = i + 1; j <= spp; j++) { makev(i, j, &v); d[i - 1][j - 1] = v; d[j - 1][i - 1] = v; if (progress) putchar('.'); } if (progress) putchar('\n'); } if (progress) { printf(" "); for (j = 0; j < nmlngth; j++) putchar(nayme[spp - 1][j]); putchar('\n'); } } /* makedists */ void writedists() { /* write out distances */ long i, j, k; if (!printdata) fprintf(outfile, "%5ld\n", spp); for (i = 0; i < spp; i++) { for (j = 0; j < nmlngth; j++) putc(nayme[i][j], outfile); if (lower) k = i; else k = spp; for (j = 1; j <= k; j++) { if (d[i][j-1] < 100.0) fprintf(outfile, "%10.6f", d[i][j-1]); else if (d[i][j-1] < 1000.0) fprintf(outfile, " %10.6f", d[i][j-1]); else fprintf(outfile, " %11.6f", d[i][j-1]); if ((j + 1) % 7 == 0 && j < k) putc('\n', outfile); } putc('\n', outfile); } if (progress) printf("\nDistances written to file \"%s\"\n\n", outfilename); } /* writedists */ void getinput() { /* reads the input data */ inputoptions(); restdist_inputdata(); makeweights(); } /* getinput */ int main(int argc, Char *argv[]) { /* distances from restriction sites or fragments */ #ifdef MAC argc = 1; /* macsetup("Restdist",""); */ argv[0] = "Restdist"; #endif init(argc,argv); progname = argv[0]; openfile(&infile,INFILE,"input data file","r",argv[0],infilename); openfile(&outfile,OUTFILE,"output file","w",argv[0],outfilename); ibmpc = IBMCRT; ansi = ANSICRT; mulsets = false; datasets = 1; firstset = true; doinit(); for (ith = 1; ith <= datasets; ith++) { getinput(); if (ith == 1) firstset = false; if (datasets > 1 && progress) printf("\nData set # %ld:\n\n",ith); makedists(); writedists(); } FClose(infile); FClose(outfile); #ifdef MAC fixmacfile(outfilename); #endif printf("Done.\n\n"); #ifdef WIN32 phyRestoreConsoleAttributes(); #endif return 0; } /* distances from restriction sites or fragments */ phylip-3.697/src/restml.c0000644004732000473200000017741312406201117015015 0ustar joefelsenst_g /* version 3.696. Written by Joseph Felsenstein, Akiko Fuseki, Sean Lamont, and Andrew Keeffe. Copyright (c) 1993-2014, Joseph Felsenstein. All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #include #include "phylip.h" #include "seq.h" #define initialv 0.1 /* starting value of branch length */ #define over 60 /* maximum width of a tree on screen */ #ifndef OLDC /* function prototypes */ void restml_inputnumbers(void); void getoptions(void); void allocrest(void); void setuppie(void); void doinit(void); void inputoptions(void); void restml_inputdata(void); void restml_sitesort(void); void restml_sitecombine(void); void makeweights(void); void restml_makevalues(void); void getinput(void); void copymatrix(transmatrix, transmatrix); void maketrans(double, boolean); void branchtrans(long, double); double evaluate(tree *, node *); boolean nuview(node *); void makenewv(node *); void update(node *); void smooth(node *); void insert_(node *p, node *); void restml_re_move(node **, node **); void restml_copynode(node *, node *); void restml_copy_(tree *, tree *); void buildnewtip(long , tree *); void buildsimpletree(tree *); void addtraverse(node *, node *, boolean); void rearrange(node *, node *); void restml_coordinates(node *, double, long *,double *, double *); void restml_fprintree(FILE *fp); void restml_printree(void); double sigma(node *, double *); void fdescribe(FILE *, node *); void summarize(void); void restml_treeout(node *); static phenotype2 restml_pheno_new(long endsite, long sitelength); /* static void restml_pheno_delete(phenotype2 x2); */ void initrestmlnode(node **p, node **grbg, node *q, long len, long nodei, long *ntips, long *parens, initops whichinit, pointarray treenode, pointarray nodep, Char *str, Char *ch, FILE *intree); static void restml_unroot(node* root, node** nodep, long nonodes); void inittravtree(tree* t,node *); static void adjust_lengths_r(node *p); void treevaluate(void); void maketree(void); void globrearrange(void); void adjust_lengths(tree *); double adjusted_v(double v); sitelike2 init_sitelike(long sitelength); void free_sitelike(sitelike2 sl); void copy_sitelike(sitelike2 dest, sitelike2 src,long sitelength); void reallocsites(void); static void set_branchnum(node *p, long branchnum); void alloctrans(tree *t, long nonodes, long sitelength); long get_trans(tree* t); void free_trans(tree* t, long trans); void free_all_trans(tree* t); void alloclrsaves(void); void freelrsaves(void); void resetlrsaves(void); void cleanup(void); void allocx2(long nonodes, long sitelength, pointarray, boolean usertree); void freex2(long nonodes, pointarray treenode); void freetrans(tree * t, long nonodes,long sitelength); /* function prototypes */ #endif Char infilename[FNMLNGTH]; Char outfilename[FNMLNGTH]; Char intreename[FNMLNGTH]; Char outtreename[FNMLNGTH]; long nonodes2, sites, enzymes, weightsum, sitelength, datasets, ith, njumble, jumb=0; long inseed, inseed0; /* User options */ boolean global; /* Perform global rearrangements? */ boolean jumble; /* Randomize input order? */ boolean lengths; /* Use lengths from user tree? */ boolean weights; boolean trout; /* Write tree to outtree? */ boolean trunc8; boolean usertree; /* Input user tree? Or search. */ boolean progress; /* Display progress */ boolean mulsets; /* Use multiple data sets */ /* Runtime state */ boolean firstset; boolean improve; boolean smoothit; boolean inserting = false; double bestyet; tree curtree, priortree, bestree, bestree2; longer seed; long *enterorder; steptr aliasweight; char *progname; node *qwhere,*addwhere; /* local rearrangements need to save views. created globally so that reallocation of the same variable is unnecessary */ node **lrsaves; /* Local variables for maketree, propagated globally for C version: */ long nextsp, numtrees, maxwhich, col, shimotrees; double maxlogl; boolean succeeded, smoothed; #define NTEMPMATS 7 transmatrix *tempmatrix, tempslope, tempcurve; sitelike2 pie; double *l0gl; double **l0gf; Char ch; /* variables added to keep treeread2() happy */ boolean goteof; double trweight; node *grbg = NULL; static void set_branchnum(node *p, long branchnum) { assert(p != NULL); assert(branchnum > 0); p->branchnum = branchnum; } void allocx2(long nonodes, long sitelength, pointarray treenode, boolean usertree) { /* its complement is freex2(nonodes,treenode) */ long i, j, k, l; node *p; for (i = 0; i < spp; i++) { treenode[i]->x2 = (phenotype2)Malloc((endsite+1)*sizeof(sitelike2)); for ( j = 0 ; j < endsite + 1 ; j++ ) treenode[i]->x2[j] = (double *)Malloc((sitelength + 1) * sizeof(double)); } if (!usertree) { for (i = spp; i < nonodes; i++) { p = treenode[i]; for (j = 1; j <= 3; j++) { p->x2 = (phenotype2)Malloc((endsite+1)*sizeof(sitelike2)); for (k = 0; k < endsite + 1; k++) { p->x2[k] = (double *)Malloc((sitelength + 1) * sizeof(double)); for (l = 0; l < sitelength; l++) p->x2[k][l] = 1.0; } p = p->next; } } } } /* allocx2 */ void freex2(long nonodes, pointarray treenode) { long i, j, k; node *p; for (i = 0; i < spp; i++) { for (j = 0; j < endsite + 1; j++) { free(treenode[i]->x2[j]); } free(treenode[i]->x2); treenode[i]->x2 = NULL; } for (i = spp; i < nonodes; i++) { p = treenode[i]; if (p != NULL) { for (j = 1; j <= 3; j++) { for (k = 0; k < endsite + 1; k++) { free(p->x2[k]); } free(p->x2); p->x2 = NULL; p = p->next; } } } } /* freex2 */ void alloctrans(tree *t, long nonodes, long sitelength) { /* it's complement is freetrans(tree*,nonodes, sitelength) */ long i, j; t->trans = (transptr)Malloc(nonodes*sizeof(transmatrix)); for (i = 0; i < nonodes; ++i){ t->trans[i] = (transmatrix)Malloc((sitelength + 1) * sizeof(double *)); for (j = 0;j < sitelength + 1; ++j) t->trans[i][j] = (double *)Malloc((sitelength + 1) * sizeof(double)); } t->freetrans = Malloc(nonodes* sizeof(long)); for ( i = 0; i < nonodes; i++ ) t->freetrans[i] = i+1; t->transindex = nonodes - 1; } /* alloctrans */ void freetrans(tree * t, long nonodes,long sitelength) { long i ,j; for ( i = 0 ; i < nonodes ; i++ ) { for ( j = 0 ; j < sitelength + 1; j++) { free ((t->trans)[i][j]); } free ((t->trans)[i]); } free(t->trans); free(t->freetrans); } long get_trans(tree* t) { long ret; assert(t->transindex >= 0); ret = t->freetrans[t->transindex]; t->transindex--; return ret; } void free_trans(tree* t, long trans) { long i; /* FIXME This is a temporary workaround and probably slows things down a bit. * During rearrangements, this function is sometimes called more than once on * already freed nodes, causing the freetrans array to overrun other data. */ for ( i = 0 ; i < t->transindex; i++ ) { if ( t->freetrans[i] == trans ) { return; } } /* end of temporary fix */ t->transindex++; t->freetrans[t->transindex] = trans; } void free_all_trans(tree* t) { long i; for ( i = 0; i < nonodes2; i++ ) t->freetrans[i] = i; t->transindex = nonodes2 - 1; } sitelike2 init_sitelike(long sitelength) { return Malloc((sitelength+1) * sizeof(double)); } void free_sitelike(sitelike2 sl) { free(sl); } void copy_sitelike(sitelike2 dest, sitelike2 src,long sitelength) { memcpy(dest,src,(sitelength+1)*sizeof(double)); } void restml_inputnumbers() { /* read and print out numbers of species and sites */ fscanf(infile, "%ld%ld%ld", &spp, &sites, &enzymes); nonodes2 = spp * 2 - 1; } /* restml_inputnumbers */ void getoptions() { /* interactively set options */ long loopcount, loopcount2; Char ch; fprintf(outfile, "\nRestriction site Maximum Likelihood"); fprintf(outfile, " method, version %s\n\n",VERSION); putchar('\n'); sitelength = 6; trunc8 = true; global = false; improve = false; jumble = false; njumble = 1; lengths = false; outgrno = 1; outgropt = false; trout = true; usertree = false; weights = false; printdata = false; progress = true; treeprint = true; interleaved = true; loopcount = 0; for (;;) { cleerhome(); printf("\nRestriction site Maximum Likelihood"); printf(" method, version %s\n\n",VERSION); printf("Settings for this run:\n"); printf(" U Search for best tree? %s\n", (usertree ? "No, use user trees in input file" : "Yes")); if (usertree) { printf(" N Use lengths from user trees? %s\n", (lengths ? "Yes" : "No")); } printf(" A Are all sites detected? %s\n", (trunc8 ? "No" : "Yes")); if (!usertree) { printf(" S Speedier but rougher analysis? %s\n", (improve ? "No, not rough" : "Yes")); printf(" G Global rearrangements? %s\n", (global ? "Yes" : "No")); printf(" J Randomize input order of sequences? "); if (jumble) printf("Yes (seed =%8ld,%3ld times)\n", inseed0, njumble); else printf("No. Use input order\n"); } printf(" L Site length?%3ld\n",sitelength); printf(" O Outgroup root? %s%3ld\n", (outgropt ? "Yes, at sequence number" : "No, use as outgroup species"),outgrno); printf(" M Analyze multiple data sets?"); if (mulsets) printf(" Yes, %2ld sets\n", datasets); else printf(" No\n"); printf(" I Input sequences interleaved? %s\n", (interleaved ? "Yes" : "No, sequential")); printf(" 0 Terminal type (IBM PC, ANSI, none)? %s\n", ibmpc ? "IBM PC" : ansi ? "ANSI" : "(none)"); printf(" 1 Print out the data at start of run %s\n", (printdata ? "Yes" : "No")); printf(" 2 Print indications of progress of run %s\n", (progress ? "Yes" : "No")); printf(" 3 Print out tree %s\n", (treeprint ? "Yes" : "No")); printf(" 4 Write out trees onto tree file? %s\n", (trout ? "Yes" : "No")); printf("\n Y to accept these or type the letter for one to change\n"); fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); if (ch == '\n') ch = ' '; uppercase(&ch); if (ch == 'Y') break; if (((!usertree) && (strchr("UNASGJLOTMI01234", ch) != NULL)) || (usertree && ((strchr("UNASLOTMI01234", ch) != NULL)))){ switch (ch) { case 'A': trunc8 = !trunc8; break; case 'S': improve = !improve; break; case 'G': global = !global; break; case 'J': jumble = !jumble; if (jumble) initjumble(&inseed, &inseed0, seed, &njumble); else njumble = 1; break; case 'L': loopcount2 = 0; do { printf("New Sitelength?\n"); fflush(stdout); scanf("%ld%*[^\n]", &sitelength); getchar(); if (sitelength < 1) printf("BAD RESTRICTION SITE LENGTH: %ld\n", sitelength); countup(&loopcount2, 10); } while (sitelength < 1); break; case 'N': lengths = !lengths; break; case 'O': outgropt = !outgropt; if (outgropt) initoutgroup(&outgrno, spp); else outgrno = 1; break; case 'U': usertree = !usertree; break; case 'M': mulsets = !mulsets; if (mulsets) initdatasets(&datasets); break; case 'I': interleaved = !interleaved; break; case '0': initterminal(&ibmpc, &ansi); break; case '1': printdata = !printdata; break; case '2': progress = !progress; break; case '3': treeprint = !treeprint; break; case '4': trout = !trout; break; } } else printf("Not a possible option!\n"); countup(&loopcount, 100); } } /* getoptions */ void reallocsites() { long i; for (i = 0; i < spp; i++) { free(y[i]); y[i] = (Char *)Malloc(sites*sizeof(Char)); } free(weight); free(alias); free(aliasweight); weight = (steptr)Malloc((sites+1)*sizeof(long)); alias = (steptr)Malloc((sites+1)*sizeof(long)); aliasweight = (steptr)Malloc((sites+1)*sizeof(long)); } void allocrest() { long i; y = (Char **)Malloc(spp*sizeof(Char *)); for (i = 0; i < spp; i++) y[i] = (Char *)Malloc(sites*sizeof(Char)); nayme = (naym *)Malloc(spp*sizeof(naym)); enterorder = (long *)Malloc(spp*sizeof(long)); weight = (steptr)Malloc((sites+1)*sizeof(long)); alias = (steptr)Malloc((sites+1)*sizeof(long)); aliasweight = (steptr)Malloc((sites+1)*sizeof(long)); } /* allocrest */ void freelrsaves() { long i,j; for ( i = 0 ; i < NLRSAVES ; i++ ) { for (j = 0; j < endsite + 1; j++) free(lrsaves[i]->x2[j]); free(lrsaves[i]->x2); free(lrsaves[i]->underflows); free(lrsaves[i]); } free(lrsaves); } void resetlrsaves() { freelrsaves(); alloclrsaves(); } void alloclrsaves() { long i,j; lrsaves = Malloc(NLRSAVES * sizeof(node*)); for ( i = 0 ; i < NLRSAVES ; i++ ) { lrsaves[i] = Malloc(sizeof(node)); lrsaves[i]->x2 = Malloc((endsite + 1)*sizeof(sitelike2)); for ( j = 0 ; j < endsite + 1 ; j++ ) { lrsaves[i]->x2[j] = Malloc((sitelength + 1) * sizeof(double)); } } } /* alloclrsaves */ void setuppie() { /* set up equilibrium probabilities of being a given number of bases away from a restriction site */ long i; double sum; pie = init_sitelike(sitelength); pie[0] = 1.0; sum = pie[0]; for (i = 1; i <= sitelength; i++) { pie[i] = 3 * pie[i - 1] * (sitelength - i + 1) / i; sum += pie[i]; } for (i = 0; i <= sitelength; i++) pie[i] /= sum; } /* setuppie */ void doinit() { /* initializes variables */ long i,j; restml_inputnumbers(); getoptions(); if (!usertree) nonodes2--; if (printdata) fprintf(outfile, "%4ld Species, %4ld Sites,%4ld Enzymes\n", spp, sites, enzymes); tempmatrix = Malloc(NTEMPMATS * sizeof(transmatrix)); for ( i = 0 ; i < NTEMPMATS ; i++ ) { tempmatrix[i] = Malloc((sitelength+1) * sizeof(double *)); for ( j = 0 ; j <= sitelength ; j++) tempmatrix[i][j] = (double *)Malloc((sitelength+1) * sizeof(double)); } tempslope = (transmatrix)Malloc((sitelength+1) * sizeof(double *)); for (i=0; i<=sitelength; i++) tempslope[i] = (double *)Malloc((sitelength+1) * sizeof(double)); tempcurve = (transmatrix)Malloc((sitelength+1) * sizeof(double *)); for (i=0; i<=sitelength; i++) tempcurve[i] = (double *)Malloc((sitelength+1) * sizeof(double)); setuppie(); alloctrans(&curtree, nonodes2, sitelength); alloctree(&curtree.nodep, nonodes2, usertree); allocrest(); if (usertree) return; alloctrans(&bestree, nonodes2, sitelength); alloctree(&bestree.nodep, nonodes2, 0); alloctrans(&priortree, nonodes2, sitelength); alloctree(&priortree.nodep, nonodes2, 0); if (njumble == 1) return; alloctrans(&bestree2, nonodes2, sitelength); alloctree(&bestree2.nodep, nonodes2, 0); } /* doinit */ void cleanup() { long i, j; for (i = 0; i < NTEMPMATS; i++) { for (j = 0; j <= sitelength; j++) free(tempmatrix[i][j]); free(tempmatrix[i]); } free(tempmatrix); tempmatrix = NULL; for (i = 0; i <= sitelength; i++) { free(tempslope[i]); free(tempcurve[i]); } free(tempslope); tempslope = NULL; free(tempcurve); tempcurve = NULL; freelrsaves(); } void inputoptions() { /* read the options information */ Char ch; long i, extranum, cursp, curst, curenz; if (!firstset) { if (eoln(infile)) scan_eoln(infile); fscanf(infile, "%ld%ld%ld", &cursp, &curst, &curenz); if (cursp != spp) { printf("\nERROR: INCONSISTENT NUMBER OF SPECIES IN DATA SET %4ld\n", ith); exxit(-1); } if (curenz != enzymes) { printf("\nERROR: INCONSISTENT NUMBER OF ENZYMES IN DATA SET %4ld\n", ith); exxit(-1); } sites = curst; } if ( !firstset ) reallocsites(); for (i = 1; i <= sites; i++) weight[i] = 1; weightsum = sites; extranum = 0; readoptions(&extranum, "W"); for (i = 1; i <= extranum; i++) { matchoptions(&ch, "W"); if (ch == 'W') inputweights2(1, sites+1, &weightsum, weight, &weights, "RESTML"); } fprintf(outfile, "\n Recognition sequences all%2ld bases long\n", sitelength); if (trunc8) fprintf(outfile, "\nSites absent from all species are assumed to have been omitted\n\n"); if (weights) printweights(outfile, 1, sites, weight, "Sites"); } /* inputoptions */ void restml_inputdata() { /* read the species and sites data */ long i, j, k, l, sitesread, sitesnew=0; Char ch; boolean allread, done; if (printdata) putc('\n', outfile); j = nmlngth + (sites + (sites - 1) / 10) / 2 - 5; if (j < nmlngth - 1) j = nmlngth - 1; if (j > 39) j = 39; if (printdata) { fprintf(outfile, "Name"); for (i = 1; i <= j; i++) putc(' ', outfile); fprintf(outfile, "Sites\n"); fprintf(outfile, "----"); for (i = 1; i <= j; i++) putc(' ', outfile); fprintf(outfile, "-----\n\n"); } sitesread = 0; allread = false; while (!(allread)) { /* eat white space -- if the separator line has spaces on it*/ do { ch = gettc(infile); } while (ch == ' ' || ch == '\t'); ungetc(ch, infile); if (eoln(infile)) scan_eoln(infile); i = 1; while (i <= spp ) { if ((interleaved && sitesread == 0) || !interleaved) initname(i - 1); if (interleaved) j = sitesread; else j = 0; done = false; while (!done && !eoff(infile)) { if (interleaved) done = true; while (j < sites && !(eoln(infile) || eoff(infile))) { ch = gettc(infile); if (ch == '\n' || ch == '\t') ch = ' '; if (ch == ' ') continue; uppercase(&ch); if (ch != '1' && ch != '0' && ch != '+' && ch != '-' && ch != '?') { printf(" ERROR: Bad symbol %c", ch); printf(" at position %ld of species %ld\n", j+1, i); exxit(-1); } if (ch == '1') ch = '+'; if (ch == '0') ch = '-'; j++; y[i - 1][j - 1] = ch; } if (interleaved) continue; if (j < sites) scan_eoln(infile); else if (j == sites) done = true; } if (interleaved && i == 1) sitesnew = j; scan_eoln(infile); if ((interleaved && j != sitesnew ) || ((!interleaved) && j != sites)){ printf("ERROR: SEQUENCES OUT OF ALIGNMENT\n"); exxit(-1);} i++; } if (interleaved) { sitesread = sitesnew; allread = (sitesread == sites); } else allread = (i > spp); } if (printdata) { for (i = 1; i <= ((sites - 1) / 60 + 1); i++) { for (j = 0; j < spp; j++) { for (k = 0; k < nmlngth; k++) putc(nayme[j][k], outfile); fprintf(outfile, " "); l = i * 60; if (l > sites) l = sites; for (k = (i - 1) * 60 + 1; k <= l; k++) { putc(y[j][k - 1], outfile); if (k % 10 == 0 && k % 60 != 0) putc(' ', outfile); } putc('\n', outfile); } putc('\n', outfile); } putc('\n', outfile); } putc('\n', outfile); } /* restml_inputdata */ void restml_sitesort() { /* Shell sort keeping alias, aliasweight in same order */ long gap, i, j, jj, jg, k, itemp; boolean flip, tied; gap = sites / 2; while (gap > 0) { for (i = gap + 1; i <= sites; i++) { j = i - gap; flip = true; while (j > 0 && flip) { jj = alias[j]; jg = alias[j + gap]; flip = false; tied = true; k = 1; while (k <= spp && tied) { flip = (y[k - 1][jj - 1] > y[k - 1][jg - 1]); tied = (tied && y[k - 1][jj - 1] == y[k - 1][jg - 1]); k++; } if (tied) { aliasweight[j] += aliasweight[j + gap]; aliasweight[j + gap] = 0; } if (!flip) break; itemp = alias[j]; alias[j] = alias[j + gap]; alias[j + gap] = itemp; itemp = aliasweight[j]; aliasweight[j] = aliasweight[j + gap]; aliasweight[j + gap] = itemp; j -= gap; } } gap /= 2; } } /* restml_sitesort */ void restml_sitecombine() { /* combine sites that have identical patterns */ long i, j, k; boolean tied; i = 1; while (i < sites) { j = i + 1; tied = true; while (j <= sites && tied) { k = 1; while (k <= spp && tied) { tied = (tied && y[k - 1][alias[i] - 1] == y[k - 1][alias[j] - 1]); k++; } if (tied && aliasweight[j] > 0) { aliasweight[i] += aliasweight[j]; aliasweight[j] = 0; alias[j] = alias[i]; } j++; } i = j - 1; } } /* restml_sitecombine */ void makeweights() { /* make up weights vector to avoid duplicate computations */ long i; for (i = 1; i <= sites; i++) { alias[i] = i; aliasweight[i] = weight[i]; } restml_sitesort(); restml_sitecombine(); sitescrunch2(sites + 1, 2, 3, aliasweight); for (i = 1; i <= sites; i++) { weight[i] = aliasweight[i]; if (weight[i] > 0) endsite = i; } weight[0] = 1; } /* makeweights */ void restml_makevalues() { /* set up fractional likelihoods at tips */ long i, j, k, l, m; boolean found; for (k = 1; k <= endsite; k++) { j = alias[k]; for (i = 0; i < spp; i++) { for (l = 0; l <= sitelength; l++) curtree.nodep[i]->x2[k][l] = 1.0; switch (y[i][j - 1]) { case '+': for (m = 1; m <= sitelength; m++) curtree.nodep[i]->x2[k][m] = 0.0; break; case '-': curtree.nodep[i]->x2[k][0] = 0.0; break; case '?': /* blank case */ break; } } } for (i = 0; i < spp; i++) { for (k = 1; k <= sitelength; k++) curtree.nodep[i]->x2[0][k] = 1.0; curtree.nodep[i]->x2[0][0] = 0.0; } if (trunc8) return; found = false; i = 1; while (!found && i <= endsite) { found = true; for (k = 0; k < spp; k++) found = (found && y[k][alias[i] - 1] == '-'); if (!found) i++; } if (found) { weightsum += (enzymes - 1) * weight[i]; weight[i] *= enzymes; } } /* restml_makevalues */ void getinput() { /* reads the input data */ inputoptions(); restml_inputdata(); if ( !firstset ) freelrsaves(); makeweights(); alloclrsaves(); if (!usertree) { setuptree2(&curtree); setuptree2(&priortree); setuptree2(&bestree); if (njumble > 1) setuptree2(&bestree2); } allocx2(nonodes2, sitelength, curtree.nodep, usertree); if (!usertree) { allocx2(nonodes2, sitelength, priortree.nodep, 0); allocx2(nonodes2, sitelength, bestree.nodep, 0); if (njumble > 1) allocx2(nonodes2, sitelength, bestree2.nodep, 0); } restml_makevalues(); } /* getinput */ void copymatrix(transmatrix tomat, transmatrix frommat) { /* copy a matrix the size of transition matrix */ int i,j; for (i=0;i<=sitelength;++i){ for (j=0;j<=sitelength;++j) tomat[i][j] = frommat[i][j]; } } /* copymatrix */ void maketrans(double p, boolean nr) { /* make transition matrix, product matrix with change probability p. Put the results in tempmatrix, tempslope, tempcurve */ long i, j, k, m1, m2; double sump, sums=0, sumc=0, pover3, pijk, term; sitelike2 binom1, binom2; binom1 = init_sitelike(sitelength); binom2 = init_sitelike(sitelength); pover3 = p / 3; for (i = 0; i <= sitelength; i++) { if (p > 1.0 - epsilon) p = 1.0 - epsilon; if (p < epsilon) p = epsilon; binom1[0] = exp((sitelength - i) * log(1 - p)); for (k = 1; k <= sitelength - i; k++) binom1[k] = binom1[k - 1] * (p / (1 - p)) * (sitelength - i - k + 1) / k; binom2[0] = exp(i * log(1 - pover3)); for (k = 1; k <= i; k++) binom2[k] = binom2[k - 1] * (pover3 / (1 - pover3)) * (i - k + 1) / k; for (j = 0; j <= sitelength; ++j) { sump = 0.0; if (nr) { sums = 0.0; sumc = 0.0; } if (i - j > 0) m1 = i - j; else m1 = 0; if (sitelength - j < i) m2 = sitelength - j; else m2 = i; for (k = m1; k <= m2; k++) { pijk = binom1[j - i + k] * binom2[k]; sump += pijk; if (nr) { term = (j-i+2*k)/p - (sitelength-j-k)/(1.0-p) - (i-k)/(3.0-p); sums += pijk * term; sumc += pijk * (term * term - (j-i+2*k)/(p*p) - (sitelength-j-k)/((1.0-p)*(1.0-p)) - (i-k)/((3.0-p)*(3.0-p)) ); } } tempmatrix[0][i][j] = sump; if (nr) { tempslope[i][j] = sums; tempcurve[i][j] = sumc; } } } free_sitelike(binom1); free_sitelike(binom2); } /* maketrans */ void branchtrans(long i, double p) { /* make branch transition matrix for branch i with probability of change p */ boolean nr; nr = false; maketrans(p, nr); copymatrix(curtree.trans[i - 1], tempmatrix[0]); } /* branchtrans */ double evaluate(tree *tr, node *p) { /* evaluates the likelihood, using info. at one branch */ double sum, sum2, y, liketerm, like0, lnlike0=0, term; long i, j, k,branchnum; node *q; sitelike2 x1, x2; x1 = init_sitelike(sitelength); x2 = init_sitelike(sitelength); sum = 0.0; q = p->back; nuview(p); nuview(q); y = p->v; branchnum = p->branchnum; copy_sitelike(x1,p->x2[0],sitelength); copy_sitelike(x2,q->x2[0],sitelength); if (trunc8) { like0 = 0.0; for (j = 0; j <= sitelength; j++) { liketerm = pie[j] * x1[j]; for (k = 0; k <= sitelength; k++) like0 += liketerm * tr->trans[branchnum-1][j][k] * x2[k]; } lnlike0 = log(enzymes * (1.0 - like0)); } for (i = 1; i <= endsite; i++) { copy_sitelike(x1,p->x2[i],sitelength); copy_sitelike(x2,q->x2[i],sitelength); sum2 = 0.0; for (j = 0; j <= sitelength; j++) { liketerm = pie[j] * x1[j]; for (k = 0; k <= sitelength; k++) sum2 += liketerm * tr->trans[branchnum-1][j][k] * x2[k]; } term = log(sum2); if (trunc8) term -= lnlike0; if (usertree && (which <= shimotrees)) l0gf[which - 1][i - 1] = term; sum += weight[i] * term; } /* *** debug put a variable "saveit" in evaluate as third argument as to whether to save the KHT suff */ if (usertree) { if(which <= shimotrees) l0gl[which - 1] = sum; if (which == 1) { maxwhich = 1; maxlogl = sum; } else if (sum > maxlogl) { maxwhich = which; maxlogl = sum; } } tr->likelihood = sum; free_sitelike(x1); free_sitelike(x2); return sum; } /* evaluate */ boolean nuview(node *p) { /* recompute fractional likelihoods for one part of tree */ long i, j, k, lowlim; double sumq; node *q, *s; if (p->tip) return false; for (s = p->next; s != p; s = s->next) { if ( nuview(s->back) ) p->initialized = false; } if (p->initialized) return false; lowlim = trunc8 ? 0 : 1; /* recalculates p->x2[*][*] in place */ for (i = lowlim; i <= endsite; i++) { for (j = 0; j <= sitelength; j++) p->x2[i][j] = 1.0; for (s = p->next; s != p; s = s->next) { q = s->back; for (j = 0; j <= sitelength; j++) { sumq = 0.0; for (k = 0; k <= sitelength; k++) sumq += curtree.trans[q->branchnum-1][j][k] * q->x2[i][k]; p->x2[i][j] *= sumq; } } } return true; } /* nuview */ void makenewv(node *p) { /* Newton-Raphson algorithm improvement of a branch length */ long i, j, k, lowlim, it, ite; double sum, sums, sumc, like, slope, curve, liketerm, liket, y, yold=0, yorig, like0=0, slope0=0, curve0=0, oldlike=0, temp; boolean done, nr, firsttime, better; node *q; sitelike2 xx1, xx2; double *tm, *ts, *tc; q = p->back; y = p->v; yorig = y; if (trunc8) lowlim = 0; else lowlim = 1; done = false; nr = true; firsttime = true; it = 1; ite = 0; while ((it < iterations) && (ite < 20) && (!done)) { like = 0.0; slope = 0.0; curve = 0.0; maketrans(y, nr); for (i = lowlim; i <= endsite; i++) { xx1 = p->x2[i]; xx2 = q->x2[i]; sum = 0.0; sums = 0.0; sumc = 0.0; for (j = 0; j <= sitelength; j++) { liket = xx1[j] * pie[j]; tm = tempmatrix[0][j]; ts = tempslope[j]; tc = tempcurve[j]; for (k = 0; k <= sitelength; k++) { liketerm = liket * xx2[k]; sum += tm[k] * liketerm; sums += ts[k] * liketerm; sumc += tc[k] * liketerm; } } if (i == 0) { like0 = sum; slope0 = sums; curve0 = sumc; } else { like += weight[i] * log(sum); slope += weight[i] * sums/sum; temp = sums/sum; curve += weight[i] * (sumc/sum-temp*temp); } } if (trunc8 && fabs(like0 - 1.0) > 1.0e-10) { like -= weightsum * log(enzymes * (1.0 - like0)); slope += weightsum * slope0 /(1.0 - like0); curve += weightsum * (curve0 /(1.0 - like0) + slope0*slope0/((1.0 - like0)*(1.0 - like0))); } better = false; if (firsttime) { yold = y; oldlike = like; firsttime = false; better = true; } else { if (like > oldlike) { yold = y; oldlike = like; better = true; it++; } } if (better) { y = y + slope/fabs(curve); if (y < epsilon) y = 10.0 * epsilon; if (y > 0.75) y = 0.75; } else { if (fabs(y - yold) < epsilon) ite = 20; y = (y + yold) / 2.0; } ite++; done = fabs(y-yold) < epsilon; } smoothed = (fabs(yold-yorig) < epsilon) && (yorig > 1000.0*epsilon); p->v = yold; q->v = yold; branchtrans(p->branchnum, yold); curtree.likelihood = oldlike; } /* makenewv */ void update(node *p) { /* improve branch length and views for one branch */ nuview(p); nuview(p->back); if ( !(usertree && lengths) ) { makenewv(p); if (smoothit ) { inittrav(p); inittrav(p->back); } else { if (inserting && !p->tip) { p->next->initialized = false; p->next->next->initialized = false; } } } } /* update */ void smooth(node *p) { /* update nodes throughout the tree, recursively */ smoothed = false; update(p); if (!p->tip) { if (smoothit && !smoothed) { smooth(p->next->back); } if (smoothit && !smoothed) { smooth(p->next->next->back); } } } /* smooth */ void insert_(node *p, node *q) { /* insert a subtree into a branch, improve lengths in tree */ long i; node *r; r = p->next->next; hookup(r, q->back); hookup(p->next, q); if (q->v >= 0.75) q->v = 0.75; else q->v = 0.75 * (1 - sqrt(1 - 1.333333 * q->v)); if ( q->v < epsilon) q->v = epsilon; q->back->v = q->v; r->v = q->v; r->back->v = r->v; set_branchnum(q->back, q->branchnum); set_branchnum(r, get_trans(&curtree)); set_branchnum(r->back, r->branchnum); branchtrans(q->branchnum, q->v); branchtrans(r->branchnum, r->v); if ( smoothit ) { inittrav(p); inittrav(p->back); } p->initialized = false; i = 1; inserting = true; while (i <= smoothings) { smooth(p); if (!p->tip) { smooth (p->next->back); smooth (p->next->next->back); } i++; } inserting = false; } /* insert_ */ void restml_re_move(node **p, node **q) { /* remove p and record in q where it was */ long i; *q = (*p)->next->back; hookup(*q, (*p)->next->next->back); free_trans(&curtree,(*q)->back->branchnum); set_branchnum((*q)->back, (*q)->branchnum); (*q)->v = 0.75*(1 - (1 - 1.333333*(*q)->v) * (1 - 1.333333*(*p)->next->v)); if ( (*q)->v > 1 - epsilon) (*q)->v = 1 - epsilon; else if ( (*q)->v < epsilon) (*q)->v = epsilon; (*q)->back->v = (*q)->v; branchtrans((*q)->branchnum, (*q)->v); (*p)->next->back = NULL; (*p)->next->next->back = NULL; if ( smoothit ) { inittrav((*q)->back); inittrav(*q); } if ( smoothit ) { for ( i = 0 ; i < smoothings ; i++ ) { smooth(*q); smooth((*q)->back); } } else ( smooth(*q)); } /* restml_re_move */ void restml_copynode(node *c, node *d) { /* copy a node */ long i; set_branchnum(d, c->branchnum); for ( i = 0 ; i <= endsite ; i++) copy_sitelike(d->x2[i],c->x2[i],sitelength); d->v = c->v; d->iter = c->iter; d->xcoord = c->xcoord; d->ycoord = c->ycoord; d->ymin = c->ymin; d->ymax = c->ymax; d->initialized = c->initialized; } /* restml_copynode */ void restml_copy_(tree *a, tree *b) { /* copy tree a to tree b */ long i,j; node *p, *q; for (i = 0; i < spp; i++) { restml_copynode(a->nodep[i], b->nodep[i]); if (a->nodep[i]->back) { if (a->nodep[i]->back == a->nodep[a->nodep[i]->back->index - 1]) b->nodep[i]->back = b->nodep[a->nodep[i]->back->index - 1]; else if (a->nodep[i]->back == a->nodep[a->nodep[i]->back->index - 1]->next) b->nodep[i]->back = b->nodep[a->nodep[i]->back->index - 1]->next; else b->nodep[i]->back = b->nodep[a->nodep[i]->back->index - 1]->next->next; } else b->nodep[i]->back = NULL; } for (i = spp; i < nonodes2; i++) { p = a->nodep[i]; q = b->nodep[i]; for (j = 1; j <= 3; j++) { restml_copynode(p, q); if (p->back) { if (p->back == a->nodep[p->back->index - 1]) q->back = b->nodep[p->back->index - 1]; else if (p->back == a->nodep[p->back->index - 1]->next) q->back = b->nodep[p->back->index - 1]->next; else q->back = b->nodep[p->back->index - 1]->next->next; } else q->back = NULL; p = p->next; q = q->next; } } b->likelihood = a->likelihood; for (i=0;itrans[i],a->trans[i]); b->transindex = a->transindex; memcpy(b->freetrans,a->freetrans,nonodes*sizeof(long)); b->start = a->start; } /* restml_copy */ void buildnewtip(long m,tree *tr) { /* set up a new tip and interior node it is connected to */ node *p; long i, j; p = tr->nodep[nextsp + spp - 3]; for (i = 0; i <= endsite; i++) { for (j = 0; j < sitelength; j++) { /* trunc8 */ p->x2[i][j] = 1.0; p->next->x2[i][j] = 1.0; p->next->next->x2[i][j] = 1.0; } } hookup(tr->nodep[m - 1], p); p->v = initialv; p->back->v = initialv; set_branchnum(p, get_trans(tr)); set_branchnum(p->back, p->branchnum); branchtrans(p->branchnum, initialv); } /* buildnewtip */ void buildsimpletree(tree *tr) { /* set up and adjust branch lengths of a three-species tree */ long branch; hookup(tr->nodep[enterorder[0] - 1], tr->nodep[enterorder[1] - 1]); tr->nodep[enterorder[0] - 1]->v = initialv; tr->nodep[enterorder[1] - 1]->v = initialv; branchtrans(enterorder[1], initialv); branch = get_trans(tr); set_branchnum(tr->nodep[enterorder[0] - 1], branch); set_branchnum(tr->nodep[enterorder[1] - 1], branch); buildnewtip(enterorder[2], tr); insert_(tr->nodep[enterorder[2] - 1]->back, tr->nodep[enterorder[1] - 1]); tr->start = tr->nodep[enterorder[2]-1]->back; } /* buildsimpletree */ void addtraverse(node *p, node *q, boolean contin) { /* try adding p at q, proceed recursively through tree */ double like, vsave = 0; node *qback =NULL; if (!smoothit) { copymatrix (tempmatrix[1], curtree.trans[q->branchnum - 1]); vsave = q->v; qback = q->back; } insert_(p, q); like = evaluate(&curtree, p); if (like > bestyet) { bestyet = like; if (smoothit) { restml_copy_(&curtree, &bestree); addwhere = q; } else qwhere = q; succeeded = true; } if (smoothit) restml_copy_(&priortree, &curtree); else { hookup (q, qback); q->v = vsave; q->back->v = vsave; free_trans(&curtree,q->back->branchnum); set_branchnum(q->back, q->branchnum); copymatrix (curtree.trans[q->branchnum - 1], tempmatrix[1]); /* curtree.likelihood = bestyet; */ evaluate(&curtree, curtree.start); } if (!q->tip && contin) { /* assumes bifurcation (OK) */ addtraverse(p, q->next->back, contin); addtraverse(p, q->next->next->back, contin); } if ( contin && q == curtree.root ) { /* FIXME!! curtree.root->back == NULL? curtree.root == NULL? */ addtraverse(p,q->back,contin); } } /* addtraverse */ void globrearrange() { /* does global rearrangements */ tree globtree; tree oldtree; int i,j,k,l,num_sibs,num_sibs2; node *where,*sib_ptr,*sib_ptr2; double oldbestyet = curtree.likelihood; int success = false; printf("\n "); alloctree(&globtree.nodep,nonodes2,0); alloctree(&oldtree.nodep,nonodes2,0); alloctrans(&globtree, nonodes2, sitelength); alloctrans(&oldtree, nonodes2, sitelength); setuptree2(&globtree); setuptree2(&oldtree); allocx2(nonodes2, sitelength,globtree.nodep, 0); allocx2(nonodes2, sitelength,oldtree.nodep, 0); restml_copy_(&curtree,&globtree); restml_copy_(&curtree,&oldtree); bestyet = curtree.likelihood; for ( i = spp ; i < nonodes2 ; i++ ) { num_sibs = count_sibs(curtree.nodep[i]); sib_ptr = curtree.nodep[i]; if ( (i - spp) % (( nonodes2 / 72 ) + 1 ) == 0 ) putchar('.'); fflush(stdout); for ( j = 0 ; j <= num_sibs ; j++ ) { restml_re_move(&sib_ptr,&where); restml_copy_(&curtree,&priortree); qwhere = where; if (where->tip) { restml_copy_(&oldtree,&curtree); restml_copy_(&oldtree,&bestree); sib_ptr=sib_ptr->next; continue; } else num_sibs2 = count_sibs(where); sib_ptr2 = where; for ( k = 0 ; k < num_sibs2 ; k++ ) { addwhere = NULL; addtraverse(sib_ptr,sib_ptr2->back,true); if ( !smoothit ) { if (succeeded && qwhere != where && qwhere != where->back) { insert_(sib_ptr,qwhere); smoothit = true; for (l = 1; l<=smoothings; l++) { smooth (where); smooth (where->back); } smoothit = false; success = true; restml_copy_(&curtree,&globtree); } restml_copy_(&priortree,&curtree); } else if ( addwhere && where != addwhere && where->back != addwhere && bestyet > globtree.likelihood) { restml_copy_(&bestree,&globtree); success = true; } sib_ptr2 = sib_ptr2->next; } restml_copy_(&oldtree,&curtree); restml_copy_(&oldtree,&bestree); sib_ptr = sib_ptr->next; } } restml_copy_(&globtree,&curtree); restml_copy_(&globtree,&bestree); if (success && globtree.likelihood > oldbestyet) { succeeded = true; } else { succeeded = false; } bestyet = globtree.likelihood; freex2(nonodes2,globtree.nodep); freex2(nonodes2,oldtree.nodep); freetrans(&globtree, nonodes2, sitelength); freetrans(&oldtree, nonodes2, sitelength); freetree2(globtree.nodep,nonodes2); freetree2(oldtree.nodep,nonodes2); } void printnode(node* p); void printnode(node* p) { if (p->back) printf("p->index = %3ld, p->back->index = %3ld, p->branchnum = %3ld,evaluates" " to %f\n",p->index,p->back->index,p->branchnum,evaluate(&curtree,p)); else printf("p->index = %3ld, p->back->index =none, p->branchnum = %3ld,evaluates" " to nothing\n",p->index,p->branchnum); } void printvals(void); void printvals(void) { int i; node* p; for ( i = 0 ; i < nextsp ; i++ ) { p = curtree.nodep[i]; printnode(p); } for ( i = spp ; i <= spp + nextsp - 3 ; i++ ) { p = curtree.nodep[i]; printnode(p); printnode(p->next); printnode(p->next->next); } } void rearrange(node *p, node *pp) { /* rearranges the tree locally */ long i; node *q; node *r; node *rnb; node *rnnb; if (p->tip) return; if (p->back->tip) { rearrange(p->next->back, p); rearrange(p->next->next->back, p); return; } else /* if !p->tip && !p->back->tip */ { /* evaluate(&curtree, curtree.start); bestyet = curtree.likelihood; */ if (p->back->next != pp) r = p->back->next; else r = p->back->next->next; if (smoothit) { /* Copy the whole tree, because we may change all lengths */ restml_copy_(&curtree, &bestree); restml_re_move(&r, &q); nuview(p->next); nuview(p->next->next); restml_copy_(&curtree, &priortree); addtraverse(r, p->next->back, false); addtraverse(r, p->next->next->back, false); restml_copy_(&bestree, &curtree); } else { /* Save node data and matrices, so we can undo */ rnb = r->next->back; rnnb = r->next->next->back; restml_copynode(r,lrsaves[0]); restml_copynode(r->next,lrsaves[1]); restml_copynode(r->next->next,lrsaves[2]); restml_copynode(p->next,lrsaves[3]); restml_copynode(p->next->next,lrsaves[4]); copymatrix (tempmatrix[2], curtree.trans[r->branchnum - 1]); copymatrix (tempmatrix[3], curtree.trans[r->next->branchnum - 1]); copymatrix (tempmatrix[4], curtree.trans[r->next->next->branchnum-1]); copymatrix (tempmatrix[5], curtree.trans[p->next->branchnum-1]); copymatrix (tempmatrix[6], curtree.trans[p->next->next->branchnum-1]); restml_re_move(&r, &q); nuview(p->next); nuview(p->next->next); qwhere = q; addtraverse(r, p->next->back, false); addtraverse(r, p->next->next->back, false); if (qwhere == q) { hookup(rnb,r->next); hookup(rnnb,r->next->next); restml_copynode(lrsaves[0],r); restml_copynode(lrsaves[1],r->next); restml_copynode(lrsaves[2],r->next->next); restml_copynode(lrsaves[3],p->next); restml_copynode(lrsaves[4],p->next->next); r->back->v = r->v; r->next->back->v = r->next->v; r->next->next->back->v = r->next->next->v; p->next->back->v = p->next->v; p->next->next->back->v = p->next->next->v; set_branchnum(r->back, r->branchnum); set_branchnum(r->next->back, r->next->branchnum); set_branchnum(p->next->back, p->next->branchnum); set_branchnum(p->next->next->back, p->next->next->branchnum); copymatrix (curtree.trans[r->branchnum-1], tempmatrix[2]); copymatrix (curtree.trans[r->next->branchnum-1], tempmatrix[3]); copymatrix (curtree.trans[r->next->next->branchnum-1], tempmatrix[4]); copymatrix (curtree.trans[p->next->branchnum-1], tempmatrix[5]); copymatrix (curtree.trans[p->next->next->branchnum-1], tempmatrix[6]); curtree.likelihood = bestyet; } else { smoothit = true; insert_(r, qwhere); for (i = 1; i<=smoothings; i++) { smooth (r); smooth (r->back); } smoothit = false; } } } } /* rearrange */ void restml_coordinates(node *p, double lengthsum, long *tipy, double *tipmax, double *x) { /* establishes coordinates of nodes */ node *q, *first, *last; if (p->tip) { p->xcoord = (long)(over * lengthsum + 0.5); p->ycoord = (*tipy); p->ymin = (*tipy); p->ymax = (*tipy); (*tipy) += down; if (lengthsum > (*tipmax)) (*tipmax) = lengthsum; return; } q = p->next; do { (*x) = -0.75 * log(1.0 - 1.333333 * q->v); restml_coordinates(q->back, lengthsum + (*x),tipy,tipmax,x); q = q->next; } while ((p == curtree.start || p != q) && (p != curtree.start || p->next != q)); first = p->next->back; q = p; while (q->next != p) q = q->next; last = q->back; p->xcoord = (long)(over * lengthsum + 0.5); if (p == curtree.start) p->ycoord = p->next->next->back->ycoord; else p->ycoord = (first->ycoord + last->ycoord) / 2; p->ymin = first->ymin; p->ymax = last->ymax; } /* restml_coordinates */ void restml_fprintree(FILE *fp) { /* prints out diagram of the tree */ long tipy,i; double scale, tipmax, x; putc('\n', fp); if (!treeprint) return; putc('\n', fp); tipy = 1; tipmax = 0.0; restml_coordinates(curtree.start, 0.0, &tipy,&tipmax,&x); scale = 1.0 / (tipmax + 1.000); for (i = 1; i <= tipy - down; i++) fdrawline2(fp, i, scale, &curtree); putc('\n', fp); } /* restml_fprintree */ void restml_printree() { restml_fprintree(outfile); } double sigma(node *q, double *sumlr) { /* get 1.95996 * approximate standard error of branch length */ double sump, sumr, sums, sumc, p, pover3, pijk, Qjk, liketerm, f; double slopef,curvef; long i, j, k, m1, m2; sitelike2 binom1, binom2; transmatrix Prob, slopeP, curveP; node *r; sitelike2 x1, x2; double term, TEMP; x1 = init_sitelike(sitelength); x2 = init_sitelike(sitelength); binom1 = init_sitelike(sitelength); binom2 = init_sitelike(sitelength); Prob = (transmatrix)Malloc((sitelength+1) * sizeof(double *)); slopeP = (transmatrix)Malloc((sitelength+1) * sizeof(double *)); curveP = (transmatrix)Malloc((sitelength+1) * sizeof(double *)); for (i=0; i<=sitelength; ++i) { Prob[i] = (double *)Malloc((sitelength+1) * sizeof(double)); slopeP[i] = (double *)Malloc((sitelength+1) * sizeof(double)); curveP[i] = (double *)Malloc((sitelength+1) * sizeof(double)); } p = q->v; pover3 = p / 3; for (i = 0; i <= sitelength; i++) { binom1[0] = exp((sitelength - i) * log(1 - p)); for (k = 1; k <= (sitelength - i); k++) binom1[k] = binom1[k - 1] * (p / (1 - p)) * (sitelength - i - k + 1) / k; binom2[0] = exp(i * log(1 - pover3)); for (k = 1; k <= i; k++) binom2[k] = binom2[k - 1] * (pover3 / (1 - pover3)) * (i - k + 1) / k; for (j = 0; j <= sitelength; j++) { sump = 0.0; sums = 0.0; sumc = 0.0; if (i - j > 0) m1 = i - j; else m1 = 0; if (sitelength - j < i) m2 = sitelength - j; else m2 = i; for (k = m1; k <= m2; k++) { pijk = binom1[j - i + k] * binom2[k]; sump += pijk; term = (j-i+2*k)/p - (sitelength-j-k)/(1.0-p) - (i-k)/(3.0-p); sums += pijk * term; sumc += pijk * (term * term - (j-i+2*k)/(p*p) - (sitelength-j-k)/((1.0-p)*(1.0-p)) - (i-k)/((3.0-p)*(3.0-p)) ); } Prob[i][j] = sump; slopeP[i][j] = sums; curveP[i][j] = sumc; } } (*sumlr) = 0.0; sumc = 0.0; sums = 0.0; r = q->back; for (i = 1; i <= endsite; i++) { f = 0.0; slopef = 0.0; curvef = 0.0; sumr = 0.0; copy_sitelike(x1,q->x2[i],sitelength); copy_sitelike(x2,r->x2[i],sitelength); for (j = 0; j <= sitelength; j++) { liketerm = pie[j] * x1[j]; sumr += liketerm * x2[j]; for (k = 0; k <= sitelength; k++) { Qjk = liketerm * x2[k]; f += Qjk * Prob[j][k]; slopef += Qjk * slopeP[j][k]; curvef += Qjk * curveP[j][k]; } } (*sumlr) += weight[i] * log(f / sumr); sums += weight[i] * slopef / f; TEMP = slopef / f; sumc += weight[i] * (curvef / f - TEMP * TEMP); } if (trunc8) { f = 0.0; slopef = 0.0; curvef = 0.0; sumr = 0.0; copy_sitelike(x1,q->x2[0],sitelength); copy_sitelike(x2,r->x2[0],sitelength); for (j = 0; j <= sitelength; j++) { liketerm = pie[j] * x1[j]; sumr += liketerm * x2[j]; for (k = 0; k <= sitelength; k++) { Qjk = liketerm * x2[k]; f += Qjk * Prob[j][k]; slopef += Qjk * slopeP[j][k]; curvef += Qjk * curveP[j][k]; } } (*sumlr) += weightsum * log((1.0 - sumr) / (1.0 - f)); sums += weightsum * slopef / (1.0 - f); TEMP = slopef / (1.0 - f); sumc += weightsum * (curvef / (1.0 - f) + TEMP * TEMP); } for (i=0;i<=sitelength;++i){ free(Prob[i]); free(slopeP[i]); free(curveP[i]); } free(Prob); free(slopeP); free(curveP); free_sitelike(x1); free_sitelike(x2); free_sitelike(binom1); free_sitelike(binom2); if (sumc < -1.0e-6) return ((-sums - sqrt(sums * sums - 3.841 * sumc)) / sumc); else return -1.0; } /* sigma */ void fdescribe(FILE *fp, node *p) { /* print out information on one branch */ double sumlr; long i; node *q; double s; double realv; q = p->back; fprintf(fp, "%4ld ", q->index - spp); fprintf(fp, " "); if (p->tip) { for (i = 0; i < nmlngth; i++) putc(nayme[p->index - 1][i], fp); } else fprintf(fp, "%4ld ", p->index - spp); if (q->v >= 0.75) fprintf(fp, " infinity"); else { realv = -0.75 * log(1 - 4.0/3.0 * q->v); fprintf(fp, "%13.5f", realv); } if (p->iter) { s = sigma(q, &sumlr); if (s < 0.0) fprintf(fp, " ( zero, infinity)"); else { fprintf(fp, " ("); if (q->v - s <= 0.0) fprintf(fp, " zero"); else fprintf(fp, "%9.5f", -0.75 * log(1 - 1.333333 * (q->v - s))); putc(',', fp); if (q->v + s >= 0.75) fprintf(fp, " infinity"); else fprintf(fp, "%12.5f", -0.75 * log(1 - 1.333333 * (q->v + s))); putc(')', fp); } if (sumlr > 1.9205) fprintf(fp, " *"); if (sumlr > 2.995) putc('*', fp); } else fprintf(fp, " (not varied)"); putc('\n', fp); if (!p->tip) { for (q = p->next; q != p; q = q->next) fdescribe(fp, q->back); } } /* fdescribe */ void summarize() { /* print out information on branches of tree */ node *q; fprintf(outfile, "\nremember: "); if (outgropt) fprintf(outfile, "(although rooted by outgroup) "); fprintf(outfile, "this is an unrooted tree!\n\n"); fprintf(outfile, "Ln Likelihood = %11.5f\n\n", curtree.likelihood); fprintf(outfile, " \n"); fprintf(outfile, " Between And Length"); fprintf(outfile, " Approx. Confidence Limits\n"); fprintf(outfile, " ------- --- ------"); fprintf(outfile, " ------- ---------- ------\n"); for (q = curtree.start->next; q != curtree.start; q = q->next) fdescribe(outfile, q->back); fdescribe(outfile, curtree.start->back); fprintf(outfile, "\n * = significantly positive, P < 0.05\n"); fprintf(outfile, " ** = significantly positive, P < 0.01\n\n\n"); } /* summarize */ void restml_treeout(node *p) { /* write out file with representation of final tree */ long i, n, w; Char c; double x; if (p->tip) { n = 0; for (i = 1; i <= nmlngth; i++) { if (nayme[p->index - 1][i - 1] != ' ') n = i; } for (i = 0; i < n; i++) { c = nayme[p->index - 1][i]; if (c == ' ') c = '_'; putc(c, outtree); } col += n; } else { putc('(', outtree); col++; restml_treeout(p->next->back); putc(',', outtree); col++; if (col > 45) { putc('\n', outtree); col = 0; } restml_treeout(p->next->next->back); if (p == curtree.start) { putc(',', outtree); col++; if (col > 45) { putc('\n', outtree); col = 0; } restml_treeout(p->back); } putc(')', outtree); col++; } if (p->v >= 0.75) x = -1.0; else x = -0.75 * log(1 - 1.333333 * p->v); if (x > 0.0) w = (long)(0.43429448222 * log(x)); else if (x == 0.0) w = 0; else w = (long)(0.43429448222 * log(-x)) + 1; if (w < 0) w = 0; if (p == curtree.start) fprintf(outtree, ";\n"); else { fprintf(outtree, ":%*.5f", (int)(w + 7), x); col += w + 8; } } /* restml_treeout */ static phenotype2 restml_pheno_new(long endsite, long sitelength) { phenotype2 ret; long k, l; endsite++; ret = (phenotype2)Malloc(endsite*sizeof(sitelike2)); for (k = 0; k < endsite; k++) { ret[k] = Malloc((sitelength + 1) * sizeof(double)); for (l = 0; l < sitelength; l++) ret[k][l] = 1.0; } return ret; } /* static void restml_pheno_delete(phenotype2 x2) { long k; for (k = 0; k < endsite+1; k++) free(x2[k]); free(x2); } */ void initrestmlnode(node **p, node **grbg, node *q, long len, long nodei, long *ntips, long *parens, initops whichinit, pointarray treenode, pointarray nodep, Char *str, Char *ch, FILE *intree) { /* initializes a node */ boolean minusread; double valyew, divisor; switch (whichinit) { case bottom: gnu(grbg, p); (*p)->index = nodei; (*p)->tip = false; (*p)->branchnum = 0; (*p)->x2 = restml_pheno_new(endsite, sitelength); nodep[(*p)->index - 1] = (*p); break; case nonbottom: gnu(grbg, p); (*p)->x2 = restml_pheno_new(endsite, sitelength); (*p)->index = nodei; break; case tip: match_names_to_data (str, nodep, p, spp); break; case iter: (*p)->initialized = false; (*p)->v = initialv; (*p)->iter = true; if ((*p)->back != NULL){ (*p)->back->iter = true; (*p)->back->v = initialv; (*p)->back->initialized = false; } break; case length: processlength(&valyew, &divisor, ch, &minusread, intree, parens); (*p)->v = valyew / divisor; (*p)->iter = false; if ((*p)->back != NULL) { (*p)->back->v = (*p)->v; (*p)->back->iter = false; } break; case hsnolength: break; default: /* cases hslength, treewt, unittrwt */ break; } } /* initrestmlnode */ static void restml_unroot(node* root, node** nodep, long nonodes) { node *p,*r,*q; double newl; long i; long numsibs; numsibs = count_sibs(root); if ( numsibs > 2 ) { q = root; r = root; while (!(q->next == root)) q = q->next; q->next = root->next; /* FIXME? for(i=0 ; i < endsite ; i++){ free(r->x[i]); r->x[i] = NULL; } free(r->x); r->x = NULL; */ chuck(&grbg, r); curtree.nodep[spp] = q; } else { /* Bifurcating root - remove entire root fork */ /* Join v on each side of root */ newl = root->next->v + root->next->next->v; root->next->back->v = newl; root->next->next->back->v = newl; /* Connect root's children */ hookup(root->next->back, root->next->next->back); /* Move nodep entries down one and set indices */ for ( i = spp; i < nonodes-1; i++ ) { p = nodep[i+1]; nodep[i] = p; nodep[i+1] = NULL; if ( nodep[i] == NULL ) /* This may happen in a multifurcating intree */ break; do { p->index = i+1; p = p->next; } while (p != nodep[i]); } /* Free protx arrays from old root */ /* for(i=0 ; i < endsite ; i++){ free(root->x[i]); free(root->next->x[i]); free(root->next->next->x[i]); root->x[i] = NULL; root->next->x[i] = NULL; root->next->next->x[i] = NULL; } free(root->x); free(root->next->x); free(root->next->next->x); */ chuck(&grbg,root->next->next); chuck(&grbg,root->next); chuck(&grbg,root); } } /* dnaml_unroot */ void inittravtree(tree* t,node *p) { /* traverse tree to set initialized and v to initial values */ node* q; if ( p->branchnum == 0) { set_branchnum(p, get_trans(t)); set_branchnum(p->back, p->branchnum); } p->initialized = false; p->back->initialized = false; if ( usertree && (!lengths || p->iter)) { branchtrans(p->branchnum, initialv); p->v = initialv; p->back->v = initialv; } else branchtrans(p->branchnum, p->v); if ( !p->tip ) { q = p->next; while ( q != p ) { inittravtree(t,q->back); q = q->next; } } } /* inittravtree */ double adjusted_v(double v) { return 3.0/4.0 * (1.0-exp(-4.0/3.0 * v)); } static void adjust_lengths_r(node *p) { node *q; p->v = adjusted_v(p->v); p->back->v = p->v; if (!p->tip) { for (q = p->next; q != p; q = q->next) adjust_lengths_r(q->back); } } void adjust_lengths(tree *t) { assert(t->start->back->tip); adjust_lengths_r(t->start); } void treevaluate() { /* find maximum likelihood branch lengths of user tree */ long i; if ( lengths) adjust_lengths(&curtree); nonodes2--; inittravtree(&curtree,curtree.start); inittravtree(&curtree,curtree.start->back); smoothit = true; for (i = 1; i <= smoothings * 4; i++) { smooth (curtree.start); smooth (curtree.start->back); } evaluate(&curtree, curtree.start); nonodes2++; } /* treevaluate */ void maketree() { /* construct and rearrange tree */ long i,j; long nextnode; if (usertree) { /* Open in binary: ftell() is broken for UNIX line-endings under WIN32 */ openfile(&intree,INTREE,"input tree file","rb",progname,intreename); numtrees = countsemic(&intree); if(numtrees > MAXSHIMOTREES) shimotrees = MAXSHIMOTREES; else shimotrees = numtrees; if (numtrees > 2) initseed(&inseed, &inseed0, seed); l0gl = (double *) Malloc(shimotrees * sizeof(double)); l0gf = (double **) Malloc(shimotrees * sizeof(double *)); for (i=0; i < shimotrees; ++i) l0gf[i] = (double *)Malloc(endsite * sizeof(double)); if (treeprint) { fprintf(outfile, "User-defined tree"); if (numtrees > 1) putc('s', outfile); fprintf(outfile, ":\n\n"); } which = 1; while (which <= numtrees) { /* treeread2 (intree, &curtree.start, curtree.nodep, lengths, &trweight, &goteof, &haslengths, &spp,false,nonodes2); */ /* These initializations required each time through the loop since multiple trees require re-initialization */ nextnode = 0; goteof = false; treeread(intree, &curtree.start, NULL, &goteof, NULL, curtree.nodep, &nextnode, NULL, &grbg, initrestmlnode, false, nonodes2); restml_unroot(curtree.start, curtree.nodep, nonodes2); if ( outgropt ) curtree.start = curtree.nodep[outgrno - 1]->back; else curtree.start = curtree.nodep[0]->back; treevaluate(); restml_fprintree(outfile); summarize(); if (trout) { col = 0; restml_treeout(curtree.start); } clear_connections(&curtree,nonodes2); which++; } FClose(intree); if (numtrees > 1 && weightsum > 1 ) standev2(numtrees, maxwhich, 0, endsite-1, maxlogl, l0gl, l0gf, aliasweight, seed); } else { free_all_trans(&curtree); for (i = 1; i <= spp; i++) enterorder[i - 1] = i; if (jumble) randumize(seed, enterorder); if (progress) { printf("\nAdding species:\n"); writename(0, 3, enterorder); } nextsp = 3; smoothit = improve; buildsimpletree(&curtree); curtree.start = curtree.nodep[enterorder[0] - 1]->back; nextsp = 4; while (nextsp <= spp) { buildnewtip(enterorder[nextsp - 1], &curtree); /* bestyet = - nextsp*sites*sitelength*log(4.0); */ bestyet = -DBL_MAX; if (smoothit) restml_copy_(&curtree, &priortree); addtraverse(curtree.nodep[enterorder[nextsp - 1] - 1]->back, curtree.nodep[enterorder[0]-1]->back, true); if (smoothit) restml_copy_(&bestree, &curtree); else { smoothit = true; insert_(curtree.nodep[enterorder[nextsp - 1] - 1]->back, qwhere); for (i = 1; i<=smoothings; i++) { smooth (curtree.start); smooth (curtree.start->back); } smoothit = false; /* bestyet = curtree.likelihood; */ } if (progress) writename(nextsp - 1, 1, enterorder); if (global && nextsp == spp) { if (progress) { printf("Doing global rearrangements\n"); printf(" !"); for (j = spp ; j < nonodes2 ; j++) if ( (j - spp) % (( nonodes2 / 72 ) + 1 ) == 0 ) putchar('-'); putchar('!'); } } succeeded = true; while (succeeded) { succeeded = false; if (global && nextsp == spp) globrearrange(); else rearrange(curtree.start, curtree.start->back); } nextsp++; } if (global && progress) { putchar('\n'); fflush(stdout); } restml_copy_(&curtree, &bestree); if (njumble > 1) { if (jumb == 1) restml_copy_(&bestree, &bestree2); else if (bestree2.likelihood < bestree.likelihood) restml_copy_(&bestree, &bestree2); } if (jumb == njumble) { if (njumble > 1) restml_copy_(&bestree2, &curtree); curtree.start = curtree.nodep[outgrno - 1]->back; restml_fprintree(outfile); summarize(); if (trout) { col = 0; restml_treeout(curtree.start); } } } if ( jumb < njumble ) return; freex2(nonodes2, curtree.nodep); if (!usertree) { freex2(nonodes2, priortree.nodep); freex2(nonodes2, bestree.nodep); if (njumble > 1) freex2(nonodes2, bestree2.nodep); } else { free(l0gl); for (i=0;i 1) { fprintf(outfile, "Data set # %ld:\n",ith); if (progress) printf("\nData set # %ld:\n",ith); } getinput(); if (ith == 1) firstset = false; for (jumb = 1; jumb <= njumble; jumb++) maketree(); } cleanup(); FClose(infile); FClose(outfile); FClose(outtree); #ifdef MAC fixmacfile(outfilename); fixmacfile(outtreename); #endif printf("\nDone.\n\n"); #ifdef WIN32 phyRestoreConsoleAttributes(); #endif return 0; } /* maximum likelihood phylogenies from restriction sites */ phylip-3.697/src/retree.c0000644004732000473200000024144112406201117014766 0ustar joefelsenst_g #include "phylip.h" #include "moves.h" /* version 3.696. Written by Joseph Felsenstein and Andrew Keeffe. Copyright (c) 1993-2014, Joseph Felsenstein All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ /* maximum number of species */ #define maxsp 5000 /* size of pointer array. >= 2*maxsp - 1 */ /* (this can be large without eating memory */ #define maxsz 9999 #define overr 4 #define which 1 typedef enum {valid, remoov, quit} reslttype; typedef enum { horiz, vert, up, updel, ch_over, upcorner, midcorner, downcorner, aa, cc, gg, tt, deleted } chartype; typedef struct treeset_t { node *root; pointarray nodep; long nonodes; boolean waswritten, hasmult, haslengths, nolengths, initialized; } treeset_t; treeset_t treesets[2]; treeset_t simplifiedtree; typedef enum { arb, use, spec } howtree; typedef enum {beforenode, atnode} movet; movet fromtype; #ifndef OLDC /* function prototypes */ void initretreenode(node **, node **, node *, long, long, long *, long *, initops, pointarray, pointarray, Char *, Char *, FILE *); void gdispose(node *); void maketriad(node **, long); void maketip(node **, long); void copynode(node *, node *); node *copytrav(node *); void copytree(void); void getoptions(void); void configure(void); void prefix(chartype); void postfix(chartype); void ltrav(node *, boolean *); boolean ifhaslengths(void); void add_at(node *, node *, node *); void add_before(node *, node *); void add_child(node *, node *); void re_move(node **, node **); void reroot(node *); void ltrav_(node *, double, double, double *, long *, long *); void precoord(node *, boolean *, double *, long *); void coordinates(node *, double, long *, long *, double *); void flatcoordinates(node *, long *); void grwrite(chartype, long, long *); void drawline(long, node *, boolean *); void printree(void); void togglelengths(void); void arbitree(void); void yourtree(void); void buildtree(void); void unbuildtree(void); void retree_help(void); void consolidatetree(long); void rearrange(void); boolean any_deleted(node *); void fliptrav(node *, boolean); void flip(long); void transpose(long); void ifdeltrav(node *, boolean *); double oltrav(node *); void outlength(void); void midpoint(void); void deltrav(node *, boolean ); void reg_del(node *, boolean); boolean isdeleted(long); void deletebranch(void); void restorebranch(void); void del_or_restore(void); void undo(void); void treetrav(node *); void simcopynode(node *, node *); node *simcopytrav(node *); void simcopytree(void); void writebranchlength(double); void treeout(node *, boolean, double, long); void maketemptriad(node **, long); void roottreeout(boolean *); void notrootedtorooted(void); void rootedtonotrooted(void); void treewrite(boolean *); void retree_window(adjwindow); void getlength(double *, reslttype *, boolean *); void changelength(void); void changename(void); void clade(void); void changeoutgroup(void); void redisplay(void); void treeconstruct(void); void fill_del(node*p); /* function prototypes */ #endif node *root, *garbage; long nonodes, outgrno, screenwidth, vscreenwidth, screenlines, col, treenumber, leftedge, topedge, treelines, hscroll, vscroll, scrollinc, whichtree, othertree, numtrees, treesread; double trweight; boolean waswritten, onfirsttree, hasmult, haslengths, nolengths, nexus, xmltree; node **treeone, **treetwo; pointarray nodep; /* pointers to all nodes in current tree */ node *grbg; boolean reversed[14]; boolean graphic[14]; unsigned char cch[14]; howtree how; char intreename[FNMLNGTH], outtreename[FNMLNGTH]; boolean subtree, written, readnext; node *nuroot; Char ch; boolean delarray[maxsz]; void initretreenode(node **p, node **grbg, node *q, long len, long nodei, long *ntips, long *parens, initops whichinit, pointarray treenode, pointarray nodep, Char *str, Char *ch, FILE *intree) { /* initializes a node */ long i; boolean minusread; double valyew, divisor; switch (whichinit) { case bottom: gnu(grbg, p); (*p)->index = nodei; (*p)->tip = false; (*p)->deleted=false; (*p)->deadend=false; (*p)->onebranch=false; (*p)->onebranchhaslength=false; for (i=0;inayme[i] = '\0'; nodep[(*p)->index - 1] = (*p); break; case nonbottom: gnu(grbg, p); (*p)->index = nodei; break; case hslength: if ((*p)->back) { (*p)->back->back = *p; (*p)->haslength = (*p)->back->haslength; if ((*p)->haslength) (*p)->length = (*p)->back->length; } break; case tip: (*ntips)++; gnu(grbg, p); nodep[(*ntips) - 1] = *p; (*p)->index = *ntips; (*p)->tip = true; (*p)->hasname = true; strncpy ((*p)->nayme, str, MAXNCH); break; case length: (*p)->haslength = true; if ((*p)->back != NULL) (*p)->back->haslength = (*p)->haslength; processlength(&valyew, &divisor, ch, &minusread, intree, parens); if (!minusread) (*p)->length = valyew / divisor; else (*p)->length = 0.0; (*p)->back = q; if (haslengths && q != NULL) { (*p)->back->haslength = (*p)->haslength; (*p)->back->length = (*p)->length; } break; case hsnolength: haslengths = (haslengths && q == NULL); (*p)->haslength = false; (*p)->back = q; break; default: /*cases iter, treewt, unttrwt */ break; /*should not occur */ } } /* initretreenode */ void gdispose(node *p) { /* go through tree throwing away nodes */ node *q, *r; if (p->tip) return; q = p->next; while (q != p) { gdispose(q->back); q->tip = false; q->hasname = false; q->haslength = false; r = q; q = q->next; chuck(&grbg, r); } q->tip = false; q->hasname = false; q->haslength = false; chuck(&grbg, q); } /* gdispose */ void maketriad(node **p, long index) { /* Initiate an internal node with stubs for two children */ long i, j; node *q; q = NULL; for (i = 1; i <= 3; i++) { gnu(&grbg, p); (*p)->index = index; (*p)->hasname = false; (*p)->haslength = false; (*p)->deleted=false; (*p)->deadend=false; (*p)->onebranch=false; (*p)->onebranchhaslength=false; for (j=0;jnayme[j] = '\0'; (*p)->next = q; q = *p; } (*p)->next->next->next = *p; q = (*p)->next; while (*p != q) { (*p)->back = NULL; (*p)->tip = false; *p = (*p)->next; } nodep[index - 1] = *p; } /* maketriad */ void maketip(node **p, long index) { /* Initiate a tip node */ gnu(&grbg, p); (*p)->index = index; (*p)->tip = true; (*p)->hasname = false; (*p)->haslength = false; nodep[index - 1] = *p; } /* maketip */ void copynode(node *fromnode, node *tonode) { /* Copy the contents of a node from fromnode to tonode. */ int i; tonode->index = fromnode->index; tonode->deleted = fromnode->deleted; tonode->tip = fromnode->tip; tonode->hasname = fromnode->hasname; if (fromnode->hasname) for (i=0;inayme[i] = fromnode->nayme[i]; tonode->haslength = fromnode->haslength; if (fromnode->haslength) tonode->length = fromnode->length; } /* copynode */ node *copytrav(node *p) { /* Traverse the tree from p on down, copying nodes to the other tree */ node *q, *newnode, *newnextnode, *temp; gnu(&grbg, &newnode); copynode(p,newnode); if (nodep[p->index-1] == p) treesets[othertree].nodep[p->index-1] = newnode; /* if this is a tip, return now */ if (p->tip) return newnode; /* go around the ring, copying as we go */ q = p->next; gnu(&grbg, &newnextnode); copynode(q, newnextnode); newnode->next = newnextnode; do { newnextnode->back = copytrav(q->back); newnextnode->back->back = newnextnode; q = q->next; if (q == p) newnextnode->next = newnode; else { temp = newnextnode; gnu(&grbg, &newnextnode); copynode(q, newnextnode); temp->next = newnextnode; } } while (q != p); return newnode; } /* copytrav */ void copytree() { /* Make a complete copy of the current tree for undo purposes */ if (whichtree == 1) othertree = 0; else othertree = 1; treesets[othertree].root = copytrav(root); treesets[othertree].nonodes = nonodes; treesets[othertree].waswritten = waswritten; treesets[othertree].hasmult = hasmult; treesets[othertree].haslengths = haslengths; treesets[othertree].nolengths = nolengths; treesets[othertree].initialized = true; } /* copytree */ void getoptions() { /* interactively set options */ long loopcount; Char ch; boolean done, gotopt; how = use; outgrno = 1; loopcount = 0; onfirsttree = true; do { cleerhome(); printf("\nTree Rearrangement, version %s\n\n",VERSION); printf("Settings for this run:\n"); printf(" U Initial tree (arbitrary, user, specify)?"); if (how == arb) printf(" Arbitrary\n"); else if (how == use) printf(" User tree from tree file\n"); else printf(" Tree you specify\n"); printf(" N Format to write out trees (PHYLIP, Nexus, XML)?"); if (nexus) printf(" Nexus\n"); else { if (xmltree) printf(" XML\n"); else printf(" PHYLIP\n"); } printf(" 0 Graphics type (IBM PC, ANSI, none)?"); if (ibmpc) printf(" IBM PC\n"); if (ansi ) printf(" ANSI\n"); if (!(ibmpc || ansi)) printf(" (none)\n"); printf(" W Width of terminal screen, of plotting area?"); printf("%4ld, %2ld\n", screenwidth, vscreenwidth); printf(" L Number of lines on screen?"); printf("%4ld\n", screenlines); printf("\nAre these settings correct?"); printf(" (type Y or the letter for one to change)\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); if (ch == '\n') ch = ' '; ch = (isupper(ch)) ? ch : toupper(ch); done = (ch == 'Y'); gotopt = (ch == 'U' || ch == 'N' || ch == '0' || ch == 'L' || ch == 'W'); if (gotopt) { switch (ch) { case 'U': if (how == arb) how = use; else if (how == use) how = spec; else how = arb; break; case 'N': if (nexus) { nexus = false; xmltree = true; } else if (xmltree) xmltree = false; else nexus = true; break; case '0': initterminal(&ibmpc, &ansi); break; case 'L': initnumlines(&screenlines); break; case 'W': screenwidth= readlong("Width of terminal screen (in characters)?\n"); vscreenwidth=readlong("Width of plotting area (in characters)?\n"); break; } } if (!(gotopt || done)) printf("Not a possible option!\n"); countup(&loopcount, 100); } while (!done); if (scrollinc < screenwidth / 2.0) hscroll = scrollinc; else hscroll = screenwidth / 2; if (scrollinc < screenlines / 2.0) vscroll = scrollinc; else vscroll = screenlines / 2; } /* getoptions */ void configure() { /* configure to machine -- set up special characters */ chartype a; for (a = horiz; (long)a <= (long)deleted; a = (chartype)((long)a + 1)) reversed[(long)a] = false; for (a = horiz; (long)a <= (long)deleted; a = (chartype)((long)a + 1)) graphic[(long)a] = false; cch[(long)deleted] = '.'; cch[(long)updel] = ':'; if (ibmpc) { cch[(long)horiz] = '>'; cch[(long)vert] = 186; graphic[(long)vert] = true; cch[(long)up] = 186; graphic[(long)up] = true; cch[(long)ch_over] = 205; graphic[(long)ch_over] = true; cch[(long)upcorner] = 200; graphic[(long)upcorner] = true; cch[(long)midcorner] = 204; graphic[(long)midcorner] = true; cch[(long)downcorner] = 201; graphic[(long)downcorner] = true; return; } if (ansi) { cch[(long)horiz] = '>'; cch[(long)vert] = cch[(long)horiz]; reversed[(long)vert] = true; cch[(long)up] = 'x'; graphic[(long)up] = true; cch[(long)ch_over] = 'q'; graphic[(long)ch_over] = true; cch[(long)upcorner] = 'm'; graphic[(long)upcorner] = true; cch[(long)midcorner] = 't'; graphic[(long)midcorner] = true; cch[(long)downcorner] = 'l'; graphic[(long)downcorner] = true; return; } cch[(long)horiz] = '>'; cch[(long)vert] = ' '; cch[(long)up] = '!'; cch[(long)upcorner] = '`'; cch[(long)midcorner] = '+'; cch[(long)downcorner] = ','; cch[(long)ch_over] = '-'; } /* configure */ void prefix(chartype a) { /* give prefix appropriate for this character */ if (reversed[(long)a]) prereverse(ansi); if (graphic[(long)a]) pregraph2(ansi); } /* prefix */ void postfix(chartype a) { /* give postfix appropriate for this character */ if (reversed[(long)a]) postreverse(ansi); if (graphic[(long)a]) postgraph2(ansi); } /* postfix */ void ltrav(node *p, boolean *localhl) { /* Traversal function for ifhaslengths() */ node *q; if (p->tip) { (*localhl) = ((*localhl) && p->haslength); return; } q = p->next; do { (*localhl) = ((*localhl) && q->haslength); if ((*localhl)) ltrav(q->back, localhl); q = q->next; } while (p != q); } /* ltrav */ boolean ifhaslengths() { /* return true if every branch in tree has a length */ boolean localhl; localhl = true; ltrav(root, &localhl); return localhl; } /* ifhaslengths */ void add_at(node *below, node *newtip, node *newfork) { /* inserts the nodes newfork and its left descendant, newtip, to the tree. below becomes newfork's right descendant */ node *leftdesc, *rtdesc; double length; if (below != nodep[below->index - 1]) below = nodep[below->index - 1]; if (newfork == NULL) { nonodes++; maketriad (&newfork, nonodes); if (haslengths) { newfork->haslength = true; newfork->next->haslength = true; newfork->next->next->haslength = true; } } if (below->back != NULL) { below->back->back = newfork; } newfork->back = below->back; leftdesc = newtip; rtdesc = below; rtdesc->back = newfork->next->next; newfork->next->next->back = rtdesc; newfork->next->back = leftdesc; leftdesc->back = newfork->next; if (root == below) root = newfork; root->back = NULL; if (!haslengths) return; if (newfork->back != NULL) { length = newfork->back->length / 2.0; newfork->length = length; newfork->back->length = length; below->length = length; below->back->length = length; } else { length = newtip->length / 2.0; newtip->length = length; newtip->back->length = length; below->length = length; below->back->length = length; below->haslength = true; } newtip->back->length = newtip->length; } /* add_at */ void add_before(node *atnode, node *newtip) { /* inserts the node newtip together with its ancestral fork into the tree next to the node atnode. */ /*xx ?? debug what to do if no ancestral node -- have to create one */ /*xx this case is handled by add_at. However, add_at does not account for when there is more than one sibling for the relocated newtip */ node *q; if (atnode != nodep[atnode->index - 1]) atnode = nodep[atnode->index - 1]; q = nodep[newtip->index-1]->back; if (q != NULL) { q = nodep[q->index-1]; if (newtip == q->next->next->back) { q->next->back = newtip; newtip->back = q->next; q->next->next->back = NULL; } } if (newtip->back != NULL) { add_at(atnode, newtip, nodep[newtip->back->index-1]); } else { add_at(atnode, newtip, NULL); } } /* add_before */ void add_child(node *parent, node *newchild) { /* adds the node newchild into the tree as the last child of parent */ int i; node *newnode, *q; if (parent != nodep[parent->index - 1]) parent = nodep[parent->index - 1]; gnu(&grbg, &newnode); newnode->tip = false; newnode->deleted=false; newnode->deadend=false; newnode->onebranch=false; newnode->onebranchhaslength=false; for (i=0;inayme[i] = '\0'; newnode->index = parent->index; q = parent; do { q = q->next; } while (q->next != parent); newnode->next = parent; q->next = newnode; newnode->back = newchild; newchild->back = newnode; if (newchild->haslength) { newnode->length = newchild->length; newnode->haslength = true; } else newnode->haslength = false; } /* add_child */ void re_move(node **item, node **fork) { /* Removes node item from the tree. If item has one sibling, removes its ancestor, fork, from the tree as well and attach item's sib to fork's ancestor. In this case, it returns a pointer to the removed fork node which is still attached to item. */ node *p =NULL, *q; int nodecount; if ((*item)->back == NULL) { *fork = NULL; return; } *fork = nodep[(*item)->back->index - 1]; nodecount = 0; if ((*fork)->next->back == *item) p = *fork; q = (*fork)->next; do { nodecount++; if (q->next->back == *item) p = q; q = q->next; } while (*fork != q); if (nodecount > 2) { fromtype = atnode; p->next = (*item)->back->next; chuck(&grbg, (*item)->back); (*item)->back = NULL; /*xx*/ *fork = NULL; } else { /* traditional (binary tree) remove code */ if (*item == (*fork)->next->back) { if (root == *fork) root = (*fork)->next->next->back; } else { if (root == *fork) root = (*fork)->next->back; } fromtype = beforenode; /* stitch nodes together, leaving out item */ p = (*item)->back->next->back; q = (*item)->back->next->next->back; if (p != NULL) p->back = q; if (q != NULL) q->back = p; if (haslengths) { if (p != NULL && q != NULL) { p->length += q->length; q->length = p->length; } else (*item)->length = (*fork)->next->length + (*fork)->next->next->length; } (*fork)->back = NULL; p = (*fork)->next; while (p != *fork) { p->back = NULL; p = p->next; } (*item)->back = NULL; } /* endif nodecount > 2 else */ } /* re_move */ void reroot(node *outgroup) { /* Reorient tree so that outgroup is by itself on the left of the root */ node *p, *q, *r; long nodecount = 0; double templen; q = root->next; do { /* when this loop exits, p points to the internal */ p = q; /* node to the right of root */ nodecount++; q = p->next; } while (q != root); r = p; /* There is no point in proceeding if 1. outgroup is a child of root, and 2. the tree bifurcates at the root. */ if((outgroup->back->index == root->index) && !(nodecount > 2)) return; /* reorient nodep array The nodep array must point to the ring member of each ring that is closest to the root. The while loop changes the ring member pointed to by nodep[] for those nodes that will have their orientation changed by the reroot operation. */ p = outgroup->back; while (p->index != root->index) { q = nodep[p->index - 1]->back; nodep[p->index - 1] = p; p = q; } if (nodecount > 2) nodep[p->index - 1] = p; /* If nodecount > 2, the current node ring to which root is pointing will remain in place and root will point somewhere else. */ /* detach root from old location */ if (nodecount > 2) { r->next = root->next; root->next = NULL; nonodes++; maketriad(&root, nonodes); if (haslengths) { /* root->haslength remains false, or else treeout() will generate a bogus extra length */ root->next->haslength = true; root->next->next->haslength = true; } } else { /* if (nodecount > 2) else */ q = root->next; q->back->back = r->back; r->back->back = q->back; if (haslengths) { r->back->length = r->back->length + q->back->length; q->back->length = r->back->length; } } /* if (nodecount > 2) endif */ /* tie root into new location */ root->next->back = outgroup; root->next->next->back = outgroup->back; outgroup->back->back = root->next->next; outgroup->back = root->next; /* place root equidistant between left child (outgroup) and right child by dividing outgroup's length */ if (haslengths) { templen = outgroup->length / 2.0; outgroup->length = templen; outgroup->back->length = templen; root->next->next->length = templen; root->next->next->back->length = templen; } } /* reroot */ void ltrav_(node *p, double lengthsum, double lmin, double *tipmax, long *across, long *maxchar) { node *q; long rchar, nl; double sublength; if (p->tip) { if (lengthsum > (*tipmax)) (*tipmax) = lengthsum; if (lmin == 0.0) return; rchar = (long)(lengthsum / (*tipmax) * (*across) + 0.5); nl = strlen(nodep[p->index - 1]->nayme); if (rchar + nl > (*maxchar)) (*across) = (*maxchar) - (long)(nl * (*tipmax) / lengthsum + 0.5); return; } q = p->next; do { if (q->length >= lmin) sublength = q->length; else sublength = lmin; ltrav_(q->back, lengthsum + sublength, lmin, tipmax, across, maxchar); q = q->next; } while (p != q); } /* ltrav */ void precoord(node *nuroot,boolean *subtree,double *tipmax,long *across) { /* set tipmax and across so that tree is scaled to screenwidth */ double oldtipmax, minimum; long i, maxchar; (*tipmax) = 0.0; if ((*subtree)) maxchar = vscreenwidth - 13; else maxchar = vscreenwidth - 5; (*across) = maxchar; ltrav_(nuroot, 0.0, 0.0, tipmax, across, &maxchar); i = 0; do { oldtipmax = (*tipmax); minimum = 3.0 / (*across) * (*tipmax); ltrav_(nuroot, 0.0, minimum, tipmax, across, &maxchar); i++; } while (fabs((*tipmax) - oldtipmax) > 0.01 * oldtipmax && i <= 40); } /* precoord */ void coordinates(node *p, double lengthsum, long *across, long *tipy, double *tipmax) { /* establishes coordinates of nodes for display with lengths */ node *q, *first, *last; if (p->tip) { p->xcoord = (long)((*across) * lengthsum / (*tipmax) + 0.5); p->ycoord = (*tipy); p->ymin = (*tipy); p->ymax = (*tipy); (*tipy) += down; return; } q = p->next; do { coordinates(q->back, lengthsum + q->length, across, tipy, tipmax); q = q->next; } while (p != q); first = p->next->back; q = p; while (q->next != p) q = q->next; last = q->back; p->xcoord = (long)((*across) * lengthsum / (*tipmax) + 0.5); if (p == root) { if (root->next->next->next == root) p->ycoord = (first->ycoord + last->ycoord) / 2; else p->ycoord = p->next->next->back->ycoord; } else p->ycoord = (first->ycoord + last->ycoord) / 2; p->ymin = first->ymin; p->ymax = last->ymax; } /* coordinates */ void flatcoordinates(node *p, long *tipy) { /* establishes coordinates of nodes for display without lengths */ node *q, *first, *last; if (p->tip) { p->xcoord = 0; p->ycoord = (*tipy); p->ymin = (*tipy); p->ymax = (*tipy); (*tipy) += down; return; } q = p->next; do { flatcoordinates(q->back, tipy); q = q->next; } while (p != q); first = p->next->back; q = p->next; while (q->next != p) q = q->next; last = q->back; p->xcoord = (last->ymax - first->ymin) * 3 / 2; if (p == root) { if (root->next->next->next == root) p->ycoord = (first->ycoord + last->ycoord) / 2; else p->ycoord = p->next->next->back->ycoord; } else p->ycoord = (first->ycoord + last->ycoord) / 2; p->ymin = first->ymin; p->ymax = last->ymax; } /* flatcoordinates */ void grwrite(chartype c, long num, long *pos) { long i; prefix(c); for (i = 1; i <= num; i++) { if ((*pos) >= leftedge && (*pos) - leftedge + 1 < screenwidth) putchar(cch[(long)c]); (*pos)++; } postfix(c); } /* grwrite */ void drawline(long i, node *nuroot, boolean *subtree) { /* draws one row of the tree diagram by moving up tree */ long pos; node *p, *q, *r, *s, *first =NULL, *last =NULL; long n, j; long up_nondel, down_nondel; boolean extra, done; chartype c, d; pos = 1; p = nuroot; q = nuroot; extra = false; if (i == (long)p->ycoord && (p == root || (*subtree))) { c = ch_over; if ((*subtree)) stwrite("Subtree:", 8, &pos, leftedge, screenwidth); if (p->index >= 100) nnwrite(p->index, 3, &pos, leftedge, screenwidth); else if (p->index >= 10) { grwrite(c, 1, &pos); nnwrite(p->index, 2, &pos, leftedge, screenwidth); } else { grwrite(c, 2, &pos); nnwrite(p->index, 1, &pos, leftedge, screenwidth); } extra = true; } else { if ((*subtree)) stwrite(" ", 10, &pos, leftedge, screenwidth); else stwrite(" ", 2, &pos, leftedge, screenwidth); } do { if (!p->tip) { r = p->next; done = false; do { if (i >= r->back->ymin && i <= r->back->ymax) { q = r->back; done = true; } r = r->next; } while (!(done || r == p)); first = p->next->back; r = p->next; while (r->next != p) r = r->next; last = r->back; } done = (p == q); if (haslengths && !nolengths) n = (long)(q->xcoord - p->xcoord); else n = (long)(p->xcoord - q->xcoord); if (n < 3 && !q->tip) n = 3; if (extra) { n--; extra = false; } if ((long)q->ycoord == i && !done) { c = ch_over; if (!haslengths && !q->haslength) c = horiz; if (q->deleted) c = deleted; if (q == first) d = downcorner; else if (q == last) d = upcorner; else if ((long)q->ycoord == (long)p->ycoord) d = c; else d = midcorner; if (n > 1 || q->tip) { grwrite(d, 1, &pos); grwrite(c, n - 3, &pos); } if (q->index >= 100) nnwrite(q->index, 3, &pos, leftedge, screenwidth); else if (q->index >= 10) { grwrite(c, 1, &pos); nnwrite(q->index, 2, &pos, leftedge, screenwidth); } else { grwrite(c, 2, &pos); nnwrite(q->index, 1, &pos, leftedge, screenwidth); } extra = true; } else if (!q->tip) { if ((long)last->ycoord > i && (long)first->ycoord < i && i != (long)p->ycoord) { c = up; if(p->deleted) c = updel; if (!p->tip) { up_nondel = 0; down_nondel = 0; r = p->next; do { s = r->back; if ((long)s->ycoord < (long)p->ycoord && !s->deleted) up_nondel = (long)s->ycoord; if (s->ycoord > p->ycoord && !s->deleted && (down_nondel == 0)) down_nondel = (long)s->ycoord; if (i < (long)p->ycoord && s->deleted && i > (long)s->ycoord) c = updel; if (i > (long)p->ycoord && s->deleted && i < (long)s->ycoord) c = updel; r = r->next; } while (r != p); if ((up_nondel != 0) && i < (long)p->ycoord && i > up_nondel) c = up; if ((down_nondel != 0) && i > (long)p->ycoord && i < down_nondel) c = up; } grwrite(c, 1, &pos); chwrite(' ', n - 1, &pos, leftedge, screenwidth); } else chwrite(' ', n, &pos, leftedge, screenwidth); } else chwrite(' ', n, &pos, leftedge, screenwidth); if (p != q) p = q; } while (!done); if ((long)p->ycoord == i && p->tip) { if (p->hasname) { n = 0; for (j = 1; j <= MAXNCH; j++) { if (nodep[p->index - 1]->nayme[j - 1] != '\0') n = j; } chwrite(':', 1, &pos, leftedge, screenwidth); for (j = 0; j < n; j++) chwrite(nodep[p->index - 1]->nayme[j], 1, &pos, leftedge, screenwidth); } } putchar('\n'); } /* drawline */ void printree() { /* prints out diagram of the tree */ long across; long tipy; double tipmax; long i, dow, vmargin; haslengths = ifhaslengths(); if (!subtree) nuroot = root; cleerhome(); tipy = 1; dow = down; if (spp * dow > screenlines && !subtree) { dow--; } if (haslengths && !nolengths) { precoord(nuroot, &subtree, &tipmax, &across); /* protect coordinates() from div/0 errors if user decides to examine a tip as a subtree */ if (tipmax == 0) tipmax = 0.01; coordinates(nuroot, 0.0, &across, &tipy, &tipmax); } else flatcoordinates(nuroot, &tipy); vmargin = 2; treelines = tipy - dow; if (topedge != 1) { printf("** %ld lines above screen **\n", topedge - 1); vmargin++; } if ((treelines - topedge + 1) > (screenlines - vmargin)) vmargin++; for (i = 1; i <= treelines; i++) { if (i >= topedge && i < topedge + screenlines - vmargin) drawline(i, nuroot,&subtree); } if (leftedge > 1) printf("** %ld characters to left of screen ", leftedge); if ((treelines - topedge + 1) > (screenlines - vmargin)) { printf("** %ld", treelines - (topedge - 1 + screenlines - vmargin)); printf(" lines below screen **\n"); } if (treelines - topedge + vmargin + 1 < screenlines) putchar('\n'); } /* printree */ void togglelengths() { nolengths = !nolengths; printree(); } /* togglengths */ void arbitree() { long i, maxinput; node *newtip, *newfork; maxinput = 1; do { spp = readlong("How many species?\n"); maxinput++; if (maxinput == 100) { printf("ERROR: too many tries at choosing species\n"); exxit(-1); } } while (spp <= 0); nonodes = spp * 2 - 1; maketip(&root, 1); maketip(&newtip, 2); maketriad(&newfork, spp + 1); add_at(root, newtip, newfork); for (i = 3; i <= spp; i++) { maketip(&newtip, i); maketriad(&newfork, spp + i - 1); add_at(nodep[spp + i - 3], newtip, newfork); } } /* arbitree */ void yourtree() { long uniquearray[maxsz]; long uniqueindex = 0; long i, j, k, k_max, maxinput; boolean ok, done; node *newtip, *newfork; Char ch; uniquearray[0] = 0; spp = 2; nonodes = spp * 2 - 1; maketip(&root, 1); maketip(&newtip, 2); maketriad(&newfork, spp + 3); add_at(root, newtip, newfork); i = 2; maxinput = 1; k_max = 5; do { i++; printree(); printf("Enter 0 to stop building tree.\n"); printf("Add species%3ld", i); do { printf("\n at or before which node (type number): "); inpnum(&j, &ok); ok = (ok && ((unsigned long)j < i || (j > spp + 2 && j < spp + i + 1))); if (!ok) printf("Impossible number. Please try again:\n"); maxinput++; if (maxinput == 100) { printf("ERROR: too many tries at choosing number\n"); exxit(-1); } } while (!ok); maxinput = 1; if (j >= i) { /* has user chosen a non-tip? if so, offer choice */ do { printf(" Insert at node (A) or before node (B)? "); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); if (ch == '\n') ch = ' '; ch = isupper(ch) ? ch : toupper(ch); maxinput++; if (maxinput == 100) { printf("ERROR: too many tries at choosing option\n"); exxit(-1); } } while (ch != 'A' && ch != 'B'); } else ch = 'B'; /* if user has chosen a tip, set Before */ if (j != 0) { if (ch == 'A') { if (!nodep[j - 1]->tip) { maketip(&newtip, i); add_child(nodep[j - 1], nodep[i - 1]); } } else { maketip(&newtip, i); maketriad(&newfork, spp + i + 1); nodep[i-1]->back = newfork; newfork->back = nodep[i-1]; add_before(nodep[j - 1], nodep[i - 1]); } /* endif (before or at node) */ } done = (j == 0); if (!done) { if (ch == 'B') k = spp * 2 + 3; else k = spp * 2 + 2; k_max = k; do { if (nodep[k - 2] != NULL) { nodep[k - 1] = nodep[k - 2]; nodep[k - 1]->index = k; nodep[k - 1]->next->index = k; nodep[k - 1]->next->next->index = k; } k--; } while (k != spp + 3); if (j > spp + 1) j++; spp++; nonodes = spp * 2 - 1; } } while (!done); for (i = spp + 1; i <= k_max; i++) { if ((nodep[i - 1] != nodep[i]) && (nodep[i - 1] != NULL)) { uniquearray[uniqueindex++] = i; uniquearray[uniqueindex] = 0; } } for ( i = 0; uniquearray[i] != 0; i++) { nodep[spp + i] = nodep[uniquearray[i] - 1]; nodep[spp + i]->index = spp + i + 1; nodep[spp + i]->next->index = spp + i + 1; nodep[spp + i]->next->next->index = spp + i + 1; } for (i = spp + uniqueindex; i <= k_max; i++) nodep[i] = NULL; nonodes = spp * 2 - 1; } /* yourtree */ void buildtree() { /* variables needed to be passed to treeread() */ long nextnode = 0; pointarray dummy_treenode=NULL; /* Ignore what happens to this */ boolean goteof = false; boolean haslengths = false; boolean firsttree; node *p, *q; long nodecount = 0; /* These assignments moved from treeconstruct -- they seem to happen only here. */ /*xx treeone & treetwo assignments should probably happen in treeconstruct. Memory leak if user reads multiple trees. */ treeone = (node **)Malloc(maxsz*sizeof(node *)); treetwo = (node **)Malloc(maxsz*sizeof(node *)); simplifiedtree.nodep = (node **)Malloc(maxsz*sizeof(node *)); subtree = false; topedge = 1; leftedge = 1; switch (how) { case arb: nodep = treeone; treesets[othertree].nodep = treetwo; arbitree(); break; case use: printf("\nReading tree file ...\n\n"); if (!readnext) { /* This is the first time through here, act accordingly */ firsttree = true; /* Open in binary: ftell() is broken for UNIX line-endings under WIN32 */ openfile(&intree,INTREE,"input tree file", "rb","retree",intreename); numtrees = countsemic(&intree); treesread = 0; } else { /* This isn't the first time through here ... */ firsttree = false; } allocate_nodep(&nodep, &intree, &spp); treesets[whichtree].nodep = nodep; if (firsttree) nayme = (naym *)Malloc(spp*sizeof(naym)); treeread(intree, &root, dummy_treenode, &goteof, &firsttree, nodep, &nextnode, &haslengths, &grbg, initretreenode,true,-1); nonodes = nextnode; treesread++; treesets[othertree].nodep = treetwo; break; case spec: nodep = treeone; treesets[othertree].nodep = treetwo; yourtree(); break; } q = root->next; do { p = q; nodecount++; q = p->next; } while (q != root); outgrno = root->next->back->index; if(!(nodecount > 2)) { reroot(nodep[outgrno - 1]); } } /* buildtree */ void unbuildtree() { /* throw all nodes of the tree onto the garbage heap */ long i; gdispose(root); for (i = 0; i < nonodes; i++) nodep[i] = NULL; } /* unbuildtree */ void retree_help() { /* display help information */ char tmp[100]; printf("\n\n . Redisplay the same tree again\n"); if (haslengths) { printf(" = Redisplay the same tree with"); if (!nolengths) printf("out/with"); else printf("/without"); printf(" lengths\n"); } printf(" U Undo the most recent change in the tree\n"); printf(" W Write tree to a file\n"); printf(" + Read next tree from file (may blow up if none is there)\n"); printf("\n"); printf(" R Rearrange a tree by moving a node or group\n"); printf(" O select an Outgroup for the tree\n"); if (haslengths) printf(" M Midpoint root the tree\n"); printf(" T Transpose immediate branches at a node\n"); printf(" F Flip (rotate) subtree at a node\n"); printf(" D Delete or restore nodes\n"); printf(" B Change or specify the length of a branch\n"); printf(" N Change or specify the name(s) of tip(s)\n"); printf("\n"); printf(" H Move viewing window to the left\n"); printf(" J Move viewing window downward\n"); printf(" K Move viewing window upward\n"); printf(" L Move viewing window to the right\n"); printf(" C show only one Clade (subtree) (might be useful if tree is "); printf("too big)\n"); printf(" ? Help (this screen)\n"); printf(" Q (Quit) Exit from program\n"); printf(" X Exit from program\n\n"); printf(" TO CONTINUE, PRESS ON THE Return OR Enter KEY"); getstryng(tmp); printree(); } /* retree_help */ void consolidatetree(long index) { node *start, *r, *q; int i; start = nodep[index - 1]; q = start->next; while (q != start) { r = q; q = q->next; chuck(&grbg, r); } chuck(&grbg, q); i = index; while (nodep[i-1] != NULL) { r = nodep[i - 1]; if (!(r->tip)) r->index--; if (!(r->tip)) { q = r->next; do { q->index--; q = q->next; } while (r != q && q != NULL); } nodep[i - 1] = nodep[i]; i++; } nonodes--; } /* consolidatetree */ void rearrange() { long i, j, maxinput; boolean ok; node *p, *q; char ch; printf("Remove everything to the right of which node? "); inpnum(&i, &ok); if ( ok == false ) { /* fall through */ } else if ( i < 1 || i > spp*2 - 1 ) { /* i is not in range */ ok = false; } else if (i == root->index ) { /* i is root */ ok = false; } else if ( nodep[i-1]->deleted ) { /* i has been deleted */ ok = false; } else { printf("Add at or before which node? "); inpnum(&j, &ok); if ( ok == false ) { /* fall through */ } else if ( j < 1 || j > spp*2 - 1 ) { /* j is not in range */ ok = false; } else if ( nodep[j-1]->deleted ) { /* j has been deleted */ ok = false; } else if (j != root->index && nodep[nodep[j-1]->back->index - 1]->deleted ) { /* parent of j has been deleted */ ok = false; } else if ( nodep[j-1] == nodep[nodep[i-1]->back->index -1] ) { /* i is j's parent */ ok = false; } else { /* make sure that j is not a descendant of i */ for ( p = nodep[j-1]; p != root; p = nodep[p->back->index - 1] ) { if ( p == nodep[i-1] ) { ok = false; break; } } if ( ok ) { maxinput = 1; do { printf("Insert at node (A) or before node (B)? "); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); if (ch == '\n') ch = ' '; ch = toupper(ch); maxinput++; if (maxinput == 100) { printf("ERROR: Input failed too many times.\n"); exxit(-1); } } while (ch != 'A' && ch != 'B'); if (ch == 'A') { if ( nodep[j - 1]->deleted || nodep[j - 1]->tip ) { /* If j is a tip or has been deleted */ ok = false; } else if ( nodep[j-1] == nodep[nodep[i-1]->back->index -1] ) { /* If j is i's parent */ ok = false; } else { copytree(); re_move(&nodep[i - 1], &q); add_child(nodep[j - 1], nodep[i - 1]); if (fromtype == beforenode) consolidatetree(q->index); } } else { /* ch == 'B' */ if (j == root->index) { /* can't insert at root */ ok = false; } else { copytree(); printf("Insert before node %ld\n",j); re_move(&nodep[i - 1], &q); if (q != NULL) { nodep[q->index-1]->next->back = nodep[i-1]; nodep[i-1]->back = nodep[q->index-1]->next; } add_before(nodep[j - 1], nodep[i - 1]); } } /* endif (before or at node) */ } /* endif (ok to do move) */ } /* endif (destination node ok) */ } /* endif (from node ok) */ printree(); if ( !ok ) printf("Not a possible rearrangement. Try again: \n"); else { written = false; } } /* rearrange */ boolean any_deleted(node *p) { /* return true if there are any deleted branches from branch on down */ boolean localdl; localdl = false; ifdeltrav(p, &localdl); return localdl; } /* any_deleted */ void fliptrav(node *p, boolean recurse) { node *q, *temp, *r =NULL, *rprev =NULL, *l, *lprev; boolean lprevflag; int nodecount, loopcount, i; if (p->tip) return; q = p->next; l = q; lprev = p; nodecount = 0; do { nodecount++; if (q->next->next == p) { rprev = q; r = q->next; } q = q->next; } while (p != q); if (nodecount == 1) return; loopcount = nodecount / 2; for (i=0; inext = r; rprev->next = l; temp = r->next; r->next = l->next; l->next = temp; if (i < (loopcount - 1)) { lprevflag = false; q = p->next; do { if (q == lprev->next && !lprevflag) { lprev = q; l = q->next; lprevflag = true; } if (q->next == rprev) { rprev = q; r = q->next; } q = q->next; } while (p != q); } } if (recurse) { q = p->next; do { fliptrav(q->back, true); q = q->next; } while (p != q); } } /* fliptrav */ void flip(long atnode) { /* flip at a node left-right */ long i; boolean ok; if (atnode == 0) { printf("Flip branches at which node? "); inpnum(&i, &ok); ok = (ok && i > spp && i <= nonodes); if (ok) ok = !any_deleted(nodep[i - 1]); } else { i = atnode; ok = true; } if (ok) { copytree(); fliptrav(nodep[i - 1], true); } if (atnode == 0) printree(); if (ok) { written = false; return; } if ((i >= 1 && i <= spp) || (i > spp && i <= nonodes && any_deleted(nodep[i - 1]))) printf("Can't flip there. "); else printf("No such node. "); } /* flip */ void transpose(long atnode) { /* flip at a node left-right */ long i; boolean ok; if (atnode == 0) { printf("Transpose branches at which node? "); inpnum(&i, &ok); ok = (ok && i > spp && i <= nonodes); if (ok) ok = !nodep[i - 1]->deleted; } else { i = atnode; ok = true; } if (ok) { copytree(); fliptrav(nodep[i - 1], false); } if (atnode == 0) printree(); if (ok) { written = false; return; } if ((i >= 1 && i <= spp) || (i > spp && i <= nonodes && nodep[i - 1]->deleted)) printf("Can't transpose there. "); else printf("No such node. "); } /* transpose */ void ifdeltrav(node *p, boolean *localdl) { node *q; if (*localdl) return; if (p->tip) { (*localdl) = ((*localdl) || p->deleted); return; } q = p->next; do { (*localdl) = ((*localdl) || q->deleted); ifdeltrav(q->back, localdl); q = q->next; } while (p != q); } /* ifdeltrav */ double oltrav(node *p) { node *q; double maxlen, templen; if (p->deleted) return 0.0; if (p->tip) { p->beyond = 0.0; return 0.0; } else { q = p->next; maxlen = 0; do { templen = q->back->deleted ? 0.0 : q->length + oltrav(q->back); maxlen = (maxlen > templen) ? maxlen : templen; q->beyond = templen; q = q->next; } while (p != q); p->beyond = maxlen; return (maxlen); } } /* oltrav */ void outlength() { /* compute the farthest combined length out from each node */ oltrav(root); } /* outlength */ void midpoint() { /* midpoint root the tree */ double balance, greatlen, lesslen, grlen, maxlen; node *maxnode, *grnode, *lsnode =NULL; boolean ok = true; boolean changed = false; node *p, *q; long nodecount = 0; boolean multi = false; copytree(); p = root; outlength(); q = p->next; greatlen = 0; grnode = q->back; lesslen = 0; q = root->next; do { p = q; nodecount++; q = p->next; } while (q != root); if (nodecount > 2) multi = true; /* Find the two greatest lengths reaching from root to tips. Also find the lengths and node pointers of the first nodes in the direction of those two greatest lengths. */ p = root; q = root->next; do { if (greatlen <= q->beyond) { lesslen = greatlen; lsnode = grnode; greatlen = q->beyond; grnode = q->back; } if ((greatlen > q->beyond) && (q->beyond >= lesslen)) { lesslen = q->beyond; lsnode = q->back; } q = q->next; } while (p != q); /* If we don't have two non-deleted nodes to balance between then we can't midpoint root the tree */ if (grnode->deleted || lsnode->deleted || grnode == lsnode) ok = false; balance = greatlen - (greatlen + lesslen) / 2.0; grlen = grnode->length; while ((balance - grlen > 1e-10) && ok) { /* First, find the most distant immediate child of grnode and reroot to it. */ p = grnode; q = p->next; maxlen = 0; maxnode = q->back; do { if (maxlen <= q->beyond) { maxlen = q->beyond; maxnode = q->back; } q = q->next; } while (p != q); reroot(maxnode); changed = true; /* Reassess the situation, using the same "find the two greatest lengths" code as occurs before the while loop. If another reroot is necessary, this while loop will repeat. */ p = root; outlength(); q = p->next; greatlen = 0; grnode = q->back; lesslen = 0; do { if (greatlen <= q->beyond) { lesslen = greatlen; lsnode = grnode; greatlen = q->beyond; grnode = q->back; } if ((greatlen > q->beyond) && (q->beyond >= lesslen)) { lesslen = q->beyond; lsnode = q->back; } q = q->next; } while (p != q); if (grnode->deleted || lsnode->deleted || grnode == lsnode) ok = false; balance = greatlen - (greatlen + lesslen) / 2.0; grlen = grnode->length; }; /* end of while ((balance > grlen) && ok) */ if (ok) { /*xx the following ignores deleted nodes */ /* this may be ok because deleted nodes are omitted from length calculations */ if (multi) { reroot(grnode); /*xx need length corrections */ p = root; outlength(); q = p->next; greatlen = 0; grnode = q->back; lesslen = 0; do { if (greatlen <= q->beyond) { lesslen = greatlen; lsnode = grnode; greatlen = q->beyond; grnode = q->back; } if ((greatlen > q->beyond) && (q->beyond >= lesslen)) { lesslen = q->beyond; lsnode = q->back; } q = q->next; } while (p != q); balance = greatlen - (greatlen + lesslen) / 2.0; } grnode->length -= balance; if (((grnode->length) < 0.0) && (grnode->length > -1.0e-10)) grnode->length = 0.0; grnode->back->length = grnode->length; lsnode->length += balance; if (((lsnode->length) < 0.0) && (lsnode->length > -1.0e-10)) lsnode->length = 0.0; lsnode->back->length = lsnode->length; } printree(); if (ok) { if (any_deleted(root)) printf("Deleted nodes were not used in midpoint calculations.\n"); } else { printf("Can't perform midpoint because of deleted branches.\n"); if (changed) { undo(); printf("Tree restored to original state. Undo information lost.\n"); } } } /* midpoint */ void deltrav(node *p, boolean value) { /* register p and p's children as deleted or extant, depending on value */ node *q; p->deleted = value; if (p->tip) return; q = p->next; do { deltrav(q->back, value); q = q->next; } while (p != q); } /* deltrav */ void fill_del(node*p) { int alldell; node *q = p; if ( p->next == NULL) return; q=p->next; while ( q != p) { fill_del(q->back); q=q->next; } alldell = 1; q=p->next; while ( q != p) { if ( !q->back->deleted ) { alldell = 0; } q=q->next; } p->deleted = alldell; } void reg_del(node *delp, boolean value) { /* register delp and all of delp's children as deleted */ deltrav(delp, value); fill_del(root); } /* reg_del */ boolean isdeleted(long nodenum) { /* true if nodenum is a node number in a deleted branch */ return(nodep[nodenum - 1]->deleted); } /* isdeleted */ void deletebranch() { /* delete a node */ long i; boolean ok1; printf("Delete everything to the right of which node? "); inpnum(&i, &ok1); ok1 = (ok1 && i >= 1 && i <= nonodes && i != root->index && !isdeleted(i)); if (ok1) { copytree(); reg_del(nodep[i - 1],true); } printree(); if (!ok1) printf("Not a possible deletion. Try again.\n"); else { written = false; } } /* deletebranch */ void restorebranch() { /* restore deleted branches */ long i; boolean ok1; printf("Restore everything to the right of which node? "); inpnum(&i, &ok1); ok1 = (ok1 && i >= 1 && i < spp * 2 && isdeleted(i) && ( i == root->index || !nodep[nodep[i - 1]->back->index - 1]->deleted)); if (ok1) { reg_del(nodep[i - 1],false); } printree(); if (!ok1) printf("Not a possible restoration. Try again: \n"); else { written = false; } } /* restorebranch */ void del_or_restore() { /* delete or restore a branch */ long maxinput; Char ch; if (any_deleted(root)) { maxinput = 1; do { printf("Enter D to delete a branch\n"); printf("OR enter R to restore a branch: "); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); if (ch == '\n') ch = ' '; ch = (isupper(ch)) ? ch : toupper(ch); maxinput++; if (maxinput == 100) { printf("ERROR: too many tries at choosing option\n"); exxit(-1); } } while (ch != 'D' && ch != 'R'); if (ch == 'R') restorebranch(); else deletebranch(); } else deletebranch(); } /* del_or_restore */ void undo() { /* don't undo to an uninitialized tree */ if (!treesets[othertree].initialized) { printree(); printf("Nothing to undo.\n"); return; } treesets[whichtree].root = root; treesets[whichtree].nodep = nodep; treesets[whichtree].nonodes = nonodes; treesets[whichtree].waswritten = waswritten; treesets[whichtree].hasmult = hasmult; treesets[whichtree].haslengths = haslengths; treesets[whichtree].nolengths = nolengths; treesets[whichtree].initialized = true; whichtree = othertree; root = treesets[whichtree].root; nodep = treesets[whichtree].nodep; nonodes = treesets[whichtree].nonodes; waswritten = treesets[whichtree].waswritten; hasmult = treesets[whichtree].hasmult; haslengths = treesets[whichtree].haslengths; nolengths = treesets[whichtree].nolengths; if (othertree == 0) othertree = 1; else othertree = 0; printree(); } /* undo */ /* These attributes of nodes in the tree are modified by treetrav() in preparation for writing a tree to disk. boolean deadend This node is not deleted but all of its children are, so this node will be treated as such when the tree is written or displayed. boolean onebranch This node has only one valid child, so that this node will not be written and its child will be written as a child of its grandparent with the appropriate summing of lengths. nodep *onebranchnode Used if onebranch is true. Onebranchnode points to the one valid child. This child may be one or more generations down from the current node. double onebranchlength Used if onebranch is true. Onebranchlength is the length from the current node to the valid child. */ void treetrav(node *p) { long branchcount = 0; node *q, *onebranchp =NULL; /* Count the non-deleted branches hanging off of this node into branchcount. If there is only one such branch, onebranchp points to that branch. */ if (p->tip) return; q = p->next; do { if (!q->back->deleted) { if (!q->back->tip) treetrav(q->back); if (!q->back->deadend && !q->back->deleted) { branchcount++; onebranchp = q->back; } } q = q->next; } while (p != q); if (branchcount == 0) p->deadend = true; else p->deadend = false; p->onebranch = false; if (branchcount == 1 && onebranchp->tip) { p->onebranch = true; p->onebranchnode = onebranchp; p->onebranchhaslength = (p->haslength || (p == root)) && onebranchp->haslength; if (p->onebranchhaslength) p->onebranchlength = onebranchp->length + p->length; } if (branchcount == 1 && !onebranchp->tip) { p->onebranch = true; if (onebranchp->onebranch) { p->onebranchnode = onebranchp->onebranchnode; p->onebranchhaslength = (p->haslength || (p == root)) && onebranchp->onebranchhaslength; if (p->onebranchhaslength) p->onebranchlength = onebranchp->onebranchlength + p->length; } else { p->onebranchnode = onebranchp; p->onebranchhaslength = p->haslength && onebranchp->haslength; if (p->onebranchhaslength) p->onebranchlength = onebranchp->length + p->length; } } } /* treetrav */ void simcopynode(node *fromnode, node *tonode) { /* Copy the contents of a node from fromnode to tonode. */ int i; tonode->index = fromnode->index; tonode->deleted = fromnode->deleted; tonode->tip = fromnode->tip; tonode->hasname = fromnode->hasname; if (fromnode->hasname) for (i=0;inayme[i] = fromnode->nayme[i]; tonode->haslength = fromnode->haslength; if (fromnode->haslength) tonode->length = fromnode->length; } /* simcopynode */ node *simcopytrav(node *p) { /* Traverse the tree from p on down, copying nodes to the other tree */ node *q, *newnode, *newnextnode, *temp; long lastnodeidx = 0; gnu(&grbg, &newnode); simcopynode(p, newnode); if (nodep[p->index - 1] == p) simplifiedtree.nodep[p->index - 1] = newnode; /* if this is a tip, return now */ if (p->tip) return newnode; if (p->onebranch && p->onebranchnode->tip) { simcopynode(p->onebranchnode, newnode); if (p->onebranchhaslength) newnode->length = p->onebranchlength; return newnode; } else if (p->onebranch && !p->onebranchnode->tip) { /* recurse down p->onebranchnode */ p->onebranchnode->length = p->onebranchlength; p->onebranchnode->haslength = p->onebranchnode->haslength; return simcopytrav(p->onebranchnode); } else { /* Multiple non-deleted branch case: go round the node recursing down the branches. Don't go down deleted branches or dead ends. */ q = p->next; while (q != p) { if (!q->back->deleted && !q->back->deadend) lastnodeidx = q->back->index; q = q->next; } q = p->next; gnu(&grbg, &newnextnode); simcopynode(q, newnextnode); newnode->next = newnextnode; do { /* If branch is deleted or is a dead end, do not recurse down the branch. */ if (!q->back->deleted && !q->back->deadend) { newnextnode->back = simcopytrav(q->back); newnextnode->back->back = newnextnode; q = q->next; if (newnextnode->back->index == lastnodeidx) { newnextnode->next = newnode; break; } if (q == p) { newnextnode->next = newnode; } else { temp = newnextnode; gnu(&grbg, &newnextnode); simcopynode(q, newnextnode); temp->next = newnextnode; } } else { /*xx this else and q=q->next are experimental (seems to be working) */ q = q->next; } } while (q != p); } return newnode; } /* simcopytrav */ void simcopytree() { /* Make a simplified copy of the current tree for rooting/unrooting on output. Deleted notes are removed and lengths are consolidated. */ simplifiedtree.root = simcopytrav(root); /*xx If there are deleted nodes, nonodes will be different. However, nonodes is not used in the simplified tree. */ simplifiedtree.nonodes = nonodes; simplifiedtree.waswritten = waswritten; simplifiedtree.hasmult = hasmult; simplifiedtree.haslengths = haslengths; simplifiedtree.nolengths = nolengths; simplifiedtree.initialized = true; } /* simcopytree */ void writebranchlength(double x) { long w; /* write branch length onto output file, keeping track of what column of line you are in, and writing to correct precision */ if (x > 0.0) w = (long)(0.43429448222 * log(x)); else if (x == 0.0) w = 0; else w = (long)(0.43429448222 * log(-x)) + 1; if (w < 0) w = 0; if ((long)(100000*x) == 100000*(long)x) { if (!xmltree) putc(':', outtree); fprintf(outtree, "%*.1f", (int)(w + 2), x); col += w + 3; } else { if ((long)(100000*x) == 10000*(long)(10*x)) { if (!xmltree) putc(':', outtree); fprintf(outtree, "%*.1f", (int)(w + 3), x); col += w + 4; } else { if ((long)(100000*x) == 1000*(long)(100*x)) { if (!xmltree) putc(':', outtree); fprintf(outtree, "%*.2f", (int)(w + 4), x); col += w + 5; } else { if ((long)(100000*x) == 100*(long)(1000*x)) { if (!xmltree) putc(':', outtree); fprintf(outtree, "%*.3f", (int)(w + 5), x); col += w + 6; } else { if ((long)(100000*x) == 10*(long)(10000*x)) { if (!xmltree) putc(':', outtree); fprintf(outtree, "%*.4f", (int)(w + 6), x); col += w + 7; } else { if (!xmltree) putc(':', outtree); fprintf(outtree, "%*.5f", (int)(w + 7), x); col += w + 8; } } } } } } /* writebranchlength */ void treeout(node *p, boolean writeparens, double addlength, long indent) { /* write out file with representation of final tree */ long i, n, lastnodeidx = 0; Char c; double x; boolean comma; node *q; /* If this is a tip or there are no non-deleted branches from this node, render this node as a tip (write its name). */ if (p == root) { if (xmltree) indent = 0; else indent = 0; if (xmltree) { fprintf(outtree, ""); /* assumes no length at root! */ } else putc('(', outtree); } if (p->tip) { if (p->hasname) { n = 0; for (i = 1; i <= MAXNCH; i++) { if ((nodep[p->index - 1]->nayme[i - 1] != '\0') && (nodep[p->index - 1]->nayme[i - 1] != ' ')) n = i; } indent += 2; if (xmltree) { putc('\n', outtree); for (i = 1; i <= indent; i++) putc(' ', outtree); fprintf(outtree, "haslength) { fprintf(outtree, " length=\""); x = p->length; writebranchlength(x); fprintf(outtree,"\""); } putc('>', outtree); fprintf(outtree, ""); } for (i = 0; i < n; i++) { c = nodep[p->index - 1]->nayme[i]; if (c == ' ') c = '_'; putc(c, outtree); } col += n; if (xmltree) fprintf(outtree, ""); } } else if (p->onebranch && p->onebranchnode->tip) { if (p->onebranchnode->hasname) { n = 0; for (i = 1; i <= MAXNCH; i++) { if ((nodep[p->index - 1]->nayme[i - 1] != '\0') && (nodep[p->index - 1]->nayme[i - 1] != ' ')) n = i; indent += 2; if (xmltree) { putc('\n', outtree); for (i = 1; i <= indent; i++) putc(' ', outtree); fprintf(outtree, "haslength && writeparens) || p->onebranch) { if (!(p->onebranch && !p->onebranchhaslength)) { fprintf(outtree, " length="); if (p->onebranch) x = p->onebranchlength; else x = p->length; x += addlength; writebranchlength(x); } fprintf(outtree, ""); } } for (i = 0; i < n; i++) { c = p->onebranchnode->nayme[i]; if (c == '_') c = ' '; putc(c, outtree); } col += n; if (xmltree) fprintf(outtree, ""); } } } else if (p->onebranch && !p->onebranchnode->tip) { treeout(p->onebranchnode, true, 0.0, indent); } else { /* Multiple non-deleted branch case: go round the node recursing down the branches. */ if (xmltree) { putc('\n', outtree); indent += 2; for (i = 1; i <= indent; i++) putc(' ', outtree); if (p == root) fprintf(outtree, ""); } if (p != root) { if (xmltree) { fprintf(outtree, "haslength && writeparens) || p->onebranch) { if (!(p->onebranch && !p->onebranchhaslength)) { fprintf(outtree, " length=\""); if (p->onebranch) x = p->onebranchlength; else x = p->length; x += addlength; writebranchlength(x); } fprintf(outtree, "\">"); } else fprintf(outtree, ">"); } else putc('(', outtree); } (col)++; q = p->next; while (q != p) { if (!q->back->deleted && !q->back->deadend) lastnodeidx = q->back->index; q = q->next; } q = p->next; while (q != p) { comma = true; /* If branch is deleted or is a dead end, do not recurse down the branch and do not write a comma afterwards. */ if (!q->back->deleted && !q->back->deadend) treeout(q->back, true, 0.0, indent); else comma = false; if (q->back->index == lastnodeidx) comma = false; q = q->next; if (q == p) break; if ((q->next == p) && (q->back->deleted || q->back->deadend)) break; if (comma && !xmltree) putc(',', outtree); (col)++; if ((!xmltree) && col > 65) { putc('\n', outtree); col = 0; } } /* The right paren ')' closes off this level of recursion. */ if (p != root) { if (xmltree) { fprintf(outtree, "\n"); for (i = 1; i <= indent; i++) putc(' ', outtree); } if (xmltree) { fprintf(outtree, ""); } else putc(')', outtree); } (col)++; } if (!xmltree) if ((p->haslength && writeparens) || p->onebranch) { if (!(p->onebranch && !p->onebranchhaslength)) { if (p->onebranch) x = p->onebranchlength; else x = p->length; x += addlength; writebranchlength(x); } } if (p == root) { if (xmltree) { fprintf(outtree, "\n \n\n"); } else putc(')', outtree); } } /* treeout */ void maketemptriad(node **p, long index) { /* Initiate an internal node with stubs for two children */ long i, j; node *q; q = NULL; for (i = 1; i <= 3; i++) { gnu(&grbg, p); (*p)->index = index; (*p)->hasname = false; (*p)->haslength = false; (*p)->deleted=false; (*p)->deadend=false; (*p)->onebranch=false; (*p)->onebranchhaslength=false; for (j=0;jnayme[j] = '\0'; (*p)->next = q; q = *p; } (*p)->next->next->next = *p; q = (*p)->next; while (*p != q) { (*p)->back = NULL; (*p)->tip = false; *p = (*p)->next; } } /* maketemptriad */ void roottreeout(boolean *userwantsrooted) { /* write out file with representation of final tree */ long trnum, trnumwide; boolean treeisrooted = false; treetrav(root); simcopytree(); /* Prepare a copy of the going tree without deleted branches */ treesets[whichtree].root = root; /* Store the current root */ if (nexus) { trnum = treenumber; trnumwide = 1; while (trnum >= 10) { trnum /= 10; trnumwide++; } fprintf(outtree, "TREE PHYLIP_%*ld = ", (int)trnumwide, treenumber); if (!(*userwantsrooted)) fprintf(outtree, "[&U] "); else fprintf(outtree, "[&R] "); col += 15; } root = simplifiedtree.root; /* Point root at simplified tree */ root->haslength = false; /* Root should not have a length */ if (root->tip) treeisrooted = true; else { if (root->next->next->next == root) treeisrooted = true; else treeisrooted = false; } if (*userwantsrooted && !treeisrooted) notrootedtorooted(); if (!(*userwantsrooted) && treeisrooted) rootedtonotrooted(); if ((*userwantsrooted && treeisrooted) || (!(*userwantsrooted) && !treeisrooted)) { treeout(root,true,0.0, 0); } root = treesets[whichtree].root; /* Point root at original (real) tree */ if (!xmltree) { if (hasmult) fprintf(outtree, "[%6.4f];\n", trweight); else fprintf(outtree, ";\n"); } } /* roottreeout */ void notrootedtorooted() { node *newbase, *temp; /* root halfway along leftmost branch of unrooted tree */ /* create a new triad for the new base */ maketemptriad(&newbase,nonodes+1); /* Take left branch and make it the left branch of newbase */ newbase->next->back = root->next->back; newbase->next->next->back = root; /* If needed, divide length between left and right branches */ if (newbase->next->back->haslength) { newbase->next->back->length /= 2.0; newbase->next->next->back->length = newbase->next->back->length; newbase->next->next->back->haslength = true; } /* remove leftmost ring node from old base ring */ temp = root->next->next; chuck(&grbg, root->next); root->next = temp; /* point root at new base and write the tree */ root = newbase; treeout(root,true,0.0, 0); /* (since tree mods are to simplified tree and will not be used for general purpose tree editing, much initialization can be skipped.) */ } /* notrootedtorooted */ void rootedtonotrooted() { node *q, *r, *temp, *newbase; boolean sumhaslength = false; double sumlength = 0; /* Use the leftmost non-tip immediate descendant of the root, root at that, write a multifurcation with that as the base. If both descendants are tips, write tree as is. */ root = simplifiedtree.root; /* first, search for leftmost non-tip immediate descendent of root */ q = root->next->back; r = root->next->next->back; if (q->tip && r->tip) { treeout(root,true,0.0, 0); } else if (!(q->tip)) { /* allocate new base pointer */ gnu(&grbg,&newbase); newbase->next = q->next; q->next = newbase; q->back = r; r->back = q; if (q->haslength && r->haslength) { sumlength = q->length + r->length; sumhaslength = true; } if (sumhaslength) { q->length = sumlength; q->back->length = sumlength; } else { q->haslength = false; r->haslength = false; } chuck(&grbg, root->next->next); chuck(&grbg, root->next); chuck(&grbg, root); root = newbase; treeout(root, true, 0.0, 0); } else if (q-tip && !(r->tip)) { temp = r; do { temp = temp->next; } while (temp->next != r); gnu(&grbg,&newbase); newbase->next = temp->next; temp->next = newbase; q->back = r; r->back = q; if (q->haslength && r->haslength) { sumlength = q->length + r->length; sumhaslength = true; } if (sumhaslength) { q->length = sumlength; q->back->length = sumlength; } else { q->haslength = false; r->haslength = false; } chuck(&grbg, root->next->next); chuck(&grbg, root->next); chuck(&grbg, root); root = newbase; treeout(root, true, 0.0, 0); } } /* rootedtonotrooted */ void treewrite(boolean *done) { /* write out tree to a file */ long maxinput; boolean rooted; if ( root->deleted ) { printf("Cannot write tree because every branch in the tree is deleted\n"); return; } openfile(&outtree,OUTTREE,"output tree file","w","retree",outtreename); if (nexus && onfirsttree) { fprintf(outtree, "#NEXUS\n"); fprintf(outtree, "BEGIN TREES\n"); fprintf(outtree, "TRANSLATE;\n"); /* MacClade needs this */ } if (xmltree && onfirsttree) { fprintf(outtree, "\n"); } onfirsttree = false; maxinput = 1; do { printf("Enter R if the tree is to be rooted\n"); printf("OR enter U if the tree is to be unrooted: "); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); if (ch == '\n') ch = ' '; ch = (isupper(ch)) ? ch : toupper(ch); maxinput++; if (maxinput == 100) { printf("ERROR: too many tries at choosing option\n"); exxit(-1); } } while (ch != 'R' && ch != 'U'); col = 0; rooted = (ch == 'R'); roottreeout(&rooted); treenumber++; printf("\nTree written to file \"%s\"\n\n", outtreename); waswritten = true; written = true; if (!(*done)) printree(); FClose(outtree); } /* treewrite */ void retree_window(adjwindow action) { /* move viewing window of tree */ switch (action) { case left: if (leftedge != 1) leftedge -= hscroll; break; case downn: /* The 'topedge + 3' is needed to allow downward scrolling when part of the tree is above the screen and only 1 or 2 lines are below it. */ if (treelines - topedge + 3 >= screenlines) topedge += vscroll; break; case upp: if (topedge != 1) topedge -= vscroll; break; case right: if (leftedge < vscreenwidth+2) { if (hscroll > leftedge - vscreenwidth + 1) leftedge = vscreenwidth; else leftedge += hscroll; } break; } printree(); } /* retree_window */ void getlength(double *length, reslttype *reslt, boolean *hslngth) { long maxinput; double valyew; char tmp[100]; valyew = 0.0; maxinput = 1; do { printf("\nEnter the new branch length\n"); printf("OR enter U to leave the length unchanged\n"); if (*hslngth) printf("OR enter R to remove the length from this branch: \n"); getstryng(tmp); if (tmp[0] == 'u' || tmp[0] == 'U'){ *reslt = quit; break; } else if (tmp[0] == 'r' || tmp[0] == 'R') { (*reslt) = remoov; break;} else if (sscanf(tmp,"%lf",&valyew) == 1){ (*reslt) = valid; break;} maxinput++; if (maxinput == 100) { printf("ERROR: too many tries at choosing option\n"); exxit(-1); } } while (1); (*length) = valyew; } /* getlength */ void changelength() { /* change or specify the length of a tip */ boolean hslngth; boolean ok; long i, w, maxinput; double length, x; Char ch; reslttype reslt; node *p; maxinput = 1; do { printf("Specify length of which branch (0 = all branches)? "); inpnum(&i, &ok); ok = (ok && (unsigned long)i <= nonodes); if (ok && (i != 0)) ok = (ok && !nodep[i - 1]->deleted); if (i == 0) ok = (nodep[i - 1] != root); maxinput++; if (maxinput == 100) { printf("ERROR: too many tries at choosing option\n"); exxit(-1); } } while (!ok); if (i != 0) { p = nodep[i - 1]; putchar('\n'); if (p->haslength) { x = p->length; if (x > 0.0) w = (long)(0.43429448222 * log(x)); else if (x == 0.0) w = 0; else w = (long)(0.43429448222 * log(-x)) + 1; if (w < 0) w = 0; printf("The current length of this branch is %*.5f\n", (int)(w + 7), x); } else printf("This branch does not have a length\n"); hslngth = p->haslength; getlength(&length, &reslt, &hslngth); switch (reslt) { case valid: copytree(); p->length = length; p->haslength = true; if (p->back != NULL) { p->back->length = length; p->back->haslength = true; } break; case remoov: copytree(); p->haslength = false; if (p->back != NULL) p->back->haslength = false; break; case quit: /* blank case */ break; } } else { printf("\n (this operation cannot be undone)\n"); maxinput = 1; do { printf("\n enter U to leave the lengths unchanged\n"); printf("OR enter R to remove the lengths from all branches: \n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); if (ch == '\n') ch = ' '; maxinput++; if (maxinput == 100) { printf("ERROR: too many tries at choosing option\n"); exxit(-1); } } while (ch != 'U' && ch != 'u' && ch != 'R' && ch != 'r'); if (ch == 'R' || ch == 'r') { copytree(); for (i = 0; i < spp; i++) nodep[i]->haslength = false; for (i = spp; i < nonodes; i++) { if (nodep[i] != NULL) { nodep[i]->haslength = false; nodep[i]->next->haslength = false; nodep[i]->next->next->haslength = false; } } } } printree(); } /* changelength */ void changename() { /* change or specify the name of a tip */ boolean ok; long i, n, tipno; char tipname[100]; for(;;) { for(;;) { printf("Specify name of which tip? (enter its number or 0 to quit): "); inpnum(&i, &ok); if (i > 0 && ((unsigned long)i <= spp) && ok) if (!nodep[i - 1]->deleted) { tipno = i; break; } if (i == 0) { tipno = 0; break; } } if (tipno == 0) break; if (nodep[tipno - 1]->hasname) { n = 0; /* this is valid because names are padded out to MAXNCH with nulls */ for (i = 1; i <= MAXNCH; i++) { if (nodep[tipno - 1]->nayme[i - 1] != '\0') n = i; } printf("The current name of tip %ld is \"", tipno); for (i = 0; i < n; i++) putchar(nodep[tipno - 1]->nayme[i]); printf("\"\n"); } copytree(); for (i = 0; i < MAXNCH; i++) nodep[tipno - 1]->nayme[i] = ' '; printf("Enter new tip name: "); i = 1; getstryng(tipname); strncpy(nodep[tipno-1]->nayme,tipname,MAXNCH); nodep[tipno - 1]->hasname = true; printree(); } printree(); } /* changename */ void clade() { /* pick a subtree and show only that on screen */ long i; boolean ok; printf("Select subtree rooted at which node (0 for whole tree)? "); inpnum(&i, &ok); ok = (ok && (unsigned long)i <= nonodes); if (ok) { subtree = (i > 0); if (subtree) nuroot = nodep[i - 1]; else nuroot = root; } printree(); if (!ok) printf("Not possible to use this node. "); } /* clade */ void changeoutgroup() { long i, maxinput; boolean ok; maxinput = 1; do { printf("Which node should be the new outgroup? "); inpnum(&i, &ok); ok = (ok && i >= 1 && i <= nonodes && i != root->index); if (ok) ok = (ok && !nodep[i - 1]->deleted); if (ok) ok = !nodep[nodep[i - 1]->back->index - 1]->deleted; if (ok) outgrno = i; maxinput++; if (maxinput == 100) { printf("ERROR: too many tries at choosing option\n"); exxit(-1); } } while (!ok); copytree(); reroot(nodep[outgrno - 1]); printree(); written = false; } /* changeoutgroup */ void redisplay() { long maxinput; boolean done; char ch; done = false; maxinput = 1; do { printf("\nNEXT? (Options: R . "); if (haslengths) printf("= "); printf("U W O "); if (haslengths) printf("M "); printf("T F D B N H J K L C + ? X Q) (? for Help) "); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); if (ch == '\n') ch = ' '; ch = isupper(ch) ? ch : toupper(ch); if (ch == 'C' || ch == 'F' || ch == 'O' || ch == 'R' || ch == 'U' || ch == 'X' || ch == 'Q' || ch == '.' || ch == 'W' || ch == 'B' || ch == 'N' || ch == '?' || ch == 'H' || ch == 'J' || ch == 'K' || ch == 'L' || ch == '+' || ch == 'T' || ch == 'D' || (haslengths && ch == 'M') || (haslengths && ch == '=')) { switch (ch) { case 'R': rearrange(); break; case '.': printree(); break; case '=': togglelengths(); break; case 'U': undo(); break; case 'W': treewrite(&done); break; case 'O': changeoutgroup(); break; case 'M': midpoint(); break; case 'T': transpose(0); break; case 'F': flip(0); break; case 'C': clade(); break; case 'D': del_or_restore(); break; case 'B': changelength(); break; case 'N': changename(); break; case 'H': retree_window(left); break; case 'J': retree_window(downn); break; case 'K': retree_window(upp); break; case 'L': retree_window(right); break; case '?': retree_help(); break; case '+': if (treesread \n"); else if (nexus) fprintf(outtree, "END;\n"); } FClose(intree); FClose(outtree); #ifdef MAC fixmacfile(outtreename); #endif #ifdef WIN32 phyRestoreConsoleAttributes(); #endif return 0; } /* Retree */ phylip-3.697/src/seq.c0000644004732000473200000033705612407047317014313 0ustar joefelsenst_g #include "phylip.h" #include "seq.h" /* version 3.696. Written by Joseph Felsenstein, Akiko Fuseki, Sean Lamont, and Andrew Keeffe. Copyright (c) 1993-2014, Joseph Felsenstein All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ long nonodes, endsite, outgrno, nextree, which; boolean interleaved, printdata, outgropt, treeprint, dotdiff, transvp; steptr weight, category, alias, location, ally; sequence y; void fix_x(node* p,long site, double maxx, long rcategs) { /* dnaml dnamlk */ long i,j; p->underflows[site] += log(maxx); for ( i = 0 ; i < rcategs ; i++ ) { for ( j = 0 ; j < ((long)T - (long)A + 1) ; j++) p->x[site][i][j] /= maxx; } } /* fix_x */ void fix_protx(node* p,long site, double maxx, long rcategs) { /* proml promlk */ long i,m; p->underflows[site] += log(maxx); for ( i = 0 ; i < rcategs ; i++ ) for (m = 0; m <= 19; m++) p->protx[site][i][m] /= maxx; } /* fix_protx */ void alloctemp(node **temp, long *zeros, long endsite) { /*used in dnacomp and dnapenny */ *temp = (node *)Malloc(sizeof(node)); (*temp)->numsteps = (steptr)Malloc(endsite*sizeof(long)); (*temp)->base = (baseptr)Malloc(endsite*sizeof(long)); (*temp)->numnuc = (nucarray *)Malloc(endsite*sizeof(nucarray)); memcpy((*temp)->base, zeros, endsite*sizeof(long)); memcpy((*temp)->numsteps, zeros, endsite*sizeof(long)); zeronumnuc(*temp, endsite); } /* alloctemp */ void freetemp(node **temp) { /* used in dnacomp, dnapars, & dnapenny */ free((*temp)->numsteps); free((*temp)->base); free((*temp)->numnuc); free(*temp); } /* freetemp */ void freetree2 (pointarray treenode, long nonodes) { /* The natural complement to alloctree2. Free all elements of all the rings (normally triads) in treenode */ long i; node *p, *q; /* The first spp elements are just nodes, not rings */ for (i = 0; i < spp; i++) free (treenode[i]); /* The rest are rings */ for (i = spp; i < nonodes; i++) { p = treenode[i]->next; while (p != treenode[i]) { q = p->next; free (p); p = q; } /* p should now point to treenode[i], which has yet to be freed */ free (p); } free (treenode); } /* freetree2 */ void inputdata(long chars) { /* input the names and sequences for each species */ /* used by dnacomp, dnadist, dnainvar, dnaml, dnamlk, dnapars, & dnapenny */ long i, j, k, l, basesread, basesnew=0; Char charstate; boolean allread, done; if (printdata) headings(chars, "Sequences", "---------"); basesread = 0; allread = false; while (!(allread)) { /* eat white space -- if the separator line has spaces on it*/ do { charstate = gettc(infile); } while (charstate == ' ' || charstate == '\t'); ungetc(charstate, infile); if (eoln(infile)) scan_eoln(infile); i = 1; while (i <= spp) { if ((interleaved && basesread == 0) || !interleaved) initname(i-1); j = (interleaved) ? basesread : 0; done = false; while (!done && !eoff(infile)) { if (interleaved) done = true; while (j < chars && !(eoln(infile) || eoff(infile))) { charstate = gettc(infile); if (charstate == '\n' || charstate == '\t') charstate = ' '; if (charstate == ' ' || (charstate >= '0' && charstate <= '9')) continue; uppercase(&charstate); if ((strchr("ABCDGHKMNRSTUVWXY?O-",charstate)) == NULL){ printf("ERROR: bad base: %c at site %5ld of species %3ld\n", charstate, j+1, i); if (charstate == '.') { printf(" Periods (.) may not be used as gap characters.\n"); printf(" The correct gap character is (-)\n"); } exxit(-1); } j++; y[i - 1][j - 1] = charstate; } if (interleaved) continue; if (j < chars) scan_eoln(infile); else if (j == chars) done = true; } if (interleaved && i == 1) basesnew = j; scan_eoln(infile); if ((interleaved && j != basesnew) || (!interleaved && j != chars)) { printf("\nERROR: sequences out of alignment at position %ld", j+1); printf(" of species %ld\n\n", i); exxit(-1); } i++; } if (interleaved) { basesread = basesnew; allread = (basesread == chars); } else allread = (i > spp); } if (!printdata) return; for (i = 1; i <= ((chars - 1) / 60 + 1); i++) { for (j = 1; j <= spp; j++) { for (k = 0; k < nmlngth; k++) putc(nayme[j - 1][k], outfile); fprintf(outfile, " "); l = i * 60; if (l > chars) l = chars; for (k = (i - 1) * 60 + 1; k <= l; k++) { if (dotdiff && (j > 1 && y[j - 1][k - 1] == y[0][k - 1])) charstate = '.'; else charstate = y[j - 1][k - 1]; putc(charstate, outfile); if (k % 10 == 0 && k % 60 != 0) putc(' ', outfile); } putc('\n', outfile); } putc('\n', outfile); } putc('\n', outfile); } /* inputdata */ void alloctree(pointarray *treenode, long nonodes, boolean usertree) { /* allocate treenode dynamically */ /* used in dnapars, dnacomp, dnapenny & dnamove */ long i, j; node *p, *q; *treenode = (pointarray)Malloc(nonodes*sizeof(node *)); for (i = 0; i < spp; i++) { (*treenode)[i] = (node *)Malloc(sizeof(node)); (*treenode)[i]->tip = true; (*treenode)[i]->index = i+1; (*treenode)[i]->iter = true; (*treenode)[i]->branchnum = 0; (*treenode)[i]->initialized = true; } if (!usertree) for (i = spp; i < nonodes; i++) { q = NULL; for (j = 1; j <= 3; j++) { p = (node *)Malloc(sizeof(node)); p->tip = false; p->index = i+1; p->iter = true; p->branchnum = 0; p->initialized = false; p->next = q; q = p; } p->next->next->next = p; (*treenode)[i] = p; } } /* alloctree */ void allocx(long nonodes, long rcategs, pointarray treenode, boolean usertree) { /* allocate x dynamically */ /* used in dnaml & dnamlk */ long i, j, k; node *p; for (i = 0; i < spp; i++){ treenode[i]->x = (phenotype)Malloc(endsite*sizeof(ratelike)); treenode[i]->underflows = (double *)Malloc(endsite * sizeof (double)); for (j = 0; j < endsite; j++) treenode[i]->x[j] = (ratelike)Malloc(rcategs*sizeof(sitelike)); } if (!usertree) { for (i = spp; i < nonodes; i++) { p = treenode[i]; for (j = 1; j <= 3; j++) { p->underflows = (double *)Malloc (endsite * sizeof (double)); p->x = (phenotype)Malloc(endsite*sizeof(ratelike)); for (k = 0; k < endsite; k++) p->x[k] = (ratelike)Malloc(rcategs*sizeof(sitelike)); p = p->next; } } } } /* allocx */ void prot_allocx(long nonodes, long rcategs, pointarray treenode, boolean usertree) { /* allocate x dynamically */ /* used in proml */ long i, j, k; node *p; for (i = 0; i < spp; i++){ treenode[i]->protx = (pphenotype)Malloc(endsite*sizeof(pratelike)); treenode[i]->underflows = (double *)Malloc(endsite*sizeof(double)); for (j = 0; j < endsite; j++) treenode[i]->protx[j] = (pratelike)Malloc(rcategs*sizeof(psitelike)); } if (!usertree) { for (i = spp; i < nonodes; i++) { p = treenode[i]; for (j = 1; j <= 3; j++) { p->protx = (pphenotype)Malloc(endsite*sizeof(pratelike)); p->underflows = (double *)Malloc(endsite*sizeof(double)); for (k = 0; k < endsite; k++) p->protx[k] = (pratelike)Malloc(rcategs*sizeof(psitelike)); p = p->next; } } } } /* prot_allocx */ void setuptree(pointarray treenode, long nonodes, boolean usertree) { /* initialize treenodes */ long i; node *p; for (i = 1; i <= nonodes; i++) { if (i <= spp || !usertree) { treenode[i-1]->back = NULL; treenode[i-1]->tip = (i <= spp); treenode[i-1]->index = i; treenode[i-1]->numdesc = 0; treenode[i-1]->iter = true; treenode[i-1]->initialized = true; treenode[i-1]->tyme = 0.0; } } if (!usertree) { for (i = spp + 1; i <= nonodes; i++) { p = treenode[i-1]->next; while (p != treenode[i-1]) { p->back = NULL; p->tip = false; p->index = i; p->numdesc = 0; p->iter = true; p->initialized = false; p->tyme = 0.0; p = p->next; } } } } /* setuptree */ void setuptree2(tree *a) { /* initialize a tree */ /* used in dnaml, dnamlk, & restml */ a->likelihood = -999999.0; a->start = a->nodep[0]->back; a->root = NULL; } /* setuptree2 */ void alloctip(node *p, long *zeros) { /* allocate a tip node */ /* used by dnacomp, dnapars, & dnapenny */ p->numsteps = (steptr)Malloc(endsite*sizeof(long)); p->oldnumsteps = (steptr)Malloc(endsite*sizeof(long)); p->base = (baseptr)Malloc(endsite*sizeof(long)); p->oldbase = (baseptr)Malloc(endsite*sizeof(long)); memcpy(p->base, zeros, endsite*sizeof(long)); memcpy(p->numsteps, zeros, endsite*sizeof(long)); memcpy(p->oldbase, zeros, endsite*sizeof(long)); memcpy(p->oldnumsteps, zeros, endsite*sizeof(long)); } /* alloctip */ void getbasefreqs(double freqa, double freqc, double freqg, double freqt, double *freqr, double *freqy, double *freqar, double *freqcy, double *freqgr, double *freqty, double *ttratio, double *xi, double *xv, double *fracchange, boolean freqsfrom, boolean printdata) { /* used by dnadist, dnaml, & dnamlk */ double aa, bb; if (printdata) { putc('\n', outfile); if (freqsfrom) fprintf(outfile, "Empirical "); fprintf(outfile, "Base Frequencies:\n\n"); fprintf(outfile, " A %10.5f\n", freqa); fprintf(outfile, " C %10.5f\n", freqc); fprintf(outfile, " G %10.5f\n", freqg); fprintf(outfile, " T(U) %10.5f\n", freqt); fprintf(outfile, "\n"); } *freqr = freqa + freqg; *freqy = freqc + freqt; *freqar = freqa / *freqr; *freqcy = freqc / *freqy; *freqgr = freqg / *freqr; *freqty = freqt / *freqy; aa = *ttratio * (*freqr) * (*freqy) - freqa * freqg - freqc * freqt; bb = freqa * (*freqgr) + freqc * (*freqty); *xi = aa / (aa + bb); *xv = 1.0 - *xi; if (*xi < 0.0) { printf("\n WARNING: This transition/transversion ratio\n"); printf(" is impossible with these base frequencies!\n"); *xi = 0.0; *xv = 1.0; (*ttratio) = (freqa*freqg+freqc*freqt)/((*freqr)*(*freqy)); printf(" Transition/transversion parameter reset\n"); printf(" so transition/transversion ratio is %10.6f\n\n", (*ttratio)); } if (freqa <= 0.0) freqa = 0.000001; if (freqc <= 0.0) freqc = 0.000001; if (freqg <= 0.0) freqg = 0.000001; if (freqt <= 0.0) freqt = 0.000001; *fracchange = (*xi) * (2 * freqa * (*freqgr) + 2 * freqc * (*freqty)) + (*xv) * (1.0 - freqa * freqa - freqc * freqc - freqg * freqg - freqt * freqt); } /* getbasefreqs */ void empiricalfreqs(double *freqa, double *freqc, double *freqg, double *freqt, steptr weight, pointarray treenode) { /* Get empirical base frequencies from the data */ /* used in dnaml & dnamlk */ long i, j, k; double sum, suma, sumc, sumg, sumt, w; *freqa = 0.25; *freqc = 0.25; *freqg = 0.25; *freqt = 0.25; for (k = 1; k <= 8; k++) { suma = 0.0; sumc = 0.0; sumg = 0.0; sumt = 0.0; for (i = 0; i < spp; i++) { for (j = 0; j < endsite; j++) { w = weight[j]; sum = (*freqa) * treenode[i]->x[j][0][0]; sum += (*freqc) * treenode[i]->x[j][0][(long)C - (long)A]; sum += (*freqg) * treenode[i]->x[j][0][(long)G - (long)A]; sum += (*freqt) * treenode[i]->x[j][0][(long)T - (long)A]; suma += w * (*freqa) * treenode[i]->x[j][0][0] / sum; sumc += w * (*freqc) * treenode[i]->x[j][0][(long)C - (long)A] / sum; sumg += w * (*freqg) * treenode[i]->x[j][0][(long)G - (long)A] / sum; sumt += w * (*freqt) * treenode[i]->x[j][0][(long)T - (long)A] / sum; } } sum = suma + sumc + sumg + sumt; *freqa = suma / sum; *freqc = sumc / sum; *freqg = sumg / sum; *freqt = sumt / sum; } if (*freqa <= 0.0) *freqa = 0.000001; if (*freqc <= 0.0) *freqc = 0.000001; if (*freqg <= 0.0) *freqg = 0.000001; if (*freqt <= 0.0) *freqt = 0.000001; } /* empiricalfreqs */ void sitesort(long chars, steptr weight) { /* Shell sort keeping sites, weights in same order */ /* used in dnainvar, dnapars, dnacomp & dnapenny */ long gap, i, j, jj, jg, k, itemp; boolean flip, tied; gap = chars / 2; while (gap > 0) { for (i = gap + 1; i <= chars; i++) { j = i - gap; flip = true; while (j > 0 && flip) { jj = alias[j - 1]; jg = alias[j + gap - 1]; tied = true; k = 1; while (k <= spp && tied) { flip = (y[k - 1][jj - 1] > y[k - 1][jg - 1]); tied = (tied && y[k - 1][jj - 1] == y[k - 1][jg - 1]); k++; } if (!flip) break; itemp = alias[j - 1]; alias[j - 1] = alias[j + gap - 1]; alias[j + gap - 1] = itemp; itemp = weight[j - 1]; weight[j - 1] = weight[j + gap - 1]; weight[j + gap - 1] = itemp; j -= gap; } } gap /= 2; } } /* sitesort */ void sitecombine(long chars) { /* combine sites that have identical patterns */ /* used in dnapars, dnapenny, & dnacomp */ long i, j, k; boolean tied; i = 1; while (i < chars) { j = i + 1; tied = true; while (j <= chars && tied) { k = 1; while (k <= spp && tied) { tied = (tied && y[k - 1][alias[i - 1] - 1] == y[k - 1][alias[j - 1] - 1]); k++; } if (tied) { weight[i - 1] += weight[j - 1]; weight[j - 1] = 0; ally[alias[j - 1] - 1] = alias[i - 1]; } j++; } i = j - 1; } } /* sitecombine */ void sitescrunch(long chars) { /* move so one representative of each pattern of sites comes first */ /* used in dnapars & dnacomp */ long i, j, itemp; boolean done, found; done = false; i = 1; j = 2; while (!done) { if (ally[alias[i - 1] - 1] != alias[i - 1]) { if (j <= i) j = i + 1; if (j <= chars) { do { found = (ally[alias[j - 1] - 1] == alias[j - 1]); j++; } while (!(found || j > chars)); if (found) { j--; itemp = alias[i - 1]; alias[i - 1] = alias[j - 1]; alias[j - 1] = itemp; itemp = weight[i - 1]; weight[i - 1] = weight[j - 1]; weight[j - 1] = itemp; } else done = true; } else done = true; } i++; done = (done || i >= chars); } } /* sitescrunch */ void sitesort2(long sites, steptr aliasweight) { /* Shell sort keeping sites, weights in same order */ /* used in dnaml & dnamnlk */ long gap, i, j, jj, jg, k, itemp; boolean flip, tied; gap = sites / 2; while (gap > 0) { for (i = gap + 1; i <= sites; i++) { j = i - gap; flip = true; while (j > 0 && flip) { jj = alias[j - 1]; jg = alias[j + gap - 1]; tied = (category[jj - 1] == category[jg - 1]); flip = (category[jj - 1] > category[jg - 1]); k = 1; while (k <= spp && tied) { flip = (y[k - 1][jj - 1] > y[k - 1][jg - 1]); tied = (tied && y[k - 1][jj - 1] == y[k - 1][jg - 1]); k++; } if (!flip) break; itemp = alias[j - 1]; alias[j - 1] = alias[j + gap - 1]; alias[j + gap - 1] = itemp; itemp = aliasweight[j - 1]; aliasweight[j - 1] = aliasweight[j + gap - 1]; aliasweight[j + gap - 1] = itemp; j -= gap; } } gap /= 2; } } /* sitesort2 */ void sitecombine2(long sites, steptr aliasweight) { /* combine sites that have identical patterns */ /* used in dnaml & dnamlk */ long i, j, k; boolean tied; i = 1; while (i < sites) { j = i + 1; tied = true; while (j <= sites && tied) { tied = (category[alias[i - 1] - 1] == category[alias[j - 1] - 1]); k = 1; while (k <= spp && tied) { tied = (tied && y[k - 1][alias[i - 1] - 1] == y[k - 1][alias[j - 1] - 1]); k++; } if (!tied) break; aliasweight[i - 1] += aliasweight[j - 1]; aliasweight[j - 1] = 0; ally[alias[j - 1] - 1] = alias[i - 1]; j++; } i = j; } } /* sitecombine2 */ void sitescrunch2(long sites, long i, long j, steptr aliasweight) { /* move so positively weighted sites come first */ /* used by dnainvar, dnaml, dnamlk, & restml */ long itemp; boolean done, found; done = false; while (!done) { if (aliasweight[i - 1] > 0) i++; else { if (j <= i) j = i + 1; if (j <= sites) { do { found = (aliasweight[j - 1] > 0); j++; } while (!(found || j > sites)); if (found) { j--; itemp = alias[i - 1]; alias[i - 1] = alias[j - 1]; alias[j - 1] = itemp; itemp = aliasweight[i - 1]; aliasweight[i - 1] = aliasweight[j - 1]; aliasweight[j - 1] = itemp; } else done = true; } else done = true; } done = (done || i >= sites); } } /* sitescrunch2 */ void makevalues(pointarray treenode, long *zeros, boolean usertree) { /* set up fractional likelihoods at tips */ /* used by dnacomp, dnapars, & dnapenny */ long i, j; char ns = 0; node *p; setuptree(treenode, nonodes, usertree); for (i = 0; i < spp; i++) alloctip(treenode[i], zeros); if (!usertree) { for (i = spp; i < nonodes; i++) { p = treenode[i]; do { allocnontip(p, zeros, endsite); p = p->next; } while (p != treenode[i]); } } for (j = 0; j < endsite; j++) { for (i = 0; i < spp; i++) { switch (y[i][alias[j] - 1]) { case 'A': ns = 1 << A; break; case 'C': ns = 1 << C; break; case 'G': ns = 1 << G; break; case 'U': ns = 1 << T; break; case 'T': ns = 1 << T; break; case 'M': ns = (1 << A) | (1 << C); break; case 'R': ns = (1 << A) | (1 << G); break; case 'W': ns = (1 << A) | (1 << T); break; case 'S': ns = (1 << C) | (1 << G); break; case 'Y': ns = (1 << C) | (1 << T); break; case 'K': ns = (1 << G) | (1 << T); break; case 'B': ns = (1 << C) | (1 << G) | (1 << T); break; case 'D': ns = (1 << A) | (1 << G) | (1 << T); break; case 'H': ns = (1 << A) | (1 << C) | (1 << T); break; case 'V': ns = (1 << A) | (1 << C) | (1 << G); break; case 'N': ns = (1 << A) | (1 << C) | (1 << G) | (1 << T); break; case 'X': ns = (1 << A) | (1 << C) | (1 << G) | (1 << T); break; case '?': ns = (1 << A) | (1 << C) | (1 << G) | (1 << T) | (1 << O); break; case 'O': ns = 1 << O; break; case '-': ns = 1 << O; break; } treenode[i]->base[j] = ns; treenode[i]->numsteps[j] = 0; } } } /* makevalues */ void makevalues2(long categs, pointarray treenode, long endsite, long spp, sequence y, steptr alias) { /* set up fractional likelihoods at tips */ /* used by dnaml & dnamlk */ long i, j, k, l; bases b; for (k = 0; k < endsite; k++) { j = alias[k]; for (i = 0; i < spp; i++) { for (l = 0; l < categs; l++) { for (b = A; (long)b <= (long)T; b = (bases)((long)b + 1)) treenode[i]->x[k][l][(long)b - (long)A] = 0.0; switch (y[i][j - 1]) { case 'A': treenode[i]->x[k][l][0] = 1.0; break; case 'C': treenode[i]->x[k][l][(long)C - (long)A] = 1.0; break; case 'G': treenode[i]->x[k][l][(long)G - (long)A] = 1.0; break; case 'T': treenode[i]->x[k][l][(long)T - (long)A] = 1.0; break; case 'U': treenode[i]->x[k][l][(long)T - (long)A] = 1.0; break; case 'M': treenode[i]->x[k][l][0] = 1.0; treenode[i]->x[k][l][(long)C - (long)A] = 1.0; break; case 'R': treenode[i]->x[k][l][0] = 1.0; treenode[i]->x[k][l][(long)G - (long)A] = 1.0; break; case 'W': treenode[i]->x[k][l][0] = 1.0; treenode[i]->x[k][l][(long)T - (long)A] = 1.0; break; case 'S': treenode[i]->x[k][l][(long)C - (long)A] = 1.0; treenode[i]->x[k][l][(long)G - (long)A] = 1.0; break; case 'Y': treenode[i]->x[k][l][(long)C - (long)A] = 1.0; treenode[i]->x[k][l][(long)T - (long)A] = 1.0; break; case 'K': treenode[i]->x[k][l][(long)G - (long)A] = 1.0; treenode[i]->x[k][l][(long)T - (long)A] = 1.0; break; case 'B': treenode[i]->x[k][l][(long)C - (long)A] = 1.0; treenode[i]->x[k][l][(long)G - (long)A] = 1.0; treenode[i]->x[k][l][(long)T - (long)A] = 1.0; break; case 'D': treenode[i]->x[k][l][0] = 1.0; treenode[i]->x[k][l][(long)G - (long)A] = 1.0; treenode[i]->x[k][l][(long)T - (long)A] = 1.0; break; case 'H': treenode[i]->x[k][l][0] = 1.0; treenode[i]->x[k][l][(long)C - (long)A] = 1.0; treenode[i]->x[k][l][(long)T - (long)A] = 1.0; break; case 'V': treenode[i]->x[k][l][0] = 1.0; treenode[i]->x[k][l][(long)C - (long)A] = 1.0; treenode[i]->x[k][l][(long)G - (long)A] = 1.0; break; case 'N': for (b = A; (long)b <= (long)T; b = (bases)((long)b + 1)) treenode[i]->x[k][l][(long)b - (long)A] = 1.0; break; case 'X': for (b = A; (long)b <= (long)T; b = (bases)((long)b + 1)) treenode[i]->x[k][l][(long)b - (long)A] = 1.0; break; case '?': for (b = A; (long)b <= (long)T; b = (bases)((long)b + 1)) treenode[i]->x[k][l][(long)b - (long)A] = 1.0; break; case 'O': for (b = A; (long)b <= (long)T; b = (bases)((long)b + 1)) treenode[i]->x[k][l][(long)b - (long)A] = 1.0; break; case '-': for (b = A; (long)b <= (long)T; b = (bases)((long)b + 1)) treenode[i]->x[k][l][(long)b - (long)A] = 1.0; break; } } } } } /* makevalues2 */ void fillin(node *p, node *left, node *rt) { /* sets up for each node in the tree the base sequence at that point and counts the changes. */ long i, j, k, n, purset, pyrset; node *q; purset = (1 << (long)A) + (1 << (long)G); pyrset = (1 << (long)C) + (1 << (long)T); if (!left) { memcpy(p->base, rt->base, endsite*sizeof(long)); memcpy(p->numsteps, rt->numsteps, endsite*sizeof(long)); q = rt; } else if (!rt) { memcpy(p->base, left->base, endsite*sizeof(long)); memcpy(p->numsteps, left->numsteps, endsite*sizeof(long)); q = left; } else { for (i = 0; i < endsite; i++) { p->base[i] = left->base[i] & rt->base[i]; p->numsteps[i] = left->numsteps[i] + rt->numsteps[i]; if (p->base[i] == 0) { p->base[i] = left->base[i] | rt->base[i]; if (transvp) { if (!((p->base[i] == purset) || (p->base[i] == pyrset))) p->numsteps[i] += weight[i]; } else p->numsteps[i] += weight[i]; } } q = rt; } if (left && rt) n = 2; else n = 1; for (i = 0; i < endsite; i++) for (j = (long)A; j <= (long)O; j++) p->numnuc[i][j] = 0; for (k = 1; k <= n; k++) { if (k == 2) q = left; for (i = 0; i < endsite; i++) { for (j = (long)A; j <= (long)O; j++) { if (q->base[i] & (1 << j)) p->numnuc[i][j]++; } } } } /* fillin */ long getlargest(long *numnuc) { /* find the largest in array numnuc */ long i, largest; largest = 0; for (i = (long)A; i <= (long)O; i++) if (numnuc[i] > largest) largest = numnuc[i]; return largest; } /* getlargest */ void multifillin(node *p, node *q, long dnumdesc) { /* sets up for each node in the tree the base sequence at that point and counts the changes according to the changes in q's base */ long i, j, b, largest, descsteps, purset, pyrset; memcpy(p->oldbase, p->base, endsite*sizeof(long)); memcpy(p->oldnumsteps, p->numsteps, endsite*sizeof(long)); purset = (1 << (long)A) + (1 << (long)G); pyrset = (1 << (long)C) + (1 << (long)T); for (i = 0; i < endsite; i++) { descsteps = 0; for (j = (long)A; j <= (long)O; j++) { b = 1 << j; if ((descsteps == 0) && (p->base[i] & b)) descsteps = p->numsteps[i] - (p->numdesc - dnumdesc - p->numnuc[i][j]) * weight[i]; } if (dnumdesc == -1) descsteps -= q->oldnumsteps[i]; else if (dnumdesc == 0) descsteps += (q->numsteps[i] - q->oldnumsteps[i]); else descsteps += q->numsteps[i]; if (q->oldbase[i] != q->base[i]) { for (j = (long)A; j <= (long)O; j++) { b = 1 << j; if (transvp) { if (b & purset) b = purset; if (b & pyrset) b = pyrset; } if ((q->oldbase[i] & b) && !(q->base[i] & b)) p->numnuc[i][j]--; else if (!(q->oldbase[i] & b) && (q->base[i] & b)) p->numnuc[i][j]++; } } largest = getlargest(p->numnuc[i]); if (q->oldbase[i] != q->base[i]) { p->base[i] = 0; for (j = (long)A; j <= (long)O; j++) { if (p->numnuc[i][j] == largest) p->base[i] |= (1 << j); } } p->numsteps[i] = (p->numdesc - largest) * weight[i] + descsteps; } } /* multifillin */ void sumnsteps(node *p, node *left, node *rt, long a, long b) { /* sets up for each node in the tree the base sequence at that point and counts the changes. */ long i; long ns, rs, ls, purset, pyrset; if (!left) { memcpy(p->numsteps, rt->numsteps, endsite*sizeof(long)); memcpy(p->base, rt->base, endsite*sizeof(long)); } else if (!rt) { memcpy(p->numsteps, left->numsteps, endsite*sizeof(long)); memcpy(p->base, left->base, endsite*sizeof(long)); } else { purset = (1 << (long)A) + (1 << (long)G); pyrset = (1 << (long)C) + (1 << (long)T); for (i = a; i < b; i++) { ls = left->base[i]; rs = rt->base[i]; ns = ls & rs; p->numsteps[i] = left->numsteps[i] + rt->numsteps[i]; if (ns == 0) { ns = ls | rs; if (transvp) { if (!((ns == purset) || (ns == pyrset))) p->numsteps[i] += weight[i]; } else p->numsteps[i] += weight[i]; } p->base[i] = ns; } } } /* sumnsteps */ void sumnsteps2(node *p,node *left,node *rt,long a,long b,long *threshwt) { /* counts the changes at each node. */ long i, steps; long ns, rs, ls, purset, pyrset; long term; if (a == 0) p->sumsteps = 0.0; if (!left) memcpy(p->numsteps, rt->numsteps, endsite*sizeof(long)); else if (!rt) memcpy(p->numsteps, left->numsteps, endsite*sizeof(long)); else { purset = (1 << (long)A) + (1 << (long)G); pyrset = (1 << (long)C) + (1 << (long)T); for (i = a; i < b; i++) { ls = left->base[i]; rs = rt->base[i]; ns = ls & rs; p->numsteps[i] = left->numsteps[i] + rt->numsteps[i]; if (ns == 0) { ns = ls | rs; if (transvp) { if (!((ns == purset) || (ns == pyrset))) p->numsteps[i] += weight[i]; } else p->numsteps[i] += weight[i]; } } } for (i = a; i < b; i++) { steps = p->numsteps[i]; if ((long)steps <= threshwt[i]) term = steps; else term = threshwt[i]; p->sumsteps += (double)term; } } /* sumnsteps2 */ void multisumnsteps(node *p, node *q, long a, long b, long *threshwt) { /* computes the number of steps between p and q */ long i, j, steps, largest, descsteps, purset, pyrset, b1; long term; if (a == 0) p->sumsteps = 0.0; purset = (1 << (long)A) + (1 << (long)G); pyrset = (1 << (long)C) + (1 << (long)T); for (i = a; i < b; i++) { descsteps = 0; for (j = (long)A; j <= (long)O; j++) { if ((descsteps == 0) && (p->base[i] & (1 << j))) descsteps = p->numsteps[i] - (p->numdesc - 1 - p->numnuc[i][j]) * weight[i]; } descsteps += q->numsteps[i]; largest = 0; for (j = (long)A; j <= (long)O; j++) { b1 = (1 << j); if (transvp) { if (b1 & purset) b1 = purset; if (b1 & pyrset) b1 = pyrset; } if (q->base[i] & b1) p->numnuc[i][j]++; if (p->numnuc[i][j] > largest) largest = p->numnuc[i][j]; } steps = (p->numdesc - largest) * weight[i] + descsteps; if ((long)steps <= threshwt[i]) term = steps; else term = threshwt[i]; p->sumsteps += (double)term; } } /* multisumnsteps */ void multisumnsteps2(node *p) { /* counts the changes at each multi-way node. Sums up steps of all descendants */ long i, j, largest, purset, pyrset, b1; node *q; baseptr b; purset = (1 << (long)A) + (1 << (long)G); pyrset = (1 << (long)C) + (1 << (long)T); for (i = 0; i < endsite; i++) { p->numsteps[i] = 0; q = p->next; while (q != p) { if (q->back) { p->numsteps[i] += q->back->numsteps[i]; b = q->back->base; for (j = (long)A; j <= (long)O; j++) { b1 = (1 << j); if (transvp) { if (b1 & purset) b1 = purset; if (b1 & pyrset) b1 = pyrset; } if (b[i] & b1) p->numnuc[i][j]++; } } q = q->next; } largest = getlargest(p->numnuc[i]); p->base[i] = 0; for (j = (long)A; j <= (long)O; j++) { if (p->numnuc[i][j] == largest) p->base[i] |= (1 << j); } p->numsteps[i] += ((p->numdesc - largest) * weight[i]); } } /* multisumnsteps2 */ boolean alltips(node *forknode, node *p) { /* returns true if all descendants of forknode except p are tips; false otherwise. */ node *q, *r; boolean tips; tips = true; r = forknode; q = forknode->next; do { if (q->back && q->back != p && !q->back->tip) tips = false; q = q->next; } while (tips && q != r); return tips; } /* alltips */ void gdispose(node *p, node **grbg, pointarray treenode) { /* go through tree throwing away nodes */ node *q, *r; p->back = NULL; if (p->tip) return; treenode[p->index - 1] = NULL; q = p->next; while (q != p) { gdispose(q->back, grbg, treenode); q->back = NULL; r = q; q = q->next; chuck(grbg, r); } chuck(grbg, q); } /* gdispose */ void preorder(node *p, node *r, node *root, node *removing, node *adding, node *changing, long dnumdesc) { /* recompute number of steps in preorder taking both ancestoral and descendent steps into account. removing points to a node being removed, if any */ node *q, *p1, *p2; if (p && !p->tip && p != adding) { q = p; do { if (p->back != r) { if (p->numdesc > 2) { if (changing) multifillin (p, r, dnumdesc); else multifillin (p, r, 0); } else { p1 = p->next; if (!removing) while (!p1->back) p1 = p1->next; else while (!p1->back || p1->back == removing) p1 = p1->next; p2 = p1->next; if (!removing) while (!p2->back) p2 = p2->next; else while (!p2->back || p2->back == removing) p2 = p2->next; p1 = p1->back; p2 = p2->back; if (p->back == p1) p1 = NULL; else if (p->back == p2) p2 = NULL; memcpy(p->oldbase, p->base, endsite*sizeof(long)); memcpy(p->oldnumsteps, p->numsteps, endsite*sizeof(long)); fillin(p, p1, p2); } } p = p->next; } while (p != q); q = p; do { preorder(p->next->back, p->next, root, removing, adding, NULL, 0); p = p->next; } while (p->next != q); } } /* preorder */ void updatenumdesc(node *p, node *root, long n) { /* set p's numdesc to n. If p is the root, numdesc of p's descendants are set to n-1. */ node *q; q = p; if (p == root && n > 0) { p->numdesc = n; n--; q = q->next; } do { q->numdesc = n; q = q->next; } while (q != p); } /* updatenumdesc */ void add(node *below,node *newtip,node *newfork,node **root, boolean recompute,pointarray treenode,node **grbg,long *zeros) { /* inserts the nodes newfork and its left descendant, newtip, to the tree. below becomes newfork's right descendant. if newfork is NULL, newtip is added as below's sibling */ /* used in dnacomp & dnapars */ node *p; if (below != treenode[below->index - 1]) below = treenode[below->index - 1]; if (newfork) { if (below->back != NULL) below->back->back = newfork; newfork->back = below->back; below->back = newfork->next->next; newfork->next->next->back = below; newfork->next->back = newtip; newtip->back = newfork->next; if (*root == below) *root = newfork; updatenumdesc(newfork, *root, 2); } else { gnutreenode(grbg, &p, below->index, endsite, zeros); p->back = newtip; newtip->back = p; p->next = below->next; below->next = p; updatenumdesc(below, *root, below->numdesc + 1); } if (!newtip->tip) updatenumdesc(newtip, *root, newtip->numdesc); (*root)->back = NULL; if (!recompute) return; if (!newfork) { memcpy(newtip->back->base, below->base, endsite*sizeof(long)); memcpy(newtip->back->numsteps, below->numsteps, endsite*sizeof(long)); memcpy(newtip->back->numnuc, below->numnuc, endsite*sizeof(nucarray)); if (below != *root) { memcpy(below->back->oldbase, zeros, endsite*sizeof(long)); memcpy(below->back->oldnumsteps, zeros, endsite*sizeof(long)); multifillin(newtip->back, below->back, 1); } if (!newtip->tip) { memcpy(newtip->back->oldbase, zeros, endsite*sizeof(long)); memcpy(newtip->back->oldnumsteps, zeros, endsite*sizeof(long)); preorder(newtip, newtip->back, *root, NULL, NULL, below, 1); } memcpy(newtip->oldbase, zeros, endsite*sizeof(long)); memcpy(newtip->oldnumsteps, zeros, endsite*sizeof(long)); preorder(below, newtip, *root, NULL, newtip, below, 1); if (below != *root) preorder(below->back, below, *root, NULL, NULL, NULL, 0); } else { fillin(newtip->back, newtip->back->next->back, newtip->back->next->next->back); if (!newtip->tip) { memcpy(newtip->back->oldbase, zeros, endsite*sizeof(long)); memcpy(newtip->back->oldnumsteps, zeros, endsite*sizeof(long)); preorder(newtip, newtip->back, *root, NULL, NULL, newfork, 1); } if (newfork != *root) { memcpy(below->back->base, newfork->back->base, endsite*sizeof(long)); memcpy(below->back->numsteps, newfork->back->numsteps, endsite*sizeof(long)); preorder(newfork, newtip, *root, NULL, newtip, NULL, 0); } else { fillin(below->back, newtip, NULL); fillin(newfork, newtip, below); memcpy(below->back->oldbase, zeros, endsite*sizeof(long)); memcpy(below->back->oldnumsteps, zeros, endsite*sizeof(long)); preorder(below, below->back, *root, NULL, NULL, newfork, 1); } if (newfork != *root) { memcpy(newfork->oldbase, below->base, endsite*sizeof(long)); memcpy(newfork->oldnumsteps, below->numsteps, endsite*sizeof(long)); preorder(newfork->back, newfork, *root, NULL, NULL, NULL, 0); } } } /* add */ void findbelow(node **below, node *item, node *fork) { /* decide which of fork's binary children is below */ if (fork->next->back == item) *below = fork->next->next->back; else *below = fork->next->back; } /* findbelow */ void re_move(node *item, node **fork, node **root, boolean recompute, pointarray treenode, node **grbg, long *zeros) { /* removes nodes item and its ancestor, fork, from the tree. the new descendant of fork's ancestor is made to be fork's second descendant (other than item). Also returns pointers to the deleted nodes, item and fork. If item belongs to a node with more than 2 descendants, fork will not be deleted */ /* used in dnacomp & dnapars */ node *p, *q, *other = NULL, *otherback = NULL; if (item->back == NULL) { *fork = NULL; return; } *fork = treenode[item->back->index - 1]; if ((*fork)->numdesc == 2) { updatenumdesc(*fork, *root, 0); findbelow(&other, item, *fork); otherback = other->back; if (*root == *fork) { *root = other; if (!other->tip) updatenumdesc(other, *root, other->numdesc); } p = item->back->next->back; q = item->back->next->next->back; if (p != NULL) p->back = q; if (q != NULL) q->back = p; (*fork)->back = NULL; p = (*fork)->next; while (p != *fork) { p->back = NULL; p = p->next; } } else { updatenumdesc(*fork, *root, (*fork)->numdesc - 1); p = *fork; while (p->next != item->back) p = p->next; p->next = item->back->next; } if (!item->tip) { updatenumdesc(item, item, item->numdesc); if (recompute) { memcpy(item->back->oldbase, item->back->base, endsite*sizeof(long)); memcpy(item->back->oldnumsteps, item->back->numsteps, endsite*sizeof(long)); memcpy(item->back->base, zeros, endsite*sizeof(long)); memcpy(item->back->numsteps, zeros, endsite*sizeof(long)); preorder(item, item->back, *root, item->back, NULL, item, -1); } } if ((*fork)->numdesc >= 2) chuck(grbg, item->back); item->back = NULL; if (!recompute) return; if ((*fork)->numdesc == 0) { memcpy(otherback->oldbase, otherback->base, endsite*sizeof(long)); memcpy(otherback->oldnumsteps, otherback->numsteps, endsite*sizeof(long)); if (other == *root) { memcpy(otherback->base, zeros, endsite*sizeof(long)); memcpy(otherback->numsteps, zeros, endsite*sizeof(long)); } else { memcpy(otherback->base, other->back->base, endsite*sizeof(long)); memcpy(otherback->numsteps, other->back->numsteps, endsite*sizeof(long)); } p = other->back; other->back = otherback; if (other == *root) preorder(other, otherback, *root, otherback, NULL, other, -1); else preorder(other, otherback, *root, NULL, NULL, NULL, 0); other->back = p; if (other != *root) { memcpy(other->oldbase,(*fork)->base, endsite*sizeof(long)); memcpy(other->oldnumsteps,(*fork)->numsteps, endsite*sizeof(long)); preorder(other->back, other, *root, NULL, NULL, NULL, 0); } } else { memcpy(item->oldbase, item->base, endsite*sizeof(long)); memcpy(item->oldnumsteps, item->numsteps, endsite*sizeof(long)); memcpy(item->base, zeros, endsite*sizeof(long)); memcpy(item->numsteps, zeros, endsite*sizeof(long)); preorder(*fork, item, *root, NULL, NULL, *fork, -1); if (*fork != *root) preorder((*fork)->back, *fork, *root, NULL, NULL, NULL, 0); memcpy(item->base, item->oldbase, endsite*sizeof(long)); memcpy(item->numsteps, item->oldnumsteps, endsite*sizeof(long)); } } /* remove */ void postorder(node *p) { /* traverses an n-ary tree, suming up steps at a node's descendants */ /* used in dnacomp, dnapars, & dnapenny */ node *q; if (p->tip) return; q = p->next; while (q != p) { postorder(q->back); q = q->next; } zeronumnuc(p, endsite); if (p->numdesc > 2) multisumnsteps2(p); else fillin(p, p->next->back, p->next->next->back); } /* postorder */ void getnufork(node **nufork,node **grbg,pointarray treenode,long *zeros) { /* find a fork not used currently */ long i; i = spp; while (treenode[i] && treenode[i]->numdesc > 0) i++; if (!treenode[i]) gnutreenode(grbg, &treenode[i], i, endsite, zeros); *nufork = treenode[i]; } /* getnufork */ void reroot(node *outgroup, node *root) { /* reorients tree, putting outgroup in desired position. used if the root is binary. */ /* used in dnacomp & dnapars */ node *p, *q; if (outgroup->back->index == root->index) return; p = root->next; q = root->next->next; p->back->back = q->back; q->back->back = p->back; p->back = outgroup; q->back = outgroup->back; outgroup->back->back = q; outgroup->back = p; } /* reroot */ void reroot2(node *outgroup, node *root) { /* reorients tree, putting outgroup in desired position. */ /* used in dnacomp & dnapars */ node *p; p = outgroup->back->next; while (p->next != outgroup->back) p = p->next; root->next = outgroup->back; p->next = root; } /* reroot2 */ void reroot3(node *outgroup, node *root, node *root2, node *lastdesc, node **grbg) { /* reorients tree, putting back outgroup in original position. */ /* used in dnacomp & dnapars */ node *p; p = root->next; while (p->next != root) p = p->next; chuck(grbg, root); p->next = outgroup->back; root2->next = lastdesc->next; lastdesc->next = root2; } /* reroot3 */ void savetraverse(node *p) { /* sets BOOLEANs that indicate which way is down */ node *q; p->bottom = true; if (p->tip) return; q = p->next; while (q != p) { q->bottom = false; savetraverse(q->back); q = q->next; } } /* savetraverse */ void newindex(long i, node *p) { /* assigns index i to node p */ while (p->index != i) { p->index = i; p = p->next; } } /* newindex */ void flipindexes(long nextnode, pointarray treenode) { /* flips index of nodes between nextnode and last node. */ long last; node *temp; last = nonodes; while (treenode[last - 1]->numdesc == 0) last--; if (last > nextnode) { temp = treenode[nextnode - 1]; treenode[nextnode - 1] = treenode[last - 1]; treenode[last - 1] = temp; newindex(nextnode, treenode[nextnode - 1]); newindex(last, treenode[last - 1]); } } /* flipindexes */ boolean parentinmulti(node *anode) { /* sees if anode's parent has more than 2 children */ node *p; while (!anode->bottom) anode = anode->next; p = anode->back; while (!p->bottom) p = p->next; return (p->numdesc > 2); } /* parentinmulti */ long sibsvisited(node *anode, long *place) { /* computes the number of nodes which are visited earlier than anode among its siblings */ node *p; long nvisited; while (!anode->bottom) anode = anode->next; p = anode->back->next; nvisited = 0; do { if (!p->bottom && place[p->back->index - 1] != 0) nvisited++; p = p->next; } while (p != anode->back); return nvisited; } /* sibsvisited */ long smallest(node *anode, long *place) { /* finds the smallest index of sibling of anode */ node *p; long min; while (!anode->bottom) anode = anode->next; p = anode->back->next; if (p->bottom) p = p->next; min = nonodes; do { if (p->back && place[p->back->index - 1] != 0) { if (p->back->index <= spp) { if (p->back->index < min) min = p->back->index; } else { if (place[p->back->index - 1] < min) min = place[p->back->index - 1]; } } p = p->next; if (p->bottom) p = p->next; } while (p != anode->back); return min; } /* smallest */ void bintomulti(node **root, node **binroot, node **grbg, long *zeros) { /* attaches root's left child to its right child and makes the right child new root */ node *left, *right, *newnode, *temp; right = (*root)->next->next->back; left = (*root)->next->back; if (right->tip) { (*root)->next = right->back; (*root)->next->next = left->back; temp = left; left = right; right = temp; right->back->next = *root; } gnutreenode(grbg, &newnode, right->index, endsite, zeros); newnode->next = right->next; newnode->back = left; left->back = newnode; right->next = newnode; (*root)->next->back = (*root)->next->next->back = NULL; *binroot = *root; (*binroot)->numdesc = 0; *root = right; (*root)->numdesc++; (*root)->back = NULL; } /* bintomulti */ void backtobinary(node **root, node *binroot, node **grbg) { /* restores binary root */ node *p; binroot->next->back = (*root)->next->back; (*root)->next->back->back = binroot->next; p = (*root)->next; (*root)->next = p->next; binroot->next->next->back = *root; (*root)->back = binroot->next->next; chuck(grbg, p); (*root)->numdesc--; *root = binroot; (*root)->numdesc = 2; } /* backtobinary */ boolean outgrin(node *root, node *outgrnode) { /* checks if outgroup node is a child of root */ node *p; p = root->next; while (p != root) { if (p->back == outgrnode) return true; p = p->next; } return false; } /* outgrin */ void flipnodes(node *nodea, node *nodeb) { /* flip nodes */ node *backa, *backb; backa = nodea->back; backb = nodeb->back; backa->back = nodeb; backb->back = nodea; nodea->back = backb; nodeb->back = backa; } /* flipnodes */ void moveleft(node *root, node *outgrnode, node **flipback) { /* makes outgroup node to leftmost child of root */ node *p; boolean done; p = root->next; done = false; while (p != root && !done) { if (p->back == outgrnode) { *flipback = p; flipnodes(root->next->back, p->back); done = true; } p = p->next; } } /* moveleft */ void savetree(node *root, long *place, pointarray treenode, node **grbg, long *zeros) { /* record in place where each species has to be added to reconstruct this tree */ /* used by dnacomp & dnapars */ long i, j, nextnode, nvisited; node *p, *q, *r = NULL, *root2, *lastdesc, *outgrnode, *binroot, *flipback; boolean done, newfork; binroot = NULL; lastdesc = NULL; root2 = NULL; flipback = NULL; outgrnode = treenode[outgrno - 1]; if (root->numdesc == 2) bintomulti(&root, &binroot, grbg, zeros); if (outgrin(root, outgrnode)) { if (outgrnode != root->next->back) moveleft(root, outgrnode, &flipback); } else { root2 = root; lastdesc = root->next; while (lastdesc->next != root) lastdesc = lastdesc->next; lastdesc->next = root->next; gnutreenode(grbg, &root, outgrnode->back->index, endsite, zeros); root->numdesc = root2->numdesc; reroot2(outgrnode, root); } savetraverse(root); nextnode = spp + 1; for (i = nextnode; i <= nonodes; i++) if (treenode[i - 1]->numdesc == 0) flipindexes(i, treenode); for (i = 0; i < nonodes; i++) place[i] = 0; place[root->index - 1] = 1; for (i = 1; i <= spp; i++) { p = treenode[i - 1]; while (place[p->index - 1] == 0) { place[p->index - 1] = i; while (!p->bottom) p = p->next; r = p; p = p->back; } if (i > 1) { q = treenode[i - 1]; newfork = true; nvisited = sibsvisited(q, place); if (nvisited == 0) { if (parentinmulti(r)) { nvisited = sibsvisited(r, place); if (nvisited == 0) place[i - 1] = place[p->index - 1]; else if (nvisited == 1) place[i - 1] = smallest(r, place); else { place[i - 1] = -smallest(r, place); newfork = false; } } else place[i - 1] = place[p->index - 1]; } else if (nvisited == 1) { place[i - 1] = place[p->index - 1]; } else { place[i - 1] = -smallest(q, place); newfork = false; } if (newfork) { j = place[p->index - 1]; done = false; while (!done) { place[p->index - 1] = nextnode; while (!p->bottom) p = p->next; p = p->back; done = (p == NULL); if (!done) done = (place[p->index - 1] != j); if (done) { nextnode++; } } } } } if (flipback) flipnodes(outgrnode, flipback->back); else { if (root2) { reroot3(outgrnode, root, root2, lastdesc, grbg); root = root2; } } if (binroot) backtobinary(&root, binroot, grbg); } /* savetree */ void addnsave(node *p, node *item, node *nufork, node **root, node **grbg, boolean multf, pointarray treenode, long *place, long *zeros) { /* adds item to tree and save it. Then removes item. */ node *dummy; if (!multf) add(p, item, nufork, root, false, treenode, grbg, zeros); else add(p, item, NULL, root, false, treenode, grbg, zeros); savetree(*root, place, treenode, grbg, zeros); if (!multf) re_move(item, &nufork, root, false, treenode, grbg, zeros); else re_move(item, &dummy, root, false, treenode, grbg, zeros); } /* addnsave */ void addbestever(long *pos, long *nextree, long maxtrees, boolean collapse, long *place, bestelm *bestrees) { /* adds first best tree */ *pos = 1; *nextree = 1; initbestrees(bestrees, maxtrees, true); initbestrees(bestrees, maxtrees, false); addtree(*pos, nextree, collapse, place, bestrees); } /* addbestever */ void addtiedtree(long pos, long *nextree, long maxtrees, boolean collapse, long *place, bestelm *bestrees) { /* add tied tree */ if (*nextree <= maxtrees) addtree(pos, nextree, collapse, place, bestrees); } /* addtiedtree */ void clearcollapse(pointarray treenode) { /* clears collapse status at a node */ long i; node *p; for (i = 0; i < nonodes; i++) { treenode[i]->collapse = undefined; if (!treenode[i]->tip) { p = treenode[i]->next; while (p != treenode[i]) { p->collapse = undefined; p = p->next; } } } } /* clearcollapse */ void clearbottom(pointarray treenode) { /* clears boolean bottom at a node */ long i; node *p; for (i = 0; i < nonodes; i++) { treenode[i]->bottom = false; if (!treenode[i]->tip) { p = treenode[i]->next; while (p != treenode[i]) { p->bottom = false; p = p->next; } } } } /* clearbottom */ void collabranch(node *collapfrom, node *tempfrom, node *tempto) { /* collapse branch from collapfrom */ long i, j, b, largest, descsteps; boolean done; for (i = 0; i < endsite; i++) { descsteps = 0; for (j = (long)A; j <= (long)O; j++) { b = 1 << j; if ((descsteps == 0) && (collapfrom->base[i] & b)) descsteps = tempfrom->oldnumsteps[i] - (collapfrom->numdesc - collapfrom->numnuc[i][j]) * weight[i]; } done = false; for (j = (long)A; j <= (long)O; j++) { b = 1 << j; if (!done && (tempto->base[i] & b)) { descsteps += (tempto->numsteps[i] - (tempto->numdesc - collapfrom->numdesc - tempto->numnuc[i][j]) * weight[i]); done = true; } } for (j = (long)A; j <= (long)O; j++) tempto->numnuc[i][j] += collapfrom->numnuc[i][j]; largest = getlargest(tempto->numnuc[i]); tempto->base[i] = 0; for (j = (long)A; j <= (long)O; j++) { if (tempto->numnuc[i][j] == largest) tempto->base[i] |= (1 << j); } tempto->numsteps[i] = (tempto->numdesc - largest) * weight[i] + descsteps; } } /* collabranch */ boolean allcommonbases(node *a, node *b, boolean *allsame) { /* see if bases are common at all sites for nodes a and b */ long i; boolean allcommon; allcommon = true; *allsame = true; for (i = 0; i < endsite; i++) { if ((a->base[i] & b->base[i]) == 0) allcommon = false; else if (a->base[i] != b->base[i]) *allsame = false; } return allcommon; } /* allcommonbases */ void findbottom(node *p, node **bottom) { /* find a node with field bottom set at node p */ node *q; if (p->bottom) *bottom = p; else { q = p->next; while(!q->bottom && q != p) q = q->next; *bottom = q; } } /* findbottom */ boolean moresteps(node *a, node *b) { /* see if numsteps of node a exceeds those of node b */ long i; for (i = 0; i < endsite; i++) if (a->numsteps[i] > b->numsteps[i]) return true; return false; } /* moresteps */ boolean passdown(node *desc, node *parent, node *start, node *below, node *item, node *added, node *total, node *tempdsc, node *tempprt, boolean multf) { /* track down to node start to see if an ancestor branch can be collapsed */ node *temp; boolean done, allsame; done = (parent == start); while (!done) { desc = parent; findbottom(parent->back, &parent); if (multf && start == below && parent == below) parent = added; memcpy(tempdsc->base, tempprt->base, endsite*sizeof(long)); memcpy(tempdsc->numsteps, tempprt->numsteps, endsite*sizeof(long)); memcpy(tempdsc->oldbase, desc->base, endsite*sizeof(long)); memcpy(tempdsc->oldnumsteps, desc->numsteps, endsite*sizeof(long)); memcpy(tempprt->base, parent->base, endsite*sizeof(long)); memcpy(tempprt->numsteps, parent->numsteps, endsite*sizeof(long)); memcpy(tempprt->numnuc, parent->numnuc, endsite*sizeof(nucarray)); tempprt->numdesc = parent->numdesc; multifillin(tempprt, tempdsc, 0); if (!allcommonbases(tempprt, parent, &allsame)) return false; else if (moresteps(tempprt, parent)) return false; else if (allsame) return true; if (parent == added) parent = below; done = (parent == start); if (done && ((start == item) || (!multf && start == below))) { memcpy(tempdsc->base, tempprt->base, endsite*sizeof(long)); memcpy(tempdsc->numsteps, tempprt->numsteps, endsite*sizeof(long)); memcpy(tempdsc->oldbase, start->base, endsite*sizeof(long)); memcpy(tempdsc->oldnumsteps, start->numsteps, endsite*sizeof(long)); multifillin(added, tempdsc, 0); tempprt = added; } } temp = tempdsc; if (start == below || start == item) fillin(temp, tempprt, below->back); else fillin(temp, tempprt, added); return !moresteps(temp, total); } /* passdown */ boolean trycollapdesc(node *desc, node *parent, node *start, node *below, node *item, node *added, node *total, node *tempdsc, node *tempprt, boolean multf, long *zeros) { /* see if branch between nodes desc and parent can be collapsed */ boolean allsame; if (desc->numdesc == 1) return true; if (multf && start == below && parent == below) parent = added; memcpy(tempdsc->base, zeros, endsite*sizeof(long)); memcpy(tempdsc->numsteps, zeros, endsite*sizeof(long)); memcpy(tempdsc->oldbase, desc->base, endsite*sizeof(long)); memcpy(tempdsc->oldnumsteps, desc->numsteps, endsite*sizeof(long)); memcpy(tempprt->base, parent->base, endsite*sizeof(long)); memcpy(tempprt->numsteps, parent->numsteps, endsite*sizeof(long)); memcpy(tempprt->numnuc, parent->numnuc, endsite*sizeof(nucarray)); tempprt->numdesc = parent->numdesc - 1; multifillin(tempprt, tempdsc, -1); tempprt->numdesc += desc->numdesc; collabranch(desc, tempdsc, tempprt); if (!allcommonbases(tempprt, parent, &allsame) || moresteps(tempprt, parent)) { if (parent != added) { desc->collapse = nocollap; parent->collapse = nocollap; } return false; } else if (allsame) { if (parent != added) { desc->collapse = tocollap; parent->collapse = tocollap; } return true; } if (parent == added) parent = below; if ((start == item && parent == item) || (!multf && start == below && parent == below)) { memcpy(tempdsc->base, tempprt->base, endsite*sizeof(long)); memcpy(tempdsc->numsteps, tempprt->numsteps, endsite*sizeof(long)); memcpy(tempdsc->oldbase, start->base, endsite*sizeof(long)); memcpy(tempdsc->oldnumsteps, start->numsteps, endsite*sizeof(long)); memcpy(tempprt->base, added->base, endsite*sizeof(long)); memcpy(tempprt->numsteps, added->numsteps, endsite*sizeof(long)); memcpy(tempprt->numnuc, added->numnuc, endsite*sizeof(nucarray)); tempprt->numdesc = added->numdesc; multifillin(tempprt, tempdsc, 0); if (!allcommonbases(tempprt, added, &allsame)) return false; else if (moresteps(tempprt, added)) return false; else if (allsame) return true; } return passdown(desc, parent, start, below, item, added, total, tempdsc, tempprt, multf); } /* trycollapdesc */ void setbottom(node *p) { /* set field bottom at node p */ node *q; p->bottom = true; q = p->next; do { q->bottom = false; q = q->next; } while (q != p); } /* setbottom */ boolean zeroinsubtree(node *subtree, node *start, node *below, node *item, node *added, node *total, node *tempdsc, node *tempprt, boolean multf, node* root, long *zeros) { /* sees if subtree contains a zero length branch */ node *p; if (!subtree->tip) { setbottom(subtree); p = subtree->next; do { if (p->back && !p->back->tip && !((p->back->collapse == nocollap) && (subtree->collapse == nocollap)) && (subtree->numdesc != 1)) { if ((p->back->collapse == tocollap) && (subtree->collapse == tocollap) && multf && (subtree != below)) return true; /* when root->numdesc == 2 * there is no mandatory step at the root, * instead of checking at the root we check around it * we only need to check p because the first if * statement already gets rid of it for the subtree */ else if ((p->back->index != root->index || root->numdesc > 2) && trycollapdesc(p->back, subtree, start, below, item, added, total, tempdsc, tempprt, multf, zeros)) return true; else if ((p->back->index == root->index && root->numdesc == 2) && !(root->next->back->tip) && !(root->next->next->back->tip) && trycollapdesc(root->next->back, root->next->next->back, start, below, item,added, total, tempdsc, tempprt, multf, zeros)) return true; } p = p->next; } while (p != subtree); p = subtree->next; do { if (p->back && !p->back->tip) { if (zeroinsubtree(p->back, start, below, item, added, total, tempdsc, tempprt, multf, root, zeros)) return true; } p = p->next; } while (p != subtree); } return false; } /* zeroinsubtree */ boolean collapsible(node *item, node *below, node *temp, node *temp1, node *tempdsc, node *tempprt, node *added, node *total, boolean multf, node *root, long *zeros, pointarray treenode) { /* sees if any branch can be collapsed */ node *belowbk; boolean allsame; if (multf) { memcpy(tempdsc->base, item->base, endsite*sizeof(long)); memcpy(tempdsc->numsteps, item->numsteps, endsite*sizeof(long)); memcpy(tempdsc->oldbase, zeros, endsite*sizeof(long)); memcpy(tempdsc->oldnumsteps, zeros, endsite*sizeof(long)); memcpy(added->base, below->base, endsite*sizeof(long)); memcpy(added->numsteps, below->numsteps, endsite*sizeof(long)); memcpy(added->numnuc, below->numnuc, endsite*sizeof(nucarray)); added->numdesc = below->numdesc + 1; multifillin(added, tempdsc, 1); } else { fillin(added, item, below); added->numdesc = 2; } fillin(total, added, below->back); clearbottom(treenode); if (below->back) { if (zeroinsubtree(below->back, below->back, below, item, added, total, tempdsc, tempprt, multf, root, zeros)) return true; } if (multf) { if (zeroinsubtree(below, below, below, item, added, total, tempdsc, tempprt, multf, root, zeros)) return true; } else if (!below->tip) { if (zeroinsubtree(below, below, below, item, added, total, tempdsc, tempprt, multf, root, zeros)) return true; } if (!item->tip) { if (zeroinsubtree(item, item, below, item, added, total, tempdsc, tempprt, multf, root, zeros)) return true; } if (multf && below->back && !below->back->tip) { memcpy(tempdsc->base, zeros, endsite*sizeof(long)); memcpy(tempdsc->numsteps, zeros, endsite*sizeof(long)); memcpy(tempdsc->oldbase, added->base, endsite*sizeof(long)); memcpy(tempdsc->oldnumsteps, added->numsteps, endsite*sizeof(long)); if (below->back == treenode[below->back->index - 1]) belowbk = below->back->next; else belowbk = treenode[below->back->index - 1]; memcpy(tempprt->base, belowbk->base, endsite*sizeof(long)); memcpy(tempprt->numsteps, belowbk->numsteps, endsite*sizeof(long)); memcpy(tempprt->numnuc, belowbk->numnuc, endsite*sizeof(nucarray)); tempprt->numdesc = belowbk->numdesc - 1; multifillin(tempprt, tempdsc, -1); tempprt->numdesc += added->numdesc; collabranch(added, tempdsc, tempprt); if (!allcommonbases(tempprt, belowbk, &allsame)) return false; else if (allsame && !moresteps(tempprt, belowbk)) return true; else if (belowbk->back) { fillin(temp, tempprt, belowbk->back); fillin(temp1, belowbk, belowbk->back); return !moresteps(temp, temp1); } } return false; } /* collapsible */ void replaceback(node **oldback, node *item, node *forknode, node **grbg, long *zeros) { /* replaces back node of item with another */ node *p; p = forknode; while (p->next->back != item) p = p->next; *oldback = p->next; gnutreenode(grbg, &p->next, forknode->index, endsite, zeros); p->next->next = (*oldback)->next; p->next->back = (*oldback)->back; p->next->back->back = p->next; (*oldback)->next = (*oldback)->back = NULL; } /* replaceback */ void putback(node *oldback, node *item, node *forknode, node **grbg) { /* restores node to back of item */ node *p, *q; p = forknode; while (p->next != item->back) p = p->next; q = p->next; oldback->next = p->next->next; p->next = oldback; oldback->back = item; item->back = oldback; oldback->index = forknode->index; chuck(grbg, q); } /* putback */ void savelocrearr(node *item, node *forknode, node *below, node *tmp, node *tmp1, node *tmp2, node *tmp3, node *tmprm, node *tmpadd, node **root, long maxtrees, long *nextree, boolean multf, boolean bestever, boolean *saved, long *place, bestelm *bestrees, pointarray treenode, node **grbg, long *zeros) { /* saves tied or better trees during local rearrangements by removing item from forknode and adding to below */ node *other, *otherback = NULL, *oldfork, *nufork, *oldback; long pos; boolean found, collapse; if (forknode->numdesc == 2) { findbelow(&other, item, forknode); otherback = other->back; oldback = NULL; } else { other = NULL; replaceback(&oldback, item, forknode, grbg, zeros); } re_move(item, &oldfork, root, false, treenode, grbg, zeros); if (!multf) getnufork(&nufork, grbg, treenode, zeros); else nufork = NULL; addnsave(below, item, nufork, root, grbg, multf, treenode, place, zeros); pos = 0; findtree(&found, &pos, *nextree, place, bestrees); if (other) { add(other, item, oldfork, root, false, treenode, grbg, zeros); if (otherback->back != other) flipnodes(item, other); } else add(forknode, item, NULL, root, false, treenode, grbg, zeros); *saved = false; if (found) { if (oldback) putback(oldback, item, forknode, grbg); } else { if (oldback) chuck(grbg, oldback); re_move(item, &oldfork, root, true, treenode, grbg, zeros); collapse = collapsible(item, below, tmp, tmp1, tmp2, tmp3, tmprm, tmpadd, multf, *root, zeros, treenode); if (!collapse) { if (bestever) addbestever(&pos, nextree, maxtrees, collapse, place, bestrees); else addtiedtree(pos, nextree, maxtrees, collapse, place, bestrees); } if (other) add(other, item, oldfork, root, true, treenode, grbg, zeros); else add(forknode, item, NULL, root, true, treenode, grbg, zeros); *saved = !collapse; } } /* savelocrearr */ void clearvisited(pointarray treenode) { /* clears boolean visited at a node */ long i; node *p; for (i = 0; i < nonodes; i++) { treenode[i]->visited = false; if (!treenode[i]->tip) { p = treenode[i]->next; while (p != treenode[i]) { p->visited = false; p = p->next; } } } } /* clearvisited */ void hyprint(long b1, long b2, struct LOC_hyptrav *htrav, pointarray treenode, Char *basechar) { /* print out states in sites b1 through b2 at node */ long i, j, k, n; boolean dot; bases b; if (htrav->bottom) { if (!outgropt) fprintf(outfile, " "); else fprintf(outfile, "root "); } else fprintf(outfile, "%4ld ", htrav->r->back->index - spp); if (htrav->r->tip) { for (i = 0; i < nmlngth; i++) putc(nayme[htrav->r->index - 1][i], outfile); } else fprintf(outfile, "%4ld ", htrav->r->index - spp); if (htrav->bottom) fprintf(outfile, " "); else if (htrav->nonzero) fprintf(outfile, " yes "); else if (htrav->maybe) fprintf(outfile, " maybe "); else fprintf(outfile, " no "); for (i = b1; i <= b2; i++) { j = location[ally[i - 1] - 1]; htrav->tempset = htrav->r->base[j - 1]; htrav->anc = htrav->hypset[j - 1]; if (!htrav->bottom) htrav->anc = treenode[htrav->r->back->index - 1]->base[j - 1]; dot = dotdiff && (htrav->tempset == htrav->anc && !htrav->bottom); if (dot) putc('.', outfile); else if (htrav->tempset == (1 << A)) putc('A', outfile); else if (htrav->tempset == (1 << C)) putc('C', outfile); else if (htrav->tempset == (1 << G)) putc('G', outfile); else if (htrav->tempset == (1 << T)) putc('T', outfile); else if (htrav->tempset == (1 << O)) putc('-', outfile); else { k = 1; n = 0; for (b = A; b <= O; b = b + 1) { if (((1 << b) & htrav->tempset) != 0) n += k; k += k; } putc(basechar[n - 1], outfile); } if (i % 10 == 0) putc(' ', outfile); } putc('\n', outfile); } /* hyprint */ void gnubase(gbases **p, gbases **garbage, long endsite) { /* this and the following are do-it-yourself garbage collectors. Make a new node or pull one off the garbage list */ if (*garbage != NULL) { *p = *garbage; *garbage = (*garbage)->next; } else { *p = (gbases *)Malloc(sizeof(gbases)); (*p)->base = (baseptr)Malloc(endsite*sizeof(long)); } (*p)->next = NULL; } /* gnubase */ void chuckbase(gbases *p, gbases **garbage) { /* collect garbage on p -- put it on front of garbage list */ p->next = *garbage; *garbage = p; } /* chuckbase */ void hyptrav(node *r_, long *hypset_, long b1, long b2, boolean bottom_, pointarray treenode, gbases **garbage, Char *basechar) { /* compute, print out states at one interior node */ struct LOC_hyptrav Vars; long i, j, k; long largest; gbases *ancset; nucarray *tempnuc; node *p, *q; Vars.bottom = bottom_; Vars.r = r_; Vars.hypset = hypset_; gnubase(&ancset, garbage, endsite); tempnuc = (nucarray *)Malloc(endsite*sizeof(nucarray)); Vars.maybe = false; Vars.nonzero = false; if (!Vars.r->tip) zeronumnuc(Vars.r, endsite); for (i = b1 - 1; i < b2; i++) { j = location[ally[i] - 1]; Vars.anc = Vars.hypset[j - 1]; if (!Vars.r->tip) { p = Vars.r->next; for (k = (long)A; k <= (long)O; k++) if (Vars.anc & (1 << k)) Vars.r->numnuc[j - 1][k]++; do { for (k = (long)A; k <= (long)O; k++) if (p->back->base[j - 1] & (1 << k)) Vars.r->numnuc[j - 1][k]++; p = p->next; } while (p != Vars.r); largest = getlargest(Vars.r->numnuc[j - 1]); Vars.tempset = 0; for (k = (long)A; k <= (long)O; k++) { if (Vars.r->numnuc[j - 1][k] == largest) Vars.tempset |= (1 << k); } Vars.r->base[j - 1] = Vars.tempset; } if (!Vars.bottom) Vars.anc = treenode[Vars.r->back->index - 1]->base[j - 1]; Vars.nonzero = (Vars.nonzero || (Vars.r->base[j - 1] & Vars.anc) == 0); Vars.maybe = (Vars.maybe || Vars.r->base[j - 1] != Vars.anc); } hyprint(b1, b2, &Vars, treenode, basechar); Vars.bottom = false; if (!Vars.r->tip) { memcpy(tempnuc, Vars.r->numnuc, endsite*sizeof(nucarray)); q = Vars.r->next; do { memcpy(Vars.r->numnuc, tempnuc, endsite*sizeof(nucarray)); for (i = b1 - 1; i < b2; i++) { j = location[ally[i] - 1]; for (k = (long)A; k <= (long)O; k++) if (q->back->base[j - 1] & (1 << k)) Vars.r->numnuc[j - 1][k]--; largest = getlargest(Vars.r->numnuc[j - 1]); ancset->base[j - 1] = 0; for (k = (long)A; k <= (long)O; k++) if (Vars.r->numnuc[j - 1][k] == largest) ancset->base[j - 1] |= (1 << k); if (!Vars.bottom) Vars.anc = ancset->base[j - 1]; } hyptrav(q->back, ancset->base, b1, b2, Vars.bottom, treenode, garbage, basechar); q = q->next; } while (q != Vars.r); } chuckbase(ancset, garbage); } /* hyptrav */ void hypstates(long chars, node *root, pointarray treenode, gbases **garbage, Char *basechar) { /* fill in and describe states at interior nodes */ /* used in dnacomp, dnapars, & dnapenny */ long i, n; baseptr nothing; fprintf(outfile, "\nFrom To Any Steps? State at upper node\n"); fprintf(outfile, " "); if (dotdiff) fprintf(outfile, " ( . means same as in the node below it on tree)\n"); nothing = (baseptr)Malloc(endsite*sizeof(long)); for (i = 0; i < endsite; i++) nothing[i] = 0; for (i = 1; i <= ((chars - 1) / 40 + 1); i++) { putc('\n', outfile); n = i * 40; if (n > chars) n = chars; hyptrav(root, nothing, i * 40 - 39, n, true, treenode, garbage, basechar); } free(nothing); } /* hypstates */ void initbranchlen(node *p) { node *q; p->v = 0.0; if (p->back) p->back->v = 0.0; if (p->tip) return; q = p->next; while (q != p) { initbranchlen(q->back); q = q->next; } q = p->next; while (q != p) { q->v = 0.0; q = q->next; } } /* initbranchlen */ void initmin(node *p, long sitei, boolean internal) { long i; if (internal) { for (i = (long)A; i <= (long)O; i++) { p->cumlengths[i] = 0; p->numreconst[i] = 1; } } else { for (i = (long)A; i <= (long)O; i++) { if (p->base[sitei - 1] & (1 << i)) { p->cumlengths[i] = 0; p->numreconst[i] = 1; } else { p->cumlengths[i] = -1; p->numreconst[i] = 0; } } } } /* initmin */ void initbase(node *p, long sitei) { /* traverse tree to initialize base at internal nodes */ node *q; long i, largest; if (p->tip) return; q = p->next; while (q != p) { if (q->back) { memcpy(q->numnuc, p->numnuc, endsite*sizeof(nucarray)); for (i = (long)A; i <= (long)O; i++) { if (q->back->base[sitei - 1] & (1 << i)) q->numnuc[sitei - 1][i]--; } if (p->back) { for (i = (long)A; i <= (long)O; i++) { if (p->back->base[sitei - 1] & (1 << i)) q->numnuc[sitei - 1][i]++; } } largest = getlargest(q->numnuc[sitei - 1]); q->base[sitei - 1] = 0; for (i = (long)A; i <= (long)O; i++) { if (q->numnuc[sitei - 1][i] == largest) q->base[sitei - 1] |= (1 << i); } } q = q->next; } q = p->next; while (q != p) { initbase(q->back, sitei); q = q->next; } } /* initbase */ void inittreetrav(node *p, long sitei) { /* traverse tree to clear boolean initialized and set up base */ node *q; if (p->tip) { initmin(p, sitei, false); p->initialized = true; return; } q = p->next; while (q != p) { inittreetrav(q->back, sitei); q = q->next; } initmin(p, sitei, true); p->initialized = false; q = p->next; while (q != p) { initmin(q, sitei, true); q->initialized = false; q = q->next; } } /* inittreetrav */ void compmin(node *p, node *desc) { /* computes minimum lengths up to p */ long i, j, minn, cost, desclen, descrecon=0, maxx; maxx = 10 * spp; for (i = (long)A; i <= (long)O; i++) { minn = maxx; for (j = (long)A; j <= (long)O; j++) { if (transvp) { if ( ( ((i == (long)A) || (i == (long)G)) && ((j == (long)A) || (j == (long)G)) ) || ( ((j == (long)C) || (j == (long)T)) && ((i == (long)C) || (i == (long)T)) ) ) cost = 0; else cost = 1; } else { if (i == j) cost = 0; else cost = 1; } if (desc->cumlengths[j] == -1) { desclen = maxx; } else { desclen = desc->cumlengths[j]; } if (minn > cost + desclen) { minn = cost + desclen; descrecon = 0; } if (minn == cost + desclen) { descrecon += desc->numreconst[j]; } } p->cumlengths[i] += minn; p->numreconst[i] *= descrecon; } p->initialized = true; } /* compmin */ void minpostorder(node *p, pointarray treenode) { /* traverses an n-ary tree, computing minimum steps at each node */ node *q; if (p->tip) { return; } q = p->next; while (q != p) { if (q->back) minpostorder(q->back, treenode); q = q->next; } if (!p->initialized) { q = p->next; while (q != p) { if (q->back) compmin(p, q->back); q = q->next; } } } /* minpostorder */ void branchlength(node *subtr1, node *subtr2, double *brlen, pointarray treenode) { /* computes a branch length between two subtrees for a given site */ long i, j, minn, cost, nom, denom; node *temp; if (subtr1->tip) { temp = subtr1; subtr1 = subtr2; subtr2 = temp; } if (subtr1->index == outgrno) { temp = subtr1; subtr1 = subtr2; subtr2 = temp; } minpostorder(subtr1, treenode); minpostorder(subtr2, treenode); minn = 10 * spp; nom = 0; denom = 0; for (i = (long)A; i <= (long)O; i++) { for (j = (long)A; j <= (long)O; j++) { if (transvp) { if ( ( ((i == (long)A) || (i == (long)G)) && ((j == (long)A) || (j == (long)G)) ) || ( ((j == (long)C) || (j == (long)T)) && ((i == (long)C) || (i == (long)T)) ) ) cost = 0; else cost = 1; } else { if (i == j) cost = 0; else cost = 1; } if (subtr1->cumlengths[i] != -1 && (subtr2->cumlengths[j] != -1)) { if (subtr1->cumlengths[i] + cost + subtr2->cumlengths[j] < minn) { minn = subtr1->cumlengths[i] + cost + subtr2->cumlengths[j]; nom = 0; denom = 0; } if (subtr1->cumlengths[i] + cost + subtr2->cumlengths[j] == minn) { nom += subtr1->numreconst[i] * subtr2->numreconst[j] * cost; denom += subtr1->numreconst[i] * subtr2->numreconst[j]; } } } } *brlen = (double)nom/(double)denom; } /* branchlength */ void printbranchlengths(node *p) { node *q; long i; if (p->tip) return; q = p->next; do { fprintf(outfile, "%6ld ",q->index - spp); if (q->back->tip) { for (i = 0; i < nmlngth; i++) putc(nayme[q->back->index - 1][i], outfile); } else fprintf(outfile, "%6ld ", q->back->index - spp); fprintf(outfile, " %f\n",q->v); if (q->back) printbranchlengths(q->back); q = q->next; } while (q != p); } /* printbranchlengths */ void branchlentrav(node *p, node *root, long sitei, long chars, double *brlen, pointarray treenode) { /* traverses the tree computing tree length at each branch */ node *q; if (p->tip) return; if (p->index == outgrno) p = p->back; q = p->next; do { if (q->back) { branchlength(q, q->back, brlen, treenode); q->v += ((weight[sitei - 1] / 10.0) * (*brlen)/chars); q->back->v += ((weight[sitei - 1] / 10.0) * (*brlen)/chars); if (!q->back->tip) branchlentrav(q->back, root, sitei, chars, brlen, treenode); } q = q->next; } while (q != p); } /* branchlentrav */ void treelength(node *root, long chars, pointarray treenode) { /* calls branchlentrav at each site */ long sitei; double trlen; initbranchlen(root); for (sitei = 1; sitei <= endsite; sitei++) { trlen = 0.0; initbase(root, sitei); inittreetrav(root, sitei); branchlentrav(root, root, sitei, chars, &trlen, treenode); } } /* treelength */ void coordinates(node *p, long *tipy, double f, long *fartemp) { /* establishes coordinates of nodes for display without lengths */ node *q, *first, *last; node *mid1 = NULL, *mid2 = NULL; long numbranches, numb2; if (p->tip) { p->xcoord = 0; p->ycoord = *tipy; p->ymin = *tipy; p->ymax = *tipy; (*tipy) += down; return; } numbranches = 0; q = p->next; do { coordinates(q->back, tipy, f, fartemp); numbranches += 1; q = q->next; } while (p != q); first = p->next->back; q = p->next; while (q->next != p) q = q->next; last = q->back; numb2 = 1; q = p->next; while (q != p) { if (numb2 == (long)(numbranches + 1)/2) mid1 = q->back; if (numb2 == (long)(numbranches/2 + 1)) mid2 = q->back; numb2 += 1; q = q->next; } p->xcoord = (long)((double)(last->ymax - first->ymin) * f); p->ycoord = (long)((mid1->ycoord + mid2->ycoord) / 2); p->ymin = first->ymin; p->ymax = last->ymax; if (p->xcoord > *fartemp) *fartemp = p->xcoord; } /* coordinates */ void drawline(long i, double scale, node *root) { /* draws one row of the tree diagram by moving up tree */ node *p, *q, *r, *first =NULL, *last =NULL; long n, j; boolean extra, done, noplus; p = root; q = root; extra = false; noplus = false; if (i == (long)p->ycoord && p == root) { if (p->index - spp >= 10) fprintf(outfile, " %2ld", p->index - spp); else fprintf(outfile, " %ld", p->index - spp); extra = true; noplus = true; } else fprintf(outfile, " "); do { if (!p->tip) { r = p->next; done = false; do { if (i >= r->back->ymin && i <= r->back->ymax) { q = r->back; done = true; } r = r->next; } while (!(done || r == p)); first = p->next->back; r = p->next; while (r->next != p) r = r->next; last = r->back; } done = (p == q); n = (long)(scale * (p->xcoord - q->xcoord) + 0.5); if (n < 3 && !q->tip) n = 3; if (extra) { n--; extra = false; } if ((long)q->ycoord == i && !done) { if (noplus) { putc('-', outfile); noplus = false; } else putc('+', outfile); if (!q->tip) { for (j = 1; j <= n - 2; j++) putc('-', outfile); if (q->index - spp >= 10) fprintf(outfile, "%2ld", q->index - spp); else fprintf(outfile, "-%ld", q->index - spp); extra = true; noplus = true; } else { for (j = 1; j < n; j++) putc('-', outfile); } } else if (!p->tip) { if ((long)last->ycoord > i && (long)first->ycoord < i && i != (long)p->ycoord) { putc('!', outfile); for (j = 1; j < n; j++) putc(' ', outfile); } else { for (j = 1; j <= n; j++) putc(' ', outfile); } noplus = false; } else { for (j = 1; j <= n; j++) putc(' ', outfile); noplus = false; } if (p != q) p = q; } while (!done); if ((long)p->ycoord == i && p->tip) { for (j = 0; j < nmlngth; j++) putc(nayme[p->index - 1][j], outfile); } putc('\n', outfile); } /* drawline */ void printree(node *root, double f) { /* prints out diagram of the tree */ /* used in dnacomp, dnapars, & dnapenny */ long i, tipy, dummy; double scale; putc('\n', outfile); if (!treeprint) return; putc('\n', outfile); tipy = 1; dummy = 0; coordinates(root, &tipy, f, &dummy); scale = 1.5; putc('\n', outfile); for (i = 1; i <= (tipy - down); i++) drawline(i, scale, root); fprintf(outfile, "\n remember:"); if (outgropt) fprintf(outfile, " (although rooted by outgroup)"); fprintf(outfile, " this is an unrooted tree!\n\n"); } /* printree */ void writesteps(long chars, boolean weights, steptr oldweight, node *root) { /* used in dnacomp, dnapars, & dnapenny */ long i, j, k, l; putc('\n', outfile); if (weights) fprintf(outfile, "weighted "); fprintf(outfile, "steps in each site:\n"); fprintf(outfile, " "); for (i = 0; i <= 9; i++) fprintf(outfile, "%4ld", i); fprintf(outfile, "\n *------------------------------------"); fprintf(outfile, "-----\n"); for (i = 0; i <= (chars / 10); i++) { fprintf(outfile, "%5ld", i * 10); putc('|', outfile); for (j = 0; j <= 9; j++) { k = i * 10 + j; if (k == 0 || k > chars) fprintf(outfile, " "); else { l = location[ally[k - 1] - 1]; if (oldweight[k - 1] > 0) fprintf(outfile, "%4ld", oldweight[k - 1] * (root->numsteps[l - 1] / weight[l - 1])); else fprintf(outfile, " 0"); } } putc('\n', outfile); } } /* writesteps */ void treeout(node *p, long nextree, long *col, node *root) { /* write out file with representation of final tree */ /* used in dnacomp, dnamove, dnapars, & dnapenny */ node *q; long i, n; Char c; if (p->tip) { n = 0; for (i = 1; i <= nmlngth; i++) { if (nayme[p->index - 1][i - 1] != ' ') n = i; } for (i = 0; i < n; i++) { c = nayme[p->index - 1][i]; if (c == ' ') c = '_'; putc(c, outtree); } *col += n; } else { putc('(', outtree); (*col)++; q = p->next; while (q != p) { treeout(q->back, nextree, col, root); q = q->next; if (q == p) break; putc(',', outtree); (*col)++; if (*col > 60) { putc('\n', outtree); *col = 0; } } putc(')', outtree); (*col)++; } if (p != root) return; if (nextree > 2) fprintf(outtree, "[%6.4f];\n", 1.0 / (nextree - 1)); else fprintf(outtree, ";\n"); } /* treeout */ void treeout3(node *p, long nextree, long *col, node *root) { /* write out file with representation of final tree */ /* used in dnapars -- writes branch lengths */ node *q; long i, n, w; double x; Char c; if (p->tip) { n = 0; for (i = 1; i <= nmlngth; i++) { if (nayme[p->index - 1][i - 1] != ' ') n = i; } for (i = 0; i < n; i++) { c = nayme[p->index - 1][i]; if (c == ' ') c = '_'; putc(c, outtree); } *col += n; } else { putc('(', outtree); (*col)++; q = p->next; while (q != p) { treeout3(q->back, nextree, col, root); q = q->next; if (q == p) break; putc(',', outtree); (*col)++; if (*col > 60) { putc('\n', outtree); *col = 0; } } putc(')', outtree); (*col)++; } x = p->v; if (x > 0.0) w = (long)(0.43429448222 * log(x)); else if (x == 0.0) w = 0; else w = (long)(0.43429448222 * log(-x)) + 1; if (w < 0) w = 0; if (p != root) { fprintf(outtree, ":%*.5f", (int)(w + 7), x); *col += w + 8; } if (p != root) return; if (nextree > 2) fprintf(outtree, "[%6.4f];\n", 1.0 / (nextree - 1)); else fprintf(outtree, ";\n"); } /* treeout3 */ /* FIXME curtree should probably be passed by reference */ void drawline2(long i, double scale, tree curtree) { fdrawline2(outfile, i, scale, &curtree); } void fdrawline2(FILE *fp, long i, double scale, tree *curtree) { /* draws one row of the tree diagram by moving up tree */ /* used in dnaml, proml, & restml */ node *p, *q; long n, j; boolean extra; node *r, *first =NULL, *last =NULL; boolean done; p = curtree->start; q = curtree->start; extra = false; if (i == (long)p->ycoord && p == curtree->start) { if (p->index - spp >= 10) fprintf(fp, " %2ld", p->index - spp); else fprintf(fp, " %ld", p->index - spp); extra = true; } else fprintf(fp, " "); do { if (!p->tip) { r = p->next; done = false; do { if (i >= r->back->ymin && i <= r->back->ymax) { q = r->back; done = true; } r = r->next; } while (!(done || (p != curtree->start && r == p) || (p == curtree->start && r == p->next))); first = p->next->back; r = p; while (r->next != p) r = r->next; last = r->back; if (p == curtree->start) last = p->back; } done = (p->tip || p == q); n = (long)(scale * (q->xcoord - p->xcoord) + 0.5); if (n < 3 && !q->tip) n = 3; if (extra) { n--; extra = false; } if ((long)q->ycoord == i && !done) { if ((long)p->ycoord != (long)q->ycoord) putc('+', fp); else putc('-', fp); if (!q->tip) { for (j = 1; j <= n - 2; j++) putc('-', fp); if (q->index - spp >= 10) fprintf(fp, "%2ld", q->index - spp); else fprintf(fp, "-%ld", q->index - spp); extra = true; } else { for (j = 1; j < n; j++) putc('-', fp); } } else if (!p->tip) { if ((long)last->ycoord > i && (long)first->ycoord < i && (i != (long)p->ycoord || p == curtree->start)) { putc('|', fp); for (j = 1; j < n; j++) putc(' ', fp); } else { for (j = 1; j <= n; j++) putc(' ', fp); } } else { for (j = 1; j <= n; j++) putc(' ', fp); } if (q != p) p = q; } while (!done); if ((long)p->ycoord == i && p->tip) { for (j = 0; j < nmlngth; j++) putc(nayme[p->index-1][j], fp); } putc('\n', fp); } /* drawline2 */ void drawline3(long i, double scale, node *start) { /* draws one row of the tree diagram by moving up tree */ /* used in dnapars */ node *p, *q; long n, j; boolean extra; node *r, *first =NULL, *last =NULL; boolean done; p = start; q = start; extra = false; if (i == (long)p->ycoord) { if (p->index - spp >= 10) fprintf(outfile, " %2ld", p->index - spp); else fprintf(outfile, " %ld", p->index - spp); extra = true; } else fprintf(outfile, " "); do { if (!p->tip) { r = p->next; done = false; do { if (i >= r->back->ymin && i <= r->back->ymax) { q = r->back; done = true; } r = r->next; } while (!(done || (r == p))); first = p->next->back; r = p; while (r->next != p) r = r->next; last = r->back; } done = (p->tip || p == q); n = (long)(scale * (q->xcoord - p->xcoord) + 0.5); if (n < 3 && !q->tip) n = 3; if (extra) { n--; extra = false; } if ((long)q->ycoord == i && !done) { if ((long)p->ycoord != (long)q->ycoord) putc('+', outfile); else putc('-', outfile); if (!q->tip) { for (j = 1; j <= n - 2; j++) putc('-', outfile); if (q->index - spp >= 10) fprintf(outfile, "%2ld", q->index - spp); else fprintf(outfile, "-%ld", q->index - spp); extra = true; } else { for (j = 1; j < n; j++) putc('-', outfile); } } else if (!p->tip) { if ((long)last->ycoord > i && (long)first->ycoord < i && (i != (long)p->ycoord || p == start)) { putc('|', outfile); for (j = 1; j < n; j++) putc(' ', outfile); } else { for (j = 1; j <= n; j++) putc(' ', outfile); } } else { for (j = 1; j <= n; j++) putc(' ', outfile); } if (q != p) p = q; } while (!done); if ((long)p->ycoord == i && p->tip) { for (j = 0; j < nmlngth; j++) putc(nayme[p->index-1][j], outfile); } putc('\n', outfile); } /* drawline3 */ void copynode(node *c, node *d, long categs) { long i, j; for (i = 0; i < endsite; i++) for (j = 0; j < categs; j++) memcpy(d->x[i][j], c->x[i][j], sizeof(sitelike)); memcpy(d->underflows,c->underflows,sizeof(double) * endsite); d->tyme = c->tyme; d->v = c->v; d->xcoord = c->xcoord; d->ycoord = c->ycoord; d->ymin = c->ymin; d->ymax = c->ymax; d->iter = c->iter; /* iter used in dnaml only */ d->haslength = c->haslength; /* haslength used in dnamlk only */ d->initialized = c->initialized; /* initialized used in dnamlk only */ } /* copynode */ void prot_copynode(node *c, node *d, long categs) { /* a version of copynode for proml */ long i, j; for (i = 0; i < endsite; i++) for (j = 0; j < categs; j++) memcpy(d->protx[i][j], c->protx[i][j], sizeof(psitelike)); memcpy(d->underflows,c->underflows,sizeof(double) * endsite); d->tyme = c->tyme; d->v = c->v; d->xcoord = c->xcoord; d->ycoord = c->ycoord; d->ymin = c->ymin; d->ymax = c->ymax; d->iter = c->iter; /* iter used in dnaml only */ d->haslength = c->haslength; /* haslength used in dnamlk only */ d->initialized = c->initialized; /* initialized used in dnamlk only */ } /* prot_copynode */ void copy_(tree *a, tree *b, long nonodes, long categs) { /* used in dnamlk */ long i; node *p, *q, *r, *s, *t; for (i = 0; i < spp; i++) { copynode(a->nodep[i], b->nodep[i], categs); if (a->nodep[i]->back) { if (a->nodep[i]->back == a->nodep[a->nodep[i]->back->index - 1]) b->nodep[i]->back = b->nodep[a->nodep[i]->back->index - 1]; else if (a->nodep[i]->back == a->nodep[a->nodep[i]->back->index - 1]->next) b->nodep[i]->back = b->nodep[a->nodep[i]->back->index - 1]->next; else b->nodep[i]->back = b->nodep[a->nodep[i]->back->index - 1]->next->next; } else b->nodep[i]->back = NULL; } for (i = spp; i < nonodes; i++) { if (a->nodep[i]) { p = a->nodep[i]; q = b->nodep[i]; r = p; do { copynode(p, q, categs); if (p->back) { s = a->nodep[p->back->index - 1]; t = b->nodep[p->back->index - 1]; if (s->tip) { if(p->back == s) q->back = t; } else { do { if (p->back == s) q->back = t; s = s->next; t = t->next; } while (s != a->nodep[p->back->index - 1]); } } else q->back = NULL; p = p->next; q = q->next; } while (p != r); } } b->likelihood = a->likelihood; b->start = a->start; /* start used in dnaml only */ b->root = a->root; /* root used in dnamlk only */ } /* copy_ */ void prot_copy_(tree *a, tree *b, long nonodes, long categs) { /* used in promlk */ /* identical to copy_() except for calls to prot_copynode rather */ /* than copynode. */ long i; node *p, *q, *r, *s, *t; for (i = 0; i < spp; i++) { prot_copynode(a->nodep[i], b->nodep[i], categs); if (a->nodep[i]->back) { if (a->nodep[i]->back == a->nodep[a->nodep[i]->back->index - 1]) b->nodep[i]->back = b->nodep[a->nodep[i]->back->index - 1]; else if (a->nodep[i]->back == a->nodep[a->nodep[i]->back->index - 1]->next ) b->nodep[i]->back = b->nodep[a->nodep[i]->back->index - 1]->next; else b->nodep[i]->back = b->nodep[a->nodep[i]->back->index - 1]->next->next; } else b->nodep[i]->back = NULL; } for (i = spp; i < nonodes; i++) { if (a->nodep[i]) { p = a->nodep[i]; q = b->nodep[i]; r = p; do { prot_copynode(p, q, categs); if (p->back) { s = a->nodep[p->back->index - 1]; t = b->nodep[p->back->index - 1]; if (s->tip) { if(p->back == s) q->back = t; } else { do { if (p->back == s) q->back = t; s = s->next; t = t->next; } while (s != a->nodep[p->back->index - 1]); } } else q->back = NULL; p = p->next; q = q->next; } while (p != r); } } b->likelihood = a->likelihood; b->start = a->start; /* start used in dnaml only */ b->root = a->root; /* root used in dnamlk only */ } /* prot_copy_ */ void standev(long chars, long numtrees, long minwhich, double minsteps, double *nsteps, long **fsteps, longer seed) { /* do paired sites test (KHT or SH test) on user-defined trees */ /* used in dnapars & protpars */ long i, j, k; double wt, sumw, sum, sum2, sd; double temp; double **covar, *P, *f, *r; #define SAMPLES 1000 if (numtrees == 2) { fprintf(outfile, "Kishino-Hasegawa-Templeton test\n\n"); fprintf(outfile, "Tree Steps Diff Steps Its S.D."); fprintf(outfile, " Significantly worse?\n\n"); which = 1; while (which <= numtrees) { fprintf(outfile, "%3ld%10.1f", which, nsteps[which - 1] / 10); if (minwhich == which) fprintf(outfile, " <------ best\n"); else { sumw = 0.0; sum = 0.0; sum2 = 0.0; for (i = 0; i < endsite; i++) { if (weight[i] > 0) { wt = weight[i] / 10.0; sumw += wt; temp = (fsteps[which - 1][i] - fsteps[minwhich - 1][i]) / 10.0; sum += wt * temp; sum2 += wt * temp * temp; } } sd = sqrt(sumw / (sumw - 1.0) * (sum2 - sum * sum / sumw)); fprintf(outfile, "%10.1f%12.4f", (nsteps[which - 1] - minsteps) / 10, sd); if ((sum > 0.0) && (sum > 1.95996 * sd)) fprintf(outfile, " Yes\n"); else fprintf(outfile, " No\n"); } which++; } fprintf(outfile, "\n\n"); } else { /* Shimodaira-Hasegawa test using normal approximation */ if(numtrees > MAXSHIMOTREES){ fprintf(outfile, "Shimodaira-Hasegawa test on first %d of %ld trees\n\n" , MAXSHIMOTREES, numtrees); numtrees = MAXSHIMOTREES; } else { fprintf(outfile, "Shimodaira-Hasegawa test\n\n"); } covar = (double **)Malloc(numtrees*sizeof(double *)); sumw = 0.0; for (i = 0; i < endsite; i++) sumw += weight[i] / 10.0; for (i = 0; i < numtrees; i++) covar[i] = (double *)Malloc(numtrees*sizeof(double)); for (i = 0; i < numtrees; i++) { /* compute covariances of trees */ sum = nsteps[i]/(10.0*sumw); for (j = 0; j <=i; j++) { sum2 = nsteps[j]/(10.0*sumw); temp = 0.0; for (k = 0; k < endsite; k++) { if (weight[k] > 0) { wt = weight[k]/10.0; temp = temp + wt*(fsteps[i][k]/10.0-sum) *(fsteps[j][k]/10.0-sum2); } } covar[i][j] = temp; if (i != j) covar[j][i] = temp; } } for (i = 0; i < numtrees; i++) { /* in-place Cholesky decomposition of trees x trees covariance matrix */ sum = 0.0; for (j = 0; j <= i-1; j++) sum = sum + covar[i][j] * covar[i][j]; if (covar[i][i] <= sum) temp = 0.0; else temp = sqrt(covar[i][i] - sum); covar[i][i] = temp; for (j = i+1; j < numtrees; j++) { sum = 0.0; for (k = 0; k < i; k++) sum = sum + covar[i][k] * covar[j][k]; if (fabs(temp) < 1.0E-12) covar[j][i] = 0.0; else covar[j][i] = (covar[j][i] - sum)/temp; } } f = (double *)Malloc(numtrees*sizeof(double)); /* resampled sums */ P = (double *)Malloc(numtrees*sizeof(double)); /* vector of P's of trees */ r = (double *)Malloc(numtrees*sizeof(double)); /* store Normal variates */ for (i = 0; i < numtrees; i++) P[i] = 0.0; sum2 = nsteps[0]/10.0; /* sum2 will be smallest # of steps */ for (i = 1; i < numtrees; i++) if (sum2 > nsteps[i]/10.0) sum2 = nsteps[i]/10.0; for (i = 1; i <= SAMPLES; i++) { /* loop over resampled trees */ for (j = 0; j < numtrees; j++) /* draw Normal variates */ r[j] = normrand(seed); for (j = 0; j < numtrees; j++) { /* compute vectors */ sum = 0.0; for (k = 0; k <= j; k++) sum += covar[j][k]*r[k]; f[j] = sum; } sum = f[1]; for (j = 1; j < numtrees; j++) /* get min of vector */ if (f[j] < sum) sum = f[j]; for (j = 0; j < numtrees; j++) /* accumulate P's */ if (nsteps[j]/10.0-sum2 <= f[j] - sum) P[j] += 1.0/SAMPLES; } fprintf(outfile, "Tree Steps Diff Steps P value"); fprintf(outfile, " Significantly worse?\n\n"); for (i = 0; i < numtrees; i++) { fprintf(outfile, "%3ld%10.1f", i+1, nsteps[i]/10); if ((minwhich-1) == i) fprintf(outfile, " <------ best\n"); else { fprintf(outfile, " %9.1f %10.3f", nsteps[i]/10.0-sum2, P[i]); if (P[i] < 0.05) fprintf(outfile, " Yes\n"); else fprintf(outfile, " No\n"); } } fprintf(outfile, "\n"); free(P); /* free the variables we Malloc'ed */ free(f); free(r); for (i = 0; i < numtrees; i++) free(covar[i]); free(covar); } } /* standev */ void standev2(long numtrees, long maxwhich, long a, long b, double maxlogl, double *l0gl, double **l0gf, steptr aliasweight, longer seed) { /* do paired sites test (KHT or SH) for user-defined trees */ /* used in dnaml, dnamlk, proml, promlk, and restml */ double **covar, *P, *f, *r; long i, j, k; double wt, sumw, sum, sum2, sd; double temp; #define SAMPLES 1000 if (numtrees == 2) { fprintf(outfile, "Kishino-Hasegawa-Templeton test\n\n"); fprintf(outfile, "Tree logL Diff logL Its S.D."); fprintf(outfile, " Significantly worse?\n\n"); which = 1; while (which <= numtrees) { fprintf(outfile, "%3ld %9.1f", which, l0gl[which - 1]); if (maxwhich == which) fprintf(outfile, " <------ best\n"); else { sumw = 0.0; sum = 0.0; sum2 = 0.0; for (i = a; i <= b; i++) { if (aliasweight[i] > 0) { wt = aliasweight[i]; sumw += wt; temp = l0gf[which - 1][i] - l0gf[maxwhich - 1][i]; sum += temp * wt; sum2 += wt * temp * temp; } } temp = sum / sumw; sd = sqrt(sumw / (sumw - 1.0) * (sum2 - sum * sum / sumw )); fprintf(outfile, "%10.1f %11.4f", (l0gl[which - 1])-maxlogl, sd); if ((sum < 0.0) && ((-sum) > 1.95996 * sd)) fprintf(outfile, " Yes\n"); else fprintf(outfile, " No\n"); } which++; } fprintf(outfile, "\n\n"); } else { /* Shimodaira-Hasegawa test using normal approximation */ if(numtrees > MAXSHIMOTREES){ fprintf(outfile, "Shimodaira-Hasegawa test on first %d of %ld trees\n\n" , MAXSHIMOTREES, numtrees); numtrees = MAXSHIMOTREES; } else { fprintf(outfile, "Shimodaira-Hasegawa test\n\n"); } covar = (double **)Malloc(numtrees*sizeof(double *)); sumw = 0.0; for (i = a; i <= b; i++) sumw += aliasweight[i]; for (i = 0; i < numtrees; i++) covar[i] = (double *)Malloc(numtrees*sizeof(double)); for (i = 0; i < numtrees; i++) { /* compute covariances of trees */ sum = l0gl[i]/sumw; for (j = 0; j <=i; j++) { sum2 = l0gl[j]/sumw; temp = 0.0; for (k = a; k <= b ; k++) { if (aliasweight[k] > 0) { wt = aliasweight[k]; temp = temp + wt*(l0gf[i][k]-sum) *(l0gf[j][k]-sum2); } } covar[i][j] = temp; if (i != j) covar[j][i] = temp; } } for (i = 0; i < numtrees; i++) { /* in-place Cholesky decomposition of trees x trees covariance matrix */ sum = 0.0; for (j = 0; j <= i-1; j++) sum = sum + covar[i][j] * covar[i][j]; if (covar[i][i] <= sum) temp = 0.0; else temp = sqrt(covar[i][i] - sum); covar[i][i] = temp; for (j = i+1; j < numtrees; j++) { sum = 0.0; for (k = 0; k < i; k++) sum = sum + covar[i][k] * covar[j][k]; if (fabs(temp) < 1.0E-12) covar[j][i] = 0.0; else covar[j][i] = (covar[j][i] - sum)/temp; } } f = (double *)Malloc(numtrees*sizeof(double)); /* resampled likelihoods */ P = (double *)Malloc(numtrees*sizeof(double)); /* vector of P's of trees */ r = (double *)Malloc(numtrees*sizeof(double)); /* store Normal variates */ for (i = 0; i < numtrees; i++) P[i] = 0.0; for (i = 1; i <= SAMPLES; i++) { /* loop over resampled trees */ for (j = 0; j < numtrees; j++) /* draw Normal variates */ r[j] = normrand(seed); for (j = 0; j < numtrees; j++) { /* compute vectors */ sum = 0.0; for (k = 0; k <= j; k++) sum += covar[j][k]*r[k]; f[j] = sum; } sum = f[1]; for (j = 1; j < numtrees; j++) /* get max of vector */ if (f[j] > sum) sum = f[j]; for (j = 0; j < numtrees; j++) /* accumulate P's */ if (maxlogl-l0gl[j] <= sum-f[j]) P[j] += 1.0/SAMPLES; } fprintf(outfile, "Tree logL Diff logL P value"); fprintf(outfile, " Significantly worse?\n\n"); for (i = 0; i < numtrees; i++) { fprintf(outfile, "%3ld%10.1f", i+1, l0gl[i]); if ((maxwhich-1) == i) fprintf(outfile, " <------ best\n"); else { fprintf(outfile, " %9.1f %10.3f", l0gl[i]-maxlogl, P[i]); if (P[i] < 0.05) fprintf(outfile, " Yes\n"); else fprintf(outfile, " No\n"); } } fprintf(outfile, "\n"); free(P); /* free the variables we Malloc'ed */ free(f); free(r); for (i = 0; i < numtrees; i++) free(covar[i]); free(covar); } } /* standev */ void freetip(node *anode) { /* used in dnacomp, dnapars, & dnapenny */ free(anode->numsteps); free(anode->oldnumsteps); free(anode->base); free(anode->oldbase); } /* freetip */ void freenontip(node *anode) { /* used in dnacomp, dnapars, & dnapenny */ free(anode->numsteps); free(anode->oldnumsteps); free(anode->base); free(anode->oldbase); free(anode->numnuc); } /* freenontip */ void freenodes(long nonodes, pointarray treenode) { /* used in dnacomp, dnapars, & dnapenny */ long i; node *p; for (i = 0; i < spp; i++) freetip(treenode[i]); for (i = spp; i < nonodes; i++) { if (treenode[i] != NULL) { p = treenode[i]->next; do { freenontip(p); p = p->next; } while (p != treenode[i]); freenontip(p); } } } /* freenodes */ void freenode(node **anode) { /* used in dnacomp, dnapars, & dnapenny */ freenontip(*anode); free(*anode); } /* freenode */ void freetree(long nonodes, pointarray treenode) { /* used in dnacomp, dnapars, & dnapenny */ long i; node *p, *q; for (i = 0; i < spp; i++) free(treenode[i]); for (i = spp; i < nonodes; i++) { if (treenode[i] != NULL) { p = treenode[i]->next; do { q = p->next; free(p); p = q; } while (p != treenode[i]); free(p); } } free(treenode); } /* freetree */ void prot_freex_notip(long nonodes, pointarray treenode) { /* used in proml */ long i, j; node *p; for (i = spp; i < nonodes; i++) { p = treenode[i]; if ( p == NULL ) continue; do { for (j = 0; j < endsite; j++){ free(p->protx[j]); p->protx[j] = NULL; } free(p->underflows); p->underflows = NULL; free(p->protx); p->protx = NULL; p = p->next; } while (p != treenode[i]); } } /* prot_freex_notip */ void prot_freex(long nonodes, pointarray treenode) { /* used in proml */ long i, j; node *p; for (i = 0; i < spp; i++) { for (j = 0; j < endsite; j++) free(treenode[i]->protx[j]); free(treenode[i]->protx); free(treenode[i]->underflows); } for (i = spp; i < nonodes; i++) { p = treenode[i]; do { for (j = 0; j < endsite; j++) free(p->protx[j]); free(p->protx); free(p->underflows); p = p->next; } while (p != treenode[i]); } } /* prot_freex */ void freex_notip(long nonodes, pointarray treenode) { /* used in dnaml & dnamlk */ long i, j; node *p; for (i = spp; i < nonodes; i++) { p = treenode[i]; if ( p == NULL ) continue; do { for (j = 0; j < endsite; j++) free(p->x[j]); free(p->underflows); free(p->x); p = p->next; } while (p != treenode[i]); } } /* freex_notip */ void freex(long nonodes, pointarray treenode) { /* used in dnaml & dnamlk */ long i, j; node *p; for (i = 0; i < spp; i++) { for (j = 0; j < endsite; j++) free(treenode[i]->x[j]); free(treenode[i]->x); free(treenode[i]->underflows); } for (i = spp; i < nonodes; i++) { if(treenode[i]){ p = treenode[i]; do { for (j = 0; j < endsite; j++) free(p->x[j]); free(p->x); free(p->underflows); p = p->next; } while (p != treenode[i]); } } } /* freex */ void freegarbage(gbases **garbage) { /* used in dnacomp, dnapars, & dnapenny */ gbases *p; while (*garbage) { p = *garbage; *garbage = (*garbage)->next; free(p->base); free(p); } } /*freegarbage */ void freegrbg(node **grbg) { /* used in dnacomp, dnapars, & dnapenny */ node *p; while (*grbg) { p = *grbg; *grbg = (*grbg)->next; freenontip(p); free(p); } } /*freegrbg */ void collapsetree(node *p, node *root, node **grbg, pointarray treenode, long *zeros) { /* Recurse through tree searching for zero length brances between */ /* nodes (not to tips). If one exists, collapse the nodes together, */ /* removing the branch. */ node *q, *x1, *y1, *x2, *y2; long i, j, index, index2, numd; if (p->tip) return; q = p->next; do { if (!q->back->tip && q->v == 0.000000) { /* merge the two nodes. */ x1 = y2 = q->next; x2 = y1 = q->back->next; while(x1->next != q) x1 = x1-> next; while(y1->next != q->back) y1 = y1-> next; x1->next = x2; y1->next = y2; index = q->index; index2 = q->back->index; numd = treenode[index-1]->numdesc + q->back->numdesc -1; chuck(grbg, q->back); chuck(grbg, q); q = x2; /* update the indicies around the node circle */ do{ if(q->index != index){ q->index = index; } q = q-> next; }while(x2 != q); updatenumdesc(treenode[index-1], root, numd); /* Alter treenode to point to real nodes, and update indicies */ /* acordingly. */ j = 0; i=0; for(i = (index2-1); i < nonodes-1 && treenode[i+1]; i++){ treenode[i]=treenode[i+1]; treenode[i+1] = NULL; x1=x2=treenode[i]; do{ x1->index = i+1; x1 = x1 -> next; } while(x1 != x2); } /* Create a new empty fork in the blank spot of treenode */ x1=NULL; for(i=1; i <=3 ; i++){ gnutreenode(grbg, &x2, index2, endsite, zeros); x2->next = x1; x1 = x2; } x2->next->next->next = x2; treenode[nonodes-1]=x2; if (q->back) collapsetree(q->back, root, grbg, treenode, zeros); } else { if (q->back) collapsetree(q->back, root, grbg, treenode, zeros); q = q->next; } } while (q != p); } /* collapsetree */ void collapsebestrees(node **root, node **grbg, pointarray treenode, bestelm *bestrees, long *place, long *zeros, long chars, boolean recompute, boolean progress) { /* Goes through all best trees, collapsing trees where possible, and */ /* deleting trees that are not unique. */ long i,j, k, pos, nextnode, oldnextree; boolean found; node *dummy; oldnextree = nextree; for(i = 0 ; i < (oldnextree - 1) ; i++){ bestrees[i].collapse = true; } if(progress) printf("Collapsing best trees\n "); k = 0; for(i = 0 ; i < (oldnextree - 1) ; i++){ if(progress){ if(i % (((oldnextree-1) / 72) + 1) == 0) putchar('.'); fflush(stdout); } while(!bestrees[k].collapse) k++; /* Reconstruct tree. */ *root = treenode[0]; add(treenode[0], treenode[1], treenode[spp], root, recompute, treenode, grbg, zeros); nextnode = spp + 2; for (j = 3; j <= spp; j++) { if (bestrees[k].btree[j - 1] > 0) add(treenode[bestrees[k].btree[j - 1] - 1], treenode[j - 1], treenode[nextnode++ - 1], root, recompute, treenode, grbg, zeros); else add(treenode[treenode[-bestrees[k].btree[j - 1]-1]->back->index-1], treenode[j - 1], NULL, root, recompute, treenode, grbg, zeros); } reroot(treenode[outgrno - 1], *root); treelength(*root, chars, treenode); collapsetree(*root, *root, grbg, treenode, zeros); savetree(*root, place, treenode, grbg, zeros); /* move everything down in the bestree list */ for(j = k ; j < (nextree - 2) ; j++){ memcpy(bestrees[j].btree, bestrees[j + 1].btree, spp * sizeof(long)); bestrees[j].gloreange = bestrees[j + 1].gloreange; bestrees[j + 1].gloreange = false; bestrees[j].locreange = bestrees[j + 1].locreange; bestrees[j + 1].locreange = false; bestrees[j].collapse = bestrees[j + 1].collapse; } pos=0; findtree(&found, &pos, nextree-1, place, bestrees); /* put the new tree at the end of the list if it wasn't found */ nextree--; if(!found) addtree(pos, &nextree, false, place, bestrees); /* Deconstruct the tree */ for (j = 1; j < spp; j++){ re_move(treenode[j], &dummy, root, recompute, treenode, grbg, zeros); } } if (progress) { putchar('\n'); #ifdef WIN32 phyFillScreenColor(); #endif } } phylip-3.697/src/seq.h0000644004732000473200000002125212407047350014301 0ustar joefelsenst_g/* version 3.696. Written by Joseph Felsenstein, Akiko Fuseki, Sean Lamont, and Andrew Keeffe. Copyright (c) 1993-2014, Joseph Felsenstein All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ /* seq.h: included in dnacomp, dnadist, dnainvar, dnaml, dnamlk, dnamove, dnapars, dnapenny, protdist, protpars, restdist & restml */ #ifndef SEQ_H #define SEQ_H #define ebcdic EBCDIC #define MAXNCH 20 /* All of this came over from cons.h -plc*/ #define OVER 7 #define ADJACENT_PAIRS 1 #define CORR_IN_1_AND_2 2 #define ALL_IN_1_AND_2 3 #define NO_PAIRING 4 #define ALL_IN_FIRST 5 #define TREE1 8 #define TREE2 9 #define FULL_MATRIX 11 #define VERBOSE 22 #define SPARSE 33 /* Number of columns per block in a matrix output */ #define COLUMNS_PER_BLOCK 10 /*end move*/ typedef struct gbases { baseptr base; struct gbases *next; } gbases; typedef struct nuview_data { /* A big 'ol collection of pointers used in nuview */ double *yy, *wwzz, *vvzz, *vzsumr, *vzsumy, *sum, *sumr, *sumy; sitelike *xx; } nuview_data; struct LOC_hyptrav { boolean bottom; node *r; long *hypset; boolean maybe, nonzero; long tempset, anc; } ; extern long nonodes, endsite, outgrno, nextree, which; extern boolean interleaved, printdata, outgropt, treeprint, dotdiff, transvp; extern steptr weight, category, alias, location, ally; extern sequence y; #ifndef OLDC /* function prototypes */ void alloctemp(node **, long *, long); void freetemp(node **); void freetree2 (pointarray, long); void inputdata(long); void alloctree(pointarray *, long, boolean); void allocx(long, long, pointarray, boolean); void prot_allocx(long, long, pointarray, boolean); void setuptree(pointarray, long, boolean); void setuptree2(tree *); void alloctip(node *, long *); void getbasefreqs(double, double, double, double, double *, double *, double *, double *, double *, double *, double *, double *xi, double *, double *, boolean, boolean); void empiricalfreqs(double *,double *,double *,double *,steptr,pointarray); void sitesort(long, steptr); void sitecombine(long); void sitescrunch(long); void sitesort2(long, steptr); void sitecombine2(long, steptr); void sitescrunch2(long, long, long, steptr); void makevalues(pointarray, long *, boolean); void makevalues2(long, pointarray, long, long, sequence, steptr); void fillin(node *, node *, node *); long getlargest(long *); void multifillin(node *, node *, long); void sumnsteps(node *, node *, node *, long, long); void sumnsteps2(node *, node *, node *, long, long, long *); void multisumnsteps(node *, node *, long, long, long *); void multisumnsteps2(node *); boolean alltips(node *, node *); void gdispose(node *, node **, pointarray); void preorder(node *, node *, node *, node *, node *, node *, long); void updatenumdesc(node *, node *, long); void add(node *,node *,node *,node **,boolean,pointarray,node **,long *); void findbelow(node **below, node *item, node *fork); void re_move(node *item, node **fork, node **root, boolean recompute, pointarray, node **, long *); void postorder(node *p); void getnufork(node **, node **, pointarray, long *); void reroot(node *, node *); void reroot2(node *, node *); void reroot3(node *, node *, node *, node *, node **); void savetraverse(node *); void newindex(long, node *); void flipindexes(long, pointarray); boolean parentinmulti(node *); long sibsvisited(node *, long *); long smallest(node *, long *); void bintomulti(node **, node **, node **, long *); void backtobinary(node **, node *, node **); boolean outgrin(node *, node *); void flipnodes(node *, node *); void moveleft(node *, node *, node **); void savetree(node *, long *, pointarray, node **, long *); void addnsave(node *, node *, node *, node **, node **,boolean, pointarray, long *, long *); void addbestever(long *, long *, long, boolean, long *, bestelm *); void addtiedtree(long, long *, long, boolean,long *, bestelm *); void clearcollapse(pointarray); void clearbottom(pointarray); void collabranch(node *, node *, node *); boolean allcommonbases(node *, node *, boolean *); void findbottom(node *, node **); boolean moresteps(node *, node *); boolean passdown(node *, node *, node *, node *, node *, node *, node *, node *, node *, boolean); boolean trycollapdesc(node *, node *, node *, node *, node *, node *, node *, node *, node *, boolean , long *); void setbottom(node *); boolean zeroinsubtree(node *, node *, node *, node *, node *, node *, node *, node *, boolean, node *, long *); boolean collapsible(node *, node *, node *, node *, node *, node *, node *, node *, boolean, node *, long *, pointarray); void replaceback(node **, node *, node *, node **, long *); void putback(node *, node *, node *, node **); void savelocrearr(node *, node *, node *, node *, node *, node *, node *, node *, node *, node **, long, long *, boolean, boolean , boolean *, long *, bestelm *, pointarray , node **, long *); void clearvisited(pointarray); void hyprint(long, long, struct LOC_hyptrav *,pointarray, Char *); void gnubase(gbases **, gbases **, long); void chuckbase(gbases *, gbases **); void hyptrav(node *, long *, long, long, boolean,pointarray, gbases **, Char *); void hypstates(long , node *, pointarray, gbases **, Char *); void initbranchlen(node *p); void initmin(node *, long, boolean); void initbase(node *, long); void inittreetrav(node *, long); void compmin(node *, node *); void minpostorder(node *, pointarray); void branchlength(node *,node *,double *,pointarray); void printbranchlengths(node *); void branchlentrav(node *,node *,long,long,double *,pointarray); void treelength(node *, long, pointarray); void coordinates(node *, long *, double, long *); void drawline(long, double, node *); void printree(node *, double); void writesteps(long, boolean, steptr, node *); void treeout(node *, long, long *, node *); void treeout3(node *, long, long *, node *); void fdrawline2(FILE *fp, long i, double scale, tree *curtree); void drawline2(long, double, tree); void drawline3(long, double, node *); void copynode(node *, node *, long); void prot_copynode(node *, node *, long); void copy_(tree *, tree *, long, long); void prot_copy_(tree *, tree *, long, long); void standev(long, long, long, double, double *, long **, longer); void standev2(long, long, long, long, double, double *, double **, steptr, longer); void freetip(node *); void freenontip(node *); void freenodes(long, pointarray); void freenode(node **); void freetree(long, pointarray); void freex(long, pointarray); void freex_notip(long, pointarray); void prot_freex_notip(long nonodes, pointarray treenode); void prot_freex(long nonodes, pointarray treenode); void freegarbage(gbases **); void freegrbg(node **); void collapsetree(node *, node *, node **, pointarray, long *); void collapsebestrees(node **, node **, pointarray, bestelm *, long *, long *, long, boolean, boolean); void fix_x(node* p,long site, double maxx, long rcategs); void fix_protx(node* p,long site,double maxx, long rcategs); /*function prototypes*/ #endif #endif /* SEQ_H */ phylip-3.697/src/seqboot.c0000644004732000473200000013247212406201117015157 0ustar joefelsenst_g/* version 3.696. Written by Joseph Felsenstein, Akiko Fuseki, Sean Lamont, Andrew Keeffe, and Doug Buxton. Copyright (c) 1993-2014, Joseph Felsenstein. All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #include "phylip.h" #include "seq.h" typedef enum { seqs, morphology, restsites, genefreqs } datatype; typedef enum { dna, rna, protein } seqtype; #ifndef OLDC /* function prototypes */ void getoptions(void); void seqboot_inputnumbers(void); void seqboot_inputfactors(void); void inputoptions(void); char **matrix_char_new(long rows, long cols); void matrix_char_delete(char **mat, long rows); double **matrix_double_new(long rows, long cols); void matrix_double_delete(double **mat, long rows); void seqboot_inputdata(void); void allocrest(void); void freerest(void); void allocnew(void); void freenew(void); void allocnewer(long newergroups, long newersites); void doinput(int argc, Char *argv[]); void bootweights(void); void permute_vec(long *a, long n); void sppermute(long); void charpermute(long, long); void writedata(void); void writeweights(void); void writecategories(void); void writeauxdata(steptr, FILE*); void writefactors(void); void bootwrite(void); void seqboot_inputaux(steptr, FILE*); void freenewer(void); /* function prototypes */ #endif /*** Config vars ***/ /* Mutually exclusive booleans for boostrap type */ boolean bootstrap, jackknife; boolean permute; /* permute char order */ boolean ild; /* permute species for each char */ boolean lockhart; /* permute chars within species */ boolean rewrite; boolean factors = false; /* Use factors (only with morph data) */ /* Bootstrap/jackknife sample frequency */ boolean regular = true; /* Use 50% sampling with bootstrap/jackknife */ double fracsample = 0.5; /* ...or user-defined sample freq, [0..inf) */ /* Output format: mutually exclusive, none indicates PHYLIP */ boolean xml = false; boolean nexus = false; boolean weights = false;/* Read weights file */ boolean categories = false;/* Use categories (permuted with dataset) */ boolean enzymes; boolean all; /* All alleles present in infile? */ boolean justwts = false; /* Write boot'd/jack'd weights, no datasets */ boolean mixture; boolean ancvar; boolean progress = true; /* Enable progress indications */ boolean firstrep; /* TODO Must this be global? */ longer seed; /* Filehandles and paths */ /* Usual suspects declared in phylip.c/h */ FILE *outcatfile, *outweightfile, *outmixfile, *outancfile, *outfactfile; Char infilename[FNMLNGTH], outfilename[FNMLNGTH], catfilename[FNMLNGTH], outcatfilename[FNMLNGTH], weightfilename[FNMLNGTH], outweightfilename[FNMLNGTH], mixfilename[FNMLNGTH], outmixfilename[FNMLNGTH], ancfilename[FNMLNGTH], outancfilename[FNMLNGTH], factfilename[FNMLNGTH], outfactfilename[FNMLNGTH]; long sites, loci, maxalleles, groups, nenzymes, reps, ws, blocksize, categs, maxnewsites; datatype data; seqtype seq; steptr oldweight, where, how_many, mixdata, ancdata; /* Original dataset */ /* [0..spp-1][0..sites-1] */ Char **nodep = NULL; /* molecular or morph data */ double **nodef = NULL; /* gene freqs */ Char *factor = NULL; /* factor[sites] - direct read-in of factors file */ long *factorr = NULL; /* [0..sites-1] => nondecreasing [1..groups] */ long *alleles = NULL; /* Mapping with read-in weights eliminated * Allocated once in allocnew() */ long newsites; long newgroups; long *newwhere = NULL; /* Map [0..newgroups-1] => [1..newsites] */ long *newhowmany = NULL; /* Number of chars for each [0..newgroups-1] */ /* Mapping with bootstrapped weights applied */ /* (re)allocated by allocnewer() */ long newersites, newergroups; long *newerfactor = NULL; /* Map [0..newersites-1] => [1..newergroups] */ long *newerwhere = NULL; /* Map [0..newergroups-1] => [1..newersites] */ long *newerhowmany = NULL; /* Number of chars for each [0..newergroups-1] */ long **charorder = NULL; /* Permutation [0..spp-1][0..newergroups-1] */ long **sppord = NULL; /* Permutation [0..newergroups-1][0..spp-1] */ void getoptions() { /* interactively set options */ long reps0; long inseed, inseed0, loopcount, loopcount2; Char ch; boolean done1; data = seqs; seq = dna; bootstrap = true; jackknife = false; permute = false; ild = false; lockhart = false; blocksize = 1; regular = true; fracsample = 1.0; all = false; reps = 100; weights = false; mixture = false; ancvar = false; categories = false; justwts = false; printdata = false; dotdiff = true; progress = true; interleaved = true; xml = false; nexus = false; factors = false; loopcount = 0; for (;;) { cleerhome(); printf("\nBootstrapping algorithm, version %s\n\n",VERSION); printf("Settings for this run:\n"); printf(" D Sequence, Morph, Rest., Gene Freqs? %s\n", (data == seqs ) ? "Molecular sequences" : (data == morphology ) ? "Discrete Morphology" : (data == restsites) ? "Restriction Sites" : (data == genefreqs) ? "Gene Frequencies" : ""); if (data == restsites) printf(" E Number of enzymes? %s\n", enzymes ? "Present in input file" : "Not present in input file"); if (data == genefreqs) printf(" A All alleles present at each locus? %s\n", all ? "Yes" : "No, one absent at each locus"); if ((!lockhart) && (data == morphology)) printf(" F Use factors information? %s\n", factors ? "Yes" : "No"); printf(" J Bootstrap, Jackknife, Permute, Rewrite? %s\n", regular && jackknife ? "Delete-half jackknife" : (!regular) && jackknife ? "Delete-fraction jackknife" : permute ? "Permute species for each character" : ild ? "Permute character order" : lockhart ? "Permute within species" : regular && bootstrap ? "Bootstrap" : (!regular) && bootstrap ? "Partial bootstrap" : rewrite ? "Rewrite data" : "(unknown)" ); if (bootstrap || jackknife) { printf(" %% Regular or altered sampling fraction? "); if (regular) printf("regular\n"); else { if (fabs(fracsample*100 - (int)(fracsample*100)) > 0.01) { printf("%.1f%% sampled\n", 100.0*fracsample); } else { printf("%.0f%% sampled\n", 100.0*fracsample); } } } if ((data == seqs) && rewrite) { printf(" P PHYLIP, NEXUS, or XML output format? %s\n", nexus ? "NEXUS" : xml ? "XML" : "PHYLIP"); if (xml || ((data == seqs) && nexus)) { printf(" S Type of molecular sequences? " ); switch (seq) { case (dna) : printf("DNA\n"); break; case (rna) : printf("RNA\n"); break; case (protein) : printf("Protein\n"); break; } } } if ((data == morphology) && rewrite) printf(" P PHYLIP or NEXUS output format? %s\n", nexus ? "NEXUS" : "PHYLIP"); if (bootstrap) { if (blocksize > 1) printf(" B Block size for block-bootstrapping? %ld\n", blocksize); else printf(" B Block size for block-bootstrapping? %ld (regular bootstrap)\n", blocksize); } if (!rewrite) printf(" R How many replicates? %ld\n", reps); if (jackknife || bootstrap || permute || ild) { printf(" W Read weights of characters? %s\n", (weights ? "Yes" : "No")); if (data == morphology) { printf(" X Read mixture file? %s\n", (mixture ? "Yes" : "No")); printf(" N Read ancestors file? %s\n", (ancvar ? "Yes" : "No")); } if (data == seqs) printf(" C Read categories of sites? %s\n", (categories ? "Yes" : "No")); if ((!permute)) { printf(" S Write out data sets or just weights? %s\n", (justwts ? "Just weights" : "Data sets")); } } if (data == seqs || data == restsites) printf(" I Input sequences interleaved? %s\n", interleaved ? "Yes" : "No, sequential"); printf(" 0 Terminal type (IBM PC, ANSI, none)? %s\n", ibmpc ? "IBM PC" : ansi ? "ANSI" : "(none)"); printf(" 1 Print out the data at start of run %s\n", printdata ? "Yes" : "No"); if (printdata) printf(" . Use dot-differencing to display them %s\n", dotdiff ? "Yes" : "No"); printf(" 2 Print indications of progress of run %s\n", progress ? "Yes" : "No"); printf("\n Y to accept these or type the letter for one to change\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); uppercase(&ch); if (ch == 'Y') break; if ( (bootstrap && (strchr("ABCDEFSJPRWXNI%1.20",ch) != NULL)) || (jackknife && (strchr("ACDEFSJPRWXNI%1.20",ch) != NULL)) || (permute && (strchr("ACDEFSJPRWXNI%1.20",ch) != NULL)) || ((ild || lockhart) && (strchr("ACDESJPRXNI%1.20",ch) != NULL)) || ((!(bootstrap || jackknife || permute || ild || lockhart)) && ((!xml) && (strchr("ADEFJPI1.20",ch) != NULL))) || (((data == morphology) || (data == seqs)) && (nexus || xml) && (strchr("ADEFJPSI1.20",ch) != NULL)) ) { switch (ch) { case 'D': if (data == genefreqs) data = seqs; else data = (datatype)((long)data + 1); break; case 'A': all = !all; break; case 'E': enzymes = !enzymes; break; case 'J': if (bootstrap) { bootstrap = false; jackknife = true; } else if (jackknife) { jackknife = false; permute = true; } else if (permute) { permute = false; ild = true; } else if (ild) { ild = false; lockhart = true; } else if (lockhart) { lockhart = false; rewrite = true; } else if (rewrite) { rewrite = false; bootstrap = true; } else { assert(0); /* Shouldn't happen */ bootstrap = true; jackknife = permute = ild = lockhart = rewrite = false; } break; case '%': regular = !regular; if (!regular) { loopcount2 = 0; do { printf("Samples as percentage of"); if ((data == seqs) || (data == restsites)) printf(" sites?\n"); if (data == morphology) printf(" characters?\n"); if (data == genefreqs) printf(" loci?\n"); fflush(stdout); scanf("%lf%*[^\n]", &fracsample); getchar(); done1 = (fracsample > 0.0); if (!done1) { printf("BAD NUMBER: must be positive\n"); } fracsample = fracsample/100.0; countup(&loopcount2, 10); } while (done1 != true); } break; case 'P': if (data == seqs) { if (!xml && !nexus) nexus = true; else { if (nexus) { nexus = false; xml = true; } else xml = false; } } if (data == morphology) { nexus = !nexus; xml = false; } break; case 'S': if(!rewrite){ justwts = !justwts; } else { switch (seq) { case (dna): seq = rna; break; case (rna): seq = protein; break; case (protein): seq = dna; break; } } break; case 'B': loopcount2 = 0; do { printf("Block size?\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%ld%*[^\n]", &blocksize); getchar(); done1 = (blocksize > 0); if (!done1) { printf("BAD NUMBER: must be positive\n"); } countup(&loopcount2, 10); } while (done1 != true); break; case 'R': reps0 = reps; loopcount2 = 0; do { printf("Number of replicates?\n"); #ifdef WIN32 phyFillScreenColor(); #endif fflush(stdout); scanf("%ld%*[^\n]", &reps); getchar(); done1 = (reps > 0); if (!done1) { printf("BAD NUMBER: must be positive\n"); reps = reps0; } countup(&loopcount2, 10); } while (done1 != true); break; case 'W': weights = !weights; break; case 'X': mixture = !mixture; break; case 'N': ancvar = !ancvar; break; case 'C': categories = !categories; break; case 'F': factors = !factors; break; case 'I': interleaved = !interleaved; break; case '0': initterminal(&ibmpc, &ansi); break; case '1': printdata = !printdata; break; case '.': dotdiff = !dotdiff; break; case '2': progress = !progress; break; } } else printf("Not a possible option!\n"); countup(&loopcount, 100); } if (bootstrap || jackknife) { if (jackknife && regular) fracsample = 0.5; if (bootstrap && regular) fracsample = 1.0; } if (!rewrite) initseed(&inseed, &inseed0, seed); /* These warnings only appear when user has changed an option that * subsequently became inapplicable */ if (factors && lockhart) { printf("Warning: Cannot use factors when permuting within species.\n"); factors = false; } if (data != seqs) { if (xml) { printf("warning: XML output not available for this type of data\n"); xml = false; } if (categories) { printf("warning: cannot use categories file with this type of data\n"); categories = false; } } if (data != morphology) { if (mixture) { printf("warning: cannot use mixture file with this type of data\n"); mixture = false; } if (ancvar) { printf("warning: cannot use ancestors file with this type of data\n"); ancvar = false; } } } /* getoptions */ void seqboot_inputnumbers() { /* read numbers of species and of sites */ long i; fscanf(infile, "%ld%ld", &spp, &sites); loci = sites; maxalleles = 1; if (data == restsites && enzymes) fscanf(infile, "%ld", &nenzymes); if (data == genefreqs) { alleles = (long *)Malloc(sites*sizeof(long)); scan_eoln(infile); sites = 0; for (i = 0; i < (loci); i++) { if (eoln(infile)) scan_eoln(infile); fscanf(infile, "%ld", &alleles[i]); if (alleles[i] > maxalleles) maxalleles = alleles[i]; if (all) sites += alleles[i]; else sites += alleles[i] - 1; } if (!all) maxalleles--; scan_eoln(infile); } } /* seqboot_inputnumbers */ void seqboot_inputfactors() { long i, j; Char ch, prevch; prevch = ' '; j = 0; for (i = 0; i < (sites); i++) { do { if (eoln(factfile)) scan_eoln(factfile); ch = gettc(factfile); } while (ch == ' '); if (ch != prevch) j++; prevch = ch; factorr[i] = j; } scan_eoln(factfile); } /* seqboot_inputfactors */ void inputoptions() { /* input the information on the options */ long weightsum, maxfactsize, i, j, k, l, m; if (data == genefreqs) { k = 0; l = 0; for (i = 0; i < (loci); i++) { if (all) m = alleles[i]; else m = alleles[i] - 1; k++; for (j = 1; j <= m; j++) { l++; factorr[l - 1] = k; } } } else { for (i = 1; i <= (sites); i++) factorr[i - 1] = i; } if(factors){ seqboot_inputfactors(); } for (i = 0; i < (sites); i++) oldweight[i] = 1; if (weights) inputweights2(0, sites, &weightsum, oldweight, &weights, "seqboot"); if (factors && printdata) { for(i = 0; i < sites; i++) factor[i] = (char)('0' + (factorr[i]%10)); printfactors(outfile, sites, factor, " (least significant digit)"); } if (weights && printdata) printweights(outfile, 0, sites, oldweight, "Sites"); for (i = 0; i < (loci); i++) how_many[i] = 0; for (i = 0; i < (loci); i++) where[i] = 0; for (i = 1; i <= (sites); i++) { how_many[factorr[i - 1] - 1]++; if (where[factorr[i - 1] - 1] == 0) where[factorr[i - 1] - 1] = i; } groups = factorr[sites - 1]; newgroups = 0; newsites = 0; maxfactsize = 0; for(i = 0 ; i < loci ; i++){ if(how_many[i] > maxfactsize){ maxfactsize = how_many[i]; } } maxnewsites = groups * maxfactsize; allocnew(); for (i = 0; i < groups; i++) { if (oldweight[where[i] - 1] > 0) { newgroups++; newsites += how_many[i]; newwhere[newgroups - 1] = where[i]; newhowmany[newgroups - 1] = how_many[i]; } } } /* inputoptions */ char **matrix_char_new(long rows, long cols) { char **mat; long i; assert(rows > 0); assert(cols > 0); mat = (char **)Malloc(rows*sizeof(char *)); for (i = 0; i < rows; i++) mat[i] = (char *)Malloc(cols*sizeof(char)); return mat; } void matrix_char_delete(char **mat, long rows) { long i; assert(mat != NULL); for (i = 0; i < rows; i++) free(mat[i]); free(mat); } double **matrix_double_new(long rows, long cols) { double **mat; long i; assert(rows > 0); assert(cols > 0); mat = (double **)Malloc(rows*sizeof(double *)); for (i = 0; i < rows; i++) mat[i] = (double *)Malloc(cols*sizeof(double)); return mat; } void matrix_double_delete(double **mat, long rows) { long i; assert(mat != NULL); for (i = 0; i < rows; i++) free(mat[i]); free(mat); } void seqboot_inputdata() { /* input the names and sequences for each species */ long i, j, k, l, m, n, basesread, basesnew=0; double x; Char charstate; boolean allread, done; if (data == genefreqs) { nodef = matrix_double_new(spp, sites); } else { nodep = matrix_char_new(spp, sites); } j = nmlngth + (sites + (sites - 1) / 10) / 2 - 5; if (j < nmlngth - 1) j = nmlngth - 1; if (j > 37) j = 37; if (printdata) { fprintf(outfile, "\nBootstrapping algorithm, version %s\n\n\n",VERSION); if (bootstrap) { if (blocksize > 1) { if (regular) fprintf(outfile, "Block-bootstrap with block size %ld\n\n", blocksize); else fprintf(outfile, "Partial (%2.0f%%) block-bootstrap with block size %ld\n\n", 100*fracsample, blocksize); } else { if (regular) fprintf(outfile, "Bootstrap\n\n"); else fprintf(outfile, "Partial (%2.0f%%) bootstrap\n\n", 100*fracsample); } } else { if (jackknife) { if (regular) fprintf(outfile, "Delete-half Jackknife\n\n"); else fprintf(outfile, "Delete-%2.0f%% Jackknife\n\n", 100*(1.0-fracsample)); } else { if (permute) { fprintf(outfile, "Species order permuted separately for each"); if (data == genefreqs) fprintf(outfile, " locus\n\n"); if (data == seqs) fprintf(outfile, " site\n\n"); if (data == morphology) fprintf(outfile, " character\n\n"); if (data == restsites) fprintf(outfile, " site\n\n"); } else { if (ild) { if (data == genefreqs) fprintf(outfile, "Locus"); if (data == seqs) fprintf(outfile, "Site"); if (data == morphology) fprintf(outfile, "Character"); if (data == restsites) fprintf(outfile, "Site"); fprintf(outfile, " order permuted\n\n"); } else { if (lockhart) if (data == genefreqs) fprintf(outfile, "Locus"); if (data == seqs) fprintf(outfile, "Site"); if (data == morphology) fprintf(outfile, "Character"); if (data == restsites) fprintf(outfile, "Site"); fprintf(outfile, " order permuted separately for each species\n\n"); } } } } if (data == genefreqs) fprintf(outfile, "%3ld species, %3ld loci\n\n", spp, loci); else { fprintf(outfile, "%3ld species, ", spp); if (data == seqs) fprintf(outfile, "%3ld sites\n\n", sites); else if (data == morphology) fprintf(outfile, "%3ld characters\n\n", sites); else if (data == restsites) fprintf(outfile, "%3ld sites\n\n", sites); } fprintf(outfile, "Name"); for (i = 1; i <= j; i++) putc(' ', outfile); fprintf(outfile, "Data\n"); fprintf(outfile, "----"); for (i = 1; i <= j; i++) putc(' ', outfile); fprintf(outfile, "----\n\n"); } interleaved = (interleaved && ((data == seqs) || (data == restsites))); if (data == genefreqs) { for (i = 1; i <= (spp); i++) { initname(i - 1); j = 1; while (j <= sites && !eoff(infile)) { if (eoln(infile)) scan_eoln(infile); if ( fscanf(infile, "%lf", &x) != 1) { printf("ERROR: Invalid value for locus %ld of species %ld\n", j, i); exxit(-1); } else if ((unsigned)x > 1.0) { printf("GENE FREQ OUTSIDE [0,1] in species %ld\n", i); exxit(-1); } else { nodef[i - 1][j - 1] = x; j++; } } scan_eoln(infile); } return; } basesread = 0; allread = false; while (!allread) { /* eat white space -- if the separator line has spaces on it*/ do { charstate = gettc(infile); } while (charstate == ' ' || charstate == '\t'); ungetc(charstate, infile); if (eoln(infile)) scan_eoln(infile); i = 1; while (i <= spp) { if ((interleaved && basesread == 0) || !interleaved) initname(i-1); j = interleaved ? basesread : 0; done = false; while (!done && !eoff(infile)) { if (interleaved) done = true; while (j < sites && !(eoln(infile) ||eoff(infile))) { charstate = gettc(infile); if (charstate == '\n' || charstate == '\t') charstate = ' '; if (charstate == ' ' || (data == seqs && charstate >= '0' && charstate <= '9')) continue; uppercase(&charstate); j++; if (charstate == '.') charstate = nodep[0][j-1]; nodep[i-1][j-1] = charstate; } if (interleaved) continue; if (j < sites) scan_eoln(infile); else if (j == sites) done = true; } if (interleaved && i == 1) basesnew = j; scan_eoln(infile); if ((interleaved && j != basesnew) || ((!interleaved) && j != sites)){ printf("\n\nERROR: sequences out of alignment at site %ld", j+1); printf(" of species %ld\n\n", i); exxit(-1);} i++; } if (interleaved) { basesread = basesnew; allread = (basesread == sites); } else allread = (i > spp); } if (!printdata) return; if (data == genefreqs) m = (sites - 1) / 8 + 1; else m = (sites - 1) / 60 + 1; for (i = 1; i <= m; i++) { for (j = 0; j < spp; j++) { for (k = 0; k < nmlngth; k++) putc(nayme[j][k], outfile); fprintf(outfile, " "); if (data == genefreqs) l = i * 8; else l = i * 60; if (l > sites) l = sites; if (data == genefreqs) n = (i - 1) * 8; else n = (i - 1) * 60; for (k = n; k < l; k++) { if (data == genefreqs) fprintf(outfile, "%8.5f", nodef[j][k]); else { if (j + 1 > 1 && nodep[j][k] == nodep[0][k]) charstate = '.'; else charstate = nodep[j][k]; putc(charstate, outfile); if ((k + 1) % 10 == 0 && (k + 1) % 60 != 0) putc(' ', outfile); } } putc('\n', outfile); } putc('\n', outfile); } putc('\n', outfile); } /* seqboot_inputdata */ void allocrest() { /* allocate memory for bookkeeping arrays */ oldweight = (steptr)Malloc(sites*sizeof(long)); weight = (steptr)Malloc(sites*sizeof(long)); if (categories) category = (steptr)Malloc(sites*sizeof(long)); if (mixture) mixdata = (steptr)Malloc(sites*sizeof(long)); if (ancvar) ancdata = (steptr)Malloc(sites*sizeof(long)); where = (steptr)Malloc(loci*sizeof(long)); how_many = (steptr)Malloc(loci*sizeof(long)); factor = (Char *)Malloc(sites*sizeof(Char)); factorr = (steptr)Malloc(sites*sizeof(long)); nayme = (naym *)Malloc(spp*sizeof(naym)); } /* allocrest */ void freerest() { /* Free bookkeeping arrays */ if (alleles) free(alleles); free(oldweight); free(weight); if (categories) free(category); if (mixture) free(mixdata); if (ancvar) free(ancdata); free(where); free(how_many); free(factor); free(factorr); free(nayme); } void allocnew(void) { /* allocate memory for arrays that depend on the lenght of the output sequence*/ /* Only call this function once */ assert(newwhere == NULL && newhowmany == NULL); newwhere = (steptr)Malloc(loci*sizeof(long)); newhowmany = (steptr)Malloc(loci*sizeof(long)); } /* allocnew */ void freenew(void) { /* free arrays allocated by allocnew() */ /* Only call this function once */ assert(newwhere != NULL); assert(newhowmany != NULL); free(newwhere); free(newhowmany); } void allocnewer(long newergroups, long newersites) { /* allocate memory for arrays that depend on the length of the bootstrapped output sequence */ /* Assumes that spp remains constant */ static long curnewergroups = 0; static long curnewersites = 0; long i; if (newerwhere != NULL) { if (newergroups > curnewergroups) { free(newerwhere); free(newerhowmany); for (i = 0; i < spp; i++) free(charorder[i]); newerwhere = NULL; } if (newersites > curnewersites) { free(newerfactor); newerfactor = NULL; } } if (charorder == NULL) charorder = (steptr *)Malloc(spp*sizeof(steptr)); /* Malloc() will fail if either is 0, so add a dummy element */ if (newergroups == 0) newergroups++; if (newersites == 0) newersites++; if (newerwhere == NULL) { newerwhere = (steptr)Malloc(newergroups*sizeof(long)); newerhowmany = (steptr)Malloc(newergroups*sizeof(long)); for (i = 0; i < spp; i++) charorder[i] = (steptr)Malloc(newergroups*sizeof(long)); curnewergroups = newergroups; } if (newerfactor == NULL) { newerfactor = (steptr)Malloc(newersites*sizeof(long)); curnewersites = newersites; } } void freenewer() { /* Free memory allocated by allocnewer() */ /* spp must be the same as when allocnewer was called */ long i; if (newerwhere) { free(newerwhere); free(newerhowmany); free(newerfactor); for (i = 0; i < spp; i++) free(charorder[i]); free(charorder); } } void doinput(int argc, Char *argv[]) { /* reads the input data */ getoptions(); seqboot_inputnumbers(); allocrest(); if (weights) openfile(&weightfile,WEIGHTFILE,"input weight file", "r",argv[0],weightfilename); if (mixture){ openfile(&mixfile,MIXFILE,"mixture file", "r",argv[0],mixfilename); openfile(&outmixfile,"outmixture","output mixtures file","w",argv[0], outmixfilename); seqboot_inputaux(mixdata, mixfile); } if (ancvar){ openfile(&ancfile,ANCFILE,"ancestor file", "r",argv[0],ancfilename); openfile(&outancfile,"outancestors","output ancestors file","w",argv[0], outancfilename); seqboot_inputaux(ancdata, ancfile); } if (categories) { openfile(&catfile,CATFILE,"input category file","r",argv[0],catfilename); openfile(&outcatfile,"outcategories","output category file","w",argv[0], outcatfilename); inputcategs(0, sites, category, 9, "SeqBoot"); } if (factors){ openfile(&factfile,FACTFILE,"factors file","r",argv[0],factfilename); openfile(&outfactfile,"outfactors","output factors file","w",argv[0], outfactfilename); } if (justwts && !permute) openfile(&outweightfile,"outweights","output weight file", "w",argv[0],outweightfilename); else { openfile(&outfile,OUTFILE,"output data file","w",argv[0],outfilename); } inputoptions(); seqboot_inputdata(); } /* doinput */ void bootweights() { /* sets up weights by resampling data */ long i, j, k, blocks; double p, q, r; long grp = 0, site = 0; ws = newgroups; for (i = 0; i < (ws); i++) weight[i] = 0; if (jackknife) { if (fabs(newgroups*fracsample - (long)(newgroups*fracsample+0.5)) > 0.00001) { if (randum(seed) < (newgroups*fracsample - (long)(newgroups*fracsample)) /((long)(newgroups*fracsample+1.0)-(long)(newgroups*fracsample))) q = (long)(newgroups*fracsample)+1; else q = (long)(newgroups*fracsample); } else q = (long)(newgroups*fracsample+0.5); r = newgroups; p = q / r; ws = 0; for (i = 0; i < (newgroups); i++) { if (randum(seed) < p) { weight[i]++; ws++; q--; } r--; if (i + 1 < newgroups) p = q / r; } } else if (permute) { for (i = 0; i < (newgroups); i++) weight[i] = 1; } else if (bootstrap) { blocks = fracsample * newgroups / blocksize; for (i = 1; i <= (blocks); i++) { j = (long)(newgroups * randum(seed)) + 1; for (k = 0; k < blocksize; k++) { weight[j - 1]++; j++; if (j > newgroups) j = 1; } } } else /* case of rewriting data */ for (i = 0; i < (newgroups); i++) weight[i] = 1; /* Count number of replicated groups */ newergroups = 0; newersites = 0; for (i = 0; i < newgroups; i++) { newergroups += weight[i]; newersites += newhowmany[i] * weight[i]; } if (newergroups < 1) { fprintf(stdout, "ERROR: sampling frequency or number of sites is too small\n"); exxit(-1); } /* reallocate "newer" arrays, sized by output groups: * newerfactor, newerwhere, newerhowmany, and charorder */ allocnewer(newergroups, newersites); /* Replicate each group i weight[i] times */ grp = 0; site = 0; for (i = 0; i < newgroups; i++) { for (j = 0; j < weight[i]; j++) { for (k = 0; k < newhowmany[i]; k++) { newerfactor[site] = grp + 1; site++; } newerwhere[grp] = newwhere[i]; newerhowmany[grp] = newhowmany[i]; grp++; } } } /* bootweights */ void permute_vec(long *a, long n) { long i, j, k; for (i = 1; i < n; i++) { k = (long)((i+1) * randum(seed)); j = a[i]; a[i] = a[k]; a[k] = j; } } void sppermute(long n) { /* permute the species order as given in array sppord */ permute_vec(sppord[n-1], spp); } /* sppermute */ void charpermute(long m, long n) { /* permute the n+1 characters of species m+1 */ permute_vec(charorder[m], n); } /* charpermute */ void writedata() { /* write out one set of bootstrapped sequences */ long i, j, k, l, m, n, n2; double x; Char charstate; sppord = (long **)Malloc(newergroups*sizeof(long *)); for (i = 0; i < (newergroups); i++) sppord[i] = (long *)Malloc(spp*sizeof(long)); for (j = 1; j <= spp; j++) sppord[0][j - 1] = j; for (i = 1; i < newergroups; i++) { for (j = 1; j <= (spp); j++) sppord[i][j - 1] = sppord[i - 1][j - 1]; } if (!justwts || permute) { if (data == restsites && enzymes) fprintf(outfile, "%5ld %5ld% 4ld\n", spp, newergroups, nenzymes); else if (data == genefreqs) fprintf(outfile, "%5ld %5ld\n", spp, newergroups); else { if ((data == seqs) && rewrite && xml) fprintf(outfile, "\n"); else if (rewrite && nexus) { fprintf(outfile, "#NEXUS\n"); fprintf(outfile, "BEGIN DATA;\n"); fprintf(outfile, " DIMENSIONS NTAX=%ld NCHAR=%ld;\n", spp, newersites); fprintf(outfile, " FORMAT"); if (interleaved) fprintf(outfile, " interleave=yes"); else fprintf(outfile, " interleave=no"); fprintf(outfile, " DATATYPE="); if (data == seqs) { switch (seq) { case (dna): fprintf(outfile, "DNA missing=N gap=-"); break; case (rna): fprintf(outfile, "RNA missing=N gap=-"); break; case (protein): fprintf(outfile, "protein missing=? gap=-"); break; } } if (data == morphology) fprintf(outfile, "STANDARD"); fprintf(outfile, ";\n MATRIX\n"); } else fprintf(outfile, "%5ld %5ld\n", spp, newersites); } if (data == genefreqs) { for (i = 0; i < (newergroups); i++) fprintf(outfile, " %3ld", alleles[factorr[newerwhere[i] - 1] - 1]); putc('\n', outfile); } } l = 1; /* When rewriting to PHYLIP, only convert interleaved <-> sequential * for molecular and restriction sites. */ if (( (rewrite && !nexus) ) && ((data == seqs) || (data == restsites))) { interleaved = !interleaved; if (rewrite && xml) interleaved = false; } m = interleaved ? 60 : newergroups; do { if (m > newergroups) m = newergroups; for (j = 0; j < spp; j++) { n = 0; if ((l == 1) || (interleaved && nexus)) { if (rewrite && xml) { fprintf(outfile, " \n"); fprintf(outfile, " "); } n2 = nmlngth; if (rewrite && (xml || nexus)) { while (nayme[j][n2-1] == ' ') n2--; } if (nexus) fprintf(outfile, " "); for (k = 0; k < n2; k++) if (nexus && (nayme[j][k] == ' ') && (k < n2)) putc('_', outfile); else putc(nayme[j][k], outfile); if (rewrite && xml) fprintf(outfile, "\n "); } else { if (rewrite && xml) { fprintf(outfile, " "); } } if (!xml) { for (k = 0; k < nmlngth-n2; k++) fprintf(outfile, " "); fprintf(outfile, " "); } for (k = l - 1; k < m; k++) { if (permute && j + 1 == 1) sppermute(newerfactor[n]); /* we can assume chars not permuted */ for (n2 = -1; n2 <= (newerhowmany[charorder[j][k]] - 2); n2++) { n++; if (data == genefreqs) { if (n > 1 && (n & 7) == 1) fprintf(outfile, "\n "); x = nodef[sppord[charorder[j][k]][j] - 1] [newerwhere[charorder[j][k]] + n2]; fprintf(outfile, "%8.5f", x); } else { if (rewrite && xml && (n > 1) && (n % 60 == 1)) fprintf(outfile, "\n "); else if (!nexus && !interleaved && (n > 1) && (n % 60 == 1)) fprintf(outfile, "\n "); charstate = nodep[sppord[charorder[j][k]][j] - 1] [newerwhere[charorder[j][k]] + n2]; putc(charstate, outfile); if (n % 10 == 0 && n % 60 != 0) putc(' ', outfile); } } } if (rewrite && xml) { fprintf(outfile, "\n \n"); } putc('\n', outfile); } if (interleaved) { if ((m <= newersites) && (newersites > 60)) putc('\n', outfile); l += 60; m += 60; } } while (interleaved && l <= newersites); if ((data == seqs) && (rewrite && xml)) fprintf(outfile, "\n"); if (rewrite && nexus) fprintf(outfile, " ;\nEND;\n"); for (i = 0; i < (newergroups); i++) free(sppord[i]); free(sppord); } /* writedata */ void writeweights() { /* write out one set of post-bootstrapping weights */ long j, k, l, m, n, o; j = 0; l = 1; if (interleaved) m = 60; else m = sites; do { if(m > sites) m = sites; n = 0; for (k = l - 1; k < m; k++) { for(o = 0 ; o < how_many[k] ; o++){ if(oldweight[k]==0){ fprintf(outweightfile, "0"); j++; } else{ if (weight[k-j] < 10) fprintf(outweightfile, "%c", (char)('0'+weight[k-j])); else fprintf(outweightfile, "%c", (char)('A'+weight[k-j]-10)); n++; if (!interleaved && n > 1 && n % 60 == 1) { fprintf(outweightfile, "\n"); if (n % 10 == 0 && n % 60 != 0) putc(' ', outweightfile); } } } } putc('\n', outweightfile); if (interleaved) { l += 60; m += 60; } } while (interleaved && l <= sites); } /* writeweights */ void writecategories() { /* write out categories for the bootstrapped sequences */ long k, l, m, n, n2; Char charstate; if(justwts){ if (interleaved) m = 60; else m = sites; l=1; do { if(m > sites) m = sites; n=0; for(k=l-1 ; k < m ; k++){ n++; if (!interleaved && n > 1 && n % 60 == 1) fprintf(outcatfile, "\n "); charstate = '0' + category[k]; putc(charstate, outcatfile); } if (interleaved) { l += 60; m += 60; } }while(interleaved && l <= sites); fprintf(outcatfile, "\n"); return; } l = 1; if (interleaved) m = 60; else m = newergroups; do { if (m > newergroups) m = newergroups; n = 0; for (k = l - 1; k < m; k++) { for (n2 = -1; n2 <= (newerhowmany[k] - 2); n2++) { n++; if (!interleaved && n > 1 && n % 60 == 1) fprintf(outcatfile, "\n "); charstate = '0' + category[newerwhere[k] + n2]; putc(charstate, outcatfile); if (n % 10 == 0 && n % 60 != 0) putc(' ', outcatfile); } } if (interleaved) { l += 60; m += 60; } } while (interleaved && l <= newersites); fprintf(outcatfile, "\n"); } /* writecategories */ void writeauxdata(steptr auxdata, FILE *outauxfile) { /* write out auxiliary option data (mixtures, ancestors, etc.) to appropriate file. Samples parralel to data, or just gives one output entry if justwts is true */ long k, l, m, n, n2; Char charstate; /* if we just output weights (justwts), and this is first set just output the data unsampled */ if(justwts){ if(firstrep){ if (interleaved) m = 60; else m = sites; l=1; do { if(m > sites) m = sites; n = 0; for(k=l-1 ; k < m ; k++){ n++; if (!interleaved && n > 1 && n % 60 == 1) fprintf(outauxfile, "\n "); charstate = auxdata[k]; putc(charstate, outauxfile); } if (interleaved) { l += 60; m += 60; } }while(interleaved && l <= sites); fprintf(outauxfile, "\n"); } return; } l = 1; if (interleaved) m = 60; else m = newergroups; do { if (m > newergroups) m = newergroups; n = 0; for (k = l - 1; k < m; k++) { for (n2 = -1; n2 <= (newerhowmany[k] - 2); n2++) { n++; if (!interleaved && n > 1 && n % 60 == 1) fprintf(outauxfile, "\n "); charstate = auxdata[newerwhere[k] + n2]; putc(charstate, outauxfile); if (n % 10 == 0 && n % 60 != 0) putc(' ', outauxfile); } } if (interleaved) { l += 60; m += 60; } } while (interleaved && l <= newersites); fprintf(outauxfile, "\n"); } /* writeauxdata */ void writefactors(void) { long i, k, l, m, n, writesites; char symbol; steptr wfactor; long grp; if(!justwts || firstrep){ if(justwts){ writesites = sites; wfactor = factorr; } else { writesites = newergroups; wfactor = newerfactor; } symbol = '+'; if (interleaved) m = 60; else m = writesites; l=1; do { if(m > writesites) m = writesites; n = 0; for(k=l-1 ; k < m ; k++){ grp = charorder[0][k]; for(i = 0; i < newerhowmany[grp]; i++) { putc(symbol, outfactfile); n++; if (!interleaved && n > 1 && n % 60 == 1) fprintf(outfactfile, "\n "); if (n % 10 == 0 && n % 60 != 0) putc(' ', outfactfile); } symbol = (symbol == '+') ? '-' : '+'; } if (interleaved) { l += 60; m += 60; } }while(interleaved && l <= writesites); fprintf(outfactfile, "\n"); } } /* writefactors */ void bootwrite() { /* does bootstrapping and writes out data sets */ long i, j, rr, repdiv10; if (rewrite) reps = 1; repdiv10 = reps / 10; if (repdiv10 < 1) repdiv10 = 1; if (progress) putchar('\n'); firstrep = true; for (rr = 1; rr <= (reps); rr++) { bootweights(); for (i = 0; i < spp; i++) for (j = 0; j < newergroups; j++) charorder[i][j] = j; if (ild) { charpermute(0, newergroups); for (i = 1; i < spp; i++) for (j = 0; j < newergroups; j++) charorder[i][j] = charorder[0][j]; } if (lockhart) for (i = 0; i < spp; i++) charpermute(i, newergroups); if (!justwts || permute || ild || lockhart) writedata(); if (justwts && !(permute || ild || lockhart)) writeweights(); if (categories) writecategories(); if (factors) writefactors(); if (mixture) writeauxdata(mixdata, outmixfile); if (ancvar) writeauxdata(ancdata, outancfile); if (progress && !rewrite && ((reps < 10) || rr % repdiv10 == 0)) { printf("completed replicate number %4ld\n", rr); #ifdef WIN32 phyFillScreenColor(); #endif firstrep = false; } } if (progress) { if (justwts) printf("\nOutput weights written to file \"%s\"\n\n", outweightfilename); else printf("\nOutput written to file \"%s\"\n\n", outfilename); } } /* bootwrite */ void seqboot_inputaux(steptr dataptr, FILE* auxfile) { /* input auxiliary option data (mixtures, ancestors, ect) for new style input, assumes that data is correctly formated in input files*/ long i, j, k; Char ch; j = 0; k = 1; for (i = 0; i < (sites); i++) { do { if (eoln(auxfile)) scan_eoln(auxfile); ch = gettc(auxfile); if (ch == '\n') ch = ' '; } while (ch == ' '); dataptr[i] = ch; } scan_eoln(auxfile); } /* seqboot_inputaux */ int main(int argc, Char *argv[]) { /* Read in sequences or frequencies and bootstrap or jackknife them */ #ifdef MAC argc = 1; /* macsetup("SeqBoot",""); */ argv[0] = "SeqBoot"; #endif init(argc,argv); openfile(&infile, INFILE, "input file", "r", argv[0], infilename); ibmpc = IBMCRT; ansi = ANSICRT; doinput(argc, argv); bootwrite(); freenewer(); freenew(); freerest(); if (nodep) matrix_char_delete(nodep, spp); if (nodef) matrix_double_delete(nodef, spp); FClose(infile); if (factors) { FClose(factfile); FClose(outfactfile); } if (weights) FClose(weightfile); if (categories) { FClose(catfile); FClose(outcatfile); } if(mixture) FClose(outmixfile); if(ancvar) FClose(outancfile); if (justwts && !permute) { FClose(outweightfile); } else FClose(outfile); #ifdef MAC fixmacfile(outfilename); if (justwts && !permute) fixmacfile(outweightfilename); if (categories) fixmacfile(outcatfilename); if (mixture) fixmacfile(outmixfilename); #endif printf("Done.\n\n"); #ifdef WIN32 phyRestoreConsoleAttributes(); #endif return 0; } phylip-3.697/src/treedist.c0000644004732000473200000011656712406201117015335 0ustar joefelsenst_g/* version 3.696. Written by Dan Fineman, Joseph Felsenstein, Mike Palczewski, Hisashi Horino, Akiko Fuseki, Sean Lamont, and Andrew Keeffe. Copyright (c) 1993-2014, Joseph Felsenstein All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #include "phylip.h" #include "cons.h" typedef enum { SYMMETRIC, BSD } distance_type; /* The following extern's refer to things declared in cons.c */ extern int tree_pairing; extern Char outfilename[FNMLNGTH], intreename[FNMLNGTH], intree2name[FNMLNGTH], outtreename[FNMLNGTH]; extern node *root; extern long numopts, outgrno, col; extern long maxgrp; /* max. no. of groups in all trees found */ extern boolean trout, firsttree, noroot, outgropt, didreroot, prntsets, progress, treeprint, goteof; extern pointarray treenode, nodep; extern group_type **grouping, **grping2, **group2;/* to store groups found */ extern long **order, **order2, lasti; extern group_type *fullset; extern node *grbg; extern long tipy; extern double **timesseen, **tmseen2, **times2; extern double trweight, ntrees; static distance_type dtype; static long output_scheme; #ifndef OLDC /* function prototpes */ void assign_tree(group_type **, pattern_elm ***, long, long *); boolean group_is_null(group_type **, long); void compute_distances(pattern_elm ***, long, long); void free_patterns(pattern_elm ***, long); void produce_square_matrix(long, long *); void produce_full_matrix(long, long, long *); void output_submenu(void); void pairing_submenu(void); void read_second_file(pattern_elm ***, long, long); void getoptions(void); void assign_lengths(double **lengths, pattern_elm ***pattern_array, long tree_index); void print_header(long trees_in_1, long trees_in_2); void output_distances(long trees_in_1, long trees_in_2); void output_long_distance(long diffl, long tree1, long tree2, long trees_in_1, long trees_in_2); void output_matrix_long(long diffl, long tree1, long tree2, long trees_in_1, long trees_in_2); void output_matrix_double(double diffl, long tree1, long tree2, long trees_in_1, long trees_in_2); void output_double_distance(double diffd, long tree1, long tree2, long trees_in_1, long trees_in_2); long symetric_diff(group_type **tree1, group_type **tree2, long ntree1, long ntree2, long patternsz1, long patternsz2); double bsd_tree_diff(group_type **tree1, group_type **tree2, long ntree1, long ntree2, double* lengths1, double *lengths2, long patternsz1, long patternsz2); void tree_diff(group_type **tree1, group_type **tree2, double *lengths1, double* lengths2, long patternsz1, long patternsz2, long ntree1, long ntree2, long trees_in_1, long trees_in_2); void print_line_heading(long tree); int get_num_columns(void); void print_matrix_heading(long tree, long maxtree); /* function prototpes */ #endif void assign_lengths(double **lengths, pattern_elm ***pattern_array, long tree_index) { *lengths = pattern_array[0][tree_index]->length; } void assign_tree(group_type **treeN, pattern_elm ***pattern_array, long tree_index, long *pattern_size) { /* set treeN to be the tree_index-th tree in pattern_elm */ long i; for ( i = 0 ; i < setsz ; i++ ) { treeN[i] = pattern_array[i][tree_index]->apattern; } *pattern_size = *pattern_array[0][tree_index]->patternsize; } /* assign_tree */ boolean group_is_null(group_type **treeN, long index) { /* Check to see if a given index to a tree array points to an empty group */ long i; for ( i = 0 ; i < setsz ; i++ ) if (treeN[i][index] != (group_type) 0) return false; /* If we've gotten this far, then the index is to an empty group in the tree. */ return true; } /* group_is_null */ double bsd_tree_diff(group_type **tree1, group_type **tree2, long ntree1, long ntree2, double *lengths1, double* lengths2, long patternsz1, long patternsz2) { /* Compute the difference between 2 given trees. Return that value as a double. */ long index1, index2; double return_value = 0; boolean match_found; long i; if ( group_is_null(tree1, 0) || group_is_null(tree2, 0) ) { printf ("Error computing tree difference between tree %ld and tree %ld\n", ntree1, ntree2); exxit(-1); } for ( index1 = 0; index1 < patternsz1; index1++ ) { if ( !group_is_null(tree1, index1) ) { if ( lengths1[index1] == -1 ) { printf( "Error: tree %ld is missing a length from at least one branch\n", ntree1 ); exxit(-1); } } } for ( index2 = 0; index2 < patternsz2; index2++ ) { if ( !group_is_null(tree2, index2) ) { if ( lengths2[index2] == -1 ) { printf( "Error: tree %ld is missing a length from at least one branch\n", ntree2 ); exxit(-1); } } } for ( index1 = 0 ; index1 < patternsz1; index1++ ) { /* For every element in the first tree, see if there's a match to it in the second tree. */ match_found = false; if ( group_is_null(tree1, index1) ) { /* When we've gone over all the elements in tree1, greater number of elements in tree2 will constitute that much more of a difference... */ while ( !group_is_null(tree2, index1) ) { return_value += pow(lengths1[index1], 2); index1++; } break; } for ( index2 = 0 ; index2 < patternsz2 ; index2++ ) { /* For every element in the second tree, see if any match the current element in the first tree. */ if ( group_is_null(tree2, index2) ) { /* When we've gone over all the elements in tree2 */ match_found = false; break; } else { /* Tentatively set match_found; will be changed later if neccessary. . . */ match_found = true; for ( i = 0 ; i < setsz ; i++ ) { /* See if we've got a match, */ if ( tree1[i][index1] != tree2[i][index2] ) match_found = false; } if ( match_found == true ) { break; } } } if ( match_found == false ) { return_value += pow(lengths1[index1], 2); } } for ( index2 = 0 ; index2 < patternsz2 ; index2++ ) { /* For every element in the second tree, see if there's a match to it in the first tree. */ match_found = false; if ( group_is_null(tree2, index2) ) { /* When we've gone over all the elements in tree2, greater number of elements in tree1 will constitute that much more of a difference... */ while ( !group_is_null(tree1, index2) ) { return_value += pow(lengths2[index2], 2); index2++; } break; } for ( index1 = 0 ; index1 < patternsz1 ; index1++ ) { /* For every element in the first tree, see if any match the current element in the second tree. */ if ( group_is_null (tree1, index1) ) { /* When we've gone over all the elements in tree2 */ match_found = false; break; } else { /* Tentatively set match_found; will be changed later if neccessary. . . */ match_found = true; for ( i = 0 ; i < setsz ; i++ ) { /* See if we've got a match, */ if ( tree2[i][index2] != tree1[i][index1] ) match_found = false; } if ( match_found == true ) { return_value += pow(lengths1[index1] - lengths2[index2], 2); break; } } } if ( match_found == false ) { return_value += pow(lengths2[index2], 2); } } if ( return_value > 0.0 ) return_value = sqrt(return_value); else return_value = 0.0; return return_value; } long symetric_diff(group_type **tree1, group_type **tree2, long ntree1, long ntree2, long patternsz1, long patternsz2) { /* Compute the symmetric difference between 2 given trees. Return that value as a long. */ long index1, index2, return_value = 0; boolean match_found; long i; if ( group_is_null(tree1, 0) || group_is_null(tree2, 0) ) { printf ("Error computing tree difference.\n"); return 0; } for ( index1 = 0 ; index1 < patternsz1 ; index1++ ) { /* For every element in the first tree, see if there's a match to it in the second tree. */ match_found = false; if ( group_is_null (tree1, index1) ) { /* When we've gone over all the elements in tree1, greater number of elements in tree2 will constitute that much more of a difference... */ while ( !group_is_null(tree2, index1) ) { return_value++; index1++; } break; } for ( index2 = 0 ; index2 < patternsz2 ; index2++ ) { /* For every element in the second tree, see if any match the current element in the first tree. */ if ( group_is_null(tree2, index2) ) { /* When we've gone over all the elements in tree2 */ match_found = false; break; } else { /* Tentatively set match_found; will be changed later if neccessary. . . */ match_found = true; for ( i = 0 ; i < setsz ; i++ ) { /* See if we've got a match, */ if ( tree1[i][index1] != tree2[i][index2] ) match_found = false; } if ( match_found == true ) { /* If the previous loop ran from 0 to setsz without setting match_found to false, */ break; } } } if ( match_found == false ) { return_value++; } } return return_value; } /* symetric_diff */ void output_double_distance(double diffd, long tree1, long tree2, long trees_in_1, long trees_in_2) { switch ( tree_pairing ) { case ADJACENT_PAIRS: if ( output_scheme == VERBOSE ) { fprintf (outfile, "Trees %ld and %ld: %e\n", tree1, tree2, diffd); } else if (output_scheme == SPARSE) { fprintf (outfile, "%ld %ld %e\n", tree1, tree2, diffd); } break; case ALL_IN_FIRST: if ( output_scheme == VERBOSE ) { fprintf (outfile, "Trees %ld and %ld: %e\n", tree1, tree2, diffd); } else if ( output_scheme == SPARSE ) { fprintf (outfile, "%ld %ld %e\n", tree1, tree2, diffd ); } else if ( output_scheme == FULL_MATRIX ) { output_matrix_double(diffd, tree1, tree2, trees_in_1, trees_in_2); } break; case CORR_IN_1_AND_2: if ( output_scheme == VERBOSE ) { fprintf(outfile, "Tree pair %ld: %e\n", tree1, diffd); } else if ( output_scheme == SPARSE ) { fprintf(outfile, "%ld %e\n", tree1, diffd); } break; case ALL_IN_1_AND_2: if ( output_scheme == VERBOSE ) fprintf(outfile, "Trees %ld and %ld: %e\n", tree1, tree2, diffd); else if ( output_scheme == SPARSE ) fprintf(outfile, "%ld %ld %e\n", tree1, tree2, diffd); else if ( output_scheme == FULL_MATRIX ) { output_matrix_double(diffd, tree1, tree2, trees_in_1, trees_in_2); } break; } } /* output_double_distance */ void print_matrix_heading(long tree, long maxtree) { long i; if ( tree_pairing == ALL_IN_1_AND_2 ) { fprintf(outfile, "\n\nFirst\\ Second tree file:\n"); fprintf(outfile, "tree \\\n"); fprintf(outfile, "file: \\"); } else fprintf(outfile, "\n\n "); for ( i = tree ; i <= maxtree ; i++ ) { if ( dtype == SYMMETRIC ) fprintf(outfile, "%5ld ", i); else fprintf(outfile, " %7ld ", i); } fprintf(outfile, "\n"); if ( tree_pairing == ALL_IN_1_AND_2 ) fprintf(outfile, " \\"); else fprintf(outfile, " \\"); for ( i = tree ; i <= maxtree ; i++ ) { if ( dtype == SYMMETRIC ) fprintf(outfile, "------"); else fprintf(outfile, "------------"); } } void print_line_heading(long tree) { if ( tree_pairing == ALL_IN_1_AND_2 ) fprintf(outfile, "\n%4ld |", tree); else fprintf(outfile, "\n%5ld |", tree); } void output_matrix_double(double diffl, long tree1, long tree2, long trees_in_1, long trees_in_2) { if ( tree1 == 1 && ((tree2 - 1) % get_num_columns() == 0 || tree2 == 1 ) ) { if ( (tree_pairing == ALL_IN_FIRST && tree2 + get_num_columns() - 1 < trees_in_1 ) || (tree_pairing == ALL_IN_1_AND_2 && tree2 + get_num_columns() - 1 < trees_in_2 ) ) { print_matrix_heading(tree2, tree2 + get_num_columns() - 1); } else { if ( tree_pairing == ALL_IN_FIRST ) print_matrix_heading(tree2, trees_in_1); else print_matrix_heading(tree2, trees_in_2); } } if ( (tree2 - 1) % get_num_columns() == 0 || tree2 == 1 ) { print_line_heading(tree1); } fprintf(outfile, " %9g ", diffl); if ((tree_pairing == ALL_IN_FIRST && tree1 == trees_in_1 && tree2 == trees_in_1) || (tree_pairing == ALL_IN_1_AND_2 && tree1 == trees_in_1 && tree2 == trees_in_2)) fprintf(outfile, "\n\n\n"); } /* output_matrix_double */ void output_matrix_long(long diffl, long tree1, long tree2, long trees_in_1, long trees_in_2) { if ( tree1 == 1 && ((tree2 - 1) % get_num_columns() == 0 || tree2 == 1 )) { if ( (tree_pairing == ALL_IN_FIRST && tree2 + get_num_columns() - 1 < trees_in_1) || (tree_pairing == ALL_IN_1_AND_2 && tree2 + get_num_columns() - 1 < trees_in_2)) { print_matrix_heading(tree2, tree2 + get_num_columns() - 1); } else { if ( tree_pairing == ALL_IN_FIRST) print_matrix_heading(tree2, trees_in_1); else print_matrix_heading(tree2, trees_in_2); } } if ( (tree2 - 1) % get_num_columns() == 0 || tree2 == 1) { print_line_heading(tree1); } fprintf(outfile, "%4ld ", diffl); if ((tree_pairing == ALL_IN_FIRST && tree1 == trees_in_1 && tree2 == trees_in_1) || (tree_pairing == ALL_IN_1_AND_2 && tree1 == trees_in_1 && tree2 == trees_in_2)) fprintf(outfile, "\n\n\n"); } /* output_matrix_long */ void output_long_distance(long diffl, long tree1, long tree2, long trees_in_1, long trees_in_2) { switch (tree_pairing) { case ADJACENT_PAIRS: if (output_scheme == VERBOSE ) { fprintf (outfile, "Trees %ld and %ld: %ld\n", tree1, tree2, diffl); } else if (output_scheme == SPARSE) { fprintf (outfile, "%ld %ld %ld\n", tree1, tree2, diffl); } break; case ALL_IN_FIRST: if (output_scheme == VERBOSE) { fprintf (outfile, "Trees %ld and %ld: %ld\n", tree1, tree2, diffl); } else if (output_scheme == SPARSE) { fprintf (outfile, "%ld %ld %ld\n", tree1, tree2, diffl ); } else if (output_scheme == FULL_MATRIX) { output_matrix_long(diffl, tree1, tree2, trees_in_1, trees_in_2); } break; case CORR_IN_1_AND_2: if (output_scheme == VERBOSE) { fprintf (outfile, "Tree pair %ld: %ld\n", tree1, diffl); } else if (output_scheme == SPARSE) { fprintf (outfile, "%ld %ld\n", tree1, diffl); } break; case ALL_IN_1_AND_2: if (output_scheme == VERBOSE) fprintf (outfile, "Trees %ld and %ld: %ld\n", tree1, tree2, diffl); else if (output_scheme == SPARSE) fprintf (outfile, "%ld %ld %ld\n", tree1, tree2, diffl); else if (output_scheme == FULL_MATRIX ) { output_matrix_long(diffl, tree1, tree2, trees_in_1, trees_in_2); } break; } } void tree_diff(group_type **tree1, group_type **tree2, double *lengths1, double* lengths2, long patternsz1, long patternsz2, long ntree1, long ntree2, long trees_in_1, long trees_in_2) { long diffl; double diffd; switch (dtype) { case SYMMETRIC: diffl = symetric_diff (tree1, tree2, ntree1, ntree2, patternsz1, patternsz2); diffl += symetric_diff (tree2, tree1, ntree1, ntree2, patternsz2, patternsz1); output_long_distance(diffl, ntree1, ntree2, trees_in_1, trees_in_2); break; case BSD: diffd = bsd_tree_diff(tree1, tree2, ntree1, ntree2, lengths1, lengths2, patternsz1, patternsz2); output_double_distance(diffd, ntree1, ntree2, trees_in_1, trees_in_2); break; } } /* tree_diff */ int get_num_columns(void) { if ( dtype == SYMMETRIC ) return 10; else return 7; } /* get_num_columns */ void compute_distances(pattern_elm ***pattern_array, long trees_in_1, long trees_in_2) { /* Compute symmetric distances between arrays of trees */ long tree_index, end_tree, index1, index2, index3; group_type **treeA, **treeB; long patternsz1, patternsz2; double *length1 = NULL, *length2 = NULL; int num_columns = get_num_columns(); index1 = 0; /* Put together space for treeA and treeB */ treeA = (group_type **) Malloc (setsz * sizeof (group_type *)); treeB = (group_type **) Malloc (setsz * sizeof (group_type *)); print_header(trees_in_1, trees_in_2); switch (tree_pairing) { case ADJACENT_PAIRS: /* For every tree, compute the distance between it and the tree at the next location; do this in both directions */ end_tree = trees_in_1 - 1; for (tree_index = 0 ; tree_index < end_tree ; tree_index += 2) { assign_tree (treeA, pattern_array, tree_index, &patternsz1); assign_tree (treeB, pattern_array, tree_index + 1, &patternsz2); assign_lengths(&length1, pattern_array, tree_index); assign_lengths(&length2, pattern_array, tree_index + 1); tree_diff (treeA, treeB, length1, length2, patternsz1, patternsz2, tree_index+1, tree_index+2, trees_in_1, trees_in_2); if (tree_index + 2 == end_tree) printf("\nWARNING: extra tree at the end of input tree file.\n"); } break; case ALL_IN_FIRST: /* For every tree, compute the distance between it and every other tree in that file. */ end_tree = trees_in_1; if ( output_scheme != FULL_MATRIX ) { /* verbose or sparse output */ for (index1 = 0 ; index1 < end_tree ; index1++) { assign_tree (treeA, pattern_array, index1, &patternsz1); assign_lengths(&length1, pattern_array, index1); for (index2 = 0 ; index2 < end_tree ; index2++) { assign_tree (treeB, pattern_array, index2, &patternsz2); assign_lengths(&length2, pattern_array, index2); tree_diff (treeA, treeB, length1, length2, patternsz1, patternsz2, index1 + 1, index2 + 1, trees_in_1, trees_in_2); } } } else { /* full matrix output */ for ( index3 = 0 ; index3 < trees_in_1 ; index3 += num_columns) { for ( index1 = 0 ; index1 < trees_in_1 ; index1++) { assign_tree (treeA, pattern_array, index1, &patternsz1); assign_lengths(&length1, pattern_array, index1); for ( index2 = index3 ; index2 < index3 + num_columns && index2 < trees_in_1 ; index2++ ) { assign_tree (treeB, pattern_array, index2, &patternsz2); assign_lengths(&length2, pattern_array, index2); tree_diff (treeA, treeB, length1, length2, patternsz1, patternsz2, index1 + 1, index2 + 1, trees_in_1, trees_in_2); } } } } break; case CORR_IN_1_AND_2: if (trees_in_1 != trees_in_2) { /* Set end tree to the smaller of the two totals. */ end_tree = trees_in_1 > trees_in_2 ? trees_in_2 : trees_in_1; /* Print something out to the outfile and to the terminal. */ fprintf(outfile, "\n\n" "*** Warning: differing number of trees in first and second\n" "*** tree files. Only computing %ld pairs.\n" "\n", end_tree ); printf( "\n" " *** Warning: differing number of trees in first and second\n" " *** tree files. Only computing %ld pairs.\n" "\n", end_tree ); } else end_tree = trees_in_1; for (tree_index = 0 ; tree_index < end_tree ; tree_index++) { /* For every tree, compute the distance between it and the tree at the parallel location in the other file; do this in both directions */ assign_tree(treeA, pattern_array, tree_index, &patternsz1); assign_lengths(&length1, pattern_array, tree_index); /* (tree_index + trees_in_1) will be the corresponding tree in the second file. */ assign_tree(treeB, pattern_array, tree_index + trees_in_1, &patternsz2); assign_lengths(&length2, pattern_array, tree_index + trees_in_1); tree_diff( treeA, treeB, length1, length2, patternsz1, patternsz2, tree_index + 1, 0, trees_in_1, trees_in_2 ); } break; case ALL_IN_1_AND_2: end_tree = trees_in_1 + trees_in_2; if ( output_scheme != FULL_MATRIX ) { for (tree_index = 0 ; tree_index < trees_in_1 ; tree_index++) { /* For every tree in the first file, compute the distance between it and every tree in the second file. */ assign_tree (treeA, pattern_array, tree_index, &patternsz1); assign_lengths(&length1, pattern_array, tree_index); for (index2 = trees_in_1 ; index2 < end_tree ; index2++) { assign_tree (treeB, pattern_array, index2, &patternsz2); assign_lengths(&length2, pattern_array, index2); tree_diff(treeA, treeB, length1, length2, patternsz1, patternsz2, tree_index + 1 , index2 + 1, trees_in_1, trees_in_2); } } for ( ; tree_index < end_tree ; tree_index++) { /* For every tree in the second file, compute the distance between it and every tree in the first file. */ assign_tree (treeA, pattern_array, tree_index, &patternsz1); assign_lengths(&length1, pattern_array, tree_index); for (index2 = 0 ; index2 < trees_in_1 ; index2++) { assign_tree (treeB, pattern_array, index2, &patternsz2); assign_lengths(&length2, pattern_array, index2); tree_diff (treeA, treeB, length1, length2 , patternsz1, patternsz2, tree_index + 1, index2 + 1, trees_in_1, trees_in_2); } } } else { for ( index3 = trees_in_1 ; index3 < end_tree ; index3 += num_columns) { for ( index1 = 0 ; index1 < trees_in_1 ; index1++) { assign_tree (treeA, pattern_array, index1, &patternsz1); assign_lengths(&length1, pattern_array, index1); for ( index2 = index3 ; index2 < index3 + num_columns && index2 < end_tree ; index2++) { assign_tree (treeB, pattern_array, index2, &patternsz2); assign_lengths(&length2, pattern_array, index2); tree_diff (treeA, treeB, length1, length2, patternsz1, patternsz2, index1 + 1, index2 - trees_in_1 + 1, trees_in_1, trees_in_2); } } } } break; } /* Free up treeA and treeB */ free (treeA); free (treeB); } /* compute_distances */ void free_patterns(pattern_elm ***pattern_array, long total_trees) { long i, j; /* Free each pattern array, */ for (i=0 ; i < setsz ; i++) { for (j = 0 ; j < total_trees ; j++) { free (pattern_array[i][j]->apattern); free (pattern_array[i][j]->patternsize); free (pattern_array[i][j]->length); free (pattern_array[i][j]); } free (pattern_array[i]); } free (pattern_array); } /* free_patterns */ void print_header(long trees_in_1, long trees_in_2) { long end_tree; switch (tree_pairing) { case ADJACENT_PAIRS: end_tree = trees_in_1 - 1; if (output_scheme == VERBOSE) { fprintf(outfile, "\n" "Tree distance program, version %s\n\n", VERSION ); if (dtype == BSD) fprintf(outfile, "Branch score distances between adjacent pairs of trees:\n" "\n" ); else fprintf (outfile, "Symmetric differences between adjacent pairs of trees:\n\n"); } else if ( output_scheme != SPARSE) printf ("Error -- cannot output adjacent pairs into a full matrix.\n"); break; case ALL_IN_FIRST: end_tree = trees_in_1; if (output_scheme == VERBOSE) { fprintf(outfile, "\nTree distance program, version %s\n\n", VERSION); if (dtype == BSD) fprintf (outfile, "Branch score distances between all pairs of trees in tree file\n\n" ); else fprintf (outfile, "Symmetric differences between all pairs of trees in tree file:\n\n"); } else if (output_scheme == FULL_MATRIX) { fprintf(outfile, "\nTree distance program, version %s\n\n", VERSION); if (dtype == BSD) fprintf (outfile, "Branch score distances between all pairs of trees in tree file:\n\n"); else fprintf (outfile, "Symmetric differences between all pairs of trees in tree file:\n\n"); } break; case CORR_IN_1_AND_2: if (output_scheme == VERBOSE) { fprintf(outfile, "\nTree distance program, version %s\n\n", VERSION); if (dtype == BSD) { fprintf (outfile, "Branch score distances between corresponding pairs of trees\n"); fprintf (outfile, " from first and second tree files:\n\n"); } else { fprintf (outfile, "Symmetric differences between corresponding pairs of trees\n"); fprintf (outfile, " from first and second tree files:\n\n"); } } else if (output_scheme != SPARSE) printf ( "Error -- cannot output corresponding pairs into a full matrix.\n"); break; case (ALL_IN_1_AND_2) : if ( output_scheme == VERBOSE) { fprintf(outfile, "\nTree distance program, version %s\n\n", VERSION); if (dtype == BSD) { fprintf (outfile, "Branch score distances between all pairs of trees\n"); fprintf (outfile, " from first and second tree files:\n\n"); } else { fprintf(outfile,"Symmetric differences between all pairs of trees\n"); fprintf(outfile," from first and second tree files:\n\n"); } } else if ( output_scheme == FULL_MATRIX) { fprintf(outfile, "\nTree distance program, version %s\n\n", VERSION); } break; } } /* print_header */ void output_submenu() { /* this allows the user to select a different output of distances scheme. */ long loopcount; boolean done = false; Char ch; if (tree_pairing == NO_PAIRING) return; loopcount = 0; while (!done) { printf ("\nDistances output options:\n"); if ((tree_pairing == ALL_IN_1_AND_2) || (tree_pairing == ALL_IN_FIRST)) printf (" F Full matrix.\n"); printf (" V One pair per line, verbose.\n"); printf (" S One pair per line, sparse.\n"); if ((tree_pairing == ALL_IN_1_AND_2) || (tree_pairing == ALL_IN_FIRST)) printf ("\n Choose one: (F,V,S)\n"); else printf ("\n Choose one: (V,S)\n"); fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); uppercase(&ch); if (strchr("FVS", ch) != NULL) { switch (ch) { case 'F': if ((tree_pairing == ALL_IN_1_AND_2) || (tree_pairing == ALL_IN_FIRST)) output_scheme = FULL_MATRIX; else /* If this can't be a full matrix... */ continue; break; case 'V': output_scheme = VERBOSE; break; case 'S': output_scheme = SPARSE; break; } done = true; } countup(&loopcount, 10); } } /* output_submenu */ void pairing_submenu() { /* this allows the user to select a different tree pairing scheme. */ long loopcount; boolean done = false; Char ch; loopcount = 0; while (!done) { cleerhome(); printf( "Tree Pairing Submenu:\n" " A Distances between adjacent pairs in tree file.\n" " P Distances between all possible pairs in tree file.\n" " C Distances between corresponding pairs in one tree file and another.\n" " L Distances between all pairs in one tree file and another.\n" "\n" " Choose one: (A,P,C,L)\n" ); fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); uppercase(&ch); if ( strchr("APCL", ch) != NULL ) { switch ( ch ) { case 'A': tree_pairing = ADJACENT_PAIRS; break; case 'P': tree_pairing = ALL_IN_FIRST; break; case 'C': tree_pairing = CORR_IN_1_AND_2; break; case 'L': tree_pairing = ALL_IN_1_AND_2; break; } output_submenu(); done = true; } countup(&loopcount, 10); } } /* pairing_submenu */ void read_second_file( pattern_elm ***pattern_array, long trees_in_1, long trees_in_2 ) { boolean firsttree2, haslengths; long nextnode, trees_read=0; long j; firsttree2 = false; while ( !eoff(intree2) ) { goteof = false; nextnode = 0; haslengths = false; allocate_nodep(&nodep, &intree2, &spp); treeread( intree2, &root, treenode, &goteof, &firsttree2, nodep, &nextnode, &haslengths, &grbg, initconsnode, false, -1 ); missingname(root); reordertips(); if (goteof) continue; ntrees += trweight; if (noroot) { reroot(nodep[outgrno - 1], &nextnode); didreroot = outgropt; } accumulate(root); gdispose(root); for (j = 0; j < 2*(1 + spp); j++) nodep[j] = NULL; free(nodep); store_pattern(pattern_array, trees_in_1 + trees_read); trees_read++; } } /* read_second_file */ void getoptions() { /* interactively set options */ long loopcount; Char ch; boolean done; /* Initial settings */ dtype = BSD; tree_pairing = ADJACENT_PAIRS; output_scheme = VERBOSE; ibmpc = IBMCRT; ansi = ANSICRT; didreroot = false; spp = 0; grbg = NULL; col = 0; putchar('\n'); noroot = true; numopts = 0; outgrno = 1; outgropt = false; progress = true; /* The following are not used by treedist, but may be used in functions in cons.c, so we set them here. */ treeprint = false; trout = false; prntsets = false; loopcount = 0; do { cleerhome(); printf("\nTree distance program, version %s\n\n", VERSION); printf("Settings for this run:\n"); printf(" D Distance Type: "); switch (dtype) { case SYMMETRIC: printf("Symmetric Difference\n"); break; case BSD: printf("Branch Score Distance\n"); break; } printf(" R Trees to be treated as Rooted:"); if (noroot) printf(" No\n"); else printf(" Yes\n"); printf(" T Terminal type (IBM PC, ANSI, none):"); if (ibmpc) printf(" IBM PC\n"); if (ansi) printf(" ANSI\n"); if (!(ibmpc || ansi)) printf(" (none)\n"); printf(" 1 Print indications of progress of run: %s\n", (progress ? "Yes" : "No")); printf(" 2 Tree distance submenu:"); switch (tree_pairing) { case NO_PAIRING: printf("\n\nERROR: Unallowable option!\n\n"); exxit(-1); break; case ADJACENT_PAIRS: printf(" Distance between adjacent pairs\n"); break; case CORR_IN_1_AND_2: printf(" Distances between corresponding \n"); printf(" pairs in first and second tree files\n"); break; case ALL_IN_FIRST: printf(" Distances between all possible\n"); printf(" pairs in tree file.\n"); break; case ALL_IN_1_AND_2: printf(" Distances between all pairs in\n"); printf(" first and second tree files\n"); break; } printf("\nAre these settings correct? (type Y or the letter for one to change)\n"); fflush(stdout); scanf("%c%*[^\n]", &ch); getchar(); uppercase(&ch); done = (ch == 'Y'); if (!done) { if ((noroot && (ch == 'O')) || strchr("RTD12",ch) != NULL) { switch (ch) { case 'D': if ( dtype == SYMMETRIC ) dtype = BSD; else if ( dtype == BSD ) dtype = SYMMETRIC; break; case 'R': noroot = !noroot; break; case 'T': initterminal(&ibmpc, &ansi); break; case '1': progress = !progress; break; case '2': pairing_submenu(); break; } } else printf("Not a possible option!\n"); } countup(&loopcount, 100); } while (!done); } /* getoptions */ int main(int argc, Char *argv[]) { pattern_elm ***pattern_array; long trees_in_1 = 0, trees_in_2 = 0; long tip_count = 0; double ln_maxgrp; double ln_maxgrp1; double ln_maxgrp2; node * p; #ifdef MAC argc = 1; /* macsetup("Treedist", ""); */ argv[0] = "Treedist"; #endif init(argc, argv); /* Open in binary: ftell() is broken for UNIX line-endings under WIN32 */ openfile(&intree, INTREE, "input tree file", "rb", argv[0], intreename); openfile(&outfile, OUTFILE, "output file", "w", argv[0], outfilename); /* Initialize option-based variables, then ask for changes regarding their values. */ getoptions(); ntrees = 0.0; lasti = -1; /* read files to determine size of structures we'll be working with */ trees_in_1 = countsemic(&intree); countcomma(&intree,&tip_count); tip_count++; /* countcomma does a raw comma count, tips is one greater */ if ( (tree_pairing == ALL_IN_1_AND_2) || (tree_pairing == CORR_IN_1_AND_2) ) { /* If another intree file should exist, */ /* Open in binary: ftell() is broken for UNIX line-endings under WIN32 */ openfile(&intree2, INTREE2, "input tree file 2", "rb", argv[0], intree2name); trees_in_2 = countsemic(&intree2); } /* * EWFIX.BUG.756 -- this section may be killed if a good solution * to bug 756 is found * * inside cons.c there are several arrays which are allocated * to size "maxgrp", the maximum number of groups (sets of * tips more closely connected than the rest of the tree) we * can see as the code executes. * * We have two measures we use to determine how much space to * allot: * (1) based on the tip count of the trees in the infile * (2) based on total number of trees in infile, and * * (1) -- Tip Count Method * Since each group is a subset of the set of tips we must * represent at most pow(2,tips) different groups. (Technically * two fewer since we don't store the empty or complete subsets, * but let's keep this simple. * * (2) -- Total Tree Size Method * Each tree we read results in * singleton groups for each tip, plus * a group for each interior node except the root * Since the singleton tips are identical for each tree, this gives * a bound of #tips + ( #trees * (# tips - 2 ) ) * * * Ignoring small terms where expedient, either of the following should * result in an adequate allocation: * pow(2,#tips) * (#trees + 1) * #tips * * Since "maxgrp" is a limit on the number of items we'll need to put * in a hash, we double it to make space for quick hashing * * BUT -- all of this has the possibility for overflow, so -- let's * make the initial calculations with doubles and then convert * */ /* limit chosen to make hash arithmetic work */ maxgrp = LONG_MAX / 2; ln_maxgrp = log((double)maxgrp); /* 2 * (#trees + 1) * #tips */ ln_maxgrp1 = log(2.0 * (double)tip_count * ((double)trees_in_1 + (double)trees_in_2)); /* ln only for 2 * pow(2,#tips) */ ln_maxgrp2 = (double)(1 + tip_count) * log(2.0); /* now -- find the smallest of the three */ if(ln_maxgrp1 < ln_maxgrp) { maxgrp = 2 * (trees_in_1 + trees_in_2 + 1) * tip_count; ln_maxgrp = ln_maxgrp1; } if(ln_maxgrp2 < ln_maxgrp) { maxgrp = pow(2,tip_count+1); } maxgrp = 4*tip_count; /* Read the (first) tree file and put together grouping, order, and * timesseen */ read_groups (&pattern_array, trees_in_1 + trees_in_2, tip_count, intree); if ( (tree_pairing == ADJACENT_PAIRS) || (tree_pairing == ALL_IN_FIRST) ) { /* Here deal with the adjacent or all-in-first pairing difference computation */ compute_distances (pattern_array, trees_in_1, 0); } else if ( (tree_pairing == CORR_IN_1_AND_2) || (tree_pairing == ALL_IN_1_AND_2) ) { /* Here, open the other tree file, parse it, and then put together the difference array */ read_second_file(pattern_array, trees_in_1, trees_in_2); compute_distances (pattern_array, trees_in_1, trees_in_2); } else if (tree_pairing == NO_PAIRING) { /* Compute the consensus tree. */ putc('\n', outfile); /* consensus(); Reserved for future development */ } if (progress) printf("\nOutput written to file \"%s\"\n\n", outfilename); FClose(outtree); FClose(intree); FClose(outfile); if ((tree_pairing == ALL_IN_1_AND_2) || (tree_pairing == CORR_IN_1_AND_2)) FClose(intree2); #ifdef MAC fixmacfile(outfilename); fixmacfile(outtreename); #endif free_patterns (pattern_array, trees_in_1 + trees_in_2); clean_up_final(); /* clean up grbg */ p = grbg; while (p != NULL) { node * r = p; p = p->next; free(r->nodeset); free(r->view); free(r); } printf("Done.\n\n"); #ifdef WIN32 phyRestoreConsoleAttributes(); #endif return 0; } /* main */ phylip-3.697/src/wagner.c0000644004732000473200000003563012407047420014772 0ustar joefelsenst_g #include "phylip.h" #include "disc.h" #include "wagner.h" /* version 3.696. Written by Joseph Felsenstein, Akiko Fuseki, Sean Lamont, and Andrew Keeffe. Copyright (c) 1993-2014, Joseph Felsenstein All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ void inputmixture(bitptr wagner0) { /* input mixture of methods */ /* used in mix, move, & penny */ long i, j, k; Char ch; boolean wag; for (i = 0; i < (words); i++) wagner0[i] = 0; j = 0; k = 1; for (i = 1; i <= (chars); i++) { do { if (eoln(mixfile)) scan_eoln(mixfile); ch = gettc(mixfile); if (ch == '\n') ch = ' '; } while (ch == ' '); uppercase(&ch); wag = false; if (ch == 'W' || ch == '?') wag = true; else if (ch == 'S' || ch == 'C') wag = false; else { printf("BAD METHOD: %c\n", ch); exxit(-1); } if (wag) wagner0[k - 1] = (long)wagner0[k - 1] | (1L << j); j++; if (j > bits) { j = 1; k++; } } scan_eoln(mixfile); } /* inputmixture */ void printmixture(FILE *filename, bitptr wagner) { /* print out list of parsimony methods */ /* used in mix, move, & penny */ long i, k, l; fprintf(filename, "Parsimony methods:\n"); l = 0; k = 1; for (i = 1; i <= nmlngth + 3; i++) putc(' ', filename); for (i = 1; i <= (chars); i++) { newline(filename, i, 55, nmlngth + 3); l++; if (l > bits) { l = 1; k++; } if (((1L << l) & wagner[k - 1]) != 0) putc('W', filename); else putc('S', filename); if (i % 5 == 0) putc(' ', filename); } fprintf(filename, "\n\n"); } /* printmixture */ void fillin(node2 *p,long fullset,boolean full,bitptr wagner,bitptr zeroanc) { /* Sets up for each node in the tree two statesets. stateone and statezero are the sets of character states that must be 1 or must be 0, respectively, in a most parsimonious reconstruction, based on the information at or above this node. Note that this state assignment may change based on information further down the tree. If a character is in both sets it is in state "P". If in neither, it is "?". */ long i; long l0, l1, r0, r1, st, wa, za; for (i = 0; i < (words); i++) { if (full) { l0 = p->next->back->fulstte0[i]; l1 = p->next->back->fulstte1[i]; r0 = p->next->next->back->fulstte0[i]; r1 = p->next->next->back->fulstte1[i]; } else { l0 = p->next->back->empstte0[i]; l1 = p->next->back->empstte1[i]; r0 = p->next->next->back->empstte0[i]; r1 = p->next->next->back->empstte1[i]; } st = (l1 & r0) | (l0 & r1); wa = wagner[i]; za = zeroanc[i]; if (full) { p->fulstte1[i] = (l1 | r1) & (~(st & (wa | za))); p->fulstte0[i] = (l0 | r0) & (~(st & (wa | (fullset & (~za))))); p->fulsteps[i] = st; } else { p->empstte1[i] = (l1 | r1) & (~(st & (wa | za))); p->empstte0[i] = (l0 | r0) & (~(st & (wa | (fullset & (~za))))); p->empsteps[i] = st; } } } /* fillin */ void count(long *stps, bitptr zeroanc, steptr numszero, steptr numsone) { /* counts the number of steps in a fork of the tree. The program spends much of its time in this PROCEDURE */ /* used in mix & penny */ long i, j, l; j = 1; l = 0; for (i = 0; i < (chars); i++) { l++; if (l > bits) { l = 1; j++; } if (((1L << l) & stps[j - 1]) != 0) { if (((1L << l) & zeroanc[j - 1]) != 0) numszero[i] += weight[i]; else numsone[i] += weight[i]; } } } /* count */ void postorder(node2 *p, long fullset, boolean full, bitptr wagner, bitptr zeroanc) { /* traverses a binary tree, calling PROCEDURE fillin at a node's descendants before calling fillin at the node2 */ /* used in mix & penny */ if (p->tip) return; postorder(p->next->back, fullset, full, wagner, zeroanc); postorder(p->next->next->back, fullset, full, wagner, zeroanc); if (!p->visited) { fillin(p, fullset, full, wagner, zeroanc); if (!full) p->visited = true; } } /* postorder */ void cpostorder(node2 *p, boolean full, bitptr zeroanc, steptr numszero, steptr numsone) { /* traverses a binary tree, calling PROCEDURE count at a node's descendants before calling count at the node2 */ /* used in mix & penny */ if (p->tip) return; cpostorder(p->next->back, full, zeroanc, numszero, numsone); cpostorder(p->next->next->back, full, zeroanc, numszero, numsone); if (full) count(p->fulsteps, zeroanc, numszero, numsone); else count(p->empsteps, zeroanc, numszero, numsone); } /* cpostorder */ void filltrav(node2 *r, long fullset, boolean full, bitptr wagner, bitptr zeroanc) { /* traverse to fill in interior node states */ if (r->tip) return; filltrav(r->next->back, fullset, full, wagner, zeroanc); filltrav(r->next->next->back, fullset, full, wagner, zeroanc); fillin(r, fullset, full, wagner, zeroanc); } /* filltrav */ void hyprint(struct htrav_vars2 *htrav, boolean unknown, boolean noroot, boolean didreroot, bitptr wagner, Char *guess) { /* print out states at node2 */ long i, j, k; char l; boolean dot, a0, a1, s0, s1; if (htrav->bottom) { if (noroot && !didreroot) fprintf(outfile, " "); else fprintf(outfile, "root "); } else fprintf(outfile, "%3ld ", htrav->r->back->index - spp); if (htrav->r->tip) { for (i = 0; i < nmlngth; i++) putc(nayme[htrav->r->index - 1][i], outfile); } else fprintf(outfile, "%4ld ", htrav->r->index - spp); if (htrav->bottom && noroot && !didreroot) fprintf(outfile, " "); else if (htrav->nonzero) fprintf(outfile, " yes "); else if (unknown) fprintf(outfile, " ? "); else if (htrav->maybe) fprintf(outfile, " maybe "); else fprintf(outfile, " no "); for (j = 1; j <= (chars); j++) { newline(outfile, j, 40, nmlngth + 17); k = (j - 1) / bits + 1; l = (j - 1) % bits + 1; dot = (((1L << l) & wagner[k - 1]) == 0 && guess[j - 1] == '?'); s0 = (((1L << l) & htrav->r->empstte0[k - 1]) != 0); s1 = (((1L << l) & htrav->r->empstte1[k - 1]) != 0); a0 = (((1L << l) & htrav->zerobelow->bits_[k - 1]) != 0); a1 = (((1L << l) & htrav->onebelow->bits_[k - 1]) != 0); dot = (dot || ((!htrav->bottom || !noroot || didreroot) && a1 == s1 && a0 == s0)); if (dot) putc('.', outfile); else { if (s0) putc('0', outfile); else if (s1) putc('1', outfile); else putc('?', outfile); } if (j % 5 == 0) putc(' ', outfile); } putc('\n', outfile); } /* hyprint */ void hyptrav(node2 *r_, boolean unknown, bitptr dohyp, long fullset, boolean noroot, boolean didreroot, bitptr wagner, bitptr zeroanc, bitptr oneanc, pointptr2 treenode, Char *guess, gbit *garbage) { /* compute, print out states at one interior node2 */ /* used in mix & penny */ struct htrav_vars2 vars; long i; long l0, l1, r0, r1, s0, s1, a0, a1, temp, dh, wa; vars.r = r_; disc_gnu(&vars.zerobelow, &garbage); disc_gnu(&vars.onebelow, &garbage); vars.bottom = (vars.r->back == NULL); vars.maybe = false; vars.nonzero = false; if (vars.bottom) { memcpy(vars.zerobelow->bits_, zeroanc, words*sizeof(long)); memcpy(vars.onebelow->bits_, oneanc, words*sizeof(long)); } else { memcpy(vars.zerobelow->bits_, treenode[vars.r->back->index - 1]->empstte0, words*sizeof(long)); memcpy(vars.onebelow->bits_, treenode[vars.r->back->index - 1]->empstte1, words*sizeof(long)); } for (i = 0; i < (words); i++) { dh = dohyp[i]; s0 = vars.r->empstte0[i]; s1 = vars.r->empstte1[i]; a0 = vars.zerobelow->bits_[i]; a1 = vars.onebelow->bits_[i]; if (!vars.r->tip) { wa = wagner[i]; l0 = vars.r->next->back->empstte0[i]; l1 = vars.r->next->back->empstte1[i]; r0 = vars.r->next->next->back->empstte0[i]; r1 = vars.r->next->next->back->empstte1[i]; s0 = (wa & ((a0 & l0) | (a0 & r0) | (l0 & r0))) | (dh & fullset & (~wa) & s0); s1 = (wa & ((a1 & l1) | (a1 & r1) | (l1 & r1))) | (dh & fullset & (~wa) & s1); temp = fullset & (~(s0 | s1 | l1 | l0 | r1 | r0)); s0 |= temp & a0; s1 |= temp & a1; vars.r->empstte0[i] = s0; vars.r->empstte1[i] = s1; } vars.maybe = (vars.maybe || (dh & (s0 | s1)) != (a0 | a1)); vars.nonzero = (vars.nonzero || ((s1 & a0) | (s0 & a1)) != 0); } hyprint(&vars,unknown, noroot, didreroot, wagner, guess); if (!vars.r->tip) { hyptrav(vars.r->next->back,unknown,dohyp, fullset, noroot,didreroot, wagner, zeroanc, oneanc, treenode, guess, garbage); hyptrav(vars.r->next->next->back, unknown,dohyp, fullset, noroot, didreroot, wagner, zeroanc, oneanc, treenode, guess, garbage); } disc_chuck(vars.zerobelow, &garbage); disc_chuck(vars.onebelow, &garbage); } /* hyptrav */ void hypstates(long fullset, boolean full, boolean noroot, boolean didreroot, node2 *root, bitptr wagner, bitptr zeroanc, bitptr oneanc, pointptr2 treenode, Char *guess, gbit *garbage) { /* fill in and describe states at interior nodes */ /* used in mix & penny */ boolean unknown; bitptr dohyp; long i, j, k; for (i = 0; i < (words); i++) { zeroanc[i] = 0; oneanc[i] = 0; } unknown = false; for (i = 0; i < (chars); i++) { j = i / bits + 1; k = i % bits + 1; if (guess[i] == '0') zeroanc[j - 1] = ((long)zeroanc[j - 1]) | (1L << k); if (guess[i] == '1') oneanc[j - 1] = ((long)oneanc[j - 1]) | (1L << k); unknown = (unknown || ((((1L << k) & wagner[j - 1]) == 0) && guess[i] == '?')); } dohyp = (bitptr)Malloc(words*sizeof(long)); for (i = 0; i < (words); i++) dohyp[i] = wagner[i] | zeroanc[i] | oneanc[i]; filltrav(root, fullset, full, wagner, zeroanc); fprintf(outfile, "From To Any Steps? "); fprintf(outfile, "State at upper node\n"); fprintf(outfile, " "); fprintf(outfile, "( . means same as in the node below it on tree)\n\n"); hyptrav(root,unknown,dohyp, fullset, noroot, didreroot, wagner, zeroanc, oneanc, treenode, guess, garbage); free(dohyp); } /* hypstates */ void drawline(long i, double scale, node2 *root) { /* draws one row of the tree diagram by moving up tree */ node2 *p, *q, *r, *first =NULL, *last =NULL; long n, j; boolean extra, done; p = root; q = root; extra = false; if (i == p->ycoord && p == root) { if (p->index - spp >= 10) fprintf(outfile, "-%2ld", p->index - spp); else fprintf(outfile, "--%ld", p->index - spp); extra = true; } else fprintf(outfile, " "); do { if (!p->tip) { r = p->next; done = false; do { if (i >= r->back->ymin && i <= r->back->ymax) { q = r->back; done = true; } r = r->next; } while (!(done || r == p)); first = p->next->back; r = p->next; while (r->next != p) r = r->next; last = r->back; } done = (p == q); n = (long)(scale * (p->xcoord - q->xcoord) + 0.5); if (n < 3 && !q->tip) n = 3; if (extra) { n--; extra = false; } if (q->ycoord == i && !done) { putc('+', outfile); if (!q->tip) { for (j = 1; j <= n - 2; j++) putc('-', outfile); if (q->index - spp >= 10) fprintf(outfile, "%2ld", q->index - spp); else fprintf(outfile, "-%ld", q->index - spp); extra = true; } else { for (j = 1; j < n; j++) putc('-', outfile); } } else if (!p->tip) { if (last->ycoord > i && first->ycoord < i && i != p->ycoord) { putc('!', outfile); for (j = 1; j < n; j++) putc(' ', outfile); } else { for (j = 1; j <= n; j++) putc(' ', outfile); } } else { for (j = 1; j <= n; j++) putc(' ', outfile); } if (p != q) p = q; } while (!done); if (p->ycoord == i && p->tip) { for (j = 0; j < nmlngth; j++) putc(nayme[p->index - 1][j], outfile); } putc('\n', outfile); } /* drawline */ void printree(boolean treeprint,boolean noroot,boolean didreroot,node2 *root) { /* prints out diagram of the tree */ /* used in mix & penny */ long tipy, i; double scale; putc('\n', outfile); if (!treeprint) return; putc('\n', outfile); tipy = 1; coordinates2(root, &tipy); scale = 1.5; putc('\n', outfile); for (i = 1; i <= (tipy - down); i++) drawline(i, scale, root); if (noroot) { fprintf(outfile, "\n remember:"); if (didreroot) fprintf(outfile, " (although rooted by outgroup)"); fprintf(outfile, " this is an unrooted tree!\n"); } putc('\n', outfile); } /* printree */ void writesteps(boolean weights, steptr numsteps) { /* write number of steps */ /* used in mix & penny */ long i, j, k; if (weights) fprintf(outfile, "weighted "); fprintf(outfile, "steps in each character:\n"); fprintf(outfile, " "); for (i = 0; i <= 9; i++) fprintf(outfile, "%4ld", i); fprintf(outfile, "\n *-----------------------------------------\n"); for (i = 0; i <= (chars / 10); i++) { fprintf(outfile, "%5ld", i * 10); putc('!', outfile); for (j = 0; j <= 9; j++) { k = i * 10 + j; if (k == 0 || k > chars) fprintf(outfile, " "); else fprintf(outfile, "%4ld", numsteps[k - 1] + extras[k - 1]); } putc('\n', outfile); } putc('\n', outfile); } /* writesteps */ phylip-3.697/src/wagner.h0000644004732000473200000000462012407047446015002 0ustar joefelsenst_g /* version 3.696. Written by Joseph Felsenstein, Akiko Fuseki, Sean Lamont, and Andrew Keeffe. Copyright (c) 1993-2014, Joseph Felsenstein All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ /* wagner.h: included in move, mix & penny */ #ifndef OLDC /* function prototypes */ void inputmixture(bitptr); void inputmixturenew(bitptr); void printmixture(FILE *, bitptr); void fillin(node2 *,long, boolean, bitptr, bitptr); void count(long *, bitptr, steptr, steptr); void postorder(node2 *, long, boolean, bitptr, bitptr); void cpostorder(node2 *, boolean, bitptr, steptr, steptr); void filltrav(node2 *, long, boolean, bitptr, bitptr); void hyprint(struct htrav_vars2 *,boolean,boolean,boolean,bitptr,Char *); void hyptrav(node2 *, boolean, bitptr, long, boolean, boolean, bitptr, bitptr, bitptr, pointptr2, Char *, gbit *); void hypstates(long, boolean, boolean, boolean, node2 *, bitptr, bitptr, bitptr, pointptr2, Char *, gbit *); void drawline(long, double, node2 *); void printree(boolean, boolean, boolean, node2 *); void writesteps(boolean, steptr); /* function prototypes */ #endif