This file is indexed.

/usr/share/doc/phylip/html/doc/main.html is in phylip-doc 1:3.696+dfsg-1.

This file is owned by root:root, with mode 0o644.

The actual contents of the file can be viewed below.

   1
   2
   3
   4
   5
   6
   7
   8
   9
  10
  11
  12
  13
  14
  15
  16
  17
  18
  19
  20
  21
  22
  23
  24
  25
  26
  27
  28
  29
  30
  31
  32
  33
  34
  35
  36
  37
  38
  39
  40
  41
  42
  43
  44
  45
  46
  47
  48
  49
  50
  51
  52
  53
  54
  55
  56
  57
  58
  59
  60
  61
  62
  63
  64
  65
  66
  67
  68
  69
  70
  71
  72
  73
  74
  75
  76
  77
  78
  79
  80
  81
  82
  83
  84
  85
  86
  87
  88
  89
  90
  91
  92
  93
  94
  95
  96
  97
  98
  99
 100
 101
 102
 103
 104
 105
 106
 107
 108
 109
 110
 111
 112
 113
 114
 115
 116
 117
 118
 119
 120
 121
 122
 123
 124
 125
 126
 127
 128
 129
 130
 131
 132
 133
 134
 135
 136
 137
 138
 139
 140
 141
 142
 143
 144
 145
 146
 147
 148
 149
 150
 151
 152
 153
 154
 155
 156
 157
 158
 159
 160
 161
 162
 163
 164
 165
 166
 167
 168
 169
 170
 171
 172
 173
 174
 175
 176
 177
 178
 179
 180
 181
 182
 183
 184
 185
 186
 187
 188
 189
 190
 191
 192
 193
 194
 195
 196
 197
 198
 199
 200
 201
 202
 203
 204
 205
 206
 207
 208
 209
 210
 211
 212
 213
 214
 215
 216
 217
 218
 219
 220
 221
 222
 223
 224
 225
 226
 227
 228
 229
 230
 231
 232
 233
 234
 235
 236
 237
 238
 239
 240
 241
 242
 243
 244
 245
 246
 247
 248
 249
 250
 251
 252
 253
 254
 255
 256
 257
 258
 259
 260
 261
 262
 263
 264
 265
 266
 267
 268
 269
 270
 271
 272
 273
 274
 275
 276
 277
 278
 279
 280
 281
 282
 283
 284
 285
 286
 287
 288
 289
 290
 291
 292
 293
 294
 295
 296
 297
 298
 299
 300
 301
 302
 303
 304
 305
 306
 307
 308
 309
 310
 311
 312
 313
 314
 315
 316
 317
 318
 319
 320
 321
 322
 323
 324
 325
 326
 327
 328
 329
 330
 331
 332
 333
 334
 335
 336
 337
 338
 339
 340
 341
 342
 343
 344
 345
 346
 347
 348
 349
 350
 351
 352
 353
 354
 355
 356
 357
 358
 359
 360
 361
 362
 363
 364
 365
 366
 367
 368
 369
 370
 371
 372
 373
 374
 375
 376
 377
 378
 379
 380
 381
 382
 383
 384
 385
 386
 387
 388
 389
 390
 391
 392
 393
 394
 395
 396
 397
 398
 399
 400
 401
 402
 403
 404
 405
 406
 407
 408
 409
 410
 411
 412
 413
 414
 415
 416
 417
 418
 419
 420
 421
 422
 423
 424
 425
 426
 427
 428
 429
 430
 431
 432
 433
 434
 435
 436
 437
 438
 439
 440
 441
 442
 443
 444
 445
 446
 447
 448
 449
 450
 451
 452
 453
 454
 455
 456
 457
 458
 459
 460
 461
 462
 463
 464
 465
 466
 467
 468
 469
 470
 471
 472
 473
 474
 475
 476
 477
 478
 479
 480
 481
 482
 483
 484
 485
 486
 487
 488
 489
 490
 491
 492
 493
 494
 495
 496
 497
 498
 499
 500
 501
 502
 503
 504
 505
 506
 507
 508
 509
 510
 511
 512
 513
 514
 515
 516
 517
 518
 519
 520
 521
 522
 523
 524
 525
 526
 527
 528
 529
 530
 531
 532
 533
 534
 535
 536
 537
 538
 539
 540
 541
 542
 543
 544
 545
 546
 547
 548
 549
 550
 551
 552
 553
 554
 555
 556
 557
 558
 559
 560
 561
 562
 563
 564
 565
 566
 567
 568
 569
 570
 571
 572
 573
 574
 575
 576
 577
 578
 579
 580
 581
 582
 583
 584
 585
 586
 587
 588
 589
 590
 591
 592
 593
 594
 595
 596
 597
 598
 599
 600
 601
 602
 603
 604
 605
 606
 607
 608
 609
 610
 611
 612
 613
 614
 615
 616
 617
 618
 619
 620
 621
 622
 623
 624
 625
 626
 627
 628
 629
 630
 631
 632
 633
 634
 635
 636
 637
 638
 639
 640
 641
 642
 643
 644
 645
 646
 647
 648
 649
 650
 651
 652
 653
 654
 655
 656
 657
 658
 659
 660
 661
 662
 663
 664
 665
 666
 667
 668
 669
 670
 671
 672
 673
 674
 675
 676
 677
 678
 679
 680
 681
 682
 683
 684
 685
 686
 687
 688
 689
 690
 691
 692
 693
 694
 695
 696
 697
 698
 699
 700
 701
 702
 703
 704
 705
 706
 707
 708
 709
 710
 711
 712
 713
 714
 715
 716
 717
 718
 719
 720
 721
 722
 723
 724
 725
 726
 727
 728
 729
 730
 731
 732
 733
 734
 735
 736
 737
 738
 739
 740
 741
 742
 743
 744
 745
 746
 747
 748
 749
 750
 751
 752
 753
 754
 755
 756
 757
 758
 759
 760
 761
 762
 763
 764
 765
 766
 767
 768
 769
 770
 771
 772
 773
 774
 775
 776
 777
 778
 779
 780
 781
 782
 783
 784
 785
 786
 787
 788
 789
 790
 791
 792
 793
 794
 795
 796
 797
 798
 799
 800
 801
 802
 803
 804
 805
 806
 807
 808
 809
 810
 811
 812
 813
 814
 815
 816
 817
 818
 819
 820
 821
 822
 823
 824
 825
 826
 827
 828
 829
 830
 831
 832
 833
 834
 835
 836
 837
 838
 839
 840
 841
 842
 843
 844
 845
 846
 847
 848
 849
 850
 851
 852
 853
 854
 855
 856
 857
 858
 859
 860
 861
 862
 863
 864
 865
 866
 867
 868
 869
 870
 871
 872
 873
 874
 875
 876
 877
 878
 879
 880
 881
 882
 883
 884
 885
 886
 887
 888
 889
 890
 891
 892
 893
 894
 895
 896
 897
 898
 899
 900
 901
 902
 903
 904
 905
 906
 907
 908
 909
 910
 911
 912
 913
 914
 915
 916
 917
 918
 919
 920
 921
 922
 923
 924
 925
 926
 927
 928
 929
 930
 931
 932
 933
 934
 935
 936
 937
 938
 939
 940
 941
 942
 943
 944
 945
 946
 947
 948
 949
 950
 951
 952
 953
 954
 955
 956
 957
 958
 959
 960
 961
 962
 963
 964
 965
 966
 967
 968
 969
 970
 971
 972
 973
 974
 975
 976
 977
 978
 979
 980
 981
 982
 983
 984
 985
 986
 987
 988
 989
 990
 991
 992
 993
 994
 995
 996
 997
 998
 999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
1573
1574
1575
1576
1577
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
1597
1598
1599
1600
1601
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
1619
1620
1621
1622
1623
1624
1625
1626
1627
1628
1629
1630
1631
1632
1633
1634
1635
1636
1637
1638
1639
1640
1641
1642
1643
1644
1645
1646
1647
1648
1649
1650
1651
1652
1653
1654
1655
1656
1657
1658
1659
1660
1661
1662
1663
1664
1665
1666
1667
1668
1669
1670
1671
1672
1673
1674
1675
1676
1677
1678
1679
1680
1681
1682
1683
1684
1685
1686
1687
1688
1689
1690
1691
1692
1693
1694
1695
1696
1697
1698
1699
1700
1701
1702
1703
1704
1705
1706
1707
1708
1709
1710
1711
1712
1713
1714
1715
1716
1717
1718
1719
1720
1721
1722
1723
1724
1725
1726
1727
1728
1729
1730
1731
1732
1733
1734
1735
1736
1737
1738
1739
1740
1741
1742
1743
1744
1745
1746
1747
1748
1749
1750
1751
1752
1753
1754
1755
1756
1757
1758
1759
1760
1761
1762
1763
1764
1765
1766
1767
1768
1769
1770
1771
1772
1773
1774
1775
1776
1777
1778
1779
1780
1781
1782
1783
1784
1785
1786
1787
1788
1789
1790
1791
1792
1793
1794
1795
1796
1797
1798
1799
1800
1801
1802
1803
1804
1805
1806
1807
1808
1809
1810
1811
1812
1813
1814
1815
1816
1817
1818
1819
1820
1821
1822
1823
1824
1825
1826
1827
1828
1829
1830
1831
1832
1833
1834
1835
1836
1837
1838
1839
1840
1841
1842
1843
1844
1845
1846
1847
1848
1849
1850
1851
1852
1853
1854
1855
1856
1857
1858
1859
1860
1861
1862
1863
1864
1865
1866
1867
1868
1869
1870
1871
1872
1873
1874
1875
1876
1877
1878
1879
1880
1881
1882
1883
1884
1885
1886
1887
1888
1889
1890
1891
1892
1893
1894
1895
1896
1897
1898
1899
1900
1901
1902
1903
1904
1905
1906
1907
1908
1909
1910
1911
1912
1913
1914
1915
1916
1917
1918
1919
1920
1921
1922
1923
1924
1925
1926
1927
1928
1929
1930
1931
1932
1933
1934
1935
1936
1937
1938
1939
1940
1941
1942
1943
1944
1945
1946
1947
1948
1949
1950
1951
1952
1953
1954
1955
1956
1957
1958
1959
1960
1961
1962
1963
1964
1965
1966
1967
1968
1969
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
2027
2028
2029
2030
2031
2032
2033
2034
2035
2036
2037
2038
2039
2040
2041
2042
2043
2044
2045
2046
2047
2048
2049
2050
2051
2052
2053
2054
2055
2056
2057
2058
2059
2060
2061
2062
2063
2064
2065
2066
2067
2068
2069
2070
2071
2072
2073
2074
2075
2076
2077
2078
2079
2080
2081
2082
2083
2084
2085
2086
2087
2088
2089
2090
2091
2092
2093
2094
2095
2096
2097
2098
2099
2100
2101
2102
2103
2104
2105
2106
2107
2108
2109
2110
2111
2112
2113
2114
2115
2116
2117
2118
2119
2120
2121
2122
2123
2124
2125
2126
2127
2128
2129
2130
2131
2132
2133
2134
2135
2136
2137
2138
2139
2140
2141
2142
2143
2144
2145
2146
2147
2148
2149
2150
2151
2152
2153
2154
2155
2156
2157
2158
2159
2160
2161
2162
2163
2164
2165
2166
2167
2168
2169
2170
2171
2172
2173
2174
2175
2176
2177
2178
2179
2180
2181
2182
2183
2184
2185
2186
2187
2188
2189
2190
2191
2192
2193
2194
2195
2196
2197
2198
2199
2200
2201
2202
2203
2204
2205
2206
2207
2208
2209
2210
2211
2212
2213
2214
2215
2216
2217
2218
2219
2220
2221
2222
2223
2224
2225
2226
2227
2228
2229
2230
2231
2232
2233
2234
2235
2236
2237
2238
2239
2240
2241
2242
2243
2244
2245
2246
2247
2248
2249
2250
2251
2252
2253
2254
2255
2256
2257
2258
2259
2260
2261
2262
2263
2264
2265
2266
2267
2268
2269
2270
2271
2272
2273
2274
2275
2276
2277
2278
2279
2280
2281
2282
2283
2284
2285
2286
2287
2288
2289
2290
2291
2292
2293
2294
2295
2296
2297
2298
2299
2300
2301
2302
2303
2304
2305
2306
2307
2308
2309
2310
2311
2312
2313
2314
2315
2316
2317
2318
2319
2320
2321
2322
2323
2324
2325
2326
2327
2328
2329
2330
2331
2332
2333
2334
2335
2336
2337
2338
2339
2340
2341
2342
2343
2344
2345
2346
2347
2348
2349
2350
2351
2352
2353
2354
2355
2356
2357
2358
2359
2360
2361
2362
2363
2364
2365
2366
2367
2368
2369
2370
2371
2372
2373
2374
2375
2376
2377
2378
2379
2380
2381
2382
2383
2384
2385
2386
2387
2388
2389
2390
2391
2392
2393
2394
2395
2396
2397
2398
2399
2400
2401
2402
2403
2404
2405
2406
2407
2408
2409
2410
2411
2412
2413
2414
2415
2416
2417
2418
2419
2420
2421
2422
2423
2424
2425
2426
2427
2428
2429
2430
2431
2432
2433
2434
2435
2436
2437
2438
2439
2440
2441
2442
2443
2444
2445
2446
2447
2448
2449
2450
2451
2452
2453
2454
2455
2456
2457
2458
2459
2460
2461
2462
2463
2464
2465
2466
2467
2468
2469
2470
2471
2472
2473
2474
2475
2476
2477
2478
2479
2480
2481
2482
2483
2484
2485
2486
2487
2488
2489
2490
2491
2492
2493
2494
2495
2496
2497
2498
2499
2500
2501
2502
2503
2504
2505
2506
2507
2508
2509
2510
2511
2512
2513
2514
2515
2516
2517
2518
2519
2520
2521
2522
2523
2524
2525
2526
2527
2528
2529
2530
2531
2532
2533
2534
2535
2536
2537
2538
2539
2540
2541
2542
2543
2544
2545
2546
2547
2548
2549
2550
2551
2552
2553
2554
2555
2556
2557
2558
2559
2560
2561
2562
2563
2564
2565
2566
2567
2568
2569
2570
2571
2572
2573
2574
2575
2576
2577
2578
2579
2580
2581
2582
2583
2584
2585
2586
2587
2588
2589
2590
2591
2592
2593
2594
2595
2596
2597
2598
2599
2600
2601
2602
2603
2604
2605
2606
2607
2608
2609
2610
2611
2612
2613
2614
2615
2616
2617
2618
2619
2620
2621
2622
2623
2624
2625
2626
2627
2628
2629
2630
2631
2632
2633
2634
2635
2636
2637
2638
2639
2640
2641
2642
2643
2644
2645
2646
2647
2648
2649
2650
2651
2652
2653
2654
2655
2656
2657
2658
2659
2660
2661
2662
2663
2664
2665
2666
2667
2668
2669
2670
2671
2672
2673
2674
2675
2676
2677
2678
2679
2680
2681
2682
2683
2684
2685
2686
2687
2688
2689
2690
2691
2692
2693
2694
2695
2696
2697
2698
2699
2700
2701
2702
2703
2704
2705
2706
2707
2708
2709
2710
2711
2712
2713
2714
2715
2716
2717
2718
2719
2720
2721
2722
2723
2724
2725
2726
2727
2728
2729
2730
2731
2732
2733
2734
2735
2736
2737
2738
2739
2740
2741
2742
2743
2744
2745
2746
2747
2748
2749
2750
2751
2752
2753
2754
2755
2756
2757
2758
2759
2760
2761
2762
2763
2764
2765
2766
2767
2768
2769
2770
2771
2772
2773
2774
2775
2776
2777
2778
2779
2780
2781
2782
2783
2784
2785
2786
2787
2788
2789
2790
2791
2792
2793
2794
2795
2796
2797
2798
2799
2800
2801
2802
2803
2804
2805
2806
2807
2808
2809
2810
2811
2812
2813
2814
2815
2816
2817
2818
2819
2820
2821
2822
2823
2824
2825
2826
2827
2828
2829
2830
2831
2832
2833
2834
2835
2836
2837
2838
2839
2840
2841
2842
2843
2844
2845
2846
2847
2848
2849
2850
2851
2852
2853
2854
2855
2856
2857
2858
2859
2860
2861
2862
2863
2864
2865
2866
2867
2868
2869
2870
2871
2872
2873
2874
2875
2876
2877
2878
2879
2880
2881
2882
2883
2884
2885
2886
2887
2888
2889
2890
2891
2892
2893
2894
2895
2896
2897
2898
2899
2900
2901
2902
2903
2904
2905
2906
2907
2908
2909
2910
2911
2912
2913
2914
2915
2916
2917
2918
2919
2920
2921
2922
2923
2924
2925
2926
2927
2928
2929
2930
2931
2932
2933
2934
2935
2936
2937
2938
2939
2940
2941
2942
2943
2944
2945
2946
2947
2948
2949
2950
2951
2952
2953
2954
2955
2956
2957
2958
2959
2960
2961
2962
2963
2964
2965
2966
2967
2968
2969
2970
2971
2972
2973
2974
2975
2976
2977
2978
2979
2980
2981
2982
2983
2984
2985
2986
2987
2988
2989
2990
2991
2992
2993
2994
2995
2996
2997
2998
2999
3000
3001
3002
3003
3004
3005
3006
3007
3008
3009
3010
3011
3012
3013
3014
3015
3016
3017
3018
3019
3020
3021
3022
3023
3024
3025
3026
3027
3028
3029
3030
3031
3032
3033
3034
3035
3036
3037
3038
3039
3040
3041
3042
3043
3044
3045
3046
3047
3048
3049
3050
3051
3052
3053
3054
3055
3056
3057
3058
3059
3060
3061
3062
3063
3064
3065
3066
3067
3068
3069
3070
3071
3072
3073
3074
3075
3076
3077
3078
3079
3080
3081
3082
3083
3084
3085
3086
3087
3088
3089
3090
3091
3092
3093
3094
3095
3096
3097
3098
3099
3100
3101
3102
3103
3104
3105
3106
3107
3108
3109
3110
3111
3112
3113
3114
3115
3116
3117
3118
3119
3120
3121
3122
3123
3124
3125
3126
3127
3128
3129
3130
3131
3132
3133
3134
3135
3136
3137
3138
3139
3140
3141
3142
3143
3144
3145
3146
3147
3148
3149
3150
3151
3152
3153
3154
3155
3156
3157
3158
3159
3160
3161
3162
3163
3164
3165
3166
3167
3168
3169
3170
3171
3172
3173
3174
3175
3176
3177
3178
3179
3180
3181
3182
3183
3184
3185
3186
3187
3188
3189
3190
3191
3192
3193
3194
3195
3196
3197
3198
3199
3200
3201
3202
3203
3204
3205
3206
3207
3208
3209
3210
3211
3212
3213
3214
3215
3216
3217
3218
3219
3220
3221
3222
3223
3224
3225
3226
3227
3228
3229
3230
3231
3232
3233
3234
3235
3236
3237
3238
3239
3240
3241
3242
3243
3244
3245
3246
3247
3248
3249
3250
3251
3252
3253
3254
3255
3256
3257
3258
3259
3260
3261
3262
3263
3264
3265
3266
3267
3268
3269
3270
3271
3272
3273
3274
3275
3276
3277
3278
3279
3280
3281
3282
3283
3284
3285
3286
3287
3288
3289
3290
3291
3292
3293
3294
3295
3296
3297
3298
3299
3300
3301
3302
3303
3304
3305
3306
3307
3308
3309
3310
3311
3312
3313
3314
3315
3316
3317
3318
3319
3320
3321
3322
3323
3324
3325
3326
3327
3328
3329
3330
3331
3332
3333
3334
3335
3336
3337
3338
3339
3340
3341
3342
3343
3344
3345
3346
3347
3348
3349
3350
3351
3352
3353
3354
3355
3356
3357
3358
3359
3360
3361
3362
3363
3364
3365
3366
3367
3368
3369
3370
3371
3372
3373
3374
3375
3376
3377
3378
3379
3380
3381
3382
3383
3384
3385
3386
3387
3388
3389
3390
3391
3392
3393
3394
3395
3396
3397
3398
3399
3400
3401
3402
3403
3404
3405
3406
3407
3408
3409
3410
3411
3412
3413
3414
3415
3416
3417
3418
3419
3420
3421
3422
3423
3424
3425
3426
3427
3428
3429
3430
3431
3432
3433
3434
3435
3436
3437
3438
3439
3440
3441
3442
3443
3444
3445
3446
3447
3448
3449
3450
3451
3452
3453
3454
3455
3456
3457
3458
3459
3460
3461
3462
3463
3464
3465
3466
3467
3468
3469
3470
3471
3472
3473
3474
3475
3476
3477
3478
3479
3480
3481
3482
3483
3484
3485
3486
3487
3488
3489
3490
3491
3492
3493
3494
3495
3496
3497
3498
3499
3500
3501
3502
3503
3504
3505
3506
3507
3508
3509
3510
3511
3512
3513
3514
3515
3516
3517
3518
3519
3520
3521
3522
3523
3524
3525
3526
3527
3528
3529
3530
3531
3532
3533
3534
3535
3536
3537
3538
3539
3540
3541
3542
3543
3544
3545
3546
3547
3548
3549
3550
3551
3552
3553
3554
3555
3556
3557
3558
3559
3560
3561
3562
3563
3564
3565
3566
3567
3568
3569
3570
3571
3572
3573
3574
3575
3576
3577
3578
3579
3580
3581
3582
3583
3584
3585
3586
3587
3588
3589
3590
3591
3592
3593
3594
3595
3596
3597
3598
3599
3600
3601
3602
3603
3604
3605
3606
3607
3608
3609
3610
3611
3612
3613
3614
3615
3616
3617
3618
3619
3620
3621
3622
3623
3624
3625
3626
3627
3628
3629
3630
3631
3632
3633
3634
3635
3636
3637
3638
3639
3640
3641
3642
3643
3644
3645
3646
3647
3648
3649
3650
3651
3652
3653
3654
3655
3656
3657
3658
3659
3660
3661
3662
3663
3664
3665
3666
3667
3668
3669
3670
3671
3672
3673
3674
3675
3676
3677
3678
3679
3680
3681
3682
3683
3684
3685
3686
3687
3688
3689
3690
3691
3692
3693
3694
3695
3696
3697
3698
3699
3700
3701
3702
3703
3704
3705
3706
3707
3708
3709
3710
3711
3712
3713
3714
3715
3716
3717
3718
3719
3720
3721
3722
3723
3724
3725
3726
3727
3728
3729
3730
3731
3732
3733
3734
3735
3736
3737
3738
3739
3740
3741
3742
3743
3744
3745
3746
3747
3748
3749
3750
3751
3752
3753
3754
3755
3756
3757
3758
3759
3760
3761
3762
3763
3764
3765
3766
3767
3768
3769
3770
3771
3772
3773
3774
3775
3776
3777
3778
3779
3780
3781
3782
3783
3784
3785
3786
3787
3788
3789
3790
3791
3792
3793
3794
3795
3796
3797
3798
3799
3800
3801
3802
3803
3804
3805
3806
3807
3808
3809
3810
3811
3812
3813
3814
3815
3816
3817
3818
3819
3820
3821
3822
3823
3824
3825
3826
3827
3828
3829
3830
3831
3832
3833
3834
3835
3836
3837
3838
3839
3840
3841
3842
3843
3844
3845
3846
3847
3848
3849
3850
3851
3852
3853
3854
3855
3856
3857
3858
3859
3860
3861
3862
3863
3864
3865
3866
3867
3868
3869
3870
3871
3872
3873
3874
3875
3876
3877
3878
3879
3880
3881
3882
3883
3884
3885
3886
3887
3888
3889
3890
3891
3892
3893
3894
3895
3896
3897
3898
3899
3900
3901
3902
3903
3904
3905
3906
3907
3908
3909
3910
3911
3912
3913
3914
3915
3916
3917
3918
3919
3920
3921
3922
3923
3924
3925
3926
3927
3928
3929
3930
3931
3932
3933
3934
3935
3936
3937
3938
3939
3940
3941
3942
3943
3944
3945
3946
3947
3948
3949
3950
3951
3952
3953
3954
3955
3956
3957
3958
3959
3960
3961
3962
3963
3964
3965
3966
3967
3968
3969
3970
3971
3972
3973
3974
3975
3976
3977
3978
3979
3980
3981
3982
3983
3984
3985
3986
3987
3988
3989
3990
3991
3992
3993
3994
3995
3996
3997
3998
3999
4000
4001
4002
4003
4004
4005
4006
4007
4008
4009
4010
4011
4012
4013
4014
4015
4016
4017
4018
4019
4020
4021
4022
4023
4024
4025
4026
4027
4028
4029
4030
4031
4032
4033
4034
4035
4036
4037
4038
4039
4040
4041
4042
4043
4044
4045
4046
4047
4048
4049
4050
4051
4052
4053
4054
4055
4056
4057
4058
4059
4060
4061
4062
4063
4064
4065
4066
4067
4068
4069
4070
4071
4072
4073
4074
4075
4076
4077
4078
4079
4080
4081
4082
4083
4084
4085
4086
4087
4088
4089
4090
4091
4092
4093
4094
4095
4096
4097
4098
4099
4100
4101
4102
4103
4104
4105
4106
4107
4108
4109
4110
4111
4112
4113
4114
4115
4116
4117
4118
4119
4120
4121
4122
4123
4124
4125
4126
4127
4128
4129
4130
4131
4132
4133
4134
4135
4136
4137
4138
4139
4140
4141
4142
4143
4144
4145
4146
4147
4148
4149
4150
4151
4152
4153
4154
4155
4156
4157
4158
4159
4160
4161
4162
4163
4164
4165
4166
4167
4168
4169
4170
4171
4172
4173
4174
4175
4176
4177
4178
4179
4180
4181
4182
4183
4184
4185
4186
4187
4188
4189
4190
4191
4192
4193
4194
4195
4196
4197
4198
4199
4200
4201
4202
4203
4204
4205
4206
4207
4208
4209
4210
4211
4212
4213
4214
4215
4216
4217
4218
4219
4220
4221
4222
4223
4224
4225
4226
4227
4228
4229
4230
4231
4232
4233
4234
4235
4236
4237
4238
4239
4240
4241
4242
4243
4244
4245
4246
4247
4248
4249
4250
4251
4252
4253
4254
4255
4256
4257
4258
4259
4260
4261
4262
4263
4264
4265
4266
4267
4268
4269
4270
4271
4272
4273
4274
4275
4276
4277
4278
4279
4280
4281
4282
4283
4284
4285
4286
4287
4288
4289
4290
4291
4292
4293
4294
4295
4296
4297
4298
4299
4300
4301
4302
4303
4304
4305
4306
4307
4308
4309
4310
4311
4312
4313
4314
4315
4316
4317
4318
4319
4320
4321
4322
4323
4324
4325
4326
4327
4328
4329
4330
4331
4332
4333
4334
4335
4336
4337
4338
4339
4340
4341
4342
4343
4344
4345
4346
4347
4348
4349
4350
4351
4352
4353
4354
4355
4356
4357
4358
4359
4360
4361
4362
4363
4364
4365
4366
4367
4368
4369
4370
4371
4372
4373
4374
4375
4376
4377
4378
4379
4380
4381
4382
4383
4384
4385
4386
4387
4388
4389
4390
4391
4392
4393
4394
4395
4396
4397
4398
4399
4400
4401
4402
4403
4404
4405
4406
4407
4408
4409
4410
4411
4412
4413
4414
4415
4416
4417
4418
4419
4420
4421
4422
4423
4424
4425
4426
4427
4428
4429
4430
4431
4432
4433
4434
4435
4436
4437
4438
4439
4440
4441
4442
4443
4444
4445
4446
4447
4448
4449
4450
4451
4452
4453
4454
4455
4456
4457
4458
4459
4460
4461
4462
4463
4464
4465
4466
4467
4468
4469
4470
4471
4472
4473
4474
4475
4476
4477
4478
4479
4480
4481
4482
4483
4484
4485
4486
4487
4488
4489
4490
4491
4492
4493
4494
4495
4496
4497
4498
4499
4500
4501
4502
4503
4504
4505
4506
4507
4508
4509
4510
4511
4512
4513
4514
4515
4516
4517
4518
4519
4520
4521
4522
4523
4524
4525
4526
4527
4528
4529
4530
4531
4532
4533
4534
4535
4536
4537
4538
4539
4540
4541
4542
4543
4544
4545
4546
4547
4548
4549
4550
4551
4552
4553
4554
4555
4556
4557
4558
4559
4560
4561
4562
4563
4564
4565
4566
4567
4568
4569
4570
4571
4572
4573
4574
4575
4576
4577
4578
4579
4580
4581
4582
4583
4584
4585
4586
4587
4588
4589
4590
4591
4592
4593
4594
4595
4596
4597
4598
4599
4600
4601
4602
4603
4604
4605
4606
4607
4608
4609
4610
4611
4612
4613
4614
4615
4616
4617
4618
4619
4620
4621
4622
4623
4624
4625
4626
4627
4628
4629
4630
4631
4632
4633
4634
4635
4636
4637
4638
4639
4640
4641
4642
4643
4644
4645
4646
4647
4648
4649
4650
4651
4652
4653
4654
4655
4656
4657
4658
4659
4660
4661
4662
4663
4664
4665
4666
4667
4668
4669
4670
4671
4672
4673
4674
4675
4676
4677
4678
4679
4680
4681
4682
4683
4684
4685
4686
4687
4688
4689
4690
4691
4692
4693
4694
4695
4696
4697
4698
4699
4700
4701
4702
4703
4704
4705
4706
4707
4708
4709
4710
4711
4712
4713
4714
4715
4716
4717
4718
4719
4720
4721
4722
4723
4724
4725
4726
4727
4728
4729
4730
4731
4732
4733
4734
4735
4736
4737
4738
4739
4740
4741
4742
4743
4744
4745
4746
4747
4748
4749
4750
4751
4752
4753
4754
4755
4756
4757
4758
4759
4760
4761
4762
4763
4764
4765
4766
4767
4768
4769
4770
4771
4772
4773
4774
4775
4776
4777
4778
4779
4780
4781
4782
4783
4784
4785
4786
4787
4788
4789
4790
4791
4792
4793
4794
4795
4796
4797
4798
4799
4800
4801
4802
4803
4804
4805
4806
4807
4808
4809
4810
4811
4812
4813
4814
4815
4816
4817
4818
4819
4820
4821
4822
4823
4824
4825
4826
4827
4828
4829
4830
4831
4832
4833
4834
4835
4836
4837
4838
4839
4840
4841
4842
4843
4844
4845
4846
4847
4848
4849
4850
4851
4852
4853
4854
4855
4856
4857
4858
4859
4860
4861
4862
4863
4864
4865
4866
4867
4868
4869
4870
4871
4872
4873
4874
4875
4876
4877
4878
4879
4880
4881
4882
4883
4884
4885
4886
4887
4888
4889
4890
4891
4892
4893
4894
4895
4896
4897
4898
4899
4900
4901
4902
4903
4904
4905
4906
4907
4908
4909
4910
4911
4912
4913
4914
4915
4916
4917
4918
4919
4920
4921
4922
4923
4924
4925
4926
4927
4928
4929
4930
4931
4932
4933
4934
4935
4936
4937
4938
4939
4940
4941
4942
4943
4944
4945
4946
4947
4948
4949
4950
4951
4952
4953
4954
4955
4956
4957
4958
4959
4960
4961
4962
4963
4964
4965
4966
4967
4968
4969
4970
4971
4972
4973
4974
4975
4976
4977
4978
4979
4980
4981
4982
4983
4984
4985
4986
4987
4988
4989
4990
4991
4992
4993
4994
4995
4996
4997
4998
4999
5000
5001
5002
5003
5004
5005
5006
5007
5008
5009
5010
5011
5012
5013
5014
5015
5016
5017
5018
5019
5020
5021
5022
5023
5024
5025
5026
5027
5028
5029
5030
5031
5032
5033
5034
5035
5036
5037
5038
5039
5040
5041
5042
5043
5044
5045
5046
5047
5048
5049
5050
5051
5052
5053
5054
5055
5056
5057
5058
5059
5060
5061
5062
5063
5064
5065
5066
5067
5068
5069
5070
5071
5072
5073
5074
5075
5076
5077
5078
5079
5080
5081
5082
5083
5084
5085
5086
5087
5088
5089
5090
5091
5092
5093
5094
5095
5096
5097
5098
5099
5100
5101
5102
5103
5104
5105
5106
5107
5108
5109
5110
5111
5112
5113
5114
5115
5116
5117
5118
5119
5120
5121
5122
5123
5124
5125
5126
5127
5128
5129
5130
5131
5132
5133
5134
5135
5136
5137
5138
5139
5140
5141
5142
5143
5144
5145
5146
5147
5148
5149
5150
5151
5152
5153
5154
5155
5156
5157
5158
5159
5160
5161
5162
5163
5164
5165
5166
5167
5168
5169
5170
5171
5172
5173
5174
5175
5176
5177
5178
5179
5180
5181
5182
5183
5184
5185
5186
5187
5188
5189
5190
5191
5192
5193
5194
5195
5196
5197
5198
5199
5200
5201
5202
5203
5204
5205
5206
5207
5208
5209
5210
5211
5212
5213
5214
5215
5216
5217
5218
5219
5220
5221
5222
5223
5224
5225
5226
5227
5228
5229
5230
5231
5232
5233
5234
5235
5236
5237
5238
5239
5240
5241
5242
5243
5244
5245
5246
5247
5248
5249
5250
5251
5252
5253
5254
5255
5256
5257
5258
5259
5260
5261
5262
5263
5264
5265
5266
5267
5268
5269
5270
5271
5272
5273
5274
5275
5276
5277
5278
5279
5280
5281
5282
5283
5284
5285
5286
5287
5288
5289
5290
5291
5292
5293
5294
5295
5296
5297
5298
5299
5300
5301
5302
5303
5304
5305
5306
5307
5308
5309
5310
5311
5312
5313
5314
5315
5316
5317
5318
5319
5320
5321
5322
5323
5324
5325
5326
5327
5328
5329
5330
5331
5332
5333
5334
5335
5336
5337
5338
5339
5340
5341
5342
5343
5344
5345
5346
5347
5348
5349
5350
5351
5352
5353
5354
5355
5356
5357
5358
5359
5360
5361
5362
5363
5364
5365
5366
5367
5368
5369
5370
5371
5372
5373
5374
5375
5376
5377
5378
5379
5380
5381
5382
5383
5384
5385
5386
5387
5388
5389
5390
5391
5392
5393
5394
5395
5396
5397
5398
5399
5400
5401
5402
5403
5404
5405
5406
5407
5408
5409
5410
5411
5412
5413
5414
5415
5416
5417
5418
5419
5420
5421
5422
5423
5424
5425
5426
5427
5428
5429
5430
5431
5432
5433
5434
5435
5436
5437
5438
5439
5440
5441
5442
5443
5444
5445
5446
5447
5448
5449
5450
5451
5452
5453
5454
5455
5456
5457
5458
5459
5460
5461
5462
5463
5464
5465
5466
5467
5468
5469
5470
5471
5472
5473
5474
5475
5476
5477
5478
5479
5480
5481
5482
5483
5484
5485
5486
5487
5488
5489
5490
5491
5492
5493
5494
5495
5496
5497
5498
5499
5500
5501
5502
5503
5504
5505
5506
5507
5508
5509
5510
5511
5512
5513
5514
5515
5516
5517
5518
5519
5520
5521
5522
5523
5524
5525
5526
5527
5528
5529
5530
5531
5532
5533
5534
5535
5536
5537
5538
5539
5540
5541
5542
5543
5544
5545
5546
5547
5548
5549
5550
5551
5552
5553
5554
5555
5556
5557
5558
5559
5560
5561
5562
5563
5564
5565
5566
5567
5568
5569
5570
5571
5572
5573
5574
5575
5576
5577
5578
5579
5580
5581
5582
5583
5584
5585
5586
5587
5588
5589
5590
5591
5592
5593
5594
5595
5596
5597
5598
5599
5600
5601
5602
5603
5604
5605
5606
5607
5608
5609
5610
5611
5612
5613
5614
5615
5616
5617
5618
5619
5620
5621
5622
5623
5624
5625
5626
5627
5628
5629
5630
5631
5632
5633
5634
5635
5636
5637
5638
5639
5640
5641
5642
5643
5644
5645
5646
5647
5648
5649
5650
5651
5652
5653
5654
5655
5656
5657
5658
5659
5660
5661
5662
5663
5664
5665
5666
5667
5668
5669
5670
5671
5672
5673
5674
5675
5676
5677
5678
5679
5680
5681
5682
5683
5684
5685
5686
5687
5688
5689
5690
5691
5692
5693
5694
5695
5696
5697
5698
5699
5700
5701
5702
5703
5704
5705
5706
5707
5708
5709
5710
5711
5712
5713
5714
5715
5716
5717
5718
5719
5720
5721
5722
5723
5724
5725
5726
5727
5728
5729
5730
5731
5732
5733
5734
5735
5736
5737
5738
5739
5740
5741
5742
5743
5744
5745
5746
5747
5748
5749
5750
5751
5752
5753
5754
5755
5756
5757
5758
5759
5760
5761
5762
5763
5764
5765
5766
5767
5768
5769
5770
5771
5772
5773
5774
5775
5776
5777
5778
5779
5780
5781
5782
5783
5784
5785
5786
5787
5788
5789
5790
5791
5792
5793
5794
5795
5796
5797
5798
5799
5800
5801
5802
5803
5804
5805
5806
5807
5808
5809
5810
5811
5812
5813
5814
5815
5816
5817
5818
5819
5820
5821
5822
5823
5824
5825
5826
5827
5828
5829
5830
5831
5832
5833
5834
5835
5836
5837
5838
5839
5840
5841
5842
5843
5844
5845
5846
5847
5848
5849
5850
5851
5852
5853
5854
5855
5856
5857
5858
5859
5860
5861
5862
5863
5864
5865
5866
5867
5868
5869
5870
5871
5872
5873
5874
5875
5876
5877
5878
5879
5880
5881
5882
5883
5884
5885
5886
5887
5888
5889
5890
5891
5892
5893
5894
5895
5896
5897
5898
5899
5900
5901
5902
5903
5904
5905
5906
5907
5908
5909
5910
5911
5912
5913
5914
5915
5916
5917
5918
5919
5920
5921
5922
5923
5924
5925
5926
5927
5928
5929
5930
5931
5932
5933
5934
5935
5936
5937
5938
5939
5940
5941
5942
5943
5944
5945
5946
5947
5948
5949
5950
5951
5952
5953
5954
5955
5956
5957
5958
5959
5960
5961
5962
5963
5964
5965
5966
5967
5968
5969
5970
5971
5972
5973
5974
5975
5976
5977
5978
5979
5980
5981
5982
5983
5984
5985
5986
5987
5988
5989
5990
5991
5992
5993
5994
5995
5996
5997
5998
5999
6000
6001
6002
6003
6004
6005
6006
6007
6008
6009
6010
6011
6012
6013
6014
6015
6016
6017
6018
6019
6020
6021
6022
6023
6024
6025
6026
6027
6028
6029
6030
6031
6032
6033
6034
6035
6036
6037
6038
6039
6040
6041
6042
6043
6044
6045
6046
6047
6048
6049
6050
6051
6052
6053
6054
6055
6056
6057
6058
6059
6060
6061
6062
6063
6064
6065
6066
6067
6068
6069
6070
6071
6072
6073
6074
6075
6076
6077
6078
6079
6080
6081
6082
6083
6084
6085
6086
6087
6088
6089
6090
6091
6092
6093
6094
6095
6096
6097
6098
6099
6100
6101
6102
6103
6104
6105
6106
6107
6108
6109
6110
6111
6112
6113
6114
6115
6116
6117
6118
6119
6120
6121
6122
6123
6124
6125
6126
6127
6128
6129
6130
6131
6132
6133
6134
6135
6136
6137
6138
6139
6140
6141
6142
6143
6144
6145
6146
6147
6148
6149
6150
6151
6152
6153
6154
6155
6156
6157
6158
6159
6160
6161
6162
6163
6164
6165
6166
6167
6168
6169
6170
6171
6172
6173
6174
6175
6176
6177
6178
6179
6180
6181
6182
6183
6184
6185
6186
6187
6188
6189
6190
6191
6192
6193
6194
6195
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<HTML>
<HEAD>
<TITLE>main</TITLE>
<META NAME="description" CONTENT="main">
<META NAME="keywords" CONTENT="PHYLIP", "main", "documentation">
<META NAME="resource-type" CONTENT="document">
<META NAME="distribution" CONTENT="global">
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
</HEAD>
<BODY BGCOLOR="#ccffff">
<P>
<DIV ALIGN="CENTER">
<H1>PHYLIP</H1>
<H2>Phylogeny Inference Package</H2>
<P>
<IMG SRC="phylip.gif" ALT="PHYLIP Logo">
<P>
<H3>Version 3.696</H3>
<P>
<H3>September, 2014</H3>
<P>
<H2>by Joseph Felsenstein</H2>
<P>
<BR>
<TABLE>
<TR><TD>
<FONT SIZE="+2">
Department of Genome Sciences and Department of Biology<BR>
University of Washington<BR>
<p>
address:<br>
Department of Genome Sciences<br>
Box 355065<BR>
Seattle, WA &nbsp;&nbsp;98195-5065<BR>
USA
</FONT>
</TD></TR>
</TABLE>
<H2>E-mail address:&nbsp;&nbsp;&nbsp; <TT>joe<!deathtospam>&nbsp;(at)&nbsp;<!deathtospam>gs.washington.edu</TT></H2>
</DIV>
<P>
<DIV ALIGN="CENTER">
<A NAME="contents"><HR><P></A>
<H2>Contents of This Document</H2></DIV>
<P>
<BR>
<A HREF="#contents">Contents of This Document</A>
<BR>
<A HREF="#description">A Brief Description of the Programs</A>
<BR>
<A HREF="#copyright">Copyright Notice for PHYLIP</A>
<BR>
<A HREF="#documentation">The Documentation Files and How to Read Them</A>
<BR>
<A HREF="#programs">What The Programs Do</A>
<BR>
<A HREF="#running">Running the Programs</A>
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;A word about input files
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Installing a recent version of Oracle Java
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Running the programs on a Windows machine
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Running the programs on a Macintosh with Mac OS X
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Running the programs on a Unix or Linux system
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Running the programs on a Macintosh with Mac OS 8 or 9 (deprecated)
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Running the programs in MSDOS
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Running the Drawgram and Drawtree Java interfaces
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Running the Drawgram and Drawtree Java GUI interfaces in Windows
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Running the programs in background or under control of a command file
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;An example (Unix, Linux or Mac OS X)
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Subtleties (in Unix, Linux, or Mac OS X)
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;An example (Windows)
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Testing for existence of files
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Prototyping keyboard response files
<BR>
<A HREF="#inputfiles">Preparing Input Files</A>
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Input and output files
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Where the files are
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Data file format
<BR>
<A HREF="#menu">The Menu</A>
<BR>
<A HREF="#outputfile">The Output File</A>
<BR>
<A HREF="#treefile">The Tree File</A>
<BR>
<A HREF="#options">The Options and How To Invoke Them</A>
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Common options in the menu
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;The <TT>U</TT> (User tree) option
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;The <TT>G</TT> (Global) option
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;The <TT>J</TT> (Jumble) option
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;The <TT>O</TT> (Outgroup) option
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;The <TT>T</TT> (Threshold) option
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;The <TT>M</TT> (Multiple data sets) option
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;The <TT>W</TT> (Weights) option
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;The option to write out the trees into a tree file
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;The (<TT>0</TT>) terminal type option
<BR>
<A HREF="#algorithm">The Algorithm for Constructing Trees</A>
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Local rearrangements
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Global rearrangements
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Multiple jumbles
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Saving multiple tied trees
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Strategy for finding the best tree
<BR>
<A HREF="#warning">A Warning on Interpreting Results</A>
<BR>
<A HREF="#speed">Relative Speed of Different Programs and Machines</A>
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Relative speed of the different programs
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Speed with different numbers of species
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Relative speed of different machines
<BR>
<A HREF="#comments">General Comments on Adapting the Package to Different Computer Systems</A>
<BR>
<A HREF="#compiling">Compiling the programs</A>
<BR>
<! ??? correct entries here >
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<A HREF="#unix">Unix and Linux</a>
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<A HREF="#win">On Windows systems</a>
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<A HREF="#cyg">Compiling with Cygnus Gnu C++</a>
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<A HREF="#c++">Compiling with Microsoft Visual C++</a>
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<A HREF="#mac">Macintosh</a>
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<A HREF="#gcc">Compiling with GCC on Mac OS X with our Makefile</a>
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<A HREF="#osx-x11">Compiling with GCC on Mac OS X with X Windows</a>
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<A HREF="#metrowerks">What about
the Metrowerks Codewarrior compiler?</a>
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<A HREF="#vax">VMS VAX systems</a>
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<A HREF="#parallel">Parallel computers</a>
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<A HREF="#other">Other computer systems</a>
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<A HREF="#java">Compiling the Java interfaces</a>
<BR>
<A HREF="#FAQ">Frequently Asked Questions</A>
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;How to make it do various things
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Background information needed:
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Questions about distribution and citation:
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Questions about documentation
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Additional Frequently Asked Questions, or: "Why didn't it occur to you to ...
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(Fortunately) obsolete questions
<BR>
<A HREF="#newfeatures">New Features in This Version</A>
<BR>
<A HREF="#future">Coming Attractions, Future Plans</A>
<BR>
<A HREF="#endorsements">Endorsements</A>
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;From the pages of <I>Cladistics</I>
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;... in the pages of other journals:
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;... and in the comments made by users when they register:
<BR>
<A HREF="#references">References for the Documentation Files</A>
<BR>
<A HREF="#credits">Credits</A>
<BR>
<A HREF="#otherprograms">Other Phylogeny Programs Available Elsewhere</A>
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;PAUP*
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;MrBayes
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;MEGA
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;PAML
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Phyml
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;RAxML
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;TNT
<BR>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;DAMBE
<BR>
<A HREF="#helpme">How You Can Help Me</A>
<BR>
<A HREF="#trouble">In Case of Trouble</A>
<P>
<A NAME="description"><HR><P></A>
<DIV ALIGN="CENTER">
<H2>A Brief Description of the Programs</H2></DIV>
<P>
<TT>PHYLIP</TT>, the Phylogeny Inference Package, is a package of programs for
inferring phylogenies (evolutionary trees).  It has been distributed since
1980, and has over 30,000 registered users, making it the most widely
distributed package of phylogeny programs.  It is available free, from
its web site:
<P>
<DIV ALIGN="CENTER">
<FONT SIZE=+2><A HREF="http://evolution.gs.washington.edu/phylip.html">
<TT>http://evolution.gs.washington.edu/phylip.html</TT></A></FONT>

</DIV>
<P>
<TT>PHYLIP</TT> is available as source code in C, and also as executables for
some common computer systems.  It can infer phylogenies by parsimony,
compatibility, distance matrix methods, and likelihood.  It can also
compute consensus trees, compute distances between trees, draw trees,
resample data sets by bootstrapping or jackknifing, edit trees, and
compute distance matrices.  It can handle data that are nucleotide
sequences, protein sequences, gene frequencies, restriction sites,
restriction fragments, distances, discrete characters, and continuous
characters.
<P>
<BR>
<A NAME="copyright"><HR><P></A>
<DIV ALIGN=CENTER>
<TABLE BORDER=4 WIDTH=80%><TR><TD ALIGN=LEFT>
<DIV ALIGN="CENTER">
<H2>Copyright Notice for PHYLIP</H2></DIV>
<P>
The following copyright notice given below is intended to cover all source code, all
documentation, and all executable programs of the PHYLIP package.  This is
a "BSD 2-Clause License" which is open source.  It is not a GNU license and
does not insist that other materials distributed with PHYLIP be under a similar
license.
<P>
&#169; Copyright 1980-2014, Joseph Felsenstein<BR>
All rights reserved.
<P>
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
<P>
1. Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.
<P>
2. Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
<P>
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
<P>
</TD></TR></TABLE></DIV>

<BR>
<A NAME="documentation"><HR><P></A>
<DIV ALIGN="CENTER">
<H2>The Documentation Files and How to Read Them</H2></DIV>
<P>
<TT>PHYLIP</TT> comes with an extensive set of documentation files.  These
include the main documentation file (this one), which you should read
fairly completely.  In addition there are files for groups of programs,
including ones for the <A HREF="sequence.html">molecular sequence</A>
programs, the <A HREF="distance.html">distance matrix</A>
programs, the
<A HREF="contchar.html">gene frequency and continuous characters</A>
programs, the <A HREF="discrete.html">discrete characters</A> programs,
and the <A HREF="draw.html">tree drawing</A> programs.  Finally,
each program has its own documentation file.  References for the
documentation files are all gathered together in this main documentation
file.  A good strategy is to:
<OL>
<LI>Read this main documentation file.
<LI>Tentatively decide which programs are of interest to you.
<LI>Read the documentation files for the groups of programs that
contain those.
<LI>Read the documentation files for those individual programs.
</OL>
<P>
There is an excellent guide to using PHYLIP 3.6 also available.  It was written by
Jarno Tuimala of the Center for Scientific Computing in Espoo, Finland and is
available as a PDF <A
HREF="http://koti.mbnet.fi/tuimala/oppaat/phylip2.pdf">here</A>.  It is also
distributed at the main PHYLIP web site.
<P>
<A NAME="programs"><HR><P></A>
<DIV ALIGN="CENTER">
<H2>What The Programs Do</H2></DIV>
<P>
Here is a short description of each of the programs.  For more detailed
discussion you should definitely read the documentation file for the
individual program and the documentation file for the group of programs
it is in.  In this list the name of each program is a link which will
take you to the documentation file for that program.  Note that there is no
program in the PHYLIP package called PHYLIP.
<DL>
<DT><STRONG><A HREF="clique.html">Clique</A></STRONG>
<DD>Finds the largest clique of mutually compatible characters, and
   the phylogeny which they recommend, for discrete character data with two 
   states.  The largest clique (or all cliques within a given size range of 
   the largest one) are found by a very fast branch and bound search method.  
   The method does not allow for missing data.  For such cases the <TT>T</TT>
   (Threshold) option of Pars or Mix may be a useful alternative. 
   Compatibility methods are particular useful when some characters are of
   poor quality and the rest of good quality, but when it is not known in
   advance which ones are which.
<DT><STRONG><A HREF="consense.html">Consense</A></STRONG>
<DD>Computes consensus trees by the majority-rule consensus tree 
   method, which also allows one to easily find the strict consensus tree.  
   Is not able to compute the Adams consensus tree.  Trees are input in a tree
   file in standard nested-parenthesis notation, which is produced by many of
   the tree estimation programs in the package.  This program can be used as
   the final step in doing bootstrap analyses for many of the methods in the
   package.
<DT><STRONG><A HREF="contml.html">Contml</A></STRONG>
<DD>Estimates phylogenies from gene frequency data by maximum
   likelihood under a model in which all divergence is due to genetic drift in
   the absence of new mutations.  Does not assume a molecular clock.  An 
   alternative method of analyzing this data is to compute Nei's genetic 
   distance and use one of the distance matrix programs.
   This program can also do maximum likelihood analysis of continuous
   characters that evolve by a Brownian Motion model, but it assumes that
   the characters evolve at equal rates and in an uncorrelated fashion, so
   that it does not take into account the usual correlations of characters.
<DT><STRONG><A HREF="contrast.html">Contrast</A></STRONG>
<DD>Reads a tree from a tree file, and a data set with continuous
   characters data, and produces the independent contrasts for those
   characters, for use in any multivariate statistics package.  Will also
   produce covariances, regressions and correlations between characters for
   those contrasts.  Can also correct for within-species sampling variation
   when individual phenotypes are available within a population.
<DT><STRONG><A HREF="dnacomp.html">Dnacomp</A></STRONG>
<DD>Estimates phylogenies from nucleic acid sequence data using
   the compatibility criterion, which searches for the largest number of sites 
   which could have all states (nucleotides) uniquely evolved on the same 
   tree.  Compatibility is particularly appropriate when sites vary greatly in 
   their rates of evolution, but we do not know in advance which are the less 
   reliable ones.
<DT><STRONG><A HREF="dnadist.html">Dnadist</A></STRONG>
<DD> Computes four different distances between species from nucleic acid
   sequences.  The distances can then be used in the distance matrix programs.
   The distances are the Jukes-Cantor formula, one based on Kimura's 2-
   parameter method, the F84 model used in Dnaml, and the LogDet distance.
   The distances can also be corrected for gamma-distributed and
   gamma-plus-invariant-sites-distributed rates of change in different sites.
   Rates of evolution can vary among sites in a prespecified way, and also
   according to a Hidden Markov model.  The program can also make a table of
<DT><STRONG><A HREF="dnainvar.html">Dnainvar</A></STRONG>
<DD>For nucleic acid sequence data on four species, computes
   Lake's and Cavender's phylogenetic invariants, which test alternative tree
   topologies.  The program also tabulates the frequencies of occurrence of the
   different nucleotide patterns.  Lake's invariants are the method which he
   calls "evolutionary parsimony".
<DT><STRONG><A HREF="dnaml.html">Dnaml</A></STRONG>
<DD> Estimates phylogenies from nucleotide sequences by maximum
   likelihood.  The model employed allows for unequal expected frequencies of
   the four nucleotides, for unequal rates of transitions and transversions,
   and for different (prespecified) rates of change in different categories of
   sites, and also use of a Hidden Markov model of rates, with
   the program inferring which sites have which rates.  This also
   allows gamma-distribution and gamma-plus-invariant sites distributions of
   rates across sites.
<DT><STRONG><A HREF="dnamlk.html">Dnamlk</A></STRONG>
<DD>Same as Dnaml but assumes a molecular clock.  The use of the
   two programs together permits a likelihood ratio test of the
   molecular clock hypothesis to be made.
<DT><STRONG><A HREF="dnamove.html">Dnamove</A></STRONG>
<DD>Interactive construction of phylogenies from nucleic acid
   sequences, with their evaluation by parsimony and compatibility and the
   display of reconstructed ancestral bases.  This can be used to find
   parsimony or compatibility estimates by hand.  
<DT><STRONG><A HREF="dnapars.html">Dnapars</A></STRONG>
<DD>Estimates phylogenies by the parsimony method using nucleic acid
   sequences.  Allows use the full IUB ambiguity codes, and estimates 
   ancestral nucleotide states.  Gaps treated as a fifth nucleotide state.
   It can also do transversion parsimony.  Can cope with multifurcations,
   reconstruct ancestral states, use 0/1 character weights, and infer
   branch lengths.
<DT><STRONG><A HREF="dnapenny.html">Dnapenny</A></STRONG>
<DD>Finds all most parsimonious phylogenies for nucleic acid
   sequences by branch-and-bound search.  This may not be practical (depending
   on the data) for more than 10-11 species or so.
<DT><STRONG><A HREF="dollop.html">Dollop</A></STRONG>
<DD>Estimates phylogenies by the Dollo or polymorphism parsimony
   criteria for discrete character data with two states (0 and 1).  Also
   reconstructs ancestral states and allows weighting of characters.  Dollo
   parsimony is particularly appropriate for restriction sites data; with
   ancestor states specified as unknown it may be appropriate for restriction
   fragments data.
<DT><STRONG><A HREF="dolmove.html">Dolmove</A></STRONG>
<DD>Interactive construction of phylogenies from discrete
   character data with two states (0 and 1) using the Dollo or polymorphism
   parsimony criteria.  Evaluates parsimony and compatibility criteria for
   those phylogenies and displays reconstructed states throughout the tree. 
   This can be used to find parsimony or compatibility estimates by hand. 
<DT><STRONG><A HREF="dolpenny.html">Dolpenny</A></STRONG>
<DD>Finds all most parsimonious phylogenies for
    discrete-character data with two states, for the Dollo or polymorphism
   parsimony criteria using the branch-and-bound method of exact search.  May
   be impractical (depending on the data) for more than 10-11 species. 
<DT><STRONG><A HREF="drawgram.html">Drawgram</A></STRONG>
<DD>Plots rooted phylogenies, cladograms, circular trees and phenograms in a
   wide variety of user-controllable formats.  The program is interactive.
   It has an interface in the Java language which gives it a closely
   similar menu on all three major operating systems.
   Final output can be 
   to a file formatted for one of the drawing programs, for a ray-tracing or
   VRML browser, or one at can be sent to a laser printer (such as Postscript
   or PCL-compatible printers), on graphics screens or terminals, on pen
   plotters or on dot matrix printers capable of graphics. Many of these formats
   are historic so we no longer have hardware to test them.
   If you find a problem please report it.
<DT><STRONG><A HREF="drawtree.html">Drawtree</A></STRONG>
<DD>Similar to Drawgram but plots unrooted phylogenies.  It also has a
Java interface for previews.
<DT><STRONG><A HREF="factor.html">Factor</A></STRONG>
<DD>Takes discrete multistate data with character state trees and 
   produces the corresponding data set with two states (0 and 1).  Written by 
   Christopher Meacham.  This program was formerly used to accomodate
   multistate characters in Mix, but this is less necessary now that Pars is
   available.
<DT><STRONG><A HREF="fitch.html">Fitch</A></STRONG>
<DD>Estimates phylogenies from distance matrix data under the
   "additive tree model" according to which the distances are expected to
   equal the sums of branch lengths between the species.  Uses the
   Fitch-Margoliash criterion and some related least squares criteria, or
   the Minimum Evolution distance matrix method.  Does
   not assume an evolutionary clock.  This program will be useful with
   distances computed from molecular sequences, restriction sites or fragments
   distances, with DNA hybridization measurements, and with genetic distances 
   computed from gene frequencies.
<DT><STRONG><A HREF="gendist.html">Gendist</A></STRONG>
<DD>Computes one of three different genetic distance formulas
   from gene frequency data.  The formulas are Nei's genetic distance, the
   Cavalli-Sforza chord measure, and the genetic distance of Reynolds et. al.
   The former is appropriate for data in which new mutations occur in an
   infinite isoalleles neutral mutation model, the latter two for a model
   without mutation and with pure genetic drift.  The distances are written to
   a file in a format appropriate for input to the distance matrix programs.
<DT><STRONG><A HREF="kitsch.html">Kitsch</A></STRONG>
<DD>Estimates phylogenies from distance matrix data under the 
   "ultrametric" model which is the same as the additive tree model except 
   that an evolutionary clock is assumed.  The Fitch-Margoliash criterion and 
   other least squares criteria, or the Minimum Evolution criterion are
   possible.  This program will be useful with 
   distances computed from molecular sequences, restriction sites or
   fragments distances, with distances from DNA hybridization measurements, 
   and with genetic distances computed from gene frequencies. 
<DT><STRONG><A HREF="mix.html">Mix</A></STRONG>
<DD>Estimates phylogenies by some parsimony methods for discrete
   character data with two states (0 and 1).  Allows use of the
   Wagner parsimony method, the Camin-Sokal parsimony method, or arbitrary
   mixtures of these.  Also reconstructs ancestral states and allows weighting
   of characters (does not infer branch lengths).
<DT><STRONG><A HREF="move.html">Move</A></STRONG>
<DD>Interactive construction of phylogenies from discrete character
   data with two states (0 and 1).  Evaluates parsimony and compatibility
   criteria for those phylogenies and displays reconstructed states throughout
   the tree.  This can be used to find parsimony or compatibility estimates by 
   hand. 
<DT><STRONG><A HREF="neighbor.html">Neighbor</A></STRONG>
<DD>An implementation by Mary Kuhner and John Yamato of Saitou and
   Nei's "Neighbor Joining Method," and of the UPGMA (Average Linkage
   clustering) method.  Neighbor Joining is a distance matrix method producing
   an unrooted tree without the assumption of a clock.  UPGMA does assume a
   clock.  The branch lengths are not optimized by the least squares criterion
   but the methods are very fast and thus can handle much larger data sets.
<DT><STRONG><A HREF="pars.html">Pars</A></STRONG>
<DD>Multistate discrete-characters parsimony method.   Up to 8 states
  (as well as "<TT>?</TT>") are allowed.  Cannot do Camin-Sokal or Dollo Parsimony.
  Can cope with multifurcations, reconstruct ancestral states, use character
  weights, and infer branch lengths.
<DT><STRONG><A HREF="penny.html">Penny</A></STRONG>
<DD>Finds all most parsimonious phylogenies for discrete-character
   data with two states, for the Wagner, Camin-Sokal, and mixed parsimony
   criteria using the branch-and-bound method of exact search.  May be
   impractical (depending on the data) for more than 10-11 species. 
<DT><STRONG><A HREF="proml.html">Proml</A></STRONG>
<DD>Estimates phylogenies from protein amino acid sequences by maximum 
   likelihood.  The PAM, JTT, or PMB models can be employed, and also use
   of a Hidden Markov model of rates, with the program inferring which sites
   have which rates.  This also allows gamma-distribution and
   gamma-plus-invariant sites distributions of rates across sites.
   It also allows different rates of change at known sites.
<DT><STRONG><A HREF="promlk.html">Promlk</A></STRONG>
<DD>Same as Proml but assumes a molecular clock.  The use of the
   two programs together permits a likelihood ratio test of the
   molecular clock hypothesis to be made.
<DT><STRONG><A HREF="protdist.html">Protdist</A></STRONG>
<DD>Computes a distance measure for protein sequences, using
   maximum likelihood estimates based on the Dayhoff PAM matrix,
   the JTT matrix model, the PBM model, Kimura's 1983 approximation to these,
   or a model based on the genetic code plus a constraint on changing to a
   different category of amino acid.  The distances can also be corrected for
   gamma-distributed and gamma-plus-invariant-sites-distributed rates of change
   in different sites.  Rates of evolution can vary among sites in a
   prespecified way, and also according to a Hidden Markov model.  The program
   can also make a table of percentage similarity among sequences.  The
   distances can be used in the distance matrix programs.
<DT><STRONG><A HREF="protpars.html">Protpars</A></STRONG>
<DD>Estimates phylogenies from protein sequences (input using the
   standard one-letter code for amino acids) using the parsimony method, in
   a variant which counts only those nucleotide changes that change the amino
   acid, on the assumption that silent changes are more easily accomplished.
   percentage similarity among sequences.
<DT><STRONG><A HREF="restdist.html">Restdist</A></STRONG>
<DD>Distances calculated from restriction sites data or
   restriction fragments data.  The restriction sites option is the one to
   use to also make distances for RAPDs or AFLPs.
<DT><STRONG><A HREF="restml.html">Restml</A></STRONG>
<DD>Estimation of phylogenies by maximum likelihood using
   restriction sites data (not restriction fragments but presence/absence of
   individual sites).  It employs the Jukes-Cantor symmetrical model of
   nucleotide change, which does not allow for differences of rate between
   transitions and transversions.  This program is <I>very</I> slow.
<DT><STRONG><A HREF="retree.html">Retree</A></STRONG>
<DD>Reads in a tree (with branch lengths if necessary) and allows
   you to reroot the tree, to flip branches, to change species names and
   branch lengths, and then write the result out.  Can be used to convert
   between rooted and unrooted trees, and to write the tree into a
   preliminary version of a new XML tree file format which is under
   development and which is described in the
   <A HREF="retree.html">Retree documentation web page</A>.
<DT><STRONG><A HREF="seqboot.html">Seqboot</A></STRONG>
<DD>Reads in a data set, and produces multiple data sets from
   it by bootstrap resampling.  Since most programs in the current version of
   the package allow processing of multiple data sets, this can be used
   together with the consensus tree program Consense to do bootstrap (or
   delete-half-jackknife) analyses with most of the methods in this package.
   This program also allows the Archie/Faith technique of permutation of
   species within characters.  It can also rewrite a data set to convert
   it from between the PHYLIP Interleaved and Sequential forms, and into
   a preliminary version of a new XML sequence alignment format
   which is under development and which is described in the
   <A HREF="seqboot.html">Seqboot documentation web page</A>.
<DT><STRONG><A HREF="threshml.html">Threshml</A></STRONG>
<DD>Reads a tree from a tree file, and a data set with discrete 0/1
   characters.  Using the threshold model of quantitative genetics,
   the program runs a Markov Chain Monte Carlo (MCMC) sampler to
   sample the underlying continuous characters (the liabilities) that
   cause the discrete characters.  The covariances of the liabilities
   are estimated, as well as the transformation from the liabilities to
   underlying independently evolving characters.
<DT><STRONG><A HREF="treedist.html">Treedist</A></STRONG>
<DD> Computes the Branch Score distance between trees, which allows for
differences in tree topology and which also makes use of branch lengths.  Also
computes another distance by Robinson and Foulds that uses branch lengths,
and the Symmetric Difference distance between trees, which
allows for differences in tree topology but does not use branch lengths.
</DL>
<P>
<A NAME="running"><HR><P></A>
<DIV ALIGN="CENTER">
<H2>Running the Programs</H2></DIV>
<P>
This section assumes that you have obtained PHYLIP as compiled executables
(for Windows, Mac OS X, or Linux), or else you have obtained the source code
and compiled it yourself (for Linux, Unix, Mac OS X, or Windows).
For the programs <A HREF="#drawpgms">Drawtree</A> and <A HREF="#drawpgms">Drawgram</A> you will also 
need a recent version of Java installed on your computer to run them interactively.
Note that for machines for
which compiled executables are available, there will usually be no need for
you to have a compiler or compile the programs yourself.  This section
describes how to run the programs.  Later in this document we will
discuss how to download and install PHYLIP (in case you are
reading this without yet having done that).  Normally you will only read
your copy of the documentation files after downloading and installing PHYLIP.
<P>
After describing the input files, we will describe how to run most of
the programs on
Windows, Mac OS X, Linux, and Unix systems). After that, we will give
special descriptions of the interactive Java interface for the tree-drawing
programs Drawgram and Drawtree, including how to run these interfaces on
Windows, Mac OS X, and Linux systems. These may require you to
download and install on your computer the most recent version of Oracle
Java, which is available from Oracle at no cost. We describe this below after
discussing input files.
<P>
<H3>A word about input files.</H3>
<P>
For all of these types of machines, it is
important to have the input files for the programs (typically data files)
prepared in advance.  They can be prepared in any editor, but it is important
that they be saved in Text Only ("flat ASCII") format, not in the format that
word processors such as Microsoft Word want to write (in Microsoft Word,
make sure that the data encoding used is "US ASCII", as using any of the
Unicode codings can cause trouble).  It is up to you to read
the PHYLIP documentation files which describe the files formats that are
needed.  There is a partial description in the next section of this document.
The input files can also be obtained by running a program that
produces output files in PHYLIP format (some of these programs do, and so do
programs by others such as sequence alignment programs such as ClustalW and
sequence format conversion programs such as Readseq).   There is <I>not</I> any
input file editor available in any program in PHYLIP (you should <I>not</I>
simply start running one of the programs and then expect to click a mouse
somewhere to start creating a data file).  
<P>
When they start running, the programs look first for input files with
particular names (such as <TT>infile</TT>, <TT>treefile</TT>,  <TT>intree</TT>, or <TT>fontfile</TT>).
Exactly which file names they look for varies a bit from program to program,
and you should read the documentation file for the particular program to
find out.  If you have files with those names the programs will use them
and not ask you for the file name.  If they do not find files of those
names, the programs will say that they cannot find a file of that name, and
ask you to type in the file name.
For example, if Dnaml looks
for the file <TT>infile</TT> and does not find one of that name,
it prints the message:
<P>
<TABLE><TR><TD BGCOLOR=white>
<TT>dnaml: can't find input file "infile"<BR>
Please enter a new file name></TT>
</TD></TR></TABLE>
<P><I>This does not mean that an error
has occurred.</I>  All you need to do is to type in the name of the file.
<P>
<B> (Joe, you need to rewrite or eliminate this paragraph, it is too condescending)</B>
The program looks for the input files in the same folder that the
program is in (a folder is the same thing as a "directory").  In Windows, 
Mac OS X, Linux, or Unix, if you are asked for the
file name you can type in the path to the file, as part of the name (thus,
if the file is in the folder containing the current folder, you can type in
a file name such as <TT>../myfile.dna</TT>).   If you do not know what a
"folder" is, or what "above" means, then you are a member of the new
generation who just clicks the mouse and assumes that a list of file names
will magically appear.  (Typically members of this generation have no idea
where the files are on their system, and accumulate enormous amounts of
unnecessary clutter in their file systems.)  In this case you should ask
someone to explain folders to you.
<P>
<H3>Running the programs on a Macintosh with Mac OS X</H3>
<P>
We have provided a Mac OS X version of the executables, in the form of
"universal binaries" that should run either on PowerMac or Intel iMac
systems (to ensure that they will run on both 32-bit and 64-bit
Mac OS X systems, we have made sure that we compiled the
executables as 32-bit executables).  The programs can be
run by clicking on their icons.  They open a Terminal window, and the
menu appears in it.  Note that after the program is finished, the Terminal
window remains open, and operations can be done in it.   You will have to
close the window yourself if you don't want it.  The programs can be
terminated by typing control-C (press down the "control" key in the
lower-left corner of the keyboard and type "c").
<P>
It is also possible to run the executables from within a
Terminal window by typing the program name, but this is a little harder.
You will find the Terminal utility available in the Utilities folder in the
Applications folder.
You do need to have links made in the <tt>exe</tt> folder to the
programs.  This can be done the first time you need them, by entering
the <tt>exe</tt> folder and opening a Terminal window, and then typing
<tt>source linkmac</tt>.  This creates the proper links, and thereafter
you do not need to do this again.  The programs can be run by typing
their names in a Terminal window whose current working directory is
<tt>exe</tt>  The programs work well this way,
though the programs Drawgram and Drawtree may be slow to
open and close plotting windows.
The programs can be terminated by typing control-C or by
closing the Terminal window by using the red button in the upper-left
corner of the window.
<P>
One problem we have often encountered using Mac OS X is that it is possible for
data files to have the wrong kind of characters at the ends of their lines.
They may have carriage-return (ASCII/ISO 13 or control-M) characters at the
ends of their lines when they should instead have the Unix newline character
(ASCII/ISO 10 or control-J) there.  This can happen with files transferred
from other operating systems or files produced in some word processors.
It results in segmentation-fault or memory errors.  If you encounter these,
check this possibility carefully.
<P>
If you normally run Mac OS X applications using <TT>open 
-a</TT>, you may need to use the command <TT>lsregister -f -r 
/your/path/to/apps</TT>. You can find it with the command 
<TT>locate lsregister</TT>.
<P>
<H3>Running the programs on a Unix or Linux system.</H3>
<P>
Type the name of the program
in lower-case letters (such as <TT>dnaml</TT>).  To terminate the program while
it is running, type Control-C (which means to press down on the <TT>Ctrl</TT> key
while typing the letter <TT>C</TT>).
<P>
On some systems you may need to type <TT>./</TT> before the program name,
so that in the above case it would be <TT>./dnaml</TT>.  This is mostly
needed if the user's PATH does not include their current directory, something
which is often done as a security precaution.
<P>
<H3>Running the programs on a Macintosh with Mac OS 8 or 9 (deprecated)</H3>
<P>
We no longer produce and distribute Mac OS 8 and Mac OS 9 executables
of the Phylip programs, as we no longer have access to these operating
systems to produce and test them.  As a last resort, only
if you do not have access to a system that will run the current 
distribution, you have two choices:
<ul>
<li>fetch and run <a href="http://evolution.gs.washington.edu/phylip/getme.html#os89">
out of date Mac OS 8 / OS 9 executables</a>, or
<li>attempt to <a href="#metrowerks">compile the current source with the Metrowerks compiler</a>.
</ul>
Once you have the executables, you may follow the directions below.
<ul>
<li>
Double-click on the icon for
the program.  A window should open.  Further dialog with the program occurs
by typing on the keyboard in response to what you see in the window.  The
programs can be terminated by using
the mouse to open the <TT>File</TT> menu in the upper-left corner of the program's
window area and then select <TT>Quit</TT>.  Alternatively, you can use the
Command-Q key combination.
<li>
When you use Quit, the program will ask you whether you want to save
a file whose name is the program name (often followed by <TT>.out</TT> -- for
example, if you are using Dnaml it will ask you if you want to save file
<TT>Dnaml.out</TT>.  This file is simply a record of everything that
displayed on the program window, and you usually will not want to save it.
Pressing the <TT>Enter</TT> key or selecting the Do Not Save button with
the mouse will keep this from being saved.
<li>
If you encounter memory limitations on a Mac OS 8 or 9 Macintosh,
and determine that
this is not due to a problem with the format of the input file, as it
often will be, you may be able to solve it by raising the limits of the
stack and heap sizes of the program.  To do this click on the program
and then select <TT>Get Info</TT> from the Finder <TT>File</TT> menu.
This will open a window which can be made to show the memory limits
of the program.  These can be changed by selecting them and typing in
larger numbers.  This may relieve nagging memory problems.  If it does
not, consult your local documentation and suspect problems with your
input file format.
</ul>
<P>
<A NAME="drawpgms"><HR><P></A>
<H3>Running the Drawgram and Drawtree Java interfaces</H3>
<P>
With version 3.695 we have released an interactive Java interface for 
the tree-drawing programs, Drawgram and Drawtree.  The reason 
is that the graphic interface language for Mac OS X has changed from the Carbon GUI
to the Cocoa GUI, which would require a lot of rewriting of code. 
The alternative X11 (X Windows) GUI machinery on Mac OS X has
been deprecated by Apple, and is showing its age on Linux systems.
<P>
Looking at available options, it seemed best to use Java to construct GUI
interfaces, as this could be done in a reasonably compatible way across all
three major platforms. There are disadvantages too -- to get full
compatibility we need to ask users to download the most recent available
Java from its maker, Oracle.  That is not difficult but is a tiresome extra
step.  Oracle owns Java, and Java is not public-source, but there seems to be
no sign that Oracle is going to make Java runtime machinery unavailable
or charge for it.
<P>
Not all Java implementations will run PHYLIP's Drawgram and Drawtree
GUIs. A reasonably compatible Java is distributed with Mac OS X, but no
Java is distributed along with Windows, and the Java distributed with
Linux distributions is unfortunately not compatible enough with our
Java GUI.  So for these two platforms you will need to download Oracle Java. We
will give you instructions for that below.
<P>
The new GUI for Drawgram and Drawtree is a testbed for a general set of GUI
interfaces for all our programs, which will be present in version 4.0 when
that is distributed, which will be soon.  The work you do to put a recent
version of Oracle Java on your system will make using version 4.0 easier.
<P>
For people who use Drawgram or Drawtree in a "pipeline" run by shell scripts,
there should be no interruption in your ability to do that.  
The current C code for those programs can either be called by the Java GUI
or be run from
a command line or a shellscript (for which see below). Almost all of the
features of Drawgram and Drawtree are available from their character-mode
menu when run that way, except for the interactive previewing of plots. We
hope that the shell scripts will still work and will not need modification
for this version of PHYLIP.
<P>
<H3>Running the Drawgram and Drawtree Java GUI interfaces in Windows</H3>
<P>
To run the Drawgram or Drawtree programs, you find the Drawgram.jar or
Drawtree.jar files, which are Java Archive files in our folder of
executable programs.  You can run them
by clicking on their icons. Detailed instructions for using the
interfaces are given in the general documentation file for tree-drawing
programs <a href="draw.html"><tt>draw.html</tt></a> (which you should read), 
and the documentation files for the
two programs <a href="drawgram.html"><tt>drawgram.html</tt></a>
and <a href="drawtree.html"><tt>drawtree.html</tt></a>.
<P>
<H3> Installing a recent version of Oracle Java</H3>
<P>
To run the interactive interfaces of the tree-drawing programs Drawgram and
Drawtree, you need to have an appropriate version of Java installed on your
computer. If you have Java installed, you should test whether it is an
appropriate version by trying to run Drawgram or Drawtree (for this you
will need an input tree file present as well).  Is it likely that you
have a compatible Java on your system?
<ul>
<li> On a <b>Mac OS X</b> systems you are likely to have an compatible version
of Java.
<li> On <b>Windows</b> systems no Java implementation is installed by default. You can
download a recent Oracle Java on your Windows system by using <a
href="http://www.java.com/en/">this link</a> and following the instructions
there.
<li> On some <b>Linux</b> systems there are Java installations which are not
compatible with our Java interfaces.  This is the result of licensing
issues.  You can remedy the situation by downloading a recent Oracle Java
version and installing it:
<ul>
<li> On Debian-based Linux systems such as Ubuntu and Linux Mint, you
can download Java from <a href="http://www.java.com/en/download/help/linux_install.xml)">this link</a> and install it. If you do not have administrator privileges on
the Linux system, you can install it in your own folders.
<li> On Linux systems that use ther RPM package management system (including
Red Hat, Fedora, SUSE, and Mandriva) you can use <a
href="http://www.java.com/en/download/help/linux_install.xml#download">these
instructions from Oracle</a> to install Java, but you must have administrator
privileges on your Linux system.
</ul>
</ul>
Once a useable version of Java is installed, you do not have to repeat the
installation every time you run one of the programs Drawgram or Drawtree.
<P>
<H3>Running the programs on a Windows machine.</H3>
<P>
Double-click on the icon for
the program.  A window should open with a menu in it.  Further dialog with the
program occurs
by typing on the keyboard in response to what you see in the window.  The
programs can be terminated either by typing Control-C (which means to
press down on the <TT>Ctrl</TT> key while typing the letter <TT>C</TT>), or by using
the mouse to open the <TT>File</TT> menu in the upper-left corner of the program's
window area and then select <TT>Quit</TT>.  Other than this, most PHYLIP programs
make no use of the mouse.  The tree-drawing programs Drawtree and Drawgram
do allow use of the mouse to select some options.
<P>
The programs open a window for their menus.  This window may be too small for
your tastes. They can be resized by tugging on the lower-right corner of the
window.  In addition, the font may be too small.  On most versions of Windows,
you can click on the small C:\ icon
symbol at the upper-left corner of the window, and choose the
<TT>Properties</TT> menu choice there.  One of its tab options allows you
to change the font and size of
the print.  I prefer large font sizes such as 16x12.
<P>
The programs can also be run in a Command Prompt window under Windows, in much 
the same way as they were under the MSDOS operating system, which is what the
Command Prompt window emulates.   Command Prompt windows can be open by
choosing that option in the Accessories menu which is in the All Programs menu.
Once in the Command Prompt window, make sure that you are in the
correct folder, using the <tt>cd</tt> command as needed to find the folder
where the executable PHYLIP programs are.  Then type the name of the program
that you want to use in lower-case letters (such as 
<TT>dnaml</TT>).  To terminate the program while
it is running, type Control-C (which means to press down on the <TT>Ctrl</TT> key
while typing the letter <TT>C</TT>).  
<P>
<H3>Running the programs in background or under control of a command file</H3>
<P>
In running the programs, you may sometimes want to put them in background
so you can proceed with other work.  On systems with a windowing environment
they can be put in their own window, and commands like the Unix and Linux
<TT>nice</TT> command used to make
them have lower priority so that they do not interfere with interactive
applications in other windows.  This part of the discussion will
assume either a Windows system or a Unix or Linux system.  I will
note when the commands work on one of these systems but not the other.
Mac OS X is actually Unix (surprise! surprise!) and you can
run PHYLIP programs in background on any Mac OS X system by simply following
the instructions for Unix, using a terminal window to do so if necessary.
(The Terminal utility can be found in the Utilities folder which is
inside the Applications folder).
<P>
If there is no windowing environment, or if you want to make PHYLIP programs
part of a larger workflow of some sort, on a Unix or Linux system you will 
want to use an
ampersand (<TT>&amp;</TT>) after the command file name when invoking it to put the
job in the background.  You will have to put all the responses to the
interactive menu of the program into a file and tell the background job
to take its input from that file (we cover this below).
<P>
On Windows systems there is no <TT>&amp;</TT> or <TT>nice</TT> command
but input and output redirection and command files work fine in a Commmand
window.   A command file can either be invoked by clicking on its icon or
by typing its name from a Command Prompt window.  The a file of commands must 
have a name ending in <TT>.bat</TT> or <tt>.cmd</tt>, such as 
<TT>foofile.bat</TT>.  You can 
run the batch file from a Command window by typing its name (such as
<tt>foofile</tt>) without the <tt>.bat</tt>.
<P>
Here are examples, for the different operating systems:
<P>
<H4>An example (Unix, Linux or Mac OS X)</H4>
<P>
Here is an example for Windows, Linux, or using a Terminal window of
Mac OS X.   Below you will find a separate example for Windows.  If you
are using Windows you should read that section instead.
<P>
Suppose you want to run Dnaml in a background, taking its
input data from a file called <TT>sequences.dat</TT>, putting its interactive
output to file called <TT>screenout</TT>, and using a file called <TT>input</TT> as
the place to store the interactive input.  The file <TT>input</TT> need only
contain two lines:
<P>
<TABLE><TR><TD bgcolor=white>
<PRE>
sequences.dat
Y
</PRE>
</TD></TR></TABLE>
<P>
which is what you would have typed to run the program interactively, in
response to the program's request for an input file name if it did not
find a file named <TT>infile</TT>, in response the the menu.
<P>
To run the program in background, in Unix or Linux you would simply give the command:
<P>
<TT>dnaml &lt; input &gt; screenout &amp;
</TT>
<P>
These run the program with input responses coming from <TT>input</TT> and
interactive output being put into file <TT>screenout</TT>.  The usual output
file and tree file will also be created by this run (keep that in mind
as if you run any other PHYLIP program from the same directory while
this one is running in background you may overwrite the output file from
one program with that from the other!).
<P>
<H4>Subtleties (in Unix, Linux, or Mac OS X)</H4>
<P>
If you wanted to give the program lower priority, so that it would
not interfere with other work, and you have Berkeley Unix type job control
facilities in your Unix or Linux (and you usually do), you can use the
<TT>nice</TT> command:
<P>
<TT>nice +10 dnaml &lt; input &gt; screenout &amp;
</TT>
<P>
which lowers the priority of the run.  To also time the run and put the
timing at the end of <TT>screenout</TT>, you can do this:
<P>
<TT>nice +10 ( time dnapars &lt; input ) &gt;&amp; screenout &amp;
</TT>
<P>
which I will not attempt to explain.
<P>
On Unix or Linux systems
you may also want to explore putting the interactive output into the
null file <TT>/dev/null</TT> so as to not be bothered with it (but then you
cannot look at it to see why something went wrong).  If you have problems
with creating output files that are too large, you may want to
explore carefully the turning off of options in the programs you run.
<P>
If you are doing several runs in one, as for example when you do a
bootstrap analysis using Seqboot, Dnapars (say), and Consense, you
can use an editor to create a "command file" with these commands:
<P>
<TABLE><TR><TD bgcolor=white>
<PRE>
seqboot < input1 > screenout
mv outfile infile
dnapars < input2 >> screenout
mv outtree intree
consense < input3 >> screenout
</PRE>
</TD></TR></TABLE>
<P>
The command file might be named something like
<TT>foofile</TT>
<P>
It must be given
execute permission by using the command  <TT>chmod +x foofile</TT>.
The job that <TT>foofile</TT> describes
can be run in background on Unix or Linux by giving the command
<P>
<TT>foofile &amp;</TT>
<P>
Note that you must also have the interactive input
commands for Seqboot (including the random number seed), Dnapars, and
Consense in the separate files <TT>input1</TT>, <TT>input2</TT>, and <TT>input3</TT>.
<P>
<h4>An example (Windows)</H4>
<P>
If you have a Windows system and want to run Dnaml in a background, taking its
input data from a file called <TT>sequences.dat</TT>, putting its interactive
output to file called <TT>screenout</TT>, and using a file called <TT>input</TT>
as
the place to store the interactive input.  The file <TT>input</TT> need only
contain two lines:
<P>
<TABLE><TR><TD bgcolor=white>
<PRE>
sequences.dat
Y
</PRE>
</TD></TR></TABLE>
<P>
which is what you would have typed to run the program interactively, in
response to the program's request for an input file name if it did not
find a file named <TT>infile</TT>, in response the the menu.
<P>
To run the program in background, you can place the command
<P>
<TT>dnaml &lt; input &gt; screenout &amp;
</TT>
<P>
in a file called something like <TT>foofile.bat</TT>.  This "batch file" that
has commands and has its name end in <tt>.bat</tt> or <tt>.cmd</tt>
can be run simply by double-clicking on the file icon, which will usually
have a picture of a gear.   A Command Prompt windows (an MSDOS window) will then
open and the commands in the batch file will be run in it.   Alternatively,
you can open a Command Prompt window yourself.   It will be found in the
All Programs menu, as one of the options under Accessories.  Make sure that
after it opens, you tell it to change its working directory to the one that
has the batch file in it.
<P>
The batch file with this command runs the program with input responses coming 
from <TT>input</TT> and
interactive output being put into file <TT>screenout</TT>.  The usual output
file and tree file will also be created by this run (keep that in mind
as, if you run any other PHYLIP program from the same directory while
this one is running in background, you may overwrite the output file from
one program with that from the other!).
<P>
<h4>Testing for existence of files</H4>
<P>
Note also that when PHYLIP programs attempt to open a new output file (such as
<TT>outfile</TT>, <TT>outtree</TT>, or <TT>plotfile</TT>, if they see
a file of that name already in existence they will ask you if you want to
overwrite it, and offer alternatives including writing to another file,
appending information to that file, or quitting the program without writing to
he file.  This means that in writing batch files it is important to know
whether there will be a prompt of this sort.  You must know in advance
whether the file will exist.  You may want to put in your batch file a
command that tests for the existence of a pre-existing output file and
if so, removes it, such as these commands in Unix, Linux, or Mac OS X:
<P>
<TABLE><TR><TD bgcolor=white>
<PRE>
if test -e fubarfile
then
   rm fubarfile
fi
</PRE>
</TD></TR></TABLE>
<P>
You might even want to put in a command that creates a
file of that name, so that you can be sure it is there!  Either way,
you will then know whether to put into your file of keyboard responses the
proper response to the inquiry about overwriting that output file.
<P>
Offhand, I do not know how to test for the existence of files in Windows, but
I suspect that there is a way.
<P>
<H4>Prototyping keyboard response files</H4>
<P>
Making the proper files of keyboard responses for use with command
files is most easily done if you prototype the process by simply
running the program and keeping a careful record of the keyboard
responses that you need to give to get the program to run properly.
Then create a file in an editor and type those keyboard responses into
it.  Thus if the program requires that you answer a question about
what to do with the output file with a keyboard response of R,
then wants you to type a menu selection of U (to have it use a User tree),
then wants you to answer Y to end the menu, and another R to tell it to
replace the output file, you would have the file of keyboard responses
be
<P>
<TABLE><TR><TD bgcolor=white>
<TT>
R<BR>
U<BR>
Y<BR>
R
</TT>
</TD></TR></TABLE>
<P>
Since when you run the program interactively, each keyboard
response is ended by pressing the Enter key on your keyboard,
in the file of keyboard responses you must end each line after
typing the appropriate character.
<P>
Testing the keyboard responses with an interactive run will
be essential to having batch runs succeed.
<P>
<A NAME="inputfiles"><HR><P></A>
<DIV ALIGN="CENTER">
<H2>Preparing Input Files</H2></DIV>
<P>
The input files for PHYLIP programs must be prepared separately - there is
no data editor within PHYLIP.  You can use a word processor (or text
editor) to prepare them yourself, or you can use a program that produces
a PHYLIP-format output.
<P>
With the 3.695 release of Phylip we have included a directory called <TT>TestData</TT> which
contains the data used to generate the examples shown in the individual program html pages
and the output files they produce. Within this TestData directory there is a subdirectory 
that has the name of the program (for example <TT>contrast</TT>) and within that there are the files
<TT>contrastinfile.txt</TT>, <TT>contrastintree.txt</TT> and <TT>contrastoutfile.txt</TT>. 
If you look at the <A HREF="contrast.html">Contrast</A> documentation you can see <TT>infile</TT>,
 <TT>intree</TT>, and <TT>outfile</TT> mentioned in the example. The <TT>testdata/contrast/*.txt</TT>
  files exactly match those in the example, so if you wish to experiment with Contrast
  you have both a good <TT>infile</TT> and a good <TT>intree</TT> and the <TT>outfile</TT> expected 
  from the example, if you set your conditions to match the example.
<P>
Sequence alignment programs such as ClustalW
commonly have an option to produce PHYLIP files as output, and some
other phylogeny programs, such as MacClade and TreeView, are capable of
producing a PHYLIP-format file.
<P>
It is very important that the input files be in "Text Only" or "ASCII" format.  This means that they contain only printable ASCII/ISO
characters, and not any unprintable characters.  Many word processors such
as Microsoft Word save their files in a format that contains unprintable
characters, unless you tell them not to.  In the Microsoft Word family of
word processors, the first time you edit a file, when you go to <tt>Save</tt>
in the <tt>File</tt> menu,
the file the program will instead do a <tt>Save As</tt> function, and ask you
in what format you want the file to be written.
<ul>
<LI> If you are using Microsoft Word, chose <tt>Plain Text</tt>.  A box will
open (or on the Mac OS X version of Word, an <tt>Option</tt> button will be
available) in which you can chosen the setting <tt>US-ASCII</tt>.  The
settings that start with <tt>Western European</tt> also should work.  None of
the other encodings are likely to work.
<LI> If you are using WordPad, chose <tt>Text Document (*.txt)</tt>.  Do
<em>not</em> chose <tt>Unicode Text Document</tt>.
<LI> If you are using Notepad, chose <tt>Text Document</tt> and then chose
<tt>ANSI</tt> in the list of encoding methods, but not <tt>Unicode</tt>
or <tt>UTF8</tt> in the list of encodings.
<p>
<LI>If you are on Mac OS X and using its own document editor TextEdit,
you may need to use the Make Plain Text choice in the Format menu.
Once that is done, TextEdit also has a checkbox in the Save As
window that defaults to providing a <tt>.txt</tt> extension at the end of the
file name -- if you don't want that to happen, uncheck that box.
Save As also may have a check box that defaults to hiding the three-letter 
extension of the file, so
that when the file is saved as (say) <tt>foofile.txt</tt> its name will appear 
to not have the extension <tt>.txt</tt> at the end of the file name, even 
though it really is there.  It is best to uncheck that box.
</ul>
<P>
For these word processors, the next time you edit the same file, using 
<tt>Save</tt>, the program should use those settings without asking you.  If 
you have some trouble getting an input file that the programs can read, look 
into whether you properly set these options.  This can be usually be done by 
using the <tt>Save As</tt> choice in the <tt>File</tt> menu and making the 
right settings.
<P>
Text editors such as the <TT>vi</TT> and <TT>emacs</TT> editors on
Unix and Linux (and available on Mac OS X too), or the <TT>pico</TT>
editor that comes with the <TT>pine</TT>
mailer program, produce their files in Text Only format and should not
cause any trouble.
<P>
The format of the input files is discussed below, and you should also
read the other PHYLIP documentation relevant to the particular type of
data that you are using, and the particular programs you want to run, as
there will be more details there.
<P>
<H3>Input and output files</H3>
<P>
For most of the PHYLIP programs, information comes from a series of
input files, and ends up in a series of output files:
<P>
<DIV ALIGN="CENTER">
<TABLE>
<TR><TD>
<PRE>
                   -------------------
                  |                   |
infile ---------> |                   |
                  |                   |
intree ---------> |                   | -----------> outfile
                  |                   |
weights --------> |      program      | -----------> outtree
                  |                   |
categories -----> |                   | -----------> plotfile
                  |                   |
fontfile -------> |                   |
                  |                   |
                   -------------------
</PRE>
</TD></TR>
</TABLE>
</DIV><P></P>

<P>
The programs interact with the user by presenting a menu.  Aside from the
user's choices from the menu, they read
all other input from files.  These files have default names.  The program
will try to find a file of that name - if it does not, it will ask the
user to supply the name of that file.
Input data such as DNA sequences
comes from a file whose default name is <TT>infile</TT>.  If the user
supplies a tree, this is in a file whose default name is <TT>intree</TT>.
Values of weights for the characters are in <TT>weights</TT>, and the
tree plotting program need some digitized fonts which are supplied in
<TT>fontfile</TT> (all these are default names).
<P>
For example, if Dnaml looks
for the file <TT>infile</TT> and does not find one of that name,
it prints the message:
<P>
<TABLE><TR><TD BGCOLOR=white>
<TT>dnaml: can't find input file "infile"<BR>
Please enter a new file name></TT>
</TD></TR></TABLE>
<P>
This simply means that it wants you to type in the name of the
input file.
<P>
<h3>Where the files are</h3>
<P>
When you run a program, you are in a current folder.  If you run it by clicking
on an icon, the folder is the one that has the icon.  If you run it by
typing the name of the program, the folder is the current folder when
you do that.  The program will look for default files (such as <tt>infile</tt>
and <tt>intree</tt>) in that folder.  When it writes files, their
default locations are also in the current folder.
<P>
The program need not actually be in the current folder.  An icon can
sometimes be a link to a program located elsewhere.  A program name
typed by you can contain a &ldquo;path&rdquo;, so that if you type
<tt>/usr/local/phylip/dnaml</tt> the program run will be located in
folder <tt>/usr/local/phylip</tt>.  The operating system maintains a default
path for your account, which is a series of names of folders.  When you
type the name of a program, the
operating system will look in that series of folders until it finds the
program, and then run it.  But in all of these cases, the input and output
files will, by default, be in the current folder, even if the program
is located in some other folder.
<P>
Users can change where the input files are, or where the output files
go.  If no file called <tt>infile</tt> is found in the current folder,
you will be asked to type the name of the file.  In that case you
can type a filename with a path, such as <tt>foobar/mydata</tt>, and
in that case the program will look for file <tt>mydata</tt> in folder
<tt>foobar</tt> within the current folder.   A similar process occurs
when the program cannot find file <tt>intree</tt>.
<P>
When the program starts to write an output file, such as <tt>outfile</tt>,
a similar series of events happens, with one important difference.
It is when a file <tt>outfile</tt> already exists in the current folder
that the user will be asked what to do.  (In the case of input files,
it was when they did not exist that the user is asked what to do).
You will be given the opportunity to Replace the file, Append to the
file, write to a different File, or Quit.  If you choose the response F
you will be asked for the name of the different file, and that is when
you can give a filename with a path, such as <tt>foobar/myoutput.out</tt>,
and the file will be written in that folder instead of the current folder.
<P>
Understanding which folder is the current folder, and whether there are
files named <tt>infile</tt>, <tt>intree</tt>, <tt>outfile</tt>, or
<tt>outtree</tt> there, is crucial to successfully running PHYLIP
programs, and making sure that they analyze the correct data set and
write their files in the right place.
<P>
<H3>Data file format</H3>
<P>
I have tried to adhere to a rather stereotyped input and output
format.  For the parsimony, compatibility and maximum likelihood programs, 
excluding the distance matrix methods, the simplest version of the input
data file looks something like this:
<P>
<TABLE><TR><TD BGCOLOR=white>
<PRE>
   6   13
Archaeopt CGATGCTTAC CGC
HesperorniCGTTACTCGT TGT
BaluchitheTAATGTTAAT TGT
B. virginiTAATGTTCGT TGT
BrontosaurCAAAACCCAT CAT
B.subtilisGGCAGCCAAT CAC
</TD></TR></TABLE>
</PRE>
<P>
The first line of the input file contains the number of species and the
number of characters (in this case sites).  These are in free format, separated
by blanks.  The information for each species follows, starting with a
ten-character species name (which can include blanks and some punctuation
marks), and continuing with the characters for that species.  The name should
be on the same line as the first character of the data for that species.
(I will use the term "species" for the tips of the trees, recognizing
that in some cases these will actually be populations or individual gene
sequences).
<P>
The name should be ten characters in length, and either terminated
by a Tab character or filled out to the full
ten characters by blanks if shorter.  Any printable ASCII/ISO character is
allowed in the name, except for parentheses ("<TT>(</TT>" and "<TT>)</TT>"), square
brackets ("<TT>[</TT>" and "<TT>]</TT>"), colon ("<TT>:</TT>"), semicolon ("<TT>;</TT>") and comma ("<TT>,</TT>").
If you forget to extend the names to ten characters in length by blanks,
and do not terminate them with a Tab character,
the program will get out of synchronization with the contents of the data
file, and an error message will result.  A Tab character that terminates
a name will not be taken as part of the name that is read; the name will then 
automatically be filled with blanks to a total length of 10 characters.
<P>
In the
discrete-character programs, DNA sequence programs and protein sequence
programs the characters are each a
single letter or digit, sometimes separated by blanks.  In
the continuous-characters programs they are real numbers with decimal points,
separated by blanks:
<P>
<TT>Latimeria  2.03  3.457  100.2  0.0  -3.7</TT>
<P>
The conventions about continuing the data beyond one line per species are
different between the molecular sequence programs and the others.  The 
molecular sequence programs can take the data in "aligned" or "interleaved"
format, in which we first have some lines giving the first part of each of the
sequences, then some
lines giving the next part of each, and so on.  Thus the sequences might
look like this:
<P>
<TABLE><TR><TD BGCOLOR=white>
<PRE>
    6   39
Archaeopt CGATGCTTAC CGCCGATGCT
HesperorniCGTTACTCGT TGTCGTTACT
BaluchitheTAATGTTAAT TGTTAATGTT
B. virginiTAATGTTCGT TGTTAATGTT
BrontosaurCAAAACCCAT CATCAAAACC
B.subtilisGGCAGCCAAT CACGGCAGCC

TACCGCCGAT GCTTACCGC
CGTTGTCGTT ACTCGTTGT
AATTGTTAAT GTTAATTGT
CGTTGTTAAT GTTCGTTGT
CATCATCAAA ACCCATCAT
AATCACGGCA GCCAATCAC
</PRE>
</TD></TR></TABLE>
<P>
Note that in these sequences we have a blank every
ten sites to make them easier to read: any such blanks are allowed.  The blank
line which separates the two groups of lines (the ones
containing sites 1-20 and ones containing sites 21-39) may or may not
be present.  It is important that the number of sites in each
group be the same for all species (i.e., it will not be possible to run
the programs successfully if the first species line contains 20 bases, but
the first line for the second species contains 21 bases).
<P>
Alternatively, an option can be selected in the menu to take the data in
"sequential" format, with all of the data for the first species,
then all of the characters for the next species, and so on.  This is also
the way that the discrete characters programs and the gene frequencies
and quantitative characters programs want to read the data.  They do not
allow the interleaved format.
<P>
In the sequential format, the character data can run on to a new line at any
time (except in the middle of a species name or, in the case of continuous
character and distance matrix programs where you cannot go to a new line in
the middle of a real number).  Thus it is legal to have:
<P>
<TT>Archaeopt 001100
<BR>
1101
<BR>
</TT>
<P>
or even:
<P>
<TT>Archaeopt 
<BR>
0011001101
<BR>
</TT>

<P>
though note that the <I>full</I> ten characters of the species name <I>must</I>
then be present: in the above case there must be a blank after the "t".  In all
cases it is possible to put internal blanks between any of the character
values, so that
<P>
<TT>Archaeopt 0011001101 0111011100
</TT>
<P>
is allowed.
<P>
Note that you can convert molecular sequence data between the interleaved
and the sequential data formats by using the Rewrite option of the J
menu item in Seqboot.
<P>
If you make an error in the format of the input file, the programs can
sometimes detect that
they have been fed an illegal character or illegal numerical value and issue
an error message such as <TT>BAD CHARACTER STATE:</TT>, often printing out the 
bad value, and sometimes the number of the species and character in which it
occurred.  The program will then stop shortly after.  One of the things which
can lead to a bad value is the omission of something earlier in the file, or
the insertion of something superfluous, which cause the reading of the file to
get out of synchronization.  The program then starts reading things it
didn't expect, and concludes that they are in error.  So if you see this error
message, you may also want
to look for the earlier problem that may have led to the program becoming
confused about what it is reading.
<P>
Some options are described below, but you should also read the documentation
for the groups of the programs and for the individual programs.
<BR>
<P>
<A NAME="menu"><HR><P></A>
<DIV ALIGN="CENTER">
<H2>The Menu</H2>
</DIV>
<P>
The menu is straightforward.  It typically looks like this (this one is for
Dnapars):
<P>
<TABLE><TR><TD BGCOLOR=white>
<PRE>
DNA parsimony algorithm, version 3.695

Setting for this run:
  U                 Search for best tree?  Yes
  S                        Search option?  More thorough search
  V              Number of trees to save?  10000
  J   Randomize input order of sequences?  No. Use input order
  O                        Outgroup root?  No, use as outgroup species  1
  T              Use Threshold parsimony?  No, use ordinary parsimony
  N           Use Transversion parsimony?  No, count all steps
  W                       Sites weighted?  No
  M           Analyze multiple data sets?  No
  I          Input sequences interleaved?  Yes
  0   Terminal type (IBM PC, ANSI, none)?  ANSI
  1    Print out the data at start of run  No
  2  Print indications of progress of run  Yes
  3                        Print out tree  Yes
  4          Print out steps in each site  No
  5  Print sequences at all nodes of tree  No
  6       Write out trees onto tree file?  Yes

  Y to accept these or type the letter for one to change
</PRE>
</TD></TR></TABLE>
<P>
If you want to accept the default settings (they are shown in the above case)
you can simply type <TT>Y</TT> followed by pressing on the <TT>Enter</TT> key.
If you want to change any of the options, you should type the letter
shown to the left of its entry in the menu.  For example, to set a threshold
type <TT>T</TT>.  Lower-case letters will also work.  For many of the options
the program will ask for supplementary information, such as the value of
the threshold. 
<P>
Note the <TT>Terminal type</TT> entry, which you will find on all menus.  It
allows you to specify which type of terminal your screen is.  The options
are an IBM PC screen, an ANSI standard terminal, or <TT>none</TT>.
Choosing zero (<TT>0</TT>) toggles
among these three options in cyclical order, changing each time the <TT>0</TT>
option is chosen.  If one of them is right for your terminal the screen will be
cleared before the menu is displayed.  If none works, the <TT>none</TT> option
should probably be chosen.  The programs should start with a terminal option
appropriate for your computer, but if they do not, you can change the
terminal type manually.  This is particularly important in program Retree
where a tree is displayed on the screen - if the terminal type is set to the
wrong value, the tree can look very strange.
<P>
The other numbered options control which information the program will
display on your screen or on the output files.  The option to <TT>Print
indications of progress of run</TT> will show information such as the names of
the species as they are successively added to the tree, and the
progress of rearrangements.  You will usually want to see these as
reassurance that the program is running and to help you estimate how long
it will take.  But if you are running the program "in background" as can be
done on multitasking and multiuser systems, and do not have the
program running in its own window, you may want to turn this option off so
that it does not disturb your use of the computer while the program is
running.  Note also menu option 3, "Print out tree".  This can be useful
when you are running many data sets, and will be using the resulting trees
from the output tree file.  It may be helpful to turn off the printing out
of the trees in that case, particularly if those files would be too big.
<P>
<A NAME="outputfile"><HR><P></A>
<DIV ALIGN="CENTER">
<H2>The Output File</H2>
</DIV>
<P>
Most of the programs write their output onto a file called (usually) <TT>outfile</TT>, and a representation of the trees found onto a file called
<TT>outtree</TT>.
<P>
The exact contents of the output file vary from program to program and also
depend on which menu options you have selected.  For many programs, if you
select all possible output information, the output will consist of
(1) the name of the program and its
version number, (2) some of the input information printed out, and (3) a series of
phylogenies, some with associated information indicating how much change
there was in each character or on each part of the tree.  A typical rooted tree
looks like this:
<P>
<TABLE><TR><TD BGCOLOR=white>
<PRE>
                                     +-------------------Gibbon
        +----------------------------2
        !                            !      +------------------Orang
        !                            +------4
        !                                   !  +---------Gorilla
  +-----3                                   +--6
  !     !                                      !    +---------Chimp
  !     !                                      +----5
--1     !                                           +-----Human
  !     !
  !     +-----------------------------------------------Mouse
  !
  +------------------------------------------------Bovine
</PRE>
</TD></TR></TABLE>
<P>
The interpretation of the tree is fairly straightforward: it "grows"
from left to right.  The numbers at the forks are arbitrary and are used (if
present) merely to identify the forks.  For many of the programs the tree 
produced is unrooted.   Rooted and unrooted trees are printed in nearly the
same form, but the unrooted ones are accompanied by the
warning message:
<P>
<TT>   remember: this is an unrooted tree!
</TT>
<P>
to indicate that this is an unrooted tree and to warn against
taking the position of its root too seriously.  (Mathematicians still call
an unrooted tree a tree, though some systematists unfortunately use the term
"network" for an unrooted tree.  This conflicts with standard mathematical
usage, which reserves the name "network" for a completely different kind of
graph).  The root of this tree could be anywhere, say on the line leading
immediately to <TT>Mouse</TT>.  As an exercise,
see if you can tell whether the following tree is or is not a different
one from the above:
<P>
<TABLE><TR><TD BGCOLOR=white>
<PRE>
             +-----------------------------------------------Mouse
             !
   +---------4                                   +------------------Orang
   !         !                            +------3
   !         !                            !      !       +---------Chimp
---6         +----------------------------1      !  +----2
   !                                      !      +--5    +-----Human
   !                                      !         !
   !                                      !         +---------Gorilla
   !                                      !
   !                                      +-------------------Gibbon
   !
   +-------------------------------------------Bovine

   remember: this is an unrooted tree!
</PRE>
</TD></TR></TABLE>
<P>
(it is <I>not</I> different).  It is <I>important</I> also to realize that the
lengths of the segments of the printed tree may not be significant: some
may actually represent branches of zero length, in the sense that there is no
evidence that 
those branches are nonzero in length.  Some of the diagrams of trees attempt
to print branches approximately proportional to estimated
branch lengths, while in others the lengths are purely conventional and
are presented just to make the topology visible.  You will have to look closely 
at the documentation that accompanies each program to see what it presents
and what is known about the lengths of the branches on the tree.  The above
tree attempts to represent branch lengths approximately in the diagram.  But
even in those cases, some of the smaller branches are likely to be
artificially lengthened to make the tree topology clearer.  Here is what
a tree from Dnapars looks like, when no attempt is made to make the
lengths of branches in the diagram proportional to estimated branch
lengths:
<P>
<TABLE><TR><TD BGCOLOR=white>
<PRE>
                 +--Human
              +--5
           +--4  +--Chimp
           !  !
        +--3  +-----Gorilla
        !  !
     +--2  +--------Orang
     !  !
  +--1  +-----------Gibbon
  !  !
--6  +--------------Mouse
  !
  +-----------------Bovine

  remember: this is an unrooted tree!
</PRE>
</TD></TR></TABLE>
<P>
When a tree has branch lengths, it will be accompanied by a table showing
for each branch the numbers (or names) of the nodes at each end of the
branch, and the length of that branch.  For the first tree shown above,
the corresponding table is:
<P>
<TABLE><TR><TD BGCOLOR=white>
<PRE>
 Between        And            Length      Approx. Confidence Limits
 -------        ---            ------      ------- ---------- ------

    1          Bovine            0.90216     (  0.50346,     1.30086) **
    1          Mouse             0.79240     (  0.42191,     1.16297) **
    1             2              0.48553     (  0.16602,     0.80496) **
    2             3              0.12113     (     zero,     0.24676) *
    3             4              0.04895     (     zero,     0.12668)
    4             5              0.07459     (  0.00735,     0.14180) **
    5          Human             0.10563     (  0.04234,     0.16889) **
    5          Chimp             0.17158     (  0.09765,     0.24553) **
    4          Gorilla           0.15266     (  0.07468,     0.23069) **
    3          Orang             0.30368     (  0.18735,     0.41999) **
    2          Gibbon            0.33636     (  0.19264,     0.48009) **

      *  = significantly positive, P < 0.05
      ** = significantly positive, P < 0.01
</PRE>
</TD></TR></TABLE>
<P>
Ignoring the asterisks and the approximate confidence limits, which will be
described in the documentation file for Dnaml, we can see that the table
gives a more precise idea of what the lengths of all the branches are.
Similar tables exist in distance matrix and likelihood programs, as well
as in the parsimony programs Dnapars and Pars.
<P>
Some of the parsimony programs in the package can print out a table
of the number of steps that different characters (or sites) require on
the tree.  This table may not be obvious at first.  A typical example looks like
this:
<P>
<TABLE><TR><TD BGCOLOR=white>
<PRE>
 steps in each site:
         0   1   2   3   4   5   6   7   8   9
     *-----------------------------------------
    0!       2   2   2   2   1   1   2   2   1
   10!   1   2   3   1   1   1   1   1   1   2
   20!   1   2   2   1   2   2   1   1   1   2
   30!   1   2   1   1   1   2   1   3   1   1
   40!   1
</PRE>
</TD></TR></TABLE>
<P>
The numbers across the top and down the side indicate which site
is being referred to.  Thus site 23 is column "3" of row "20"
and has 1 step in this case.
<P>
There are many other kinds of information that can appear in the
output file,  They vary from program to program, and we leave their
description to the documentation files for the specific programs.
<P>
<A NAME="treefile"><HR><P></A>
<DIV ALIGN="CENTER">
<H2>The Tree File</H2>
</DIV>
<P>
In output from most programs,
a representation of the tree is also written into the tree file
<TT>outtree</TT>.  The tree is specified by nested pairs
of parentheses, enclosing
names and separated by commas.  We will describe how this works
below.  If there are any blanks in the names,
these must be replaced by the underscore character "<TT>_</TT>".  Trailing blanks
in the name may be omitted.  The pattern of the parentheses indicates
the pattern of the tree by having each pair of parentheses enclose all
the members of a monophyletic group.  The tree file could look like this:
<P>
<TABLE><TR><TD BGCOLOR=white>
<PRE>
((Mouse,Bovine),(Gibbon,(Orang,(Gorilla,(Chimp,Human)))));
</PRE>
</TD></TR></TABLE>
<P>
In this tree the first fork separates the lineage leading to
<TT>Mouse</TT> and <TT>Bovine</TT> from the lineage leading to the rest.  Within the 
latter group there is a fork separating <TT>Gibbon</TT> from the rest, and so on.
The entire tree is enclosed in an outermost pair of parentheses.  The tree ends
with a semicolon.  In some programs such as Dnaml, Fitch, and Contml,
the tree will be unrooted.  An unrooted tree should have its 
bottommost fork have a
three-way split, with three groups separated by two commas:
<P>
<TABLE><TR><TD BGCOLOR=white>
<PRE>
(A,(B,(C,D)),(E,F));
</PRE>
</TD></TR></TABLE>
<P>
Here the three groups at the bottom node are <TT>A</TT>, <TT>(B,C,D)</TT>, and
<TT>(E,F)</TT>.  The single three-way split corresponds to one of the interior
nodes of the unrooted tree (it can be any interior node of the tree).  The
remaining forks are encountered as you move out from that first node.
In newer programs, some are able to tolerate these other forks being
multifurcations (multi-way splits).
You should check the documentation files
for the particular programs you are using to see in which of these forms
you can expect the user tree to be in.  Note that many of the programs
that actually estimate an unrooted tree (such as Dnapars) produce trees in the
treefile in rooted form!  This is done for reasons of arbitrary internal bookkeeping.  The placement of the root is arbitrary.  We are working toward
having all programs be able to read all trees, whether rooted or unrooted,
multifurcating or bifurcating, and having them do the right thing with
them.  But this is a long-term goal and it is not yet achieved.
<P>
For programs that infer branch lengths, these are given in the trees in the
tree file as real numbers following a colon, and placed immediately
after the group descended from that branch.  Here is a typical tree
with branch lengths:
<P>
<TABLE><TR><TD BGCOLOR=white>
<PRE>
((cat:47.14069,(weasel:18.87953,((dog:25.46154,(raccoon:19.19959,
bear:6.80041):0.84600):3.87382,(sea_lion:11.99700,
seal:12.00300):7.52973):2.09461):20.59201):25.0,monkey:75.85931);
</PRE>
</TD></TR></TABLE>
<P>
Note that the tree may continue to a new line at any time except in the
middle of a name or the middle of a branch length, although in trees
written to the tree file this will only be done after a comma.
<P>
These representations of trees are a subset of the standard adopted
on 24 June 1986 at the annual meetings of the Society for the Study of
Evolution by an informal committee (its final session in Newick's
lobster restaurant - hence its name, the Newick standard)
consisting of Wayne Maddison (author of MacClade), David Swofford (PAUP),
F. James Rohlf (NTSYS-PC), Chris Meacham (COMPROB and the original
PHYLIP tree drawing programs), James Archie,
William H.E. Day, and me.  This standard is a generalization of
PHYLIP's format, itself based on a well-known representation of trees in 
terms of parenthesis patterns which is due to the famous mathematician 
Arthur Cayley, and which has been around for over a century.  The
standard is now employed by most phylogeny computer programs but unfortunately
has yet to be decribed in a formal published description.  Other
descriptions by me and by Gary Olsen can be accessed using the Web at:
<P>
<DIV ALIGN="CENTER">
<FONT SIZE=+2><A HREF="http://evolution.gs.washington.edu/phylip/newicktree.html">
<TT>http://evolution.gs.washington.edu/phylip/newicktree.html</TT></A></FONT>
</DIV>
<P>
<A NAME="options"><HR><P></A>
<DIV ALIGN="CENTER">
<H2>The Options and How To Invoke Them</H2>
</DIV>
<P>
Most of the programs allow various options that alter the amount of
information the program is provided or what is done with the
information.  Options are selected in the menu.
<P>
<H3>Common options in the menu</H3>
<P>
A number of the options from the menu, the <TT>U</TT> (User tree), <TT>G</TT> (Global),
<TT>J</TT> (Jumble), <TT>O</TT> (Outgroup), <TT>W</TT> (Weights),
<TT>T</TT> (Threshold), <TT>M</TT> (multiple data sets), and the tree output options, are used
so widely that it is best to discuss them in this document.
<P>
<B>The <TT>U</TT> (User tree) option.</B>  This option toggles between the default
setting, which allows the program to search for the best tree, and the
User tree setting, which reads a tree or trees ("user trees") from the input
tree file and evaluates them.  The input tree file's
default name is <TT>intree</TT>.  In many cases the programs will also
tolerate having the trees
be preceded by a line giving the number of trees:
<P>
<TABLE><TR><TD BGCOLOR=white>
<PRE>
((Alligator,Bear),((Cow,(Dog,Elephant)),Ferret));
((Alligator,Bear),(((Cow,Dog),Elephant),Ferret));
((Alligator,Bear),((Cow,Dog),(Elephant,Ferret)));
</PRE>
</TD></TR></TABLE>
<p>
An initial line with the number of trees was formerly
required, but this now can be omitted.  Some programs require rooted
trees, some unrooted
trees, and some can handle multifurcating trees.  You should read
the documentation for the particular program to find out which it
requires.  Program Retree can be used to convert trees among
these forms (on saving a tree from Retree, you are asked whether
you want it to be rooted or unrooted).
<P>
In using the user tree option, check the pattern of parentheses
carefully.  The programs do not always detect
whether the tree makes sense, and if it does not there will probably be
a crash (hopefully, but not inevitably, with an error message indicating
the nature of the problem).  Trees written out by programs are
typically in the proper form.
<P>
<B>The <TT>G</TT> (Global) option.</B>  In  the programs which construct trees (except for
Neighbor, the "...penny" programs and Clique, and of course 
the "...move" programs where you construct the trees yourself),
after all species have been added to the tree a rearrangements phase
ensues.  In most of these programs the rearrangements are automatically
global, which in this case means that subtrees will be removed from the tree
and put back on in all possible ways so as to have a better chance of
finding a better tree.  Since this can be time consuming (it roughly
triples the time taken for a run) it is left as an option in some of the
programs, specifically Contml, Fitch, Dnaml and Proml.  In these programs
the G menu option toggles between the default of local rearrangement and
global rearrangement.  The rearrangements are explained more below.
<P>
<B>The <TT>J</TT> (Jumble) option.</B>  In most of the tree construction programs
(except for the "...penny" programs and Clique), the exact 
details of the search of different trees depend on the order of input of
species.  In these programs <TT>J</TT> option enables you to tell the program to use
a random number 
generator to choose the input order of species.  This option is toggled on
and off by
selecting option <TT>J</TT> in the menu.  The program will then prompt you for
a "seed" for the random number generator.  The seed should be an integer
between 1 and 2<sup>32</sup>-3 (which is 4,294,967,293), and should be
of form 4<i>n</i>+1,
which means that it must give a remainder of 1 when divided by 4.  This can be
judged by looking at the last two digits of the number (for example, in the
upper limit given above, the last two digits are 93, which is of form 4<i>n</i>+1.  Each different seed 
leads to a different sequence of addition of species.  By simply changing the 
random number seed and re-running the programs one can look for other, and 
better trees.  If the seed entered is not odd, the program will not proceed,
but will prompt for another seed.
<P>
The Jumble option also causes the program to ask you how many times you
want to restart the process.  If you answer 10, the program will
try ten different orders of species in constructing the trees, and the
results printed out will reflect this entire search process (that is,
the best trees found among all 10 runs will be printed out, not the
best trees from each individual run).
<P>
Some people have asked what are good values of the random number seed.
The random number seed is used to start a process of choosing "random"
(actually pseudorandom) numbers, which behave as if they were
unpredictably randomly chosen between 0 and 2<SUP>32</SUP>-1 (which is
4,294,967,295).  You could put in the number 133 and find that the
next random number was 221,381,825.  As they are effectively
unpredictable, there is no such thing as a choice that is better than
any other, provided that the numbers are of the form 4<I>n</I>+1.  However
if you re-use a random number seed, the sequence of random numbers
that result will be the same as before, resulting in exactly the same
series of choices, which may not be what you want.
<P>
<B>The <TT>O</TT> (Outgroup) option.</B>  This specifies which species is to
have the root of the tree be on the line leading to it.  For example, if the
outgroup is a species "Mouse" then the root of the tree will be placed in the
middle of the branch which is connected to this species, with Mouse branching
off on one side of the root and the lineage leading to the rest of the tree
on the other.  This option is toggled on and off by choosing <TT>O</TT> in the 
menu (the alphabetic character <TT>O</TT>, not the digit <TT>0</TT>).  When it 
is on, the program will then prompt for the
number of the outgroup (the species being taken in the numerical order that
they occur in the input file).  Responding by typing <TT>6</TT> and then an
<TT>Enter</TT> character indicates that the sixth species in the data
(the 6th in the first set of data if there are multiple data sets)
is taken as the outgroup.  Outgroup-rooting will not be attempted if the
data have already established a root for the tree from some other 
consideration, and may not be if it is a user-defined tree,
despite your invoking the option.  Thus programs such as Dollop that
produce only rooted trees do not allow the Outgroup option.  It is also
not available in Kitsch, Dnamlk, Promlk or Clique.  When it is used, the tree as
printed out is still listed as being an
unrooted tree, though the outgroup is connected to the bottommost node
so that it is easy to visually convert the tree into rooted form.
<P>
<B>The <TT>T</TT> (Threshold) option.</B>  This sets a threshold forn the
parsimony programs such that if the
number of steps counted in a character is higher than the threshold, it
will be taken to be the threshold value rather than the actual number of
steps.  The default is a threshold so high that it will never be
surpassed (in which case the steps whill simply be counted).  The <TT>T</TT>
menu option toggles on and off asking the user to
supply a threshold.  The use of thresholds to obtain methods intermediate
between parsimony and compatibility methods is described in my 1981b paper. 
When the T option is in force, the program
will prompt for the numerical threshold value.  This will be a positive
real number greater than 1.  In programs Mix, Move, Penny, Protpars,
Dnapars, Dnamove, and Dnapenny, do not use threshold values less
than or equal to 1.0, as they have no meaning and lead to a tree which
depends only on considerations such as the input order of species and not at 
all on the character state data!  In programs Dollop, Dolmove, and Dolpenny
the threshold should never be 0.0 or less, for the same
reason.  The <TT>T</TT> option is an 
important and underutilized one: it is, for example, the only way in this 
package (except for program Dnacomp) to do a compatibility analysis when there 
are missing data.   It is a method of de-weighting characters that evolve
rapidly.  I wish more people were aware of its properties.  
<P>
<B>The <TT>M</TT> (Multiple data sets) option.</B>  In menu programs there is an
<TT>M</TT> menu
option which allows one to toggle on the multiple data sets option.  The
program will ask you how many data sets it should expect.  The data sets
have the same format as the first data set.  Here is a (very small) input file
with two five-species data sets:
<P>
<TABLE><TR><TD bgcolor=white>
<PRE>
      5    6
Alpha     CCACCA
Beta      CCAAAA
Gamma     CAACCA
Delta     AACAAC
Epsilon   AACCCA
5    6
Alpha     CACACA
Beta      CCAACC
Gamma     CAACAC
Delta     GCCTGG
Epsilon   TGCAAT
</PRE>
</TD></TR></TABLE>
<P>
The main use of this option will be to allow all of the methods in these
programs to be bootstrapped.  Using the program Seqboot one can take any
DNA, protein, restriction sites, gene frequency or binary character data set and
make multiple data sets by bootstrapping.  Trees can be produced for all of
these using the <TT>M</TT> option.  They will be written on the tree output file if
that option is left in force.  Then the program Consense can be used with
that tree file as its input file.  The result is a majority rule consensus
tree which can be used to make confidence intervals.  The present version
of the package allows, with the use of Seqboot and Consense and the M option,
bootstrapping of many of the methods in the package.
<P>
Programs Dnaml, Dnapars and Pars can also take multiple weights
instead of multiple data sets.  They can then do bootstrapping by
reading in one data set, together with a file of weights that show how
the characters (or sites) are reweighted in each bootstrap sample.  Thus a
site that is omitted in a bootstrap sample has effectively been given
weight 0, while a site that has been duplicated has effectively been
given weight 2.  Seqboot has a menu selection to produce the file of
weights information automatically, instead of producing a file of
multiple data sets.  It can be renamed and used as the input weights file.
<P>
<B>The <TT>W</TT> (Weights) option</B>.  This signals the program that, in
addition to the data set, you want to read in a series of weights that
tell how many times each character is to be counted.  If the weight
for a character is zero (<TT>0</TT>) then that character is in effect to
be omitted when the tree is evaluated.  If it is (<TT>1</TT>) the
character is to be counted once.  Some programs allow weights greater than
1 as well.  These have the effect that the character is counted as
if it were present that many times, so that a weight of 4 means that the
character is counted 4 times.
The values 0-9 give weights 0 through 9, and the
values A-Z give weights 10 through 35.  By use of the weights we can
give overwhelming weight to some characters, and drop others from the
analysis.  In the molecular sequence programs only two values of the
weights, 0 or 1 are allowed.
<P>
The weights are used to analyze subsets of the characters, and also can be
used for resampling of the data as in bootstrap and jackknife resampling.
For those programs that allow weights to be greater than 1, they can also
be used to emphasize information from some characters more strongly than
others.  Of course, you must have some rationale for doing this.
<P>
The weights are provided as a sequence of digits.  Thus they might be
<P>
<TT>10011111100010100011110001100</TT>
<P>
The weights are to be provided in an input file
whose default name is <TT>weights</TT>.   The weights in it are
a simple string of digits.  Blanks in the weightfile are skipped over and
ignored, and the weights can continue to a new line.  In programs such as
Seqboot
that can also output a file of weights, the input weights have a default
file name of <TT>inweights</TT>, and the output file name has a default
file name of <TT>outweights</TT>. 
<P>
Weights can be used to analyze different subsets of characters (by weighting
the rest as zero).  Alternatively, in the discrete characters programs
they can be used to force a certain
group to appear on the phylogeny (in effect confining consideration to only
phylogenies containing that group).  This is done by adding an imaginary
character that has <TT>1</TT>'s for the members of the group, and <TT>0</TT>'s
for all the
other species.  That imaginary character is then given the highest weight
possible: the result will be that any phylogeny that does not contain that
group will be penalized by such a heavy amount that it will not (except in
the most unusual circumstances) be considered.  Of course, the new character
brings extra steps to the tree, but the number of these can be calculated
in advance and subtracted out of the total when reporting the results.  This 
use of weights is an important one, and one sadly ignored
by many users who could profit from it.  In the case of molecular sequences
we cannot use weights this way, so that to force a given group to appear we
have to add a large extra segment of sites to the molecule, with (say) A's
for that group and C's for every other species.
<P>
<B>The option to write out the trees into a tree file</B>.  This specifies that you
want the program to write
out the tree not only on its usual output, but also onto a file in
nested-parenthesis notation (as described above).  This option is sufficiently
useful that it is turned on by default in all programs that allow it.  You
can optionally turn it off if you wish, by typing the appropriate number
from the menu (it varies from program to program).  This option is useful for
creating tree files that can be directly read into the programs, including
the consensus tree and tree distance programs, and the tree plotting programs.
<P>
The output tree file has a default name of <TT>outtree</TT>.
<P>
<B>The (<TT>0</TT>) terminal type option</B> .  (This is the digit <TT>0</TT>, not
the alphabetic character <TT>O</TT>). The program will default to
one particular assumption about your terminal (ANSI in the case of Linux,
Unix, or Mac OS X, and IBM PC in the case of
Windows). You can
alternatively select it to be either an IBM PC, or nothing.
This affects the ability of the programs to clear the screen when they
display their menus, and the graphics characters used to display trees
in the programs Dnamove, Move, Dolmove, and Retree.  In the case of Windows,
the screen will clear properly with either the IBM PC or the ANSI settings,
but the graphics characters needed by Move, Dnamove, Dolmove, or Retree
will display correctly only with the IBM PC setting.
<P>
<A NAME="algorithm"><HR><P></A>
<DIV ALIGN="CENTER">
<H2>The Algorithm for Constructing Trees</H2></DIV>
<P>
All of the programs except Factor, Dnadist, Gendist, Dnainvar, Seqboot,
Contrast, Retree, and the plotting and
consensus tree programs act to construct an estimate of a phylogeny.  Move, 
Dolmove, and Dnamove let you construct it yourself by hand.  All of 
the rest but Neighbor, the "...penny" programs and Clique make use of
a common approach involving additions and rearrangements.  They are
trying to minimize or maximize some quantity over the space of all
possible evolutionary trees.  Each program contains a part that, given
the topology of the tree, evaluates the quantity that is being minimized
or maximized.  The straightforward approach would be to evaluate all
possible tree topologies one after another and pick the one which,
according to the criterion being used, is best.  This would not be
possible for more than a small number of species, since the number of
possible tree topologies is enormous.  A review of the literature on the
counting of evolutionary trees will be found one of my papers
(Felsenstein, 1978a) and in my book (Felsenstein, 2004, chapter 3).
<P>
Since we cannot search all topologies, these programs are not
guaranteed to always find the best tree, although they seem to do quite
well in practice.  The strategy they employ is as follows: the species
are taken in the order in which they appear in the input file.  The
first two (in some programs the first three) are taken and a tree
constructed containing only those.  There is only one possible topology for
this tree.  Then the next species is taken, and we consider where it
might be added to the tree.  If the initial tree is (say) a rooted tree
with two species and we want the resulting three-species tree to be a
bifurcating tree, there are only three places where we could add the
third species.  Each of these is tried, and each time the resulting tree is
evaluated according to the criterion.  The best one is chosen to be the
basis for further operations.  Now we consider adding the fourth
species, again at each of the five possible places that would result in
a bifurcating tree.  Again, the best of these is accepted.  This is
usually known as the Sequential Addition strategy.
<P>
<H3>Local rearrangements</H3>
<P>
The process continues in this manner, with one important exception.  After 
each species is added, and before the next
is added, a number of rearrangements of the tree are tried, in an effort
to improve it.  The algorithms move through the tree, making all
possible local rearrangements of the tree.  A local rearrangement involves an
internal segment of the tree in the following manner.  Each internal
segment of the tree is of this form (where T1, T2, and T3 are subtrees
- parts of the tree that can contain further forks and tips):
<P>
<PRE>
            T1      T2       T3
             \      /        /
              \    /        /
               \  /        /
                \/        /
                 *       /
                  *     /
                   *   /
                    * /
                     *
                     !
                     !
</PRE>
<P>
the segment we are discussing being indicated by the asterisks.  A local
rearrangement consists of switching the subtrees T1 and T3 or T2 and T3,
so as to obtain one of the following:
<P>
<PRE>
          T3       T2      T1            T1       T3      T2
           \       /       /              \       /       /
            \     /       /                \     /       /
             \   /       /                  \   /       /
              \ /       /                    \ /       /
               \       /                      \       /
                \     /                        \     /
                 \   /                          \   /
                  \ /                            \ /
                   !                              !
                   !                              !
                   !                              !
</PRE>
<P>
Each time a local rearrangement is successful in finding a better tree,
the new arrangement is accepted.  The phase of local rearrangements does
not end until the program can traverse the entire tree, attempting local
rearrangements, without finding any that improve the tree.
<P>
This strategy of adding species and making local rearrangements will look
at about &nbsp;(n-1)x(2n-3)&nbsp; different topologies, though if
rearrangements are frequently successful the number may be larger.  I
have been describing the strategy when rooted trees are being
considered.  For unrooted trees there is a precisely similar strategy,
though the first tree constructed may be a three-species tree and the
rearrangements may not start until after the addition of the fifth
species.
<P>
These local rearrangements have come to be called Nearest Neighbor
Interchanges (NNIs) in the phylogeny literature.
<P>
Though we are not guaranteed to have found the best tree topology,
we are guaranteed that no nearby topology (i. e.  none accessible by a
single local rearrangement) is better.  In this sense we have reached a
local optimum of our criterion.  Note that the whole process is
dependent on the order in which the species are present in the input
file.  We can try to find a different and better solution by reordering
the species in the input file and running the program again (or, more
easily, by using the <TT>J</TT> option).  If none of
these attempts finds a better solution, then we have some indication
that we may have found the best topology, though we can never be certain
of this.
<P>
Note also that a new topology is never accepted unless it is better
than the previous one, so that the rearrangement process can never fall
into an endless loop.  This is also the way ties in our criterion are
resolved, namely by sticking with the tree found first.  However, the tree 
construction programs other than Clique, Contml, Fitch, 
and Dnaml do keep a record of all trees found that are tied with the best one 
found.  This gives you some immediate idea of which parts of the tree can be 
altered without affecting the quality of the result.
<P>
<H3>Global rearrangements</H3>
<P>
A feature of most of the programs, such as Protpars, Dnapars,
Dnacomp, Dnaml, Dnamlk, Restml, Kitsch, Fitch, Contml, Mix, and Dollop,
is "global" optimization of the tree.  In four of these (Contml,
Fitch, Dnaml and Dnamlk) this is an option, <TT>G</TT>.  In the others it
automatically applies.  When 
it is present there is an additional stage to the search for the best tree.  
Each possible subtree is removed from the tree from the tree and added back in 
all possible places.  This process continues until all subtrees can be removed 
and added again without any improvement in the tree.  The purpose of this 
extra rearrangement is to make it less likely that one or more a species gets 
"stuck" in a suboptimal region of the space of all possible trees.  The use of 
global optimization results in approximately a tripling (3 x ) of the run-time, 
which is why I have left it as an option in some of the slower programs.
<P>
What PHYLIP calls "global" rearrangements are more properly called
SPR (subtree pruning and regrafting) by Swofford et. al. (1996) as distinct
from the NNI (nearest neighbor interchange) rearrangements that PHYLIP
also uses, and the TBR (tree bisection and reconnection) rearrangements
that it does not use.  My book (Felsenstein, 2004, chapter 4) contains
a review of work on these and other rearrangements and search methods.
<P>
The programs doing global optimization print out a dot "<TT>.</TT>" after each group is 
removed and re-added to the tree, to give the user some sign that the 
rearrangements are proceeding.  A new line of dots is started whenever a new 
round of global rearrangements is started following an improvement in the 
tree.  On the line before the dots are printed there is printed a bar of
the form "!---------------!" to show how many dots
to expect.  The dots will
not be printed out at a uniform rate, but the later dots, which represent
removal of larger groups from the tree and trying them consequently in fewer
places, will print out more quickly.  With some compilers each row of dots may
not be printed out until it is complete.
<P>
It should be noted that Penny, Dolpenny, Dnapenny and Clique use a more 
sophisticated strategy of "depth-first search" with a "branch and bound"
search method that guarantees that all
of the best trees will be found.  In the case 
of Penny, Dolpenny and Dnapenny there can be a considerable sacrifice of 
computer time if the number of species is greater than about ten: it is a 
matter for you to consider whether it is worth it for you to guarantee finding 
all the most parsimonious trees, and that depends on how much free computer 
time you have!  Clique finds all largest cliques, and does so without undue 
burning of computer time.   Although all of these problems that have been
investigated fall into the
category of "NP-hard" problems that in effect do not have a rapid solution,
the cases that cause this trouble for the largest-cliques algorithm in
Clique apparently are not biologically realistic and do not occur in actual
data.
<P>
<H3>Multiple jumbles</H3>
<P>
As just mentioned, for most of these programs the search depends on the order
in which the species are entered into the tree.  Using the <TT>J</TT> (Jumble)
option you can supply a random number seed which will allow the program to put
the species in in a random order.  Jumbling can be
done multiple times.  For example, if you tell the program to do it
10 times, it will go through the tree-building process 10 times, each with a
different random order of adding species.  It will keep a record of the trees
tied for best over the whole process.  In other words, it does not just
record the best trees from each of the 10 runs, but records the best ones
overall.  Of course this is slow, taking 10 times longer than a single run.
But it does give us a much greater chance of finding all of the most
parsimonious trees.  In the terminology of Maddison (1991) it
can find different "islands" of trees.  The present algorithms do not
guarantee us to find all trees in a given "island" from a single run, so
multiple runs also help explore those "islands" that are found.
<P>
<H3>Saving multiple tied trees</H3>
<P>
For the parsimony and compatibility programs, one can have a perfect tie
between two or more trees.  In these programs these trees are all
saved.  For the newer parsimony programs such as Dnapars and Pars,
global rearrangement is carried out on all of these tied trees.  This can
be turned off in the menu.
<P>
For trees with criteria which are real numbers, such as the distance
matrix programs Fitch and Kitsch, and the likelihood programs Dnaml,
Dnamlk, Contml, and Restml, it is difficult to get an exact tie between
trees.  Consequently these programs save only the single best tree
(even though the others may be only a tiny bit worse).
<P>
<H3>Strategy for finding the best tree</H3>
<P>
In practice, it is advisable to use the Jumble option to evaluate many
different orderings of the input species.  <em>It is advisable to use the
Jumble option and specify that it be done many times (as many as
different orderings
of the input species).</em>  (This is usually not necessary when bootstrapping,
though the programs will then default to doing it once to avoid artifacts
caused by the order in which species are added to the tree.)
<P>
People who want a magic "black box" program whose results they do
not have to question (or think about) often are upset that these
programs give results that are dependent on the order in which the species
are entered in the data.  To me this property is an advantage, for it
permits you to try different searches for better trees, simply by
varying the input order of species.  If you do not use the multiple Jumble
option, but do multiple individual runs instead, you
can easily decide which to pay most attention to - the one or ones that 
are best according to the criterion employed (for example, with parsimony, 
the one out of the runs that results in the tree with the fewest changes).
<P>
In practice, in a single run, it usually seems best to put species that are
likely to be sources of confusion in the topology last, as by the time they are
added the arrangement of the earlier species will have stabilized into a
good configuration, and then the last few species will by fitted into
that topology.  There will be less chance this way of a poor initial
topology that would affect all subsequent parts of the search.  However,
a variety of arrangements of the input order of species should be tried,
as can be done if the <TT>J</TT> option is used,
and no species should be kept in a fixed place in the order of input.
Note that the results of the "...penny" programs and Clique
are not sensitive to the input order of species, and Neighbor is only
slightly sensistive to it, so that multiple Jumbling is not possible
with those programs.  Note also that with global search, which
is standard in many programs and in others is an 
option, each group (including
each individual species) will be removed and re-added in all possible
positions, so that a species causing confusion will have more chance of moving
to a new location than it would without global rearrangement.
<P>
<H3>Nixon's search strategy</H3>
<P>
An innovative search strategy was developed by Kevin Nixon (1999).  If one
uses a manual rearrangement program such as Dnamove, Move, or Dolmove, and
look at the distribution of characters on the trees, you will see some
characters whose distributions appear to recommend alternative groupings.
One would want a program that automatically found such alternative
suggestions and used them to rearrange the tree so as to explore trees that
had those groups.  Nixon had the idea of using resampling methods to
do this.  Using either bootstrap or jackknife sampling, one can make data
sets that emphasize randomly sampled subsets of characters.  We
then search for trees that fit those data sets.  After finding them, we
revert to the initial data set and then search using those trees as
starting points.   This sampling allows us to explore parts of tree space
recommended by particular subsets of characters. (This is not exactly
Nixon's original strategy, which started the searches for each resampled
data set from the best tree found so far.  For each resampled data set we
instead start from scratch, doing sequential addition of taxa.)
<P>
Nixon's method has proven to be very effective in searching for most
parsimonious trees -- it is currently the state of the art for that.
Nixon called his method the "parsimony ratchet", but actually it can be
applied straightforwardly to any method of phylogeny inference that has an
optimality criterion, including likelihood and least squares distance methods.
Starting with version 3.7, PHYLIP programs have the ability to search by
rearranging a tree supplied to them by the user.  This makes it possible to
implement our variant of Nixon's strategy.  You need to do so in multiple steps:
<OL>
<LI> Use bootstrap sampling to make a number of resampled versions of the
data set.  You can also use jackknifing.  In either case, there may be
advantages to sampling a smaller fraction of the sites (Nixon recommends
sampling about 30-35%).
<LI> Take these replicates, and do quick estimates of the phylogeny for
each one.  This could be done with faster methods such as neighbor-joining
or parsimony.
<LI> Take the resulting trees, together with the original data set.  Using
the method of phylogeny estimation that you prefer, read the trees in
as multiple user-defined trees, choosing the choice in the U menu option
that uses these trees as the starting point for rearrangement.  The
program will report the best tree or trees found by rearranging all of those
input trees.  This accomplishes Nixon's search strategy.
</OL>
It will not necessarily be fast to do this, as the last step may be slow.
But the resampling will cause emphasis on different sets of characters in
the initial searches, allowing the process to explore regions of tree
space not usually examined by conventional rearrangement strategies.
<P>
There is some more information on how this may be done in the documentation
files for Seqboot and for the individual tree inference programs.
<P>
<A NAME="warning"><HR><P></A>
<DIV ALIGN="CENTER">
<H2>A Warning on Interpreting Results</H2></DIV>
<P>
Probably the most important thing to keep in mind while running any of the
parsimony or compatibility programs is not
to overinterpret the result.  Some users treat the set of most parsimonious
trees as if it were a confidence interval.  If a group appears in all of the
most parsimonious trees then they treat it as well established.  Unfortunately
<I>the confidence interval on phylogenies appears to be much
larger than the set of all most parsimonious trees</I> (Felsenstein, 1985b).
Likewise, variation of result among different methods will not be a good
indicator of the size of the confidence interval.  Consider a simple data set
in which, out of 100 binary characters, 51 recommend the unrooted tree
<TT>((A,B),(C,D))</TT> and 49 the tree <TT>((A,D),(B,C))</TT>.  Many different
methods will all give the same result on
such a data set: they will estimate the tree as <TT>((A,B),(C,D))</TT>.
Nevertheless it is
clear that the 51:49 margin by which this tree is favored is not statistically
significantly different from 50:50.  So <I>consistency among different methods
is a poor guide to statistical significance</I>.
<P>
<A NAME="speed"><HR><P></A>
<DIV ALIGN="CENTER">
<H2>Relative Speed of Different<BR>
Programs and Machines</H2></DIV>
<P>
<H3>Relative speed of the different programs</H3>
<P>
C compilers differ in efficiency of the code they generate,
and some deal with some features of the language better than with
others.  Thus a program which is unusually fast on one computer may be
unusually slow on another.  Nevertheless, as a rough guide to relative
execution speeds, I have tested the programs on three data sets, each of
which has 10 species and 40 characters.  The first is an imaginary one
in which all characters are compatible - ("The Willi Hennig Memorial
Data Set" as J. S. Farris once called ones like it).  The second is the binary
recoded form of the fossil horses data set of Camin and Sokal (1965).
The third data set has data that is completely random: 10 species and 20
characters that have a 50% chance that each character state is <TT>0</TT> or
<TT>1</TT> (or <TT>A</TT> or <TT>G</TT>).  The data sets thus range from a completely
compatible one in which there is no homoplasy (paralellism or convergence), 
through the horses data set, which requires 29 steps where the possible 
minimum number would be 20, to the random data set, which requires 49 steps.  
We can thus see how this increasing messiness of the data affects running 
times.  The three data sets have all had 20 sites of <TT>A</TT>'s added to the 
end of each sequence, so as to prevent likelihood or distance matrix programs 
from having infinite branch lengths (the test data sets used for timing
previous versions of PHYLIP were the same except that they lacked these
20 extra sites).
<P>
Here are the nucleotide sequence versions of the three data sets:
<P>
<TABLE><TR><TD BGCOLOR=white>
<PRE>
    10   40
A         CACACACAAAAAAAAAAACAAAAAAAAAAAAAAAAAAAAA
B         CACACAACAAAAAAAAAACAAAAAAAAAAAAAAAAAAAAA
C         CACAACAAAAAAAAAAAACAAAAAAAAAAAAAAAAAAAAA
D         CAACAAAACAAAAAAAAACAAAAAAAAAAAAAAAAAAAAA
E         CAACAAAAACAAAAAAAACAAAAAAAAAAAAAAAAAAAAA
F         ACAAAAAAAACACACAAAACAAAAAAAAAAAAAAAAAAAA
G         ACAAAAAAAACACAACAAACAAAAAAAAAAAAAAAAAAAA
H         ACAAAAAAAACAACAAAAACAAAAAAAAAAAAAAAAAAAA
I         ACAAAAAAAAACAAAACAACAAAAAAAAAAAAAAAAAAAA
J         ACAAAAAAAAACAAAAACACAAAAAAAAAAAAAAAAAAAA
</PRE>
</TD></TR></TABLE>
<P>
<TABLE><TR><TD BGCOLOR=white>
<PRE>
    10   40
MesohippusAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
HypohippusAAACCCCCCCAAAAAAAAACAAAAAAAAAAAAAAAAAAAA
ArchaeohipCAAAAAAAAAAAAAAAACACAAAAAAAAAAAAAAAAAAAA
ParahippusCAAACAACAACAAAAAAAACAAAAAAAAAAAAAAAAAAAA
MerychippuCCAACCACCACCCCACACCCAAAAAAAAAAAAAAAAAAAA
M. secunduCCAACCACCACCCACACCCCAAAAAAAAAAAAAAAAAAAA
Nannipus  CCAACCACAACCCCACACCCAAAAAAAAAAAAAAAAAAAA
NeohippariCCAACCCCCCCCCCACACCCAAAAAAAAAAAAAAAAAAAA
Calippus  CCAACCACAACCCACACCCCAAAAAAAAAAAAAAAAAAAA
PliohippusCCCACCCCCCCCCACACCCCAAAAAAAAAAAAAAAAAAAA
</PRE>
</TD></TR></TABLE>
<P>
<TABLE><TR><TD BGCOLOR=white>
<PRE>
    10   40
A         CACACAACCAAACAAACCACAAAAAAAAAAAAAAAAAAAA
B         AAACCACACACACAAACCCAAAAAAAAAAAAAAAAAAAAA
C         ACAAAACCAAACCACCCACAAAAAAAAAAAAAAAAAAAAA
D         AAAAACACAACACACCAAACAAAAAAAAAAAAAAAAAAAA
E         AAACAACCACACACAACCAAAAAAAAAAAAAAAAAAAAAA
F         CCCAAACACCCCCAAAAAACAAAAAAAAAAAAAAAAAAAA
G         ACACCCCCACACCCACCAACAAAAAAAAAAAAAAAAAAAA
H         AAAACAACAACCACCCCACCAAAAAAAAAAAAAAAAAAAA
I         ACACAACAACACAAACAACCAAAAAAAAAAAAAAAAAAAA
J         CCAAAAACACCCAACCCAACAAAAAAAAAAAAAAAAAAAA
</PRE>
</TD></TR></TABLE>
<P>
Here are the timings of many of the version 3.6 programs on these three data
sets as run after being compiled by Gnu C (version 3.2) and run on an
AMD Athlon XP 2200+ computer under Linux.
<P>
<DIV ALIGN="CENTER">
<TABLE CELLPADDING=3 BORDER="1">
<TR><TD ALIGN="LEFT">&nbsp;</TD>
<TD ALIGN="RIGHT">Hennigian Data</TD>
<TD ALIGN="RIGHT">Horses Data</TD>
<TD ALIGN="RIGHT">Random Data</TD>
</TR>
<TR><TD ALIGN="LEFT">Protpars</TD>
<TD ALIGN="RIGHT">0.00500</TD>
<TD ALIGN="RIGHT">0.00670</TD>
<TD ALIGN="RIGHT">0.01289</TD>
</TR>
<TR><TD ALIGN="LEFT">Dnapars</TD>
<TD ALIGN="RIGHT">0.01050</TD>
<TD ALIGN="RIGHT">0.00940</TD>
<TD ALIGN="RIGHT">0.00980</TD>
</TR>
<TR><TD ALIGN="LEFT">Dnapenny</TD>
<TD ALIGN="RIGHT">0.01400</TD>
<TD ALIGN="RIGHT">0.00860</TD>
<TD ALIGN="RIGHT">1.71100</TD>
</TR>
<TR><TD ALIGN="LEFT">Dnacomp</TD>
<TD ALIGN="RIGHT">0.00240</TD>
<TD ALIGN="RIGHT">0.00250</TD>
<TD ALIGN="RIGHT">0.00590</TD>
</TR>
<TR><TD ALIGN="LEFT">Dnaml</TD>
<TD ALIGN="RIGHT">0.17749</TD>
<TD ALIGN="RIGHT">0.23970</TD>
<TD ALIGN="RIGHT">0.21350</TD>
</TR>
<TR><TD ALIGN="LEFT">Dnamlk</TD>
<TD ALIGN="RIGHT">0.21740</TD>
<TD ALIGN="RIGHT">0.19450</TD>
<TD ALIGN="RIGHT">0.24400</TD>
</TR>
<TR><TD ALIGN="LEFT">Proml</TD>
<TD ALIGN="RIGHT">1.3527&nbsp;&nbsp;</TD>
<TD ALIGN="RIGHT">3.2085&nbsp;&nbsp;</TD>
<TD ALIGN="RIGHT">2.0055&nbsp;&nbsp;</TD>
</TR>
<TR><TD ALIGN="LEFT">Promlk</TD>
<TD ALIGN="RIGHT">3.3567&nbsp;&nbsp;</TD>
<TD ALIGN="RIGHT">8.6078&nbsp;&nbsp;</TD>
<TD ALIGN="RIGHT">4.4886&nbsp;&nbsp;</TD>
</TR>
<TR><TD ALIGN="LEFT">Dnainvar</TD>
<TD ALIGN="RIGHT">0.00020</TD>
<TD ALIGN="RIGHT">0.00020</TD>
<TD ALIGN="RIGHT">0.00020</TD>
</TR>
<TR><TD ALIGN="LEFT">Dnadist</TD>
<TD ALIGN="RIGHT">0.00140</TD>
<TD ALIGN="RIGHT">0.00080</TD>
<TD ALIGN="RIGHT">0.00150</TD>
</TR>
<TR><TD ALIGN="LEFT">Protdist</TD>
<TD ALIGN="RIGHT">0.09220</TD>
<TD ALIGN="RIGHT">0.09210</TD>
<TD ALIGN="RIGHT">0.09310</TD>
</TR>
<TR><TD ALIGN="LEFT">Restml</TD>
<TD ALIGN="RIGHT">0.14560</TD>
<TD ALIGN="RIGHT">0.28810</TD>
<TD ALIGN="RIGHT">0.21540</TD>
</TR>
<TR><TD ALIGN="LEFT">Restdist</TD>
<TD ALIGN="RIGHT">0.00110</TD>
<TD ALIGN="RIGHT">0.00090</TD>
<TD ALIGN="RIGHT">0.00080</TD>
</TR>
<TR><TD ALIGN="LEFT">Fitch</TD>
<TD ALIGN="RIGHT">0.00760</TD>
<TD ALIGN="RIGHT">0.01280</TD>
<TD ALIGN="RIGHT">0.00880</TD>
</TR>
<TR><TD ALIGN="LEFT">Kitsch</TD>
<TD ALIGN="RIGHT">0.00180</TD>
<TD ALIGN="RIGHT">0.00260</TD>
<TD ALIGN="RIGHT">0.00280</TD>
</TR>
<TR><TD ALIGN="LEFT">Neighbor</TD>
<TD ALIGN="RIGHT">0.00020</TD>
<TD ALIGN="RIGHT">0.00050</TD>
<TD ALIGN="RIGHT">0.00050</TD>
</TR>
<TR><TD ALIGN="LEFT">Contml</TD>
<TD ALIGN="RIGHT">0.01310</TD>
<TD ALIGN="RIGHT">0.01500</TD>
<TD ALIGN="RIGHT">0.01780</TD>
</TR>
<TR><TD ALIGN="LEFT">Gendist</TD>
<TD ALIGN="RIGHT">0.00070</TD>
<TD ALIGN="RIGHT">0.00070</TD>
<TD ALIGN="RIGHT">0.00070</TD>
</TR>
<TR><TD ALIGN="LEFT">Pars</TD>
<TD ALIGN="RIGHT">0.00780</TD>
<TD ALIGN="RIGHT">0.00610</TD>
<TD ALIGN="RIGHT">0.02930</TD>
</TR>
<TR><TD ALIGN="LEFT">Mix</TD>
<TD ALIGN="RIGHT">0.00360</TD>
<TD ALIGN="RIGHT">0.00410</TD>
<TD ALIGN="RIGHT">0.00610</TD>
</TR>
<TR><TD ALIGN="LEFT">Penny</TD>
<TD ALIGN="RIGHT">0.00190</TD>
<TD ALIGN="RIGHT">0.00470</TD>
<TD ALIGN="RIGHT">0.8060&nbsp;</TD>
</TR>
<TR><TD ALIGN="LEFT">Dollop</TD>
<TD ALIGN="RIGHT">0.00480</TD>
<TD ALIGN="RIGHT">0.00450</TD>
<TD ALIGN="RIGHT">0.00820</TD>
</TR>
<TR><TD ALIGN="LEFT">Dolpenny</TD>
<TD ALIGN="RIGHT">0.00200</TD>
<TD ALIGN="RIGHT">0.01060</TD>
<TD ALIGN="RIGHT">1.1270&nbsp;&nbsp;</TD>
</TR>
<TR><TD ALIGN="LEFT">Clique</TD>
<TD ALIGN="RIGHT">0.00100</TD>
<TD ALIGN="RIGHT">0.00070</TD>
<TD ALIGN="RIGHT">0.00130</TD>
</TR>
</TABLE>
</DIV>
<P>
<BR>

<P>
In all cases the programs were run under the default options with optimized 
compiler switches (<TT>-03 -fomit-frame-pointer</TT>), except as
specified here.  
The data sets used for the discrete characters programs have <TT>0</TT>'s and <TT>1</TT>'s
instead of <TT>A</TT>'s and <TT>C</TT>'s.  For Contml the <TT>A</TT>'s and <TT>C</TT>'s
were made into <TT>0.0</TT>'s and <TT>1.0</TT>'s and considered as 40 2-allele loci.
For the distance programs 10  x  10 distance matrices were
computed from the three data sets.
For the restriction sites programs <TT>A</TT> and <TT>C</TT> were changed into
<TT>+</TT> and <TT>-</TT>.  It does not
make much sense to benchmark Move, Dolmove, or Dnamove, although when there
are many characters and many species the response time after each 
alteration of the tree should be proportional to the product of the number of
species and the number of characters.
For Dnaml, Dnamlk, and Dnadist the frequencies of the four bases were
set to be equal rather than determined empirically as is the default.  For
Restml the number of enzymes was set to 1.
<P>
In most cases, the benchmark was made more accurate by analyzing 100 data
sets using the <TT>M</TT> (Multiple data sets) option and dividing the resulting
time by 100.  Times were determined as user times using the Linux <TT>time</TT>
command.  Several patterns will be apparent from this.  The algorithms (Mix,
Dollop, Contml, Fitch, Kitsch, Protpars, Dnapars, Dnacomp, and
Dnaml, Dnamlk, Restml) that use the above-described addition strategy have
run times that do not depend strongly on the messiness of the data.  The only 
exception to this is that if a data set such as the Random data requires
extra rounds of global rearrangements it takes longer.  The
programs differ greatly in run time: the protein likelihood programs
Proml and Promlk were very slow, and the other likelihood programs
Restml, Dnaml and
Contml are slower than the rest of the programs.  The protein sequence parsimony
program, which has to do a considerable amount of bookkeeping to keep track of
which amino acids can mutate to each other, is also relatively slow.
<P>
Another class of algorithms includes Penny, Dolpenny, Dnapenny and Clique.
These are branch-and-bound methods: in principle they should have execution
times that rise exponentially with the number of species and/or
characters, and they might be much more sensitive to messy data.  This is
apparent with Penny, Dolpenny, and Dnapenny, which go from being reasonably
fast with clean data to very slow with messy data.  Dolpenny is particularly
slow on messy data - this is because this algorithm cannot make use of some of
the lower-bound calculations that are possible with Dnapenny and Penny.  Clique
is very fast on all
data sets.  Although in theory it should bog down if the number of cliques in
the data is very large, that does not happen with random data, which in
fact has few cliques and those small ones.  Apparently the "worst-case"
data sets that cause exponential run time are much rarer for Clique than for
the other branch-and-bound methods.
<P>
Neighbor is quite fast compared to Fitch and Kitsch, and should make it
possible to run much larger cases, although the results are expected to be
a bit rougher than with those programs.
<BR>
<P>
<H3>Speed with different numbers of species</H3>
<P>
How will the speed depend on the number of species and the number
of characters?  For the sequential-addition algorithms, the speed should
be proportional to somewhere between the cube of the number of species and
the square of the number of species, and to the number
of characters.  Thus a case that has, instead of 10 species and 20
characters, 20 species and 50 characters would take (in the cubic case)
2  x  2  x  2  x  2.5 = 20
times as long.  This implies that cases with more than 20 species will
be slow, and cases with more than 40 species <I>very</I> slow.  This places a
premium on working on small subproblems rather than just dumping a whole
large data set into the programs.
<P>
An exception to these rules will be some of the DNA programs that use an
aliasing device to save execution time.  In these programs execution time
will not necessarily increase proportional to the number of sites,
as sites that show the same pattern of nucleotides will be detected
as identical and the calculations for them will be done only once, which does
not lead to more execution time.  This is particularly
likely to happen with few species and many sites, or with data sets that have
small amounts of evolutionary divergence.
<P>
For programs Fitch and Kitsch, the distance matrix is square, so
that when we double the number of species we also double the number of
"characters", so that running times will go up as the fourth power of
the number of species rather than the third power.  Thus a 20-species
case with Fitch is expected to run sixteen times more slowly than a 10-species
case.
<P>
For programs like Penny and Clique the run times will rise faster
than the cube of the number of species (in fact, they can rise faster
than any power since these algorithms are not guaranteed to work in
polynomial time).  In practice, Penny will frequently bog down above 11
species, while Clique easily deals with larger numbers.
<P>
For Neighbor the speed should vary only as the cube of the number of
species, so a case twice as large will take only eight times as long.  This
will make it an attractive alternative to Fitch and Kitsch for large data
sets.
<P>
<B>Suggestion:</B> If you are unsure of how long a program will take, try it first on
a few species, then work your way up until you get a feel for the speed
and for what size programs you can afford to run.
<P>
Execution time is not the most important criterion for a program,
particularly as computer time gets much cheaper than your time or a
programmer's time.  With workstations on which background jobs can be run
all night, execution speed is not overwhelmingly relevant.  Some of us have been
conditioned by an earlier era of computing to consider execution speed
paramount.  But ease of use, ease of adaptation to your computer system,
and ease of modification are much more important in practice, and in
these respects I think these programs are adequate.  Only if you are
engaged in 1960's style mainframe computing, or if you have very large
amounts of data is minimization of execution
time paramount.  If you spent six months getting your data, it may not be
overwhelmingly important whether your run takes 10 seconds or 10 hours.
<P>
Nevertheless it would have been nice to have made the programs
faster.  The present speeds are a compromise between speed and
effectiveness: by making them slower and trying more rearrangements in the 
trees, or by enumerating all possible trees, I could have made the programs
more likely to find the best tree.  By trying fewer rearrangements I
could have speeded them up, but at the cost of finding worse trees.  I
could also have speeded them up by writing critical sections in assembly
language, but this would have sacrificed ease of distribution to new
computer systems.  There are also some options included in these programs that
make it 
harder to adopt some of the economies of bookkeeping that make other programs 
faster.  However to some extent I have simply made the decision not to spend 
time trying to speed up program bookkeeping when there were new likelihood and 
statistical methods to be developed.
<BR>
<P>
<H3>Relative speed of different machines</H3>
<P>
It is interesting to compare different machines using Dnapars as the
standard task.  One can rate a machine on the Dnapars benchmark by summing the
times for all three of the data sets.  Here are relative total timings over 
all three data sets (done with various versions of Dnapars) for some machines,
taking an AMD Athlon 1.2 GHz computer running Linux with gcc as the
standard.  Benchmarks from versions 3.4 and 3.5 of the program are
also included (respectively the Pascal and C versions whose timings are in
parentheses).  They are compared only with each other and are scaled to the
rest of the timings using the joint runs on the 386SX and the Pentium MMX 266.
This use of separate standards is necessary not
because of different languages but because different versions of the package
are being compared.  Thus, the "Time" is the ratio of the Total to that for
the Pentium, adjusted by the scalings of machines using 3.4 and 3.5 when
appropriate.  The Relative Speed is the reciprocal of the Time.  For the
moment these benchmarks are for version 3.6; they will be updated when 3.7
is fully released.
<P>
<DIV ALIGN="CENTER">
<TABLE CELLPADDING=3 BORDER="1">
<TR><TD ALIGN="LEFT"><B>Machine</B></TD>
<TD ALIGN="LEFT"><B>Operating<BR>System</B></TD>
<TD ALIGN="LEFT"><B>Compiler</B></TD>
<TD ALIGN="LEFT"><B>Total</B></TD>
<TD ALIGN="LEFT"><B>Time</B></TD>
<TD ALIGN="LEFT"><B>Relative<BR>Speed</B></TD>
</TR>
<TR><TD ALIGN="LEFT">Toshiba T1100+</TD>
<TD ALIGN="LEFT">MSDOS</TD>
<TD ALIGN="LEFT">Turbo Pascal 3.01A</TD>
<TD ALIGN="LEFT">(269)</TD>
<TD ALIGN="LEFT">10542</TD>
<TD ALIGN="LEFT">0.00009486</TD>
</TR>
<TR><TD ALIGN="LEFT">Apple Mac Plus</TD>
<TD ALIGN="LEFT">Mac OS</TD>
<TD ALIGN="LEFT">Lightspeed Pascal 2</TD>
<TD ALIGN="LEFT">(175.84)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;6891</TD>
<TD ALIGN="LEFT">0.00014511</TD>
</TR>
<TR><TD ALIGN="LEFT">Toshiba T1100+</TD>
<TD ALIGN="LEFT">MSDOS</TD>
<TD ALIGN="LEFT">Turbo Pascal 5.0</TD>
<TD ALIGN="LEFT">(162)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;6349</TD>
<TD ALIGN="LEFT">0.00015750</TD>
</TR>
<TR><TD ALIGN="LEFT">Macintosh Classic</TD>
<TD ALIGN="LEFT">Mac OS</TD>
<TD ALIGN="LEFT">Think Pascal 3</TD>
<TD ALIGN="LEFT">(160)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;6271</TD>
<TD ALIGN="LEFT">0.00015947</TD>
</TR>
<TR><TD ALIGN="LEFT">Macintosh Classic</TD>
<TD ALIGN="LEFT">Mac OS</TD>
<TD ALIGN="LEFT">Think C</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;(43.0)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;4771</TD>
<TD ALIGN="LEFT">0.0002096</TD>
</TR>
<TR><TD ALIGN="LEFT">IBM PS2/60</TD>
<TD ALIGN="LEFT">MSDOS</TD>
<TD ALIGN="LEFT">Turbo Pascal 5.0</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;(58.76)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;2303</TD>
<TD ALIGN="LEFT">0.0004343</TD>
</TR>
<TR><TD ALIGN="LEFT">80286 (12 Mhz)</TD>
<TD ALIGN="LEFT">MSDOS</TD>
<TD ALIGN="LEFT">Turbo Pascal 5.0</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;(47.09)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;1845.4</TD>
<TD ALIGN="LEFT">0.0005419</TD>
</TR>
<TR><TD ALIGN="LEFT">Apple Mac IIcx</TD>
<TD ALIGN="LEFT">Mac OS</TD>
<TD ALIGN="LEFT">Think Pascal 3</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;(42)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;1645.5</TD>
<TD ALIGN="LEFT">0.0006077</TD>
</TR>
<TR><TD ALIGN="LEFT">Apple Mac SE/30</TD>
<TD ALIGN="LEFT">Mac OS</TD>
<TD ALIGN="LEFT">Think Pascal 3</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;(42)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;1645.6</TD>
<TD ALIGN="LEFT">0.0006077</TD>
</TR>
<TR><TD ALIGN="LEFT">Apple Mac IIcx</TD>
<TD ALIGN="LEFT">Mac OS</TD>
<TD ALIGN="LEFT">Lightspeed Pascal 2</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;(39.84)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;1561.6</TD>
<TD ALIGN="LEFT">0.0006404</TD>
</TR>
<TR><TD ALIGN="LEFT">Apple Mac IIcx</TD>
<TD ALIGN="LEFT">Mac OS</TD>
<TD ALIGN="LEFT">Lightspeed Pascal 2#</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;(39.69)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;1555.0</TD>
<TD ALIGN="LEFT">0.00006431</TD>
</TR>
<TR><TD ALIGN="LEFT">Zenith Z386 (16MHz)</TD>
<TD ALIGN="LEFT">MSDOS</TD>
<TD ALIGN="LEFT">Turbo Pascal 5.0</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;(38.27)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;1539.0</TD>
<TD ALIGN="LEFT">0.0006498</TD>
</TR>
<TR><TD ALIGN="LEFT">Macintosh SE/30</TD>
<TD ALIGN="LEFT">Mac OS</TD>
<TD ALIGN="LEFT">Think C</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;(13.6)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;1508.4</TD>
<TD ALIGN="LEFT">0.0006630</TD>
</TR>
<TR><TD ALIGN="LEFT">386SX (16 MHz)</TD>
<TD ALIGN="LEFT">MSDOS</TD>
<TD ALIGN="LEFT">Turbo Pascal 6.0</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;(34)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;1333.6</TD>
<TD ALIGN="LEFT">0.0007498</TD>
</TR>
<TR><TD ALIGN="LEFT">386SX (16 MHz)</TD>
<TD ALIGN="LEFT">MSDOS</TD>
<TD ALIGN="LEFT">Microsoft Quick C</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;(12.01)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;1333.6</TD>
<TD ALIGN="LEFT">0.0007499</TD>
</TR>
<TR><TD ALIGN="LEFT">Sequent-S81</TD>
<TD ALIGN="LEFT">DYNIX</TD>
<TD ALIGN="LEFT">Silicon Valley Pascal</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;(13.0)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;509.0</TD>
<TD ALIGN="LEFT">0.0019646</TD>
</TR>
<TR><TD ALIGN="LEFT">VAX 11/785</TD>
<TD ALIGN="LEFT">Unix</TD>
<TD ALIGN="LEFT">Berkeley Pascal</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;(11.9)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;466.3</TD>
<TD ALIGN="LEFT">0.002144</TD>
</TR>
<TR><TD ALIGN="LEFT">80486-33</TD>
<TD ALIGN="LEFT">MSDOS</TD>
<TD ALIGN="LEFT">Turbo Pascal 6.0</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;(11.46)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;449.0</TD>
<TD ALIGN="LEFT">0.02227</TD>
</TR>
<TR><TD ALIGN="LEFT">Sun 3/60</TD>
<TD ALIGN="LEFT">SunOS</TD>
<TD ALIGN="LEFT">Sun C</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;(3.93)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;435.7</TD>
<TD ALIGN="LEFT">0.002295</TD>
</TR>
<TR><TD ALIGN="LEFT">NeXT Cube (68030)</TD>
<TD ALIGN="LEFT">Mach</TD>
<TD ALIGN="LEFT">Gnu C</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;(2.608)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;289.3</TD>
<TD ALIGN="LEFT">0.003456</TD>
</TR>
<TR><TD ALIGN="LEFT">Sequent S-81</TD>
<TD ALIGN="LEFT">DYNIX</TD>
<TD ALIGN="LEFT">Sequent Symmetry C</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;(2.604)&nbsp;</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;288.9</TD>
<TD ALIGN="LEFT">0.003461</TD>
</TR>
<TR><TD ALIGN="LEFT">VAXstation 3500</TD>
<TD ALIGN="LEFT">Unix</TD>
<TD ALIGN="LEFT">Berkeley Pascal</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;(7.3)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;286.5</TD>
<TD ALIGN="LEFT">0.003491</TD>
</TR>
<TR><TD ALIGN="LEFT">Sequent S-81</TD>
<TD ALIGN="LEFT">DYNIX</TD>
<TD ALIGN="LEFT">Berkeley Pascal</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;(5.6)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;219.5</TD>
<TD ALIGN="LEFT">0.004557</TD>
</TR>
<TR><TD ALIGN="LEFT">Unisys 7000/40</TD>
<TD ALIGN="LEFT">Unix</TD>
<TD ALIGN="LEFT">Berkeley Pascal</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;(5.24)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;205.3</TD>
<TD ALIGN="LEFT">0.004870</TD>
</TR>
<TR><TD ALIGN="LEFT">VAX 8600</TD>
<TD ALIGN="LEFT">VMS</TD>
<TD ALIGN="LEFT">DEC VAX Pascal</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;(3.96)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;155.23</TD>
<TD ALIGN="LEFT">0.006442</TD>
</TR>
<TR><TD ALIGN="LEFT">Sun SPARC IPX</TD>
<TD ALIGN="LEFT">SunOS</TD>
<TD ALIGN="LEFT">Gnu C version 2.1</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;(1.28)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;142.04</TD>
<TD ALIGN="LEFT">0.007040</TD>
</TR>
<TR><TD ALIGN="LEFT">VAX 6000-530</TD>
<TD ALIGN="LEFT">VMS</TD>
<TD ALIGN="LEFT">DEC C</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;(0.858)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;95.14</TD>
<TD ALIGN="LEFT">0.010511</TD>
</TR>
<TR><TD ALIGN="LEFT">VAXstation 4000</TD>
<TD ALIGN="LEFT">VMS</TD>
<TD ALIGN="LEFT">DEC C</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;(0.809)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;89.81</TD>
<TD ALIGN="LEFT">0.011135</TD>
</TR>
<TR><TD ALIGN="LEFT">IBM RS/6000 540</TD>
<TD ALIGN="LEFT">AIX</TD>
<TD ALIGN="LEFT">XLP Pascal</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;(2.276)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;89.14</TD>
<TD ALIGN="LEFT">0.011219</TD>
</TR>
<TR><TD ALIGN="LEFT">NeXTstation(040/25)</TD>
<TD ALIGN="LEFT">Mach</TD>
<TD ALIGN="LEFT">Gnu C</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;(0.75)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;83.15</TD>
<TD ALIGN="LEFT">0.012027</TD>
</TR>
<TR><TD ALIGN="LEFT">Sun SPARC IPX</TD>
<TD ALIGN="LEFT">SunOS</TD>
<TD ALIGN="LEFT">Sun C</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;(0.68)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;75.43</TD>
<TD ALIGN="LEFT">0.01326</TD>
</TR>
<TR><TD ALIGN="LEFT">486DX (33 MHz)</TD>
<TD ALIGN="LEFT">Linux</TD>
<TD ALIGN="LEFT">Gnu C #</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;(0.63)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;69.95</TD>
<TD ALIGN="LEFT">0.01430</TD>
</TR>
<TR><TD ALIGN="LEFT">Sun SPARCstation-1</TD>
<TD ALIGN="LEFT">Unix</TD>
<TD ALIGN="LEFT">Sun Pascal</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;(1.7)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;66.62</TD>
<TD ALIGN="LEFT">0.01501</TD>
</TR>
<TR><TD ALIGN="LEFT">DECstation 5000/200</TD>
<TD ALIGN="LEFT">Unix</TD>
<TD ALIGN="LEFT">DEC Ultrix C</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;(0.45)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;49.97</TD>
<TD ALIGN="LEFT">0.02001</TD>
</TR>
<TR><TD ALIGN="LEFT">Sun SPARC 1+</TD>
<TD ALIGN="LEFT">SunOS</TD>
<TD ALIGN="LEFT">Sun C</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;(0.40)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;44.37</TD>
<TD ALIGN="LEFT">0.02254</TD>
</TR>
<TR><TD ALIGN="LEFT">DECstation 3100</TD>
<TD ALIGN="LEFT">Unix</TD>
<TD ALIGN="LEFT">DEC Ultrix Pascal</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;(0.77)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;30.11</TD>
<TD ALIGN="LEFT">0.03321</TD>
</TR>
<TR><TD ALIGN="LEFT">IBM 3090-300E</TD>
<TD ALIGN="LEFT">AIX</TD>
<TD ALIGN="LEFT">Metaware High C</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;(0.27)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;29.98</TD>
<TD ALIGN="LEFT">0.03336</TD>
</TR>
<TR><TD ALIGN="LEFT">DECstation 5000/125</TD>
<TD ALIGN="LEFT">Unix</TD>
<TD ALIGN="LEFT">DEC Ultrix C</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;(0.267)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;29.58</TD>
<TD ALIGN="LEFT">0.03381</TD>
</TR>
<TR><TD ALIGN="LEFT">DECstation 5000/200</TD>
<TD ALIGN="LEFT">Unix</TD>
<TD ALIGN="LEFT">DEC Ultrix C</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;(0.256)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;28.38</TD>
<TD ALIGN="LEFT">0.03524</TD>
</TR>
<TR><TD ALIGN="LEFT">Sun SPARC 4/50</TD>
<TD ALIGN="LEFT">SunOS</TD>
<TD ALIGN="LEFT">Sun C</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;(0.249)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;27.62</TD>
<TD ALIGN="LEFT">0.03621</TD>
</TR>
<TR><TD ALIGN="LEFT">DEC 3000/400 AXP</TD>
<TD ALIGN="LEFT">Unix</TD>
<TD ALIGN="LEFT">DEC C</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;(0.224)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;24.85</TD>
<TD ALIGN="LEFT">0.04024</TD>
</TR>
<TR><TD ALIGN="LEFT">DECstation 5000/240</TD>
<TD ALIGN="LEFT">Unix</TD>
<TD ALIGN="LEFT">DEC Ultrix C</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;(0.1889)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;20.96</TD>
<TD ALIGN="LEFT">0.04771</TD>
</TR>
<TR><TD ALIGN="LEFT">SGI Iris R4000</TD>
<TD ALIGN="LEFT">Unix</TD>
<TD ALIGN="LEFT">SGI C</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;(0.184)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;20.41</TD>
<TD ALIGN="LEFT">0.04898</TD>
</TR>
<TR><TD ALIGN="LEFT">IBM 3090-300E</TD>
<TD ALIGN="LEFT">VM</TD>
<TD ALIGN="LEFT">Pascal VS</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;(0.464)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;18.12</TD>
<TD ALIGN="LEFT">0.05519</TD>
</TR>
<TR><TD ALIGN="LEFT">DECstation 5000/200</TD>
<TD ALIGN="LEFT">Unix</TD>
<TD ALIGN="LEFT">DEC Ultrix Pascal</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;(0.39)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;15.188</TD>
<TD ALIGN="LEFT">0.06583</TD>
</TR>
<TR><TD ALIGN="LEFT">Pentium 120</TD>
<TD ALIGN="LEFT">Linux</TD>
<TD ALIGN="LEFT">Gnu C</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;1.848</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;11.953</TD>
<TD ALIGN="LEFT">0.08366</TD>
</TR>
<TR><TD ALIGN="LEFT">Pentium Pro 180</TD>
<TD ALIGN="LEFT">Linux</TD>
<TD ALIGN="LEFT">Gnu C</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;1.009</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;6.527</TD>
<TD ALIGN="LEFT">0.1532</TD>
</TR>
<TR><TD ALIGN="LEFT">Pentium 266 MMX</TD>
<TD ALIGN="LEFT">Linux</TD>
<TD ALIGN="LEFT">Gnu C (PHYLIP 3.5)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;(0.054)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;5.996</TD>
<TD ALIGN="LEFT">0.1668</TD>
</TR>
<TR><TD ALIGN="LEFT">Pentium 266 MMX</TD>
<TD ALIGN="LEFT">Linux</TD>
<TD ALIGN="LEFT">Gnu C</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0.927</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;5.996</TD>
<TD ALIGN="LEFT">0.1668</TD>
</TR>
<TR><TD ALIGN="LEFT">Pentium 200</TD>
<TD ALIGN="LEFT">Linux</TD>
<TD ALIGN="LEFT">Gnu C</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0.853</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;5.517</TD>
<TD ALIGN="LEFT">0.1812</TD>
</TR>
<TR><TD ALIGN="LEFT">SGI PowerChallenge</TD>
<TD ALIGN="LEFT">Irix</TD>
<TD ALIGN="LEFT">Gnu C</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0.844</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;5.459</TD>
<TD ALIGN="LEFT">0.1832</TD>
</TR>
<TR><TD ALIGN="LEFT">DEC Alpha 400 4/233</TD>
<TD ALIGN="LEFT">DUNIX</TD>
<TD ALIGN="LEFT">Digital C (cc -fast)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0.730</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;4.722</TD>
<TD ALIGN="LEFT">0.2118</TD>
</TR>
<TR><TD ALIGN="LEFT">Pentium II 500</TD>
<TD ALIGN="LEFT">Linux</TD>
<TD ALIGN="LEFT">Gnu C</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0.368</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;2.380</TD>
<TD ALIGN="LEFT">0.4201</TD>
</TR>
<TR><TD ALIGN="LEFT">Dual 448/633 MHz Pentiums</TD>
<TD ALIGN="LEFT">Linux</TD>
<TD ALIGN="LEFT">gcc</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0.3069</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;1.985</TD>
<TD ALIGN="LEFT">0.5037</TD>
</TR>
<TR><TD ALIGN="LEFT">Sun Ultra 10</TD>
<TD ALIGN="LEFT">Solaris 8</TD>
<TD>gcc</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0.25848</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;1.672</TD>
<TD ALIGN="LEFT">0.5981</TD>
</TR>
<TR><TD ALIGN="LEFT">Macintosh G3 300 MHz</TD>
<TD ALIGN="LEFT">Mac OS X</TD>
<TD ALIGN="LEFT">Gnu C (-O 3)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0.2330</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;1.5071</TD>
<TD ALIGN="LEFT">0.6635</TD>
</TR>
<TR><TD ALIGN="LEFT">Compaq/Digital Alpha 500au</TD>
<TD ALIGN="LEFT">DUNIX</TD>
<TD ALIGN="LEFT">Digital C (cc -fast)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0.167</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;1.080</TD>
<TD ALIGN="LEFT">0.9257</TD>
</TR>
<TR><TD ALIGN="LEFT">AMD Athlon 1.2 GHz</TD>
<TD ALIGN="LEFT">Linux</TD>
<TD ALIGN="LEFT">gcc</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0.1546</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;1.0</TD>
<TD ALIGN="LEFT">1.0</TD>
</TR>
<TR><TD ALIGN="LEFT">Intel Pentium 4 2.26 GHz</TD>
<TD ALIGN="LEFT">Windows XP</TD>
<TD ALIGN="LEFT">Cygwin gcc</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0.1078</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0.6973</TD>
<TD ALIGN="LEFT">1.434</TD>
</TR>
<TR><TD ALIGN="LEFT">Pentium 4 1700 MHz</TD>
<TD ALIGN="LEFT">Linux</TD>
<TD ALIGN="LEFT">Gnu C</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0.10730</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0.6940</TD>
<TD ALIGN="LEFT">1.441</TD>
</TR>
<TR><TD ALIGN="LEFT">SGI Fuel R16000/700MHz</TD>
<TD ALIGN="LEFT">IRIX 6.5.30</TD>
<TD ALIGN="LEFT">MipsPro 7.4.4</TD>
<TD ALIGN="RIGHT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0.09</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0.58</TD>
<TD ALIGN="LEFT">1.72</TD>
</TR>
<TR><TD ALIGN="LEFT">Macintosh G4 1.2GHz</TD>
<TD ALIGN="LEFT">Mac OS X</TD>
<TD ALIGN="LEFT">Gnu C (-O 3)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0.0582</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0.3765</TD>
<TD ALIGN="LEFT">2.656</TD>
</TR>
<TR><TD ALIGN="LEFT">AMD Athlon 2800 2.1 GHz</TD>
<TD ALIGN="LEFT">Linux</TD>
<TD ALIGN="LEFT">gcc (-O 3)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0.0455</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0.2943</TD>
<TD ALIGN="LEFT">3.398</TD>
</TR>
<TR><TD ALIGN="LEFT">iMac 2 Ghz Intel Core Duo </TD>
<TD ALIGN="LEFT">Mac OS X</TD>
<TD ALIGN="LEFT">gcc (-O 3)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0.0300</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0.1940</TD>
<TD ALIGN="LEFT">5.153</TD>
</TR>
</TABLE>
</DIV>
<P>
This benchmark not only reflects integer performance of these machines
(as Dnapars has few floating-point operations) but also the efficiency
of the compilers.  Some of the machines (the DEC 3000/400 AXP
and the IBM RS/6000, in particular) are much faster than this benchmark
would indicate.  The numerical programs benchmark below gives them a
fairer test.  The Compaq/Digital Alpha 500au times are exaggerated because,
although their compiles are optimized for that processor, some of the Pentium
compiles are not similarly optimized.
<P>
Note that parallel machines like the Sequent and the SGI PowerChallenge are not
really as slow as indicated by the data here, as these runs did nothing to take
advantage of their parallelism.
<P>
These benchmarks have now extended over 22 years (1986-2008), and in the Dnapars
benchmark they extend over a range of over 54,000-fold in speed!
The experience of our laboratory, which seems typical, is that
computer power grows by a factor of about 1.85 per year.  This is
roughly consistent with these benchmarks.
<P>
For a picture of speeds for a more numerically intensive program,
here are benchmarks using Dnaml, with an AMD Athlon 1.2 GHz Linux system 
as the standard.  Some of the timings, the ones in parentheses, are
using PHYLIP version 3.5, and those are compared to that version run on
the Pentium 266.  Runs using the PHYLIP 3.4 Pascal version are adjusted
using the 386SX timings where both were run.  Numbers are
total run times (total user time in the case of Unix) over all three data sets.
<P>
<DIV ALIGN="CENTER">
<TABLE CELLPADDING=3 BORDER="1">
<TR><TD ALIGN="LEFT"><B>Machine</B></TD>
<TD ALIGN="LEFT"><B>Operating<BR>System</B></TD>
<TD ALIGN="LEFT"><B>Compiler</B></TD>
<TD ALIGN="LEFT"><B>Seconds</B></TD>
<TD ALIGN="LEFT"><B>Time</B></TD>
<TD ALIGN="LEFT"><B>Relative<BR>Speed</B></TD>
</TR>
<TR><TD ALIGN="LEFT">386SX 16 Mhz</TD>
<TD ALIGN="LEFT">PCDOS</TD>
<TD ALIGN="LEFT">Turbo Pascal 6</TD>
<TD ALIGN="LEFT">(7826)</TD>
<TD ALIGN="LEFT">1027.55</TD>
<TD ALIGN="LEFT">0.0009732</TD>
</TR>
<TR><TD ALIGN="LEFT">386SX 16 Mhz</TD>
<TD ALIGN="LEFT">PCDOS</TD>
<TD ALIGN="LEFT">Quick C</TD>
<TD ALIGN="LEFT">(6549.79)</TD>
<TD ALIGN="LEFT">1027.55</TD>
<TD ALIGN="LEFT">0.0009732</TD>
</TR>
<TR><TD ALIGN="LEFT">Compudyne 486DX/33</TD>
<TD ALIGN="LEFT">Linux</TD>
<TD ALIGN="LEFT">Gnu C</TD>
<TD ALIGN="LEFT">(1599.9)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;251.0</TD>
<TD ALIGN="LEFT">0.003984</TD>
</TR>
<TR><TD ALIGN="LEFT">SUN Sparcstation 1+</TD>
<TD ALIGN="LEFT">SunOS</TD>
<TD ALIGN="LEFT">Sun C</TD>
<TD ALIGN="LEFT">(1402.8)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;220.1</TD>
<TD ALIGN="LEFT">0.004543</TD>
</TR>
<TR><TD ALIGN="LEFT">Everex STEP 386/20</TD>
<TD ALIGN="LEFT">PCDOS</TD>
<TD ALIGN="LEFT">Turbo Pascal 5.5</TD>
<TD ALIGN="LEFT">(1440.8)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;189.17</TD>
<TD ALIGN="LEFT">0.005286</TD>
</TR>
<TR><TD ALIGN="LEFT">486DX/33</TD>
<TD ALIGN="LEFT">PCDOS</TD>
<TD ALIGN="LEFT">Turbo C++</TD>
<TD ALIGN="LEFT">(1107.2)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;173.70</TD>
<TD ALIGN="LEFT">0.005757</TD>
</TR>
<TR><TD ALIGN="LEFT">Compudyne 486DX/33</TD>
<TD ALIGN="LEFT">PCDOS</TD>
<TD ALIGN="LEFT">Waterloo C/386</TD>
<TD ALIGN="LEFT">(1045.78)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;164.07</TD>
<TD ALIGN="LEFT">0.006094</TD>
</TR>
<TR><TD ALIGN="LEFT">Sun SPARCstation IPX</TD>
<TD ALIGN="LEFT">SunOS</TD>
<TD ALIGN="LEFT">Gnu C</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;(960.2)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;150.64</TD>
<TD ALIGN="LEFT">0.006638</TD>
</TR>
<TR><TD ALIGN="LEFT">NeXTstation(68040/25)</TD>
<TD ALIGN="LEFT">Mach</TD>
<TD ALIGN="LEFT">Gnu C</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;(916.6)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;143.80</TD>
<TD ALIGN="LEFT">0.006954</TD>
</TR>
<TR><TD ALIGN="LEFT">486DX/33</TD>
<TD ALIGN="LEFT">PCDOS</TD>
<TD ALIGN="LEFT">Waterloo C/386</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;(861.0)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;135.08</TD>
<TD ALIGN="LEFT">0.007403</TD>
</TR>
<TR><TD ALIGN="LEFT">Sun SPARCstation IPX</TD>
<TD ALIGN="LEFT">SunOS</TD>
<TD ALIGN="LEFT">Sun C</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;(787.7)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;123.58</TD>
<TD ALIGN="LEFT">0.008091</TD>
</TR>
<TR><TD ALIGN="LEFT">486DX/33</TD>
<TD ALIGN="LEFT">PCDOS</TD>
<TD ALIGN="LEFT">Gnu C</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;(650.9)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;102.12</TD>
<TD ALIGN="LEFT">0.009792</TD>
</TR>
<TR><TD ALIGN="LEFT">VAX 6000-530</TD>
<TD ALIGN="LEFT">VMS</TD>
<TD ALIGN="LEFT">DEC C</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;(637.0)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;99.94</TD>
<TD ALIGN="LEFT">0.01001</TD>
</TR>
<TR><TD ALIGN="LEFT">DECstation 5000/200</TD>
<TD ALIGN="LEFT">Unix</TD>
<TD ALIGN="LEFT">DEC Ultrix RISC C</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;(423.3)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;66.41</TD>
<TD ALIGN="LEFT">0.01506</TD>
</TR>
<TR><TD ALIGN="LEFT">IBM 3090-300E</TD>
<TD ALIGN="LEFT">AIX</TD>
<TD ALIGN="LEFT">Metaware High C</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;(201.8)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;31.65</TD>
<TD ALIGN="LEFT">0.03159</TD>
</TR>
<TR><TD ALIGN="LEFT">Convex C240/1024</TD>
<TD ALIGN="LEFT">Unix</TD>
<TD ALIGN="LEFT">C</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;(101.6)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;15.940</TD>
<TD ALIGN="LEFT">0.06274</TD>
</TR>
<TR><TD ALIGN="LEFT">DEC 3000/400 AXP</TD>
<TD ALIGN="LEFT">Unix</TD>
<TD ALIGN="LEFT">DEC C</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;(98.29)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;15.42</TD>
<TD ALIGN="LEFT">0.06485</TD>
</TR>
<TR><TD ALIGN="LEFT">Pentium 120</TD>
<TD ALIGN="LEFT">Linux</TD>
<TD ALIGN="LEFT">Gnu C</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;25.26</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;19.230</TD>
<TD ALIGN="LEFT">0.05200</TD>
</TR>
<TR><TD ALIGN="LEFT">Pentium Pro 180</TD>
<TD ALIGN="LEFT">Linux</TD>
<TD ALIGN="LEFT">Gnu C</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;18.88</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;14.372</TD>
<TD ALIGN="LEFT">0.06957</TD>
</TR>
<TR><TD ALIGN="LEFT">Pentium 200</TD>
<TD ALIGN="LEFT">Linux</TD>
<TD ALIGN="LEFT">Gnu C</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;16.51</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;12.569</TD>
<TD ALIGN="LEFT">0.07956</TD>
</TR>
<TR><TD ALIGN="LEFT">SGI PowerChallenge</TD>
<TD ALIGN="LEFT">IRIX</TD>
<TD ALIGN="LEFT">Gnu C</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;12.446</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;9.475</TD>
<TD ALIGN="LEFT">0.10554</TD>
</TR>
<TR><TD ALIGN="LEFT">DEC Alpha 400 4/233</TD>
<TD ALIGN="LEFT">Linux</TD>
<TD ALIGN="LEFT">Gnu C (cc -fast)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;8.0418</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;6.122</TD>
<TD ALIGN="LEFT">0.16335</TD>
</TR>
<TR><TD ALIGN="LEFT">Pentium MMX 266</TD>
<TD ALIGN="LEFT">Linux</TD>
<TD ALIGN="LEFT">Gnu C (PHYLIP 3.5)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(36.15)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;5.671</TD>
<TD ALIGN="LEFT">0.17632</TD>
</TR>
<TR><TD ALIGN="LEFT">Pentium MMX 266</TD>
<TD ALIGN="LEFT">Linux</TD>
<TD ALIGN="LEFT">Gnu C</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;7.45</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;5.671</TD>
<TD ALIGN="LEFT">0.17632</TD>
</TR>
<TR><TD ALIGN="LEFT">Pentium II 500</TD>
<TD ALIGN="LEFT">Linux</TD>
<TD ALIGN="LEFT">Gnu C</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;6.02</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;4.583</TD>
<TD ALIGN="LEFT">0.2182</TD>
</TR>
<TR><TD ALIGN="LEFT">Dual 448/633 MHz Pentiums</TD>
<TD ALIGN="LEFT">Linux</TD>
<TD ALIGN="LEFT">Gnu C</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;3.7225</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;2.834</TD>
<TD ALIGN="LEFT">0.3529</TD>
</TR>
<TR><TD ALIGN="LEFT">Sun Ultra 10</TD>
<TD ALIGN="LEFT">Solaris 8</TD>
<TD>Gnu C</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;3.7101</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;2.824</TD>
<TD ALIGN="LEFT">0.3541</TD>
</TR>
<TR><TD ALIGN="LEFT">Pentium 4 1.7 GHz</TD>
<TD ALIGN="LEFT">Linux</TD>
<TD ALIGN="LEFT">Gnu C</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;2.0668</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;1.5734</TD>
<TD ALIGN="LEFT">0.6356</TD>
</TR>
<TR><TD ALIGN="LEFT">Macintosh G3 300 MHz</TD>
<TD ALIGN="LEFT">Mac OS X</TD>
<TD ALIGN="LEFT">Gnu C (-O 3)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;1.805</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;1.3741</TD>
<TD ALIGN="LEFT">0.7278</TD>
</TR>
<TR><TD ALIGN="LEFT">Intel Pentium 4 2.26 GHz</TD>
<TD ALIGN="LEFT">Windows XP</TD>
<TD ALIGN="LEFT">Cygwin gcc</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;1.55457</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;1.1834</TD>
<TD ALIGN="LEFT">0.8450</TD>
</TR>
<TR><TD ALIGN="LEFT">AMD Athlon 1.2 GHz</TD>
<TD ALIGN="LEFT">Linux</TD>
<TD ALIGN="LEFT">Gnu C</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;1.3136</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;1.0</TD>
<TD ALIGN="LEFT">1.0</TD>
</TR>
<TR><TD ALIGN="LEFT">Compaq/Digital Alpha 500au</TD>
<TD ALIGN="LEFT">Linux</TD>
<TD ALIGN="LEFT">Gnu C (cc -fast)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0.9383</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0.7143</TD>
<TD ALIGN="LEFT">1.4000</TD>
</TR>
<TR><TD ALIGN="LEFT">Macintosh G4 1.2 GHz</TD>
<TD ALIGN="LEFT">Mac OS X</TD>
<TD ALIGN="LEFT">Gnu C (-O 3)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0.7080</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0.5390</TD>
<TD ALIGN="LEFT">1.8554</TD>
</TR>
<TR><TD ALIGN="LEFT">SGI Fuel R16000/700Mhz</TD>
<TD ALIGN="LEFT">IRIX 6.5.30</TD>
<TD ALIGN="LEFT">MipsPro 7.4.4</TD>
<TD ALIGN="RIGHT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0.55</TD>
<TD ALIGN="CENTER">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0.41</TD>
<TD ALIGN="LEFT">2.43</TD>
<TR><TD ALIGN="LEFT">AMD Athlon 2800 2.1 GHz</TD>
<TD ALIGN="LEFT">Linux</TD>
<TD ALIGN="LEFT">gcc (-O 3)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0.3065</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0.2333</TD>
<TD ALIGN="LEFT">4.286</TD>
</TR>
<TR><TD ALIGN="LEFT">iMac 2 Ghz Intel Core Duo </TD>
<TD ALIGN="LEFT">Mac OS X</TD>
<TD ALIGN="LEFT">gcc (-O 3)</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0.2535</TD>
<TD ALIGN="LEFT">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0.1930</TD>
<TD ALIGN="LEFT">5.182</TD>
</TR>
</TABLE>
</DIV>
<P>
As before, the parallel machines such as the Convex and the SGI PowerChallenge
were only run using one processor, which does not take into account the
gain that could be obtained by parallelizing the programs.  The speed of the
Compaq/Digital Alpha 500au is exaggerated because it was compiled in a way
optimized for its processor, while some of the Pentium compiles were not.
<P>
You are invited to send me figures for your machine for
inclusion in future tables.  Use the data sets above and compute the total
times for Dnapars and for Dnaml for the three data sets (setting the
frequencies of the four bases to 0.25 each for the Dnaml runs).  Be sure to
tell me the name and version of your compiler, and the version of PHYLIP you
tested.
If the times are too small to be measured accurately, obtain the times
for 10 or 100 data sets (the Multiple data sets option) and divide by 10 or
100.
<P>
<A NAME="comments"><HR><P></A>
<DIV ALIGN="CENTER">
<H2>General Comments on Adapting<BR>
the Package to Different Computer Systems</H2></DIV>
<P>
In the sections following you will find instructions on how to adapt the 
programs to different computers and compilers.  The programs should compile
without alteration on most versions of C.  They use the "malloc" library
or "calloc" function to allocate memory so that the upper limits on how many
species or how many sites or characters they can run is set by the system memory
available to that memory-allocation function.
<P>
In the document file for each program, I have supplied a small
input example, and the output it produces, to help you check whether the
programs are running properly.
<P>
<DIV ALIGN=CENTER>
<A NAME="compiling"><HR><P></A>
<H2>Compiling the programs</H2>
</DIV>
<P>
If you have not been able to get executables for PHYLIP, you should be
able to make your own.  This can be easy under Linux and Unix, but more
difficult if you have a Macintosh or a Windows system.  If you have the
latter, we strongly recommend you download and use the Macintosh and
Windows executables that we distribute.  If you do that, you will not need
to have any compiler or to do any compiling.  I get a certain number of
inquiries each year from confused users who are not sure what a compiler
is but think they need one.  After downloading the executables they
contact me and complain that they did not find a compiler included in the
package, and would I please e-mail them the compiler.  What they really
need to do is use the executables and forget about compiling them.
<P>
Some users may also need to compile the programs in order to modify them.
The instructions below will help with this.
<P>
I will discuss how to compile PHYLIP using one of a number of widely-used
compilers.  After these I will comment on compiling PHYLIP on other, less
widely-used systems.
<P>
<A NAME="unix"><H3>Unix and Linux</H3></a>
<P>
For Unix and Linux (which is Unix in all important functional respects, if
not in all legal respects) you must compile PHYLIP yourself.
This is usually easy to do.
Unix (and Linux)
systems generally have a C compiler and have the <TT>make</TT> utility.  We
distribute with the PHYLIP source code a Unix-compatible <TT>Makefile</TT>.
We use <a href="http://www.gnu.org/">GNU</a>'s 
<a href="http://www.gnu.org/software/make/">make utility</a>,
which might be installed on your system as &quot;make&quot; or as
&quot;gmake&quot;.
<P>
However, note that some popular Linux distributions do not include a C
compiler in their default configuration.  For example, in RedHat Linux version
8, the "Personal Workstation" installation that is the default does not
include the C compiler or the X Windows libraries needed to compile
PHYLIP.  These are available, and can be loaded from the CDROMs in the
distribution.  The following instructions assume that you have the
C compiler and X libraries.  If you cannot easily configure your system to
include them, you should look into using the RedHat RPM binary
distribution, mentioned on the PHYLIP 3.6 web page.
<P>
As is mentioned below (under Macintoshes) the Mac OS X operating system is
a Unix, and if the X windows windowing system is installed, these Unix
instructions will work for it.
<P>
After you have finished unpacking the Documentation and Source Code
archive, you will find that you have created a folder <TT>phylip-3.6</TT>
in which there are three
folders, called <TT>exe</TT>, <TT>src</TT>, and <TT>doc</TT>.  
There is also an HTML web page, <TT>phylip.html</TT>.  The <TT>exe</TT>
folder
will be empty, <TT>src</TT> contains the source code files, including the
<TT>Makefile</TT>.  Directory <TT>doc</TT> contains the documentation files.
<P>
Enter the <TT>src</TT> folder.  Before you compile, you will want to
look at the <tt>Makefile</tt> and see whether you want to alter the compilation
command.  We have the default C compiler flags set with no flags.  If you
have modified the programs, you might want to use the debugging flags
"-g".  On the other hand, if you are trying to make a fast executable using
the GCC compiler, you may want to use the one which is "An optimized one
for gcc".  In either case, remove the "#" before that CFLAGS command, and
place it before the CFLAGS command that was previously in use.
There are careful instructions on this in the Makefile.
Once you have set up the CFLAGS and DFLAGS statements to be the way you
want, to compile all the programs just type:
<P>
<TT>make install</TT>
<P>
You will then see the compiling commands as they happen, with
occasional warning messages.  If these are warnings, rather than errors,
they are not too serious.  A typical warning would be like this:
<P>
<TT>dnaml.c:1204: warning: static declaration for re_move follows non-static</TT>
<P>
After a time the compiler will finish compiling.  If you have done a
<TT>make install</TT> the system will then move the executables into the
<TT>exe</TT> folder and also save space by erasing all the relocatable
object files that were produced in the process.  You should be left with
useable executables in the <TT>exe</TT> folder, and the <TT>src</TT>
folder should be as before.   To run the executables, go into the
<TT>exe</TT> folder and type the program name (say <TT>dnaml</TT>, which
you may or may not have to precede by a dot and a slash<tt>./</tt>).
The names of the
executables will be the same as the names of the C programs, but without the
<TT>.c</TT> suffix.  Thus <TT>dnaml.c</TT> compiles to make an executable called <TT>dnaml</TT>.
<P>
Our two tree-drawing programs, Drawgram and Drawtree, require an
X Windows installation including the Athena Widgets. These are provided 
with most X Windows installations.  
<P>
If you see messages that the compilation could not find "Xlib.h" and other,
similar functions, this means that some parts of the X Windows development
environment is not installed on your system, or is not installed in the
default location.
Similarly, if you get error messages saying that some files with "Xaw"
in the name cannot be found, this means that the Athena Widgets
are not installed on your system, or are not installed in the
default location.
<P>
In either case, you will need to make sure that they are installed properly.
If they are there but not found during the compile, change the 
<tt>DFLAGS</tt> and <tt>DLIBS</tt> variables in the Makefile to
point to the locations of the header files and libraries, respectively.
<P>
Another is that the usual Linux C compiler is the Gnu GCC compiler.
In some Linux systems it is not invoked by the command <TT>cc</TT> but
by <TT>gcc</TT>.  You would then need to edit the Makefile to reflect this
(see below for comments on that process).
<P>
A typical Unix or Linux installation would put the directory <TT>phylip-3.6</TT>
in <TT>/usr/local</TT>.  The name of the executables directory <TT>EXEDIR</TT>
could be changed to be <TT>/usr/local/bin</TT>, so that the <TT>make install</TT>
command puts the executables there.  If the users have <TT>/usr/local/bin</TT>
in their paths, the programs would be found when their names are typed.
The font files <TT>font1</TT> through <TT>font6</TT> could also be
placed there.  A batch script containing the lines
<P>
<PRE>
      ln -s /usr/local/bin/font1 font1
      ln -s /usr/local/bin/font2 font2
      ln -s /usr/local/bin/font3 font3
      ln -s /usr/local/bin/font4 font4
      ln -s /usr/local/bin/font5 font5
      ln -s /usr/local/bin/font6 font6
</PRE>
<P>
could be used to establish links in the user's working directory so that
Drawtree and Drawgram would find these font files when users
type a name such as <TT>font1</TT> when the program asks
them for a font file name.  The
documentation web pages are in subdirectory <TT>doc</TT> of the
main PHYLIP directory, except for one, <TT>phylip.html</TT> which is
in the main PHYLIP directory.  It has a table of all of the documentation
pages, including this one.  If users create a bookmark to that page
it can be used to access all of the other documentation pages.
<P>
To compile just one program, such as Dnaml, type:
<P>
<TT>make dnaml</TT>
<P>
After this compilation, <TT>dnaml</TT> will be in the <TT>src</TT>
subdirectory.  So will some relocatable object code files that
were used to create the executable.  These have names ending in
<TT>.o</TT> - they can safely be deleted.
<P>
If you have problems with the compilation command, you can edit the
<TT>Makefile</TT>.  It has careful explanations at its front of how you
might want to do so.  For example, you might want to change the C
compiler name <TT>cc</TT> to the name of the Gnu C compiler, <TT>gcc</TT>.
This can be done by removing the comment character <TT>#</TT> from the
front of one line, and placing it at the front of a nearby line.
How to do so should be clear from the material at the beginning of the
<TT>Makefile</TT>.  We have included sample lines for using the <TT>gcc</TT>
compiler and for using the Cygwin Gnu C++ environment on Windows, as
well as the default of <TT>cc</TT>.
<P>
We have encountered some problems with the Gnu C Compiler (gcc)
on 64-bit Itanium processors when compiled with the the -O 3
optimization level, in our code for generating random numbers.
<P>
Some older C compilers (notably the Berkeley C compiler which is
included free with some Sun systems) do not adhere to the ANSI C
standard (because they were written before it was set down).
They have trouble with the function prototypes which are in
our programs.  We have included an <TT>#ifndef</TT> preprocessor
command to eliminate the problem, if you use the switch <TT>-DOLDC</TT>
when compiling.  Thus with these compilers you need only use this in
your C flags (in the Makefile) and compilers such as Berkeley C
will cause no trouble. 
<P>
<A NAME="win"><H3>Windows systems</H3></a>
<P>
We distribute Windows executables, and most likely you can use these and
do not need to recompile them.  The following instructions will only be
necessary if you want to modify the programs and need to recompile them.
They are given for several different compilers available on Windows systems.
Another major compiler is Intel compiler -- we do not have information yet
on how to use it, but expect that PHYLIP will compile on it.
<P>
<A NAME="cyg"><B>Compiling with Cygnus Gnu C++</B></a>
<P>
Cygnus Solutions (now a part of Red Hat, Inc.) has adapted the Gnu C compiler
to Windows systems and
provided an environment, CygWin, which mimics Unix for compiling.
Currently, this is the compiler that we use to prepare the Windows
executables.
Cygwin is available for purchase, and they also make it
available to be downloaded for free.  The download is large.  To get it, go
to <A HREF="http://www.cygwin.com/">the Cygwin web site</A> at
<CODE>http://www.cygwin.com</CODE> and follow the
instructions there.  To download it you need to download
their <TT>setup.exe</TT> program and then it will download the rest
when it is run.  You will need a lot of disk space for it (about a gigabyte).
<P>
When installing Cygwin it is important to install gcc and make.  During
the course of the setup program Setup will ask you to select packages.
Expand the Devel Category by clicking on it.  Scroll down to gcc and
check if the &quot;New&quot; column says &quot;Skip&quot;. If it does,
click on "skip".  "Skip" will change to the current version of gcc.
Scroll down to the <tt>make</tt> package, and if it has &quot;Skip&quot;
click on "Skip". These two
programs are nessessary to install phylip.
<P> 
Once you have
installed the free CygWin environment and the associated Gnu C compiler
on your Windows system, compiling PHYLIP is closely similar to
what one does for Unix or Linux:
<P>
<UL>
<LI> Enter the Cygwin environment (which you can do using the Windows
<TT>Start</TT> menu and its <TT>All Programs</TT> menu item.  There should be
a <TT>Cygnus</TT> menu choice within that submenu, which you can use to
start the Cygnus environment.  This puts you in an imitation of a Unix
shell.)  Alternatively you may have a CygWin icon on your desktop and you
can enter the environment by clicking on that.
<P>
<LI> On entering the CygWin environment you will find yourself in one of the
folders within the CygWin folder.  Change to the folder where the
PHYLIP programs have been put.  For example, if you are user
"fred" on a Windows xp system, and have extracted our self-extracting
archive of the Sources and Documentation files onto your desktop,
you will want to change folders by typing the command
 <tt>&nbsp;&nbsp;cd C:/"Documents and Settings"/fred/phylip3.6&nbsp;&nbsp;</tt>
<P>
<LI> Make sure that there is a folder there called <tt>exe</tt> as well as
one called <tt>src</tt>.  The former is where the executables will be
copied if you do a full recompile of the package.  If you have our
existing executables in <tt>exe</tt> you might want to save them by
copying them elsewhere at this point.  If <tt>exe</tt> does not exist
then you should create it (<tt>mkdir exe</tt>), and also the folder <tt>doc</tt> if that does
not exist (<tt>mkdir doc</tt>).
Go into the folder <tt>src</tt> by typing the command <tt>cd src</tt>.  There should be a folder <tt>icons</tt>
within this folder as well.
<P>
<li> Make sure that the file <tt>Makefile</tt> has been renamed as <tt>Makefile.unix</tt> and that a copy of <tt>Makefile.cyg</tt> has been made and renamed
as <tt>Makefile</tt>.  We will have done this in the copy of the sources that
comes for the Windows platform; if you have instead obtained the sources
from those in another form of our distribution you may need to do these
renamings and copies yourself.
<P>
<li> You should then be able to compile PHYLIP
by issuing the appropriate make command, such as <TT>make install</TT>.
If you have modified one of our source code files such as <TT>dnaml.c</TT>,
it would be wise to
have saved the original version of it first as, say, <TT>dnaml.c0</TT>.
<P>
<li> Our Makefile will automatically associate the appropriate icon from
the folder <tt>icons</tt> with the executables.  To associate an icon with
a program (say Dnaml), we have an icon
file (say <TT>dna.ico</TT>) which contains the icon in standard format.
There is also a file called <TT>dnaml.rc</TT> which contains the single
line:
<P>
<TT>dnaml ICON "dna.ico"</TT>
<P>
We have provided a folder <TT>icons</TT> in the <TT>src</TT>
folder, containing a full set of icons and a full set of resource
files (<TT>*.rc</TT>) so you will not have to do this yourself.
<P>
<li>The executables will now be in the folder <tt>exe</tt>.  You can run
them by clicking on their icons, or by using a Command Prompt window or a CygWin
window and typing their names (<tt>dnaml</tt>, or <tt>dnaml.exe</tt>, or
<tt>./dnaml.exe</tt>).
</ul>
<P>
<A NAME="c++"><B>Compiling with Microsoft Visual C++</B> </a>
<P>
We have had success in the past compiling PHYLIP with Microsoft Visual
C++ (the compiler in the Microsoft .NET package), although the
Windows executables that we distribute are built
using the Cygwin GCC compiler.  The following instructions are the ones we
have used for Visual C++ for the .NET 2008 version with Visual C++ version 9.0.
Microsoft also makes a free download version of their C++ compiler
from 2005 available as Visual C++ Express Edition.  That version has a somewhat
different content, and these instructions will not work with it.  If you
figure out how to get the compiler and Makefiles to work together, please
let us know -- we don't have the energy to figure this out for all possible
configurations of the Microsoft C++ compiler.
<P>
The instructions use the <tt>nmake</tt> command that uses a Makefile which is
called <tt>Makefile.msvc</tt> in our distribution.
At the end of this section we have some comments on how to
compile the programs with Visual C++ version 7.0, which also has a somewhat
different file folder structure.
<P>
With Microsoft Visual C++, you can compile using a Makefile.  We have supplied
this in the source code distrubution as <TT>Makefile.msvc</TT>.
You may wish to preserve the Unix Makefile by renaming <TT>Makefile</TT> to,
say, <TT>Makefile.unix</TT>, then make a copy of <TT>Makefile.msvc</TT>
and call it <TT>Makefile</TT>.   (You may have to change your Windows desktop
settings to make the three-letter extensions visible, or you could use
the <TT>RENAME</TT> command in the Command tool).
<UL>
<LI><B>Setting the path.</B>
Before using <TT>nmake</TT> you will need to have the paths
set properly.  For this, use the Start menu to open in the Accessories
a Command Prompt first.  To set the path type<BR>
<PRE>
set MSVC=Path
</PRE>
where Path is where Microsoft Visual Studio is installed. 
On our Windows XP system, "Path" is
<TT>C:\Program Files\Microsoft Visual Studio 9.0</TT>
<P>
Once you have set MSVC, type
<PRE>
PATH=%PATH%;%MSVC%\VC\bin;%MSVC%\Common7\IDE
</PRE>
(The "7" is correct here; it is not a typo.)
<P>
The Makefile has some paths in it, which are, I hope, in
the correct form for Visual C++ 9.0 on your system.
If not, the statement
<PRE>
MSVCPATH="C:\Program Files\Microsoft Visual Studio 9.0\VC"
</PRE>
will need to be changed so that 
it points to wherever Microsoft Visual Studio is installed, followed by
 <TT>\VC</TT> (for Visual Studio 9.0).
<P>
<LI><B>Using the Makefile</B>. The Makefile is invoked using the

<TT>nmake</TT> command.  To compile and install all
programs type <TT>nmake install</TT>.  We have supplied all the
support files and icons needed for the compilations.  They are
in folder <TT>msvc</TT> in the main source code folder.  If you
simply type <TT>nmake</TT> you
will get a list of possible <TT>make</TT> commands.  For example,
to compile a single program such as <TT>Dnaml</TT> but not
install it, type <TT>nmake dnaml</TT>.
</UL>
<P>
If instead you have an earlier version of Visual Studio .NET which
has the Visual C++ 7.0 compiler, you should proceed as above, but
instead, set MSVC to
<TT>C:\Program Files\Microsoft Visual Studio .NET</TT>,
and then type
<P>
<PRE>
PATH=%PATH%;%MSVC%\Vc7\bin;%MSVC%\Common7\IDE
</PRE>
<P>
You will also need to edit the line in the Makefile that
defines the variable MSVCPATH.  You should change this to
<PRE>
MSVCPATH="C:\Program Files\Microsoft Visual Studio .NET\Vc7"
</PRE>
If this does not work with your Visual C++ 7.0 compiler,
then the most likely reason is that your installation
was not placed into the folder <TT>C:\Program Files</TT>,
or has a name that is not exactly identical to
<TT>Microsoft Visual Studio .NET</TT>.  In that case,
you will need to find the correct path to the Visual C++
7.0 installation on your system, and supply this in the
MSVC variable above, and also in the Makefile.  (Note
that in the Makefile, you will need to follow this path
with <TT>\Vc7</TT>.)
<P>
<B>Compiling with Borland C++</B> 
<P>
Borland C++ can be downloaded for free.  It is a compiler released
in 2000, and which is now owned by Embarcadero Technologies, Inc.
(see their site
<A HREF="http://www.codegear.com/downloads/free/cppbuilder">
http://www.codegear.com/downloads/free/cppbuilder</A>).  To download
it you need to register with them.
It has a somewhat restrictive license, so we cannot use it for the
widely-distributed executables.
<p>
You should download the compiler as it includes all the utilities needed 
to compile phylip.  It can compile using a Makefile. We have supplied 
this in the source code distribution as <tt>Makefile.bcc</tt>. You will need to 
preserve the Unix Makefile by renaming it to, say, <tt>Makefile.unix</tt>, then 
make a copy of <tt>Makefile.bcc</tt> and call it <tt>Makefile</tt>. The
<tt>Makefile</tt> is invoked using the make command.
<P>
You will first need to create an <tt>ilink32.cfg</tt> and a <tt>bcc32.cfg</tt>
file and put the files into the <tt>src</tt> folder.  These files are text files
and their contents are described in the <tt>readme.txt</tt> that comes with the 
Borland tools.  If the Borland tools are in the default location the 
contents of <tt>ilink32.cfg</tt> would be.
<P>
<PRE>
-L"c:\Borland\Bcc55\lib"
</PRE>
<P>
and the contents of <tt>bcc32.cfg</tt>
<P>
<PRE>
-I"c:\Borland\Bcc55\include"
-L"c:\Borland\Bcc55\lib"
</PRE>
<P>
These files can be created in a text editor such as Notepad or Wordpad.
<P>
To invoke the <tt>make</tt> command you will first need to open a command
prompt window.  Then set the path appropriately.  To set the path, type
<P>
<PRE>
set BORLAND=Path
</PRE>
<P>
Where "Path" is where Borland is installed, such as <tt>C:\Borland\BCC55</tt>.
Then type
<P>
<PRE>
PATH=%PATH%;%BORLAND%\Bin
</PRE>
<P>
If you simply type <tt>make</tt> you will get a list of possible make commands. 
For example, to compile a single program such as Dnaml but not install 
it, type <tt>make dnaml</tt>. To compile and install all programs type
<tt>make install</tt>. We have supplied all the the support files and icons
needed for 
the compilations. They are in folder <tt>bcc</tt> of the main PHYLIP
ource code folder. 
We have had to supply a complete second set of the resource files with 
names <tt>*.brc</tt> because Borland resource files have a minor
incompatibility with Microsoft Visual C++ resource files.
<P>
<A NAME="mac"><H3>Macintosh</H3></a>
<P>
<A NAME="gcc"><B>Compiling with GCC on Mac OS X with our Makefile</B></a>
<P>
The executables distributed by us for Mac OS X are currently compiled
using the GCC compiler that is distributed with Mac OS X.  You may not need
to recompile them, unless you want to make changes in the programs.
We are distributing 32-bit "universal binaries" that work on both PowerMac and
Intel iMac.
You may not need to recompile unless you need to make a version of the
executables more closely adapted to your system, or unless you want to
modify the programs.  One reason to recompile might be if you want
64-bit executables, which you might need to address large amounts of
memory.
<P>
If you do want to recompile, conder the following:
<UL>
<LI> Make sure you have the GCC compiler installed.  If you are using a
Terminal window, you can check this by typing the command <tt>gcc</tt>.
If the response is that the gcc compiler cannot find a program to compile,
then it is there.  If it is not there, you need to install it.
<LI> <tt>Makefile.osx</tt> is needed to create clickable executables and store
them in the proper directories.
Go into the file <tt>Makefile.osx</tt> and change the
compiler options using by commenting in/out appropriate
<tt>CFLAGS</tt> lines.  By default it is set up to use the one that
says it will make non-universal binaries for your machine.
<LI> If you are recompiling the Drawgram and/or Drawtree programs,
in <tt>Makefile.osx</tt> make sure that the proper one of the <tt>DFLAGS</tt>
lines is active, and the other commented out.  By default it is
set to compile for Intel processors.
<LI> Make sure that all the icon files
(for Dnaml, for example, the icon file is <tt>dnaml.icns</tt>)
are present in the <tt>mac</tt> folder which is inside the
<tt>src</tt> folder, and so are <tt>command.in</tt>, 
and <tt>Info.plist.in</tt>.
<LI> If you need to put the executables somewhere else other than the
<tt>exe</tt> folder, edit the <tt>Makefile</tt>.
<LI> Then type
<P>
<tt>make install -f Makefile.osx</tt>
<P>
<LI>You will see compiling remarks with occasional warning
messages, which should not be too serious. When done, the <tt>make install</tt>
command should have moved the executables to the <tt>exe</tt> subdirectory.
Note that this
<tt>Makefile.osx</tt> also creates command line executables.
<LI>To run a program, open the <tt>exe</tt> directory (or wherever
you decided to put them) and click on 
the icon of the program. A Terminal window should appear, and you can type in
your choices from the menus that appear.   
<LI> Alternatively, you may want to execute a program from within a Terminal
window by typing the program name.  We have provided in the <tt>exe</tt>
folder a file that makes links to the executables to enable this.  Go to the
executables folder and type the command <tt>source linkmac</tt> to create
a set of links.  These will run the program when the program name is typed
in a Terminal window which has been set so that its current
folder is the <tt>exe</tt> folder (but note that Drawgram and Drawtree run
this way may be slow to open and close plotting windows).
<LI> Note that an alternative way to get command-line executables is to
compile for X11, as described in the next section.
<LI>If you want to compile just one program, say Dnaml, and move it into
the executables directory use the command
<P>
<tt>make dnaml.install</tt>
<P>
<LI>This compiles the executable <tt>dnaml.app</tt> and then moves
it to the <tt>exe</tt>
directory, and deletes the unnecessary associated files <tt>dnaml.o</tt>
and <tt>dnaml</tt> from the <tt>src</tt> directory.  
<LI> <tt>Makefile.osx</tt> is currently set up to make executables for the
processor of the machine you compile on.  Thus if you compile on a PowerMac,
you get PowerMac executables, and if you compile on an Intel iMac, you get
iMac executables.  If you want to make universal executables, you need to
look at <tt>Makefile.osx</tt> and find the appropriate lines and change
which ones are commented out.  If you have not done so already, you will
need to install the appropriate Xcode SDKs to complete a universal binary
build.
</UL>
<P>
<a name="osx-x11"><B>Compiling with GCC on Mac OS X with X Windows</B></a>
<P>
On Mac OS X systems you can also use the GCC compiler and X Windows to
compile a version of the executables that runs from the command line
in native mode.
To do that, you must have the GCC compiler and the X11 windows
development kit materials installed.
X Windows is an optional install present on the Mac OS X for version
10.3 (Panther) and 10.4 (Tiger) distribution disks, and part
of the default distribution for 10.5 (Leopard) on.
You can search for the latest X11 release for your system by 
looking at results after
<a href="http://www.apple.com/search/downloads/?q=x11">searching for &quot;x11&quot;</a>
on the <a href="http://www.apple.com/downloads/">Apple Downloads page</a>.
It is easy to download and install on a Mac OS X system.
<P>
If you have the GCC compiler and the X11 libraries installed, you can use a
Terminal window (which you
will find available in the Utilities folder in the Applications folder) and
compile PHYLIP by treating it as a Unix or Linux application and following the
instructions given above under "Unix and Linux".  Basically you just get
into the folder that contains the PHYLIP source code and type
<P>
<tt>make install</tt>
<P>
This uses the ordinary Unix/Linux Makefile, which works
in creating programs using X11 for
Mac OS X with the gcc compiler.  Note that to run the 
programs <tt>drawgram</tt> and <tt>drawtree</tt> that actually use the X Windows, you
will need to 
<ul>
<li>start the X11 program, 
<li>get an Xterm terminal running within that, and 
<li>run the programs from within the Xterm.
</ul>
<P>
<B><a name="metrowerks">What about the Metrowerks Codewarrior compiler?</a></B>
<P>
We previously also supported the Metrowerks Codewarrior compiler, for
both Mac OS 9 and Mac OS X (and even for producing Windows executables).
Codewarrior required that one maintain "projects" for each program, and we
distributed the projects as well as the source code.  As Metrowerks was
bought out by Freescale and has retargeted its compilers for building
embedded applications, we are ceasing this rather cumbersome support.  That
means that we do not at present give you a way to recompile our programs
if you have Mac OS 9.  We may not make a set of executables for Mac OS 9
ourself.  If you absolutely need to obtain compilation support routines and
projects for Metrowerks Codewarrior, contact us and we will send you what we
have.  Of course, for Mac OS X the GCC compiler is available, and we describe
above how to compile the programs with it.
<P>
<A NAME="vax"><H3>VMS VAX systems</H3></a>
<P>
VMS VAX systems have almost disappeared, so
we have not tried to compile version 3.695 on an OpenVMS system.  The
following instructions should work.
On the OpenVMS operating system with DEC VAX VMS C the programs will compile
without alteration.  The commands for compiling a typical program
(Dnapars, which depends on the separately compiled files <TT>phylip.c</TT>
and <TT>seq.c</TT>) are:
<P>
<TT>$ DEFINE LNK$LIBRARY SYS$LIBRARY:VAXCRTL
<BR>
$ CC DNAPARS.C
<BR>
$ CC PHYLIP.C
<BR>
$ CC SEQ.C
<BR>
$ LINK DNAPARS,PHYLIP,SEQ
<BR>
</TT>
<P>
Once you use this <TT>$ DEFINE</TT> statement during a given interactive session,
you need not repeat it again as the symbol <TT>LNK$LIBRARY</TT> is thereafter
properly defined.  The compilation process leaves a file <TT>DNAPARS.OBJ</TT>
in your directory: this can
be discarded.  The executable program is named <TT>DNAPARS.EXE</TT>.  To run the program
one then uses the command:
<P>
<TT>$ R DNAPARS</TT>
<P>
The compiler defaults to the filenames <TT>INFILE.</TT>, <TT>OUTFILE.</TT>, and
<TT>TREEFILE.</TT>.
If the input file <TT>INFILE.</TT> does not exist the program will prompt you to
type in its name.  Note that some commands on VMS such as <TT>TYPE OUTFILE</TT>
will fail because the name of the file that it will attempt to type out will be not
<TT>OUTFILE.</TT> but <TT>OUTFILE.LIS</TT>.  To get it to type the write file you
would have to instead issue the command <TT>TYPE OUTFILE.</TT>.
<P>
When you are
using the interactive previewing feature of Drawgram (or Drawtree) on
a Tektronix or DEC ReGIS compatible terminal, you will want before
running the program to have issued the command:
<P>
<TT>$ SET TERM/NOWRAP/ESCAPE</TT>
<P>
so that you do not run into trouble from the VMS line length limit of
255 characters or the filtering of escape characters.
<P>
To know which files to compile together, look at the entries in the
<TT>Makefile</TT>.
<P>
<A NAME="parallel"><H3>Parallel computers</H3></a>
<P>
As parallel computers become more common, the issue of how to compile
PHYLIP for them has become more pressing.  People have been compiling
PHYLIP for vector machines and parallel machines for many years.  We
have not made a version for parallel machines because there is still
no standard parallel programming environment on such machines (or rather,
there are many standards, so that one cannot find one that makes
a parallel execution version of PHYLIP widely distributable).  However
symmetric multiprocessing using the
MPI Message Passing Interface is spreading rapidly, and we will
probably support it in future versions of PHYLIP.
<P>
Although the underlying algorithms of most programs,
which treat sites independently, should be amenable to vector and
parallel processors,
there are details of the code which might best be changed.
In certain of the programs (<TT>Dnaml</TT>, <TT>Dnamlk</TT>,
<TT>Proml</TT>, <TT>Promlk</TT>) I have put a special
comment statement next to the loops in the program where
the program will spend most of its time, and which are the places
most likely to benefit from parallelization.  This comment statement is:<BR>
<PRE>
           /* parallelize here */
</PRE>
In particular
within these innermost loops of the programs there are often scalar quantities
that are used for temporary bookkeeping.  These quantities, such as
<TT>sum1, sum2, zz, z1, yy, y1, aa, bb, cc, sum,</TT> and <TT>denom</TT> in procedure makenewv
of <TT>Dnaml</TT> and similar quantities in procedure nuview) are there to
minimize the number of array references.  For vectorizing and parallelizing
compilers it will 
be better to replace them by arrays so that processing can occur
simultaneously. 
<P>
If you succeed in making a parallel version of PHYLIP we would like to
know how you did it.  In particular, if you can prepare a web page which
describes how to do it for your computer system, we would like to use
material from it
in our PHYLIP web pages.  Please e-mail it to me.  We hope to
have a set of pages that give detailed instructions on how to make parallel
version of PHYLIP on various kinds of machines.  Alternatively, if we
were given your modified version of the program we might be able to
figure out how to make modifications to our source code to allow
users to compile the program in a way which makes those modifications.
<P>
<A NAME="other"><H3>Other computer systems</H3></a>
<P>
As you can see from the variety of different systems on which these
programs have been successfully run, there are no serious
incompatibility problems with most computer systems.  PHYLIP in various
past Pascal versions has also been compiled on 8080 and Z80 CP/M Systems, Apple
II systems running UCSD Pascal, a variety of minicomputer systems such as
DEC PDP-11's and HP 1000's, on 1970's era mainframes such as CDC
Cyber systems, and so on.  In a later era
it was also compiled on IBM 370 mainframes, and of course on DOS and
Windows systems and on Macintosh systems.
We have gradually
accumulated experience on a wider variety of C compilers.  If you succeed in
compiling the C version of PHYLIP on a different machine or a different
compiler, I would like to
hear the details so that I can consider including the instructions in a future version
of this manual.
<P>
<A NAME="java"><H3>Compiling the Java interfaces</H3></a>
<P>
The ONLY reason you should do this is if you want to add or modify 
functionality on the Java interface.  In all other cases, the .jar files that 
already exist in the <TT>javajars</TT> folder will run on your Mac / MS / 
Linux / Unix system and you should not be here.
<P>
Welcome to a fairly complex process. Unless you are an experienced object 
oriented programmer, you will find Java has a steep learning curve and will 
cause you headaches.
<P>
The general overview is that there is a Java interface that gathers and 
validates input from the user, there is a call from the Java code to a dynamic 
C library that contains the Phylip functionality, and there is feedback to the 
user from the Java interface as to the status of the underlying C code. 
Because one has two very different kinds of software running, the feedback is 
not as elegant as one would expect from a single integrated environment.
<P>
Now for the specifics. We have developed these Java interfaces using the <A 
HREF="http://www.eclipse.org">Eclipse</a>  environment (available from 
<tt>www.eclipse.org</tt>).  Go there and download the version of the Java 
development environment appropriate to your operating system. 
<P> 
In the distribution there is a <TT>javasrc</TT> folder which contains folders 
that match the programs. These folders contain the program's Java interfaces. 
For example folder <TT>drawgram</TT> contains <TT>DrawgramInterface.java</TT> 
and <TT>DrawgramUserInterface.java</TT>. The former does the interaction with 
the compiled C library, the latter contains the user interface. 
</p>
<p>If you want to modify the Java interface for Drawgram, open the Eclipse 
Java development environment, create a project called Phylip3.695, create a 
folder under it called src. Under that create a project called 
<tt>drawgram</tt>.  Now import the two Drawgram associated java files 
(<tt>DrawgramUserInterface.java</tt> and <tt>DrawgramInterface.java</tt>) into 
that project. You will also need to create a project called <tt>util</tt> and 
import all the items in the <TT>javasrc/util</TT> directory.  Open 
<tt>DrawgramUserInterface.java</tt> with the Eclipse WindowBuilderEditor and 
you can edit it however you want. Remember that you'll need to add 
ActionListeners (described in Java manuals) to anything that changes things on 
the screen. There are plenty of examples of them in 
<tt>DrawgramUserInterface.java</tt>, for example, <tt>TreeGrowToggle</tt> 
which handles the toggling "Tree grows:" between "Horizontal" and "Vertical" 
using Radio buttons.  Most of the pieces you'll need are in the existing code. 
You can clone them and edit to fit.  Beyond that, "Google is your friend".
<P>
Once you have added new functionality or changed existing functionality in the 
user interface, you will need to pass the information it collects from the 
user to the underlying C code. This is a bit tricky because C and Java are 
very different kinds of languages. Luckily Sun provided the Java Native Access 
/ Java Native Interface (JNA/JNI) interface package to take care of it. We 
used JNA (which calls JNI) because it is simpler to use and our needs were 
basic enough we could live within its confines. In order to use it you will 
also need to get two public jars off the web (do a Google search for these as 
they keep moving around): 
<ul>
<li> jna.jar - the Java Native Access tools to hook the Java interface to the C code
<li> platform.jar - some tools that jna.jar needs
</ul>
<p>
JNA passes everything via an enormous list of variables. This is simple to 
program but very hard to keep track of, as you have to keep things exactly 
parallel in the Java and C code and there is no debugger that will help you. 
We have found it best to build a public class in Java that contains everything 
that is going to the C code and create an instance of it when the user is 
finished with data entry and decides to execute the process (in the 
<tt>Drawgram</tt> case, selects <tt>Preview</tt>).  We then copy all the data 
from the screen into the members of the class, and pass these directly into 
the JNA call to the underlying C code (look in <TT>DrawgramInterface.java</TT> 
for an example). 
<p>
In the underlying C code (which must be compiled as a library so that Java can 
access it), there is an entry point that is the name of the program (for 
example the function <TT>drawgram</TT> in <tt>drawgram.c</tt>) containing as 
arguments every one of the variables that were passed by the Java interface, 
in the same order.  If you have weird bugs, most likely you messed this up. 
Make a copy of the Java class definition, paste it into the C code and check 
everything. Another wrinkle that can bite you is that booleans come though as 
integers and Java and C do not agree as to what that means. False is 0 in both 
languages. True is "not 0" in Java and often set to all bits on (which is a 
very big negative number in C). C often has problems with this. Each compiler 
is different and there are environment variables that effect this also. It is 
safest to explicitly fix things before you execute any C code. There are a lot 
of other odd quirks, but you have two working examples (<tt>Drawgram.c</tt> 
and <tt>Drawtree.c</tt>), so you can probably figure them out.
<p>
Feedback from C to Java can be difficult. In Drawgram and Drawtree it is 
fairly easy, as the plotting is done (to the file <TT>JavaPreview.ps</TT> in 
case you need to know) and the program returns. The Java interface waits until 
the C code completes and returns, then reads <tt>JavaPreview.ps</tt> and 
displays the preview.  In cases where one needs progress indicators, one needs 
to multithread the Java code and display a continually updating progress 
file.  Phylip 3.695 has no need of multithreading but it will be implemented 
in Phylip 4.0.
<P>
<DIV ALIGN="CENTER">
<A NAME="FAQ"><HR><P></A>
<H2>Frequently Asked Questions</H2></DIV>
<P>
This set of Frequently Asked Questions, and their answers, is from the
PHYLIP web site.  A more up-to-date version can be found there, at:
<P>
<DIV ALIGN="CENTER">
<A HREF="http://evolution.gs.washington.edu/phylip/faq.html">
<TT>http://evolution.gs.washington.edu/phylip/faq.html</TT></A></DIV>
<P>
<DIV ALIGN=CENTER>
<H3>Problems that are encountered</H3>
</DIV>
<DL>
<DT><STRONG>"It doesn't work! <I>It doesn't work!!</I> It says <TT>can't find infile.</TT>"</STRONG>
<DD>Actually, it's working just fine.  Many of the programs look for an input file called <TT>infile</TT>,
and if one of that name is not present in the current folder, they then ask
you to type in the name of the input file.  That's all that it's doing. This
is done so that
you can get the program to read the file without you having to type in its
name, by making a copy of your input file and calling it <TT>infile</TT>.
If you don't do that, then the program issues this message.  It looks
alarming, but really all that it is trying to do is to get you to type in
the name of the input file.  Try giving it the name of the input file.
<DT><STRONG>"The program reads my data file and then says it has
a memory allocation error!"</STRONG>
<DD>This is what tends to happen if there is a problem with the format of the data
file, so that the programs get confused and think they need to set aside memory
for 1,000,000 species or so.  The result is a "memory allocation error" (the
error message may say that "the function asked for an inappropriate amount of memory").  Check the data file format against the documentation:
make sure that the data files have <I>not</I> been saved in the format of
your word processor (such as Microsoft Word) but in a "flat ASCII" or "text only"
mode.  Note that adding memory to your computer is <I>not</I> the
way to solve this problem -- you probably have plenty of memory
to run the program once the data file is in the correct format.
<DT><STRONG>"I opened the program but I don't see where to create
a data file!"</STRONG>
<DD>The programs (there are more than one) use data
files that have been created outside of the program.  They do not have any
data editor within them.  You can create a data file by using an editor,
such as Microsoft Word, Emacs, vi, TextEdit, Notepad, etc.  But be sure
<I>not</I> to save the file in Microsoft Word's own format.  It should be 
saved in Text Only format (in Mac OS X TextEdit you need to use the Make Plain
Text menu choice in the Format menu).  You can use the 
documentation files, including the examples
at the end of those files, to figure out the format of the input file.
Documentation files such as <TT>main.html</TT>, <TT>sequence.html</TT>,
<TT>distance.html</TT> and many others should be consulted.  Many users
create their data files by having their alignment program (such as
ClustalW), output its alignments in PHYLIP format.  Many alignment programs
have options to do that.
<DT><STRONG>"There is an error message saying that there is already
a file named <tt>outfile</tt>!"</STRONG>
<dd> This is perfectly normal.  When any PHYLIP program starts to
open an output file to write its output on it, it tries to open a file
called "outfile".  If there is already an output file of that name, it
asks you whether you want to replace it, or whether you want to append
to it, or whether you want to open instead a file of a new name, or
whether you just want to quit.  Choose one of the these.  If you do not
need the information that is in the old "outfile", just tell it to
overwrite (replace) the file by typing the letter R and then pressing the
Enter key.  The program will proceed normally after that.
There are also options available to you to Append your output to
"outfile" or to have the output written to a new File whose name you
provide.  (Of course,
it is good practice to rename any output file called "outfile" that
contains results that you want to keep, to prevent that file from being
overwritten).
<DT><STRONG>"The program ran but it analyzed the wrong data set!"</STRONG>
<DD> This can happen if you put a data set in the current folder,
perhaps as a file named <tt>myfile.dna</tt>, and intend to have the
program analyze that.  But you fail to notice that the folder already
has another data file in it, named <tt>infile</tt>.  The programs will
always try to find a file named <tt>infile</tt>, and they will
read that file if they find it.  You should either copy your file into
file <tt>infile</tt>, or delete file <tt>infile</tt> so that when the
program does not find it, it will ask you for the name of the input file.
<DT><STRONG>"I ran PHYLIP, and all it did was say it was extracting a bunch of files!"</STRONG>
<DD>
There is no executable program
named <TT>PHYLIP</TT> in the PHYLIP package!  But in some cases
(especially the Windows distribution) there is a file called
<TT>phylip-3.695.exe</TT>.
That file is an archive of documentation and source code.  Once you have
run it and extracted the files in it, so that they are in the folder,
running it again will just do the extraction again, which is unnecessary.
<DT><STRONG>"One program makes an output file and then the next program crashes while reading it!"</STRONG>
<DD>Did you rename the file?  If a program makes a file called <TT>outfile</TT>, and then the
next program is told to use <TT>outfile</TT> as its input file, things
can get confusing.  The second program first tries to open <TT>outfile</TT>
as an output file, and since it finds one of that name already there, it
asks you whether to overwrite that file.  If you say to do that,
the program overwrites the file, thus
erasing it.  When it then also tries to read from this empty <TT>outfile</TT>
a psychological
crisis can ensue.  The solution is simply to rename <TT>outfile</TT> before
trying to use it as an input file.
<DT><STRONG>"I make a file called infile and then the program can't find it!"</STRONG>
<DD>Let me guess.  You are using Windows, right?  You made your file in Word or
in Notepad or WordPad, right?  If you made a file in one of these editors, and
saved it, not in Word format, but in Text Only format, then you were doing the
right thing.  But when you told the operating system to save the file as
<TT>infile</TT>, it actually didn't.  It saved it as
<TT>infile.txt</TT>. Then just to make
life harder for you, the operating system is set up by default to not show
that three-letter extension to the file name.  Next to its icon it will show
the name <TT>infile</TT>.  So you think, quite reasonably, that
there is a file called <TT>infile</TT>.  But there isn't a file of that
name, so the program, quite reasonably, can't find a file called
<TT>infile</TT>.  If you want to check what the actual file name is, use
the <TT>Properties</TT>
menu item of the <TT>File</TT> item on your folder.
If you are annoyed at not seeing the full file name, with the
three-letter extensions, then you can set the operating system to
show them by choosing in the folder's Tools menu (at the top of its
window) the Folder Options and then the View tab, and setting the "Hide
extensions for known file types" to not be selected.
In any case, you should be able to get the program to work by telling it that the file name
is <TT>infile.txt</TT>.
<DT><STRONG>"Consense gives wierd branch lengths! How do I
get more reasonable ones?"</STRONG> 
<DD>Consense gives branch lengths which are simply the numbers of replicates
that support the branch.  This is not a good reflection of how long those
branches are estimated to be.  The best way to put better branch lengths on a
consensus tree is to use it as a User Tree in a program that will estimate
branch lengths for it, such as Dnaml.  You may need to convert it to being an unrooted tree,
using Retree, first.  If the original program you were using was a
program that does not estimate branch lengths, you may instead have to
use one that does.  You can use a likelihood program, or make
some distances between your species (using, for example, Dnadist) and use
Fitch to put branch lengths on the user tree.  Here is the sequence of
steps you should go through:
<OL>
<LI>Take the tree and use Retree to make sure it is Unrooted (just
read it into Retree and then save it, specifying Unrooted)
<LI>Use the unrooted tree as a User Tree (option <TT>U</TT>) in one of
our programs (such as Dnaml or Fitch).   If you use Fitch, you also
first need to use one of the distance programs such as Dnadist to
compute a set of distances to serve as its input.
<LI>Specify that the branch lengths
of the tree are not to be used but should be re-estimated.  This
is actually the default.
</OL>
<DT><STRONG>"I looked at the tree printed in the output file <tt>outfile</tt>
and it looked wierd.  Do I always need to look at it in Drawgram?"</STRONG>
<DD>It's possible you are using the wrong font for looking at the tree in
the output file.  The tree is drawn with dashes and exclamation points.  If
a proportional font such as Times Roman or Helvetica is used, the tree lines
may not connect.  Try selecting the whole tree and setting the font to
a fixed-width one such as Courier.  You may be astounded how much clearer
the tree has become.
<DT><STRONG>"Drawtree (or Drawgram) doesn't work: it can't find the font file!"</STRONG>
<DD>Six font files, called <TT>font1</TT> through <TT>font6</TT>, are
distributed with the executables
(and with the source code too).  The program looks for a copy of one of them
called <TT>fontfile</TT>.  If you haven't made such a copy called
<TT>fontfile</TT> it then asks
you for the name of the font file.  If they are in the current folder, just
type one of <TT>font1</TT> through <TT>font6</TT>.  The reason for
having the program look for <TT>fontfile</TT>
is so that you can copy your favorite font file, call the copy
<TT>fontfile</TT>,
and then it will be found automatically without you having to type the name of
the font file each time.
<DT><STRONG>"Can Drawgram draw a scale beside the tree? Print the branch lengths as numbers?"</STRONG>
<DD>It can't do either of these.  Doing so would make the program more complex, and
it is not obvious how to fit the branch length numbers into a tree that has
many very short internal branches.  If you want these scales or numbers,
choose an output plot file format (such as Postscript, PICT or PCX) that can be read by
a drawing program such as Adobe Illustrator, Freehand, Canvas, CorelDraw,
or MacDraw.
Then you can add the scales and branch length numbers yourself by hand.  Note
the menu option in Drawtree and Drawgram that specifies the tree size to be
a given number of centimeters per unit branch length.
<DT><STRONG>"How can I get Drawgram or Drawtree to print the bootstrap values
next to the branches?"</STRONG>
<DD>When you do bootstrapping and use Consense, it prints the bootstrap
values in its output file (both in a table of sets, and on the diagram
of the tree which it makes).  These are also in the output tree file of
Consense.  There they are in place of branch lengths.  So to get them to
be on the output of Drawgram or Drawtree, you must write the tree in the
format of a drawing program and use it to put the values in by hand, as
mentioned in the answer to the previous question.
<DT><STRONG>"Dnaml won't read the treefile that is produced by Dnapars!"</STRONG>
<DD>That's because the Dnapars tree file is a rooted tree, and Dnaml wants an
unrooted tree.  Try using Retree to change the file to be an unrooted tree
file. Our most recent versions of the programs usually automatically
convert a rooted tree into an unrooted one as needed.  But the programs
such as Dnamlk or Dollop that need a rooted tree won't be able to use an
unrooted tree.
<DT><STRONG>"What is a good value for the random number seed?"</STRONG>
<DD> The random number seed is used to start a process of choosing "random"
(actually pseudorandom) numbers, which behave as if they were
unpredictably randomly chosen between 0 and 2<SUP>32</SUP>-1 (which is
4,294,967,295).  You could put in the number 133 and find that the
next random number was 221,381,825.  As they are effectively
unpredictable, there is no such thing as a choice that is better than
any other, provided that the numbers are of the form 4<I>n</I>+1
(this can be judged from the last two digits of the number: for
example if they are 37 it is of this form as 37=4*9+1).  However
if you re-use a random number seed, the sequence of random numbers
that result will be the same as before, resulting in exactly the same
series of choices, which may not be what you want.
<DT><STRONG>"In bootstrapping, Seqboot makes too large a file"</STRONG>
<DD>If there are 1000 bootstrap replicates, it will make a file
1000 times as long as your original data set.  But for many methods
there is another way that uses much less file space.  You can use
Seqboot to make a file of multiple sets of weights, and use those
together with the original data set to do bootstrapping.
<DT><STRONG>"In bootstrapping, the output file gets too big."</STRONG>
<DD> When running a program such as Neighbor or Dnapars with multiple data
sets (or multiple weights) for purposes of bootstrapping,
the output file is usually not needed, as it
is the output tree file that is used next.  You can use the menu
of the program to turn off the writing of trees into the
output file.  The trees will still be written into the output tree file.
<DT><STRONG>"Why don't your programs correctly read the sequence alignment
files produced by ClustalW?"</STRONG>
<DD>They do read them correctly if you make the right kind.  Files from
ClustalV or ClustalW whose names end in <TT>".aln"</TT> are not in PHYLIP
format, but in Clustal's own format which will not work in PHYLIP.
You need to find the option to output PHYLIP format files, which ClustalW and
ClustalV usually assign the extension <TT>.phy</TT>.
<DT><STRONG>"Why doesn't Neighbor read my DNA sequences correctly?"</STRONG>
<DD>Because it  wants
to  have as input a distance matrix, not sequences.  You have to use Dnadist to
make the distance matrix first.
<DT><STRONG>"On our Mac OS 9 system, larger data files fail to run."</STRONG>
<DD>We have set the memory allowances on the Mac OS 9 executables
to be generous, but not too big.  You therefore may need to
increase them.  Use the <TT>Get Info</TT> item on the Finder <TT>File</TT> menu.
<P>
<DIV ALIGN=CENTER>
<H3>How to make it do various things</H3>
</DIV>
<P>
<DT><STRONG>"How do I bootstrap?"</STRONG>
<DD>The general method of bootstrapping
involves  running  Seqboot  to make multiple bootstrapped data sets out of your
one data set, renaming the output file, then running one of the tree-making 
programs  with  the  Multiple data sets option to analyze them all, renaming 
the output tree file, then finally running Consense to make a majority
rule consensus tree from the resulting tree file.  Read  the  documentation  of
Seqboot to get further information.  With this system almost any of the
tree-making  methods  in the package can be bootstrapped.  It is somewhat
tedious but you will find it generally useable.
<DT><STRONG>"How do I specify a multi-species outgroup
with your parsimony  programs?"</STRONG> 
<DD>
<DIV>
It's  not  a  feature  but  is  not too hard to do in many of the programs.  In
parsimony programs like Mix, for which the W (Weights) and A (Ancestral states)
options are available, and weights can be larger than 1, all you need to do is:
<TABLE ALIGN=LEFT>
<TR><TD><STRONG>(a)</STRONG><BR>&nbsp;<BR></TD><TD>In Mix, make up an extra character with states 0 for  all  the  outgroups
and  1  for all the ingroups.  If using<BR> Dnapars the ingroup can have (say)
<TT>G</TT> and the outgroup <TT>A</TT>.</TD></TR>
<TR><TD><STRONG>(b)</STRONG><BR>&nbsp;<BR></TD><TD>Assign this character an enormous weight
(such as <TT>Z</TT> for 35) using  the  W
option,<BR> all other characters getting weight 1, or whatever weight they had
before.</TD></TR>
<TR><TD><STRONG>(c)</STRONG><BR>&nbsp;<BR></TD><TD>If it is available, Use the A (Ancestral
states) option to designate that
for  that  new  character the state found in the<BR> outgroup is the ancestral
state.</TD></TR>
<TR><TD><STRONG>(d)</STRONG></TD><TD>In Mix do not use the O (Outgroup) option.
</TD></TR>
<TR><TD><STRONG>(e)</STRONG><BR>&nbsp;<BR>&nbsp;</TD><TD>After the tree is found, the designated
ingroup  should  have  been  held
together  by the fake character.  The tree will be<BR> rooted somewhere in the
outgroup (the program may or may not have a preference for  one  place  in
the  outgroup  over  another).<BR>  Make sure that you subtract from the total
number of steps on the tree all steps in the new character.</TD></TR>
<TR><TD COLSPAN=2>In programs like Dnapars, you cannot use this method as weights  of  sites
cannot  be  greater  than  1.   But you do an analogous trick, by adding a
largish number of extra sites to the data, with one nucleotide state ("A")
for the ingroup and another ("G") for the outgroup.  You will then have to
use Retree to manually reroot the tree in the desired place.</TD></TR>
</TABLE>
</DIV>
<DT><STRONG>"How do I force certain groups to remain  monophyletic in your
parsimony programs?"</STRONG> 
<DD>By  the same method as in the previous question, using multiple fake characters, any number of
groups of species can be forced to be  monophyletic.   In  Move,  Dolmove,  and
Dnamove  you  can  specify  whatever  outgroups  you want without going to this
trouble.
<DT><STRONG>"How can I reroot one of the trees written out by PHYLIP?"</STRONG>
<DD>Use the program
Retree.  But keep in mind whether the tree inferred by the original program was
already rooted, or whether you are free to reroot it without changing its
meaning.
<DT><STRONG>"What do I do  about  deletions  and  insertions  in  my  sequences?"</STRONG>
<DD>The
molecular sequence programs will accept sequences that have gaps (the "<TT>-</TT>"
character).  They do various things with them, mostly not optimal.
Programs such as Dnaml and Dnadist count gaps as equivalent to unknown
nucleotides (or unknown amino acids) on the grounds that we don't know what
would be there if something were there.  This completely leaves out the
information from the presence or absence of the gap itself, but does not bias
the gapped sequence to be close to or far from other gapped or ungapped
sequences.  Sequences that share a gap at a site do not tend to cluster
together on the tree.  So it is not necessary to remove gapped regions from your
sequences, unless the presence of gaps indicates that the region is
badly aligned.  An exception to this is Dnapars, which
counts "gap" as if it were a fifth nucleotide state (in addition to A, C, G,
and T).  Each site counts one change when a gap arises or disappears.  The
disadvantage of this treatment is that a long gap will be overweighted, with
one event per gapped site.  So a gap of 10 nucleotides will count as being as
much evidence as 10 single site nucleotide substitutions.  If there are not
overlapping gaps, one way to correct this is to recode the first site in the
gap  as "<TT>-</TT>" but make all the others be "<TT>?</TT>" so the gap only counts as one event.
<DT><STRONG>"How can I produce distances for my data set which
has 0's and 1's?"</STRONG> 
<DD>You can't do it in a simple and general
way, for a straightforward reason.  Distance methods must correct the
distances for superimposed changes.  Unless we know specifically how to
do this for your particular characters, we cannot accomplish the
correction.  There are many formulas we could use, but we can't choose
among them without much more information.  There are issues of superimposed
changes, as well as heterogeneity of rates of change in different
characters.  Thus we have not provided a distance program for 0/1 data.
It is up to you to figure out what is an appropriate stochastic model
for your data and to find the right distance formulas.
If the 0's and 1's
are presences and absences of restriction sites or restriction fragments,
you can use program Restdist to compute appropriate distances.
<DT><STRONG>"I have RFLP fragment data: which programs should I
use?"</STRONG>
<DD>This is a more difficult question than you may imagine.
Here is quick tour of the issues:
<UL><LI>You can code fragments as 0 and 1 and use a parsimony program.  It is
not obvious in advance whether 0 or 1 is ancestral, though it is likely that
change in one direction is more likely than change in the other for each
fragment.  One can use either Wagner parsimony (programs <TT>Pars</TT>,
<TT>Mix</TT>, <TT>Penny</TT> or <TT>Move</TT>) or use Dollo parsimony
(<TT>Dollop, Dolpenny</TT> or <TT>Dolmove</TT>)
with the ancestral states all set as unknown ("<TT>?</TT>").
<LI>You can use a distance matrix method using the RFLP distance of Nei and
Li (1979).  Their restriction fragment distance is available in our
program RestDist. 
<LI>You should be very hesitant to bootstrap RFLP's.  The individual
fragments do not evolve independently: a single nucleotide substitution
can eliminate one fragment and create two (or vice versa).
</UL>
For restriction <I>sites</I> (rather than fragments) life is a bit
easier: they evolve nearly independently so bootstrapping is possible
and <TT>Restml</TT> can be used, as well as restriction sites distances
computed in <TT>Restdist</TT>.  Also directionality of change
is less ambiguous when parsimony is used.  A more complete tour of the
issues for restriction sites and restriction fragments is given in chapter
15 of my book (Felsenstein, 2004).
<DT><STRONG>"Why don't your parsimony programs  print  out  branch  lengths?"</STRONG>
<DD>Well, Dnapars and Pars can.  The others have not yet been upgraded to the
same level.  The longer answer is that it is because
there are problems defining the branch lengths.  If you look closely at the
reconstructions of the states of the hypothetical ancestral nodes for almost
any data set and almost any parsimony method you will find some ambiguous
states on those nodes.  There is then usually an ambiguity as to which branch
the change is actually on.  Other parsimony programs resolve this in one or
another arbitrary fashion, sometimes with the user specifying how (for example,
methods that push the changes up the tree as far as possible or down it as far
as possible).  Our older programs leave it to the user to do this.  In
Dnapars and Pars we use an algorithm discovered by Hochbaum and Pathria (1997)
(and independently by Wayne Maddison) to compute branch lengths that average
over all possible placements of the changes.  But these branch lengths, as
nice as they are, do not correct for mulitple superimposed changes.  Few
programs  available  from  others  currently correct the branch lengths for
multiple changes of state that may have overlain each other.  One possible way
to get branch lengths with nucleotide sequence data is to take the tree
topology that you got, use Retree to convert it to be unrooted, prepare a
distance matrix from your data using Dnadist, and then use Fitch with that tree
as User Tree and see what branch lengths it estimates.
<DT><STRONG>"Why can't your programs handle unordered multistate characters?"</STRONG>
<DD>There is a program Pars which does parsimony for
undordered multistate characters with up to 8 states, plus <TT>?</TT>.  The
other the discrete characters parsimony programs can only handle two states,
<TT>0</TT> and <TT>1</TT>.
This is mostly because I have not yet had time to modify them to do so  -  the
modifications would have to be extensive.  Ultimately I hope to get these done.
If you have four or fewer states and need a feature that is not in Pars,
you could  recode your states to look like nucleotides
and use the parsimony programs in the molecular sequence section of PHYLIP, or
you could use one of the excellent parsimony programs produced by others.
<P>
<DIV ALIGN=CENTER>
<H3>Background information needed:</H3>
</DIV>
<P>
<DT><STRONG>"What file format do I use for the sequences?"<BR>
"How do I use the programs?  I can't find any documentation!"</STRONG>
<DD>These are discussed in the documentation files.  Do you have them?  If you
have a copy of this page you probably do.  They are distributed
in the same archive as the rest of the package.  Input file formats
are discussed in <TT>main.html</TT>, in <TT>sequence.html</TT>, <TT>distance.html</TT>,
<TT>contchar.html</TT>, <TT>discrete.html</TT>, and the documentation files for the
individual programs.
<DT><STRONG>"Where can I find out how to infer
phylogenies?"</STRONG>
<DD>There are now a few books.  For molecular data you could use one of these:
<P>
At the upper-undergraduate level:
<UL>
<LI> Graur, D. and W.-H. Li.  2000.  <EM>Fundamentals of Molecular
     Evolution.</EM> Sinauer Associates, Sunderland, Massachusetts. (or the earlier edition
     by Li and Graur).
<LI> Page, R. D. P. and E. C. Holmes.  1998.  <EM>Molecular Evolution: 
     A Phylogenetic Approach.</EM>  Blackwell, Oxford.
</UL>
<P>
and as graduate-level texts:
<UL>
<LI> Nei, M. and S. Kumar.  2000.  <EM>Molecular Evolution and
     Phylogenetics.</EM> Oxford University Press, Oxford.
<LI> Li, W.-H.  1999.  <EM>Molecular Evolution.</EM>  Sinauer Associates,
     Sunderland,   Massachusetts.
</UL>
<P>
For more mathematically-oriented readers, there is the book
<UL>
<LI>Semple, C., and M. Steel. 2003. <EM>Phylogenetics.</EM> Oxford Lecture
Series in Mathematics and Its Applications, volume 24. Oxford
University Press, Oxford.
</UL>
<P>
Best of all is of course my own book on phylogenies, which
covers the subject for many data types, at a
graduate course level:<BR>
<UL>
<LI>Felsenstein, J.  2004.  <EM>Inferring Phylogenies</EM>.
Sinauer Associates, Sunderland, Massachusetts.
</UL>
<P>
There are also some recent books that take a more practical hands-on approach,
and give some detailed information on how to use programs, including PHYLIP
programs.  These include:
<UL>
<LI>Lemey, P., Salemi, M., and A.-M. Vandamme (eds.) 2009. <em>The 
Phylogenetic Handbook.  A Practical Approach to Phylogenetic Analysis and 
Hypothesis Testing,
</em> 2nd edition.  Cambridge University
Press, Cambridge.
<LI>Hall, B. G.  2007.  <em>Phylogenetic Trees Made Easy: A How-To Manual</em>, 3rd edition.
Sinauer Associates, Sunderland, Massachusetts.  (The Second Edition contained
some information on using PHYLIP, but most of that has been dropped from this
third edition).
</UL>
<P>
In addition, one of these three review articles may help:
<UL>
<LI>Swofford, D. L., G. J. Olsen, P. J. Waddell, and D. M. Hillis.  1996.
Phylogenetic inference.  pp. 407-514 in <I>Molecular Systematics</I>, 2nd ed.,
ed.  D. M. Hillis, C. Moritz, and B. K. Mable.  Sinauer Associates, Sunderland,
Massachusetts.
<LI>Felsenstein, J. 1988. Phylogenies from molecular sequences: inference and
reliability.  <I>Annual Review of Genetics</I> <B>22:</B> 521-565.
<LI>Felsenstein, J. 1988. Phylogenies and quantitative
characters. <I>Annual Review of Ecology and Systematics</I> <B>19:</B> 445-471.
</UL>
<P>
A useful article introducing the inference of phylogenies at a more
elementary level is:
<UL>
<LI>Baldauf, S. L.  2003.  Phylogeny for the faint of heart: a tutorial. <em>Trends
in Genetics</em> <b>19:</b> 345-351.
</UL>
<P>
I have already mentioned above that there is an excellent guide to using PHYLIP 3.6
for molecular analyses available.  It is by Jarno Tuimala:
<UL>
<LI>Tuimala, J.  2004. <em>A Primer to Phylogenetic Analysis using Phylip Package.
</em> 2nd edition.  Center for Scientific Computing, Espoo, Finland.
</UL>
and it is
available as a PDF <A HREF="http://koti.mbnet.fi/tuimala/oppaat/phylip2.pdf">here</A>.
<P>
<DIV ALIGN=CENTER>
<H3>Questions about distribution and citation:</H3>
</DIV>
<P>
<DT><STRONG>"If I copied PHYLIP from a friend without you knowing, should I try
to keep you from finding out?"</STRONG>
<DD>No.  It is to your advantage and mine for you to
let me know.  If you did not get PHYLIP "officially" from me  or  from  someone
authorized  by me, but copied a friend's version, you are not in my database of
users.   You  may also  have  an  old  version  which  has   since   been
substantially  improved.  I  don't  mind  you  "bootlegging"
PHYLIP (it's free anyway), but
you should realize that you may have copied an outdated version. If you are reading this
Web page, 
you can get  the  latest  version  just  as  quickly over Internet.
It will help both of us if you get
onto my mailing list.  If you are on it, then I will give your  name  to  other
nearby  users  when  they ask for the names of nearby users, and they are urged to contact you and
update  your  copy.   (I  benefit  by  getting  a  better  feel  for  how  many
distributions  there have been, and having a better mailing list to use to give
other users local people to contact).  Use the registration form which
can be accessed through our web site's registration page.
<DT><STRONG>"How do I make a citation  to  the  PHYLIP  package in  the  paper I am
writing?"</STRONG> 
<DD>One way is like this:
<P>
Felsenstein, J.  2009.  PHYLIP (Phylogeny Inference Package) version 3.695.
<I>Distributed by the author.  Department of Genome Sciences, University of
Washington, Seattle.</I>
<P>
or if the editor for whom you are writing insists that the citation must be  to
a  printed  publication,  you  could cite a notice for version 3.2 published in
Cladistics:
<P>
Felsenstein, J.  1989.  PHYLIP - Phylogeny Inference Package (Version 3.2).
<I>Cladistics</I> <B>5:</B> 164-166.
<P>
(This citation has been so commonly made that this is the most-cited paper
ever in the journal <i>Cladistics</i>, I am the most-cited author ever in that
journal, and these citations are responsible for more than 15% of the
impact factor of that journal!).
<P>
For a while a printed version of the PHYLIP documentation was available and one
could  cite that.  This is no longer true.  Other than that, this is difficult,
because I have never written a paper announcing PHYLIP!  My 1985b paper in
Evolution on the bootstrap method contains a
one-paragraph Appendix describing the availability of this package, and that
can also be cited as a reference for the package, although it was
distributed since 1980 while the bootstrap paper is 1985.  A paper on PHYLIP
is needed mostly to give people something to cite, as word-of-mouth, references
in other people's papers, and electronic newsgroup postings have spread the
word about PHYLIP's existence quite effectively.
<DT><STRONG>"Can I make copies of PHYLIP available to the students in
my class?"</STRONG>
<DD>Generally, yes.  Read the Copyright notice near the front of
this main documentation page.  If you charge money for PHYLIP,
other than a minimal charge to cover cost of distribution,
or you use it in a service for which you charge money, you will need
to negotiate a royalty.  But you can make it freely available
and you do not need to get any special permission from us to do so.
<DT><STRONG>"How many copies of PHYLIP have been distributed?"</STRONG>
<DD> We have about 28,000 registrations for PHYLIP.  The number is not
exact, since it does not count repeat registrations by the same person,
and these are not always easy to detect (this number is an estimate
based on a carefully examined sample of the registrations, to find out
how many of them were re-registrations).  Of course there are
many more people who have got copies from friends, or who downloaded it
without registering it.  PHYLIP is probably the most widely distributed 
phylogeny package.  In recent years magnetic tape distribution, diskette 
distribution and e-mail distribution of PHYLIP have disappeared (as I insist 
people use the Web distribution).  But all this has been more than  
offset by, first, an explosion of distributions by anonymous ftp over 
Internet, and then a bigger explosion of Web distributions and 
registrations (about 6 registrations per day at the moment).
<DT><STRONG>"Isn't it great that PHYLIP is the most widely-used
package of phylogeny programs?"</STRONG>
<DD> It would be great if that were true, but I suspect that it is not true.
Developers of other packages usually do not give out numbers of distributions 
or numbers of registrations of their package.  Probably the best indication of 
level of use is the number of citations to these packages in the scientific 
literature.  Doing a search using the Web Of Science, I find that PHYLIP is 
either third or fourth, the order of packages being PAUP*, MrBayes, and then 
either PHYLIP or PHYML.  PHYLIP gets about 1,000 literature citations per 
year, PAUP* and MrBayes each get 2-3 times as many as that.  As for uses 
rather than citations, that is very hard to assess.   PHYLIP is widely used in 
teaching, which would account for many runs, but I do not know of a way to 
count these.
<P>
<DIV ALIGN=CENTER>
<H3>Questions about documentation</H3>
</DIV>
<P>
<DT><STRONG>"Where can I get a printed version of  the  PHYLIP  documents?"</STRONG>
<DD>For  the
moment,  you  can  only  get  a  printed  version by printing it yourself.  For
versions 3.1 to 3.3 a printed version was sold by Christopher Meacham  and  Tom
Duncan,  then  at  the  University Herbarium of the University of California at
Berkeley.  But they have had to discontinue this as it was too much work.   You
should  be  able to print out the documentation files on almost any printer and
make yourself a printed version of whichever of them you need.
<DT><STRONG>"Why have I been dropped from your newsletter mailing list?"</STRONG>
<DD>You haven't.
The  newsletter  was  dropped.  It simply was too hard to mail it out to such a
large mailing list.  The last issue of the newsletter was Number 9 in May,
1987.  The Listserver News Bulletins that we tried for a while have also been dropped
as too hard to keep up to date.  I am hoping that our World Wide Web site will take their place.
</DL>
<P>
<DIV ALIGN="CENTER">
<H3>Additional Frequently Asked Questions, or:</B>
"Why didn't it occur to you to ...</H3></DIV>
<DL>
<DT><STRONG>... allow the options to be set on the command line?"</STRONG>
<DD>We could in Unix and Linux, or somewhat differently in Windows.  But
there are so many options that this would be difficult, especially
when the options require additional information to be supplied such as
rates of evolution for many categories of sites.  You may be asking this
question because you want to automate the operation of PHYLIP programs
using batch files (command files) to run in background.  If that is the
issue, see the section of this main documentation page on
"Running the programs in background or under control of a command file".
It explains how to set the options using input redirection and a file
that has the menu responses as keystrokes.
<DT><STRONG>... write these programs in Java?"</STRONG>
<DD>Well, we might.  It is not completely clear which of two contenders,
C++ and Java, will become more widespread, and which one will gradually
fade away.  Whichever one is more successful, we will probably want to use
for future versions of PHYLIP.  As the C compilers that are used to
compile PHYLIP are usually also able to compile C++, we will be moving in
that direction, but with constant worrying about whether to convert PHYLIP
to Java instead.
<DT><STRONG>... forgot about all those inferior systems and just develop PHYLIP for Unix?"</STRONG>
<DD>This is self-answering, since the same people first said I should 
just develop it for Apple II's, then just for CP/M Z-80's, then just
for IBM PCDOS, then just for Macintoshes or for Sun 
workstations, and then for Windows.  If I had listened to them and done any one of these, I would 
have had a very hard time adapting the package to any of the other ones once 
these folks changed their mind (and most of them did)!
<DT><STRONG>... write these programs in Pascal?"</STRONG>
<DD>These programs started out
in Pascal in 1980.  In 1993 we released both Pascal and C versions.  The
present version (3.6) and
future versions will be C-only.  I make fewer mistakes in Pascal and do
like the language better than C, but C has overtaken Pascal and Pascal
compilers are starting to be hard to find on some machines.  Also C is a
bit better standardized which makes the number of modifications a user
has to make to adapt the programs to their system much less.
<DT><STRONG>... write these programs in PROLOG
(or Ada, or Modula-2, or SIMULA, or BCPL, or PL/I, or APL, or LISP)?"</STRONG>
<DD>These are all languages I have considered.  All
have advantages, but they are not really widespread (as are C, C++, and Java).
<DT><STRONG>... include in the package a program to do the Distance Wagner method, (or
successive approximations character weighting)?"</STRONG>
<DD>In most cases where I have not
included other methods, it is because I decided that they had no substantial
advantages over methods that were included (such as the programs Fitch, 
Kitsch, Neighbor, the <TT>T</TT> option of Mix and Dollop, and the "<TT>?</TT>" ancestral
states option of the discrete characters parsimony programs).
<DT><STRONG>... include in the package ordination methods and more
clustering algorithms?"</STRONG>
<DD>Because this is <I>not</I> a clustering package, it's a
package for phylogeny estimation.  Those are different tasks with different
objectives and mostly different methods.  Mary Kuhner and Jon Yamato have,
however,
included in Neighbor an option for UPGMA clustering, which will be very
similar to Kitsch in results.
<DT><STRONG>... include in the package a program to do nucleotide sequence 
alignment?"</STRONG>
<DD>Well, yes, I should
have, and this is scheduled to be in future releases.  But multiple sequence
alignment programs, in the era after Sankoff, Morel, and Cedergren's 1973
classic paper, need to use substantial computer horsepower to estimate the
alignment and the tree together (but see Karl Nicholas's program
<TT>GeneDoc</TT> or Ward Wheeler and David Gladstein's <TT>MALIGN</TT>, as
well as more approximate methods of tree-based alignment used in
<TT>ClustalW</TT>, <TT>TreeAlign</TT>, or <TT>POY</TT>).
</DL>
<P>
<DIV ALIGN="CENTER">
<H3>(Fortunately) obsolete questions</H3></DIV>
<P>
(The following four questions, once
common, have finally disappeared, I am pleased to report.  I include them to
give you some idea of what kinds of requests I had to cope with.)
<H4>"Why didn't it occur to you to ...</H4></DIV>
<DL>
<DT><STRONG>... let me log in to your computer in Seattle
and copy the files out over a phone line?"</STRONG>
<DD>No thanks.  It would cost you for a lot of
long-distance telephone time, plus a half hour of my time and yours in which
I had to explain to you how to log in and do the copying.
<DT><STRONG>... send me a listing of your program?"</STRONG>
<DD>Damn it, it's not "a program",
it's 37 programs, in a great many files.  What were you
thinking of doing, having 1800-line programs typed in by slaves at your
end?  If you were going to go to all that trouble why not try network
transfer?  If you have these then you can print out all the
listings you want to and add them to the huge stack of printed output in
the corner of your office.
<DT><STRONG>... write a magnetic tape in our computer center's favorite format
(inverted Ruritanian EBCDIC at 998 bpi)?"</STRONG>
<DD>Because the ANSI standard
format is the most widely used one, and even though your computer center
may pretend it can't read a tape written this way, if you sniff around
you will find a utility to read it.  It's just a <I>lot</I> easier for me to
let you do that work.  If I tried to put the tape into your format, I
would probably get it wrong anyway.
<DT><STRONG>... give us a version of these in FORTRAN?"</STRONG>
<DD>Because the
programs are <I>far</I> easier to write and debug in C or Pascal, and cannot
easily be
rewritten into FORTRAN (they make extensive use of recursive calls and
of records and pointers).  In any case, C is widely available.  If you don't
have a C compiler or don't know
how to use it, you are going to have to learn a language like C or
Pascal sooner or later, and the sooner the better.
</DL>
<P>
<A NAME="newfeatures"><HR><P></A>
<DIV ALIGN="CENTER">
<H2>New Features in This Version</H2></DIV>
<P>
Version 3.6 has many new features:
<UL><LI> Faster (well, less, slow) likelihood programs.
<LI> The DNA and protein likelihood and distance programs allow
for rate variation between sites using a gamma distribution of
rates among sites, or using a gamma distribution plus a given
fraction of sites which are assumed invariant.
<LI> A new multistate discrete characters parsimony program, Pars, that
handles unordered multistate characters.
<LI> The Dnapars and Pars parsimony programs can infer multifurcating
trees, which sensibly reduces the number of tied trees they find.
<LI> A new protein sequence likelihood program, Proml,
and also a version, Promlk which assumes a molecular clock.
<LI> A new restriction sites and restriction fragments distance program,
Restdist, that can also be used to compute distances for RAPD and
AFLP data.  It also allows for gamma-distributed rate variation among
DNA sites.
<LI> In the DNA likelihood programs, you can now specify different
categories of rates of change (such as rates for first, second, and
third positions of a coding sequence) and assign them to specific sites.
This is in addition to the ability of the program to use the Hidden Markov
Model mechanism to allow rates of change to vary across sites in a way that
does not ask you to assign which rate goes with which site.
<LI> The input files for many of the programs are now
simpler, in that they do not contain options information such as specification
of weights and categories.  That information is now provided in separate
files with default names such as <TT>weights</TT> and <TT>categories</TT>.
<LI> The DNA likelihood programs can now evaluate multifurcating
user trees (option <TT>U</TT>).
<LI> All programs that read in user-defined trees now do so from a separate
file, whose default name is <TT>intree</TT>, rather than requiring them to
be in the input file as before.
<LI> The DNA likelihood programs can infer the sequence at ancestral
nodes in the interior of the tree.
<LI> Dnapars can now do transversion parsimony.
<LI> The bootstrapping program Seqboot now can, instead of producing a
large file containing multiple data sets, be asked instead
to produce a weights file with multiple sets of weights.  Many
programs in this release can analyze those multiple weights together with
the original data set, which saves disk space.
<LI> The bootstrapping program Seqboot can pass weights and categories
information through to a multiple weights file or a multiple categories
file.
<LI> Seqboot can also convert sequence files from Interleaved to
Sequential form, or back.
<LI> Seqboot can convert a PHYLIP molecular sequences or discrete characters
morphology data file into the NEXUS format, which is used by a number of
other phylogeny programs such as MacClade, MrBayes and PAUP*.
<LI> Seqboot can also carry out a number of different methods of
permuting the order of characters in a data set.  This could be used to
carry out the Incongruence Length Difference (or Partition Homogeneity)
method of testing homogeneity of data sets.
<LI> Seqboot can also write a sequence data file into one version of
an XML format for sequence alignments,
for use by programs that need XML input
(none of the current PHYLIP programs can yet use this format, but it
may be useful in the future).
<LI> Retree can now write tree out into a preliminary version of a new XML tree
file format which is in the process of being defined.
<LI> The Kishino-Hasegawa-Templeton (KHT) test which compares user-defined
trees (option U) is now joined by the Shimodaira-Hasegawa (SH) test
(Shimodaira and Hasegawa, 1999) which corrects for comparisons among
multiple tests.  This avoids a statistical problem with multiple user trees.
<LI> Contrast can now carry out an analysis that takes into account
within-species variation, according to a model similar (but not
identical) to that introduced by Michael Lynch (1990).  This enables
analysis of individuals sampled from the species, in a way that properly
takes sampling error into account.
<LI> A new program, Treedist, computes the Symmetric
Difference distance among trees.  This measures the number of branches in
the trees that are present in one but not the other.
It also can compute the Branch Score distance defined by Kuhner and Felsenstein
(1994) and a distance by Robinson and Foulds, both of which take
branch lengths into account.
<LI> Fitch and Kitsch now have an option to make trees by the
minimum evolution distance matrix method.
<LI> The protein parsimony program Protpars now allows you to choose among
a number of different genetic codes such as mitochondrial codes.
<LI> The consensus tree program Consense
can compute the M<SUB>l</SUB> family of consensus tree methods, which
generalize the Majority Rule consensus tree method. It can
also compute our extended Majority Rule consensus (which is
Majority Rule with some additional groups added to resolve the
tree more completely), and it can also compute the original
Majority Rule consensus tree method which does not add these
extra groups.  It can also
compute the Strict consensus.
<LI> The tree-drawing programs Drawgram and Drawtree have a number of new
options of kinds of file they can produce, including Windows Bitmap files,
files for the Idraw and FIG X windows drawing programs, the POV ray-tracer,
and even VRML Virtual Reality Markup Language files that will enable you
to wander around the tree using a VRML plugin for your browser, such as
Cosmo Player or Cortona.
<LI> Drawtree now uses my new Equal Daylight Algorithm to draw unrooted
trees.  This gives a much better-looking tree.  Of course, competing programs
such as TREEVIEW and PAUP draw trees that look just as good - because they
too have started to use my method (with my encouragement).  Drawtree also
can use another algorithm, the n-body method.
<LI> The tree-drawing programs can now produce trees across multiple
pages, which is handy for looking at trees with very large numbers
of tips, and for producing giant diagrams by pasting together
multiple sheets of paper.
</UL>
<P>
There are many more, lesser features added as well.
<P>
Version 3.7 has some new features:
<UL>
<LI> It is possible in the programs that do a heuristic search for the
best tree to use a user-defined tree as its starting point, rearranging
the resulting tree.  If this is done on multiple user-defined trees, the
program reports the best tree or trees found among all the results.  Using
this feature and the bootstrapping tool Seqboot, one can then implement the
search strategy of Nixon (1999).  This can be done for all methods that
have an optimality criterion.  Rearranging a user-defined tree can also be used
to start a likelihood search from the tree found by a neighbor-joining
analysis.
<LI> A new program, Threshml, has been added which uses the threshold
model of quantitative genetics to model the evolution of discrete 0/1
characters and infer the covariances between evolution of the continuous
underlying "liability" characters.  This is a Markov Chain Monte Carlo
model which can make inferences for numbers of characters less than twice the
number of species.
</UL>
<P>
<A NAME="future"><HR><P></A>
<DIV ALIGN="CENTER">
<H2>Coming Attractions, Future Plans</H2></DIV>
<P>
There are some obvious deficiencies in this version.  Some of these
holes will be filled in the next few releases (leading to version
4.0).  They include:
<OL>
<LI> Obviously we need to start thinking about a more visual mouse/windows
interface, but only if that can be used on X windows, Macintoshes, and
Windows.
<LI> Program Penny and its relatives will improved so as to run faster
and find all most parsimonious trees more quickly.
<LI>An "evolutionary clock" version of Contml will be done, and the same 
may also be done for Restml.
<LI> We are gradually generalizing the tree structures in the programs to
infer multifurcating trees as well as bifurcating ones.
We should be able to have any program read any tree and know what to do
with it, without the user having to fret about whether an unrooted tree was
fed to a program that needs a rooted tree.
<LI> In general, we need more support for protein sequences, including a
codon model of change, allowing for different rates for synonymous
and nonsynonymous changes.
<LI> We also need more support for combining runs from multiple loci,
allowing for different rates of evolution at the different loci.
<LI> We will be expanding our use and production of XML data set files and
XML tree files.
<LI> A program to align molecular sequences on a predefined User Tree may
ultimately be included.  This will allow alignment and phylogeny
reconstruction to procede iteratively by successive runs of two programs, one
aligning on a tree and the other finding a better tree based on that alignment.
In the shorter run a simple two-sequence alignment program may be included.
<LI> An interactive "likelihood explorer" for DNA sequences will be written.
This will allow, either with or without the assumption of a molecular
clock, trees to be varied interactively so that the user can get a much
better feel for the shape of the likelihood surface.  Likelihood will be
able to be plotted against branch lengths for any branch.
<LI> If possible we will allow use of Hidden Markov Models for correcting for
purine/pyrimidine richness variations among species, within the framework of
the maximum likelihood programs.  That the maximum likelihood programs do not
allow for base composition variation is their major limitation at the moment.
<LI> The Hidden Markov Model (regional rates) option of Dnaml and Dnamlk will
be generalized to allow
for rates at sites to gradually change as one moves along the tree,
in an attempt to implement Fitch and Markowitz's (1970) notion of "covarions".
<LI> A more sophisticated compatibility program should be included, if I can
find one.
<LI> We are economizing on the size of the source code, and enforcing some
standardization of it, by putting frequently used routines in separate
files which can be linked into various programs.  This will enforce
a rather complete standardization of our code.
<LI> We will move our code to an object-oriented
language, most likely C++.  One could describe the language that version
3.4 was written in as "Pascal", version 3.5 as "Pascal written in C",
version 3.6 as "C written in C", version 3.7 as "C++ written
in C" and then 4.0 as "C++ written in C++".  At least that scenario
is one possibility.
</OL>
<P>
There will also be many future developments in the programs that treat
continuously-measured data (quantitative characters) and morphological
or behavioral data with discrete states, as I have new ideas for
analyzing these data in ways that connect to within-species
quantitative genetic analyses.  This will compete with parsimony
analysis.
<P>
<A NAME="endorsements"><HR><P></A>
<DIV ALIGN="CENTER">
<H2>Endorsements</H2></DIV>
<P>
Here are some comments people have made in print about PHYLIP.  Explanatory
material in square brackets is my own.  They fall naturally into three groups:
<P>
<H3>From the pages of <I>Cladistics</I>:</H3>
<P>
<BLOCKQUOTE>
"Under no circumstances can we recommend PHYLIP/WAG [their name for the
Wagner parsimony option of Mix]."
<DIV ALIGN="RIGHT">
Luckow, M. and R. A. Pimentel (1985)
</DIV>
</BLOCKQUOTE>
<P>
<BLOCKQUOTE>
"PHYLIP has not proven very effective in implementing parsimony (Luckow and
Pimentel, 1985)."
<DIV ALIGN="RIGHT">
J. Carpenter (1987a)
</DIV>
</BLOCKQUOTE>
<P>
<BLOCKQUOTE>
"... PHYLIP.  This is the computer program where every newsletter concerning
it is mostly bug-catching, some of which have been put there by previous
corrections.  As Platnick (1987) documents, through dint of much labor useful
results may be attained with this program, but I would suggest an
easier way: FORMAT b:"
<DIV ALIGN="RIGHT">
J. Carpenter (1987b)
</DIV>
</BLOCKQUOTE>
<P>
<BLOCKQUOTE>
"PHYLIP is bug-infested and both less effective and orders of
magnitude slower than other programs ...."
<DIV ALIGN="RIGHT">
"T. N. Nayenizgani" [J. S. Farris] (1990)
</DIV>
</BLOCKQUOTE>
<P>
<BLOCKQUOTE>
"Hennig86 [by J. S. Farris] provides such substantial improvements over
previously available programs (for both mainframes and microcomputers) that
it should now become the tool of choice for practising systematists."
<DIV ALIGN="RIGHT">
N. Platnick (1989)
</DIV>
</BLOCKQUOTE>
<P>
<H3>... in the pages of other journals:</H3>
<P>
<BLOCKQUOTE>
"The availability, within PHYLIP of distance, compatibility, maximum likelihood,
and generalized `invariants' algorithms (Cavender and Felsenstein, 1987) sets
it apart from other packages .... One of the strengths of PHYLIP is its
documentation ...."
<DIV ALIGN="RIGHT">
Michael J. Sanderson (1990)
</DIV>
<EM>(Sanderson also criticizes PHYLIP for slowness and inflexibility of its
parsimony algorithms, and compliments other packages on their strengths).</EM>
</BLOCKQUOTE>
<P>
<BLOCKQUOTE>
"This package of programs has gradually become a basic necessity to anyone
working seriously on various aspects of phylogenetic inference .... The package
includes more programs than any other known phylogeny package.  But it is not
just a collection of cladistic and related programs.  The package has great
value added to the whole, and for this it is unique and of extreme
importance .... its various strengths are in the great array of methods
provided ...."
<DIV ALIGN="RIGHT">
Bernard R. Baum (1989)
</DIV>
<P>
(note also W. Fink's critical remarks (1986) on version 2.8 of PHYLIP).
<P>
</BLOCKQUOTE>
<P>
<H3>... and in the comments made by users when they register:</H3>
<P>
<BLOCKQUOTE>
"a program on phylogeny --
PHYLOGENY INTERFERENCE PACKAGE (PHYLIP).  We would therefore like to ask ..."
</BLOCKQUOTE>
<DIV ALIGN="RIGHT">
[names withheld] (in 1994)
</DIV>
<P>
<BLOCKQUOTE>
"I am struglling with your clever programs."
</BLOCKQUOTE>
<DIV ALIGN="RIGHT">
[name withheld] (in 1995)
</DIV>
<P>
<BLOCKQUOTE>
"I'm famously computer illiterate - I look forward to many frustrating hours trying to run this program"
</BLOCKQUOTE>
<DIV ALIGN="RIGHT">
Desmond Maxwell (in 1998)
</DIV>
<P>
<BLOCKQUOTE>
"I am a brave man.  PHYLIP is a brave program.  We'll do fine together."
</BLOCKQUOTE>
<DIV ALIGN="RIGHT">
Christopher Winchell (in 2000)
</DIV>
<P>
<BLOCKQUOTE>
"The Mahabarata of phylogenetics looks better than ever."
</BLOCKQUOTE>
<DIV ALIGN="RIGHT">
Ross Crozier (in 2001)
</DIV>
<BLOCKQUOTE>
"I love phylip.  Tastes great and less filling!"
</BLOCKQUOTE>
<DIV ALIGN="RIGHT">
Byron Adams (in 2002)
</DIV>
<A NAME="references"><HR><P></A>
<DIV ALIGN="CENTER">
<H2>References for the Documentation Files</H2></DIV>
<P>
In the documentation files that follow I frequently refer to papers
in the literature.  In order to centralize the references they are given
in this section.  If you want to find further papers beyond these, my
book (Felsenstein, 2004) lists more than 1,000 further references.
<P>
Adams, E. N.  1972.  Consensus techniques and the comparison of
taxonomic trees.  <I>Systematic Zoology</I> <B>21:</B> 390-397.
<P>
Adams, E. N.  1986.  N-trees as nestings: complexity, similarity, and
consensus.  <I>Journal of Classification</I> <B>3:</B> 299-317.
<P>
Archie, J. W.  1989.  A randomization test for phylogenetic information in
systematic data.  <I>Systematic Zoology</I> <B>38:</B> 239-252.
<P>
Backeljau, T., L. De Bruyn, H. De Wolf, K. Jordaens, S. Van Dongen,
and B. Winnepenninckx.
1996. Multiple UPGMA and neighbor-joining trees and the performance of some
computer packages. <i>Molecular Biology and Evolution</i> <b>13:</b> 309–313.
<P>
Barry, D., and J. A. Hartigan.  1987.  Statistical analysis of hominoid
molecular evolution.  <I>Statistical Science</I>  <B>2:</B> 191-210.
<P>
Baum, B. R.  1989.  PHYLIP: Phylogeny Inference Package. Version 3.2. (Software
review).  <I>Quarterly Review of Biology</I> <B>64:</B> 539-541.
<P>
Bourque, M.  1978.  <i>Arbres de Steiner et reseaux dont certains sommets sont
&agrave; localisation variable.</i> Ph. D. Dissertation, Universit&eacute; de
Montr&eacute;al, Quebec.
<P>
Bron, C., and J. Kerbosch.  1973.  Algorithm 457: Finding all cliques
of an undirected graph.  <I>Communications of the Association for Computing Machinery</I> <B>16:</B> 575-577.
<P>
Camin, J. H., and R. R. Sokal.  1965.  A method for deducing branching
sequences in phylogeny.  <I>Evolution</I> <B>19:</B> 311-326.
<P>
Carpenter, J.  1987a.  A report on the Society for the Study of Evolution
workshop "Computer Programs for Inferring Phylogenies".  <I>Cladistics</I> <B>3:</B>
363-375.
<P>
Carpenter, J.  1987b.  Cladistics of cladists.  <I>Cladistics</I> <B>3:</B> 363-375.
<P>
Cavalli-Sforza, L. L., and A. W. F. Edwards.  1967.  Phylogenetic
analysis: models and estimation procedures.  <I>Evolution</I> <B>32:</B> 550-570
(also <I>American Journal of Human Genetics</I> <B>19:</B> 233-257).
<P>
Cavender, J. A. and J. Felsenstein.  1987.  Invariants of phylogenies in a
simple case with discrete states.  <I>Journal of Classification</I> <B>4:</B> 57-71.
<P>
Churchill, G.A.  1989.  Stochastic models for heterogeneous DNA sequences.
<I>Bulletin of Mathematical Biology</I> <B>51:</B> 79-94.
<P>
Conn, E. E. and P. K. Stumpf.  1963.  <I>Outlines of Biochemistry.</I>  John Wiley
and Sons, New York.
<P>
Day, W. H. E.  1983.  Computationally difficult parsimony problems in
phylogenetic systematics.  <I>Journal of Theoretical Biology</I> <B>103:</B>
429-438.
<P>
Dayhoff, M. O. and R. V. Eck.  1968.  <I>Atlas of Protein Sequence
and Structure 1967-1968.</I>  National Biomedical Research Foundation,
Silver Spring, Maryland.
<P>
Dayhoff, M. O., R. M. Schwartz, and B. C. Orcutt.  1979.  A model of
evolutionary change in proteins.  pp. 345-352 in <I>Atlas of
Protein Sequence and Structure, volume 5, supplement 3, 1978,</I> ed.
M. O. Dayhoff.  National Biomedical Research Foundation, Silver Spring, Maryland
.
<P>
Dayhoff, M. O.  1979.  <I>Atlas of Protein Sequence and Structure, Volume 5,
Supplement 3, 1978.</I>  National Biomedical Research Foundation, Washington, D.C.
<P>
DeBry, R. W. and N. A. Slade.  1985.  Cladistic analysis of restriction
endonuclease cleavage maps within a maximum-likelihood framework.
<I>Systematic Zoology</I> <B>34:</B>  21-34.
<P>
Dempster, A. P., N. M. Laird, and D. B. Rubin.  1977.  Maximum
likelihood from incomplete data via the EM algorithm.  <I>Journal of the Royal Statistical Society B</I> <B>39:</B> 1-38.
<P>
Eck, R. V., and M. O. Dayhoff.  1966.  <I>Atlas of Protein Sequence and
Structure 1966.</I>  National Biomedical Research Foundation, Silver
Spring, Maryland.
<P>
Edwards, A. W. F., and L. L. Cavalli-Sforza.  1964.  Reconstruction of
evolutionary trees.  pp. 67-76 in <I>Phenetic and Phylogenetic
Classification,</I> ed. V. H. Heywood and J. McNeill. Systematics
Association Volume No. 6. Systematics Association, London.
<P>
Estabrook, G. F., C. S. Johnson, Jr., and F. R. McMorris.  1976a.  A
mathematical foundation for the analysis of character
compatibility.  <I>Mathematical Biosciences</I> <B>23:</B> 181-187.
<P>
Estabrook, G. F., C. S. Johnson, Jr., and F. R. McMorris.  1976b.  An
algebraic analysis of cladistic characters.  <I>Discrete Mathematics</I> <B>16:</B> 141-147.
<P>
Estabrook, G. F., F. R. McMorris, and C. A. Meacham.  1985.  Comparison of
undirected phylogenetic trees based on subtrees of four evolutionary units.
<I>Systematic Zoology</I> <B>34:</B> 193-200.
<P>
Faith, D. P.  1990.  Chance marsupial relationships.  <I>Nature</I> <B>345:</B> 393-394.
<P>
Faith, D. P. and P. S. Cranston.  1991.  Could a cladogram this short have
arisen by chance alone?: On permutation tests for cladistic
structure.  <I>Cladistics</I> <B>7:</B> 1-28.
<P>
Farris, J. S.  1977.  Phylogenetic analysis under Dollo's Law.  <I>Systematic Zoology</I> <B>26:</B> 77-88.
<P>
Farris, J. S.  1978a.  Inferring phylogenetic trees from chromosome
inversion data.  <I>Systematic Zoology</I> <B>27:</B> 275-284.
<P>
Farris, J. S.  1981.  Distance data in phylogenetic analysis.  pp. 3-23
in <I>Advances in Cladistics: Proceedings of the first meeting of the
Willi Hennig Society,</I> ed. V. A. Funk and D. R. Brooks.  New York
Botanical Garden, Bronx, New York.
<P>
Farris, J. S.  1983.  The logical basis of phylogenetic analysis.  pp. 1-47
in <I>Advances in Cladistics, Volume 2, Proceedings of the Second Meeting of
the Willi Hennig Society.</I>  ed. Norman I. Platnick and V. A. Funk.  Columbia
University Press, New York.
<P>
Farris, J. S.  1985.  Distance data revisited.  <I>Cladistics</I> <B>1:</B> 67-85.
<P>
Farris, J. S.  1986.  Distances and statistics.  <I>Cladistics</I> <B>2:</B> 144-157. 
<P>
Farris, J. S. [&ldquo;T. N. Nayenizgani&rdquo;].  1990.  The systematics association
enters its golden years (review of <I>Prospects in Systematics</I>, ed. D.
Hawksworth).  <I>Cladistics</I> <B>6:</B> 307-314.
<P>
Farris, J. S., V. A. Albert, M. K&aauml;llersj&oauml;, D.
Lipscomb, and A. G. Kluge.  1996.  Parsimony jackknifing outperforms
neighbor-joining.  <i>Cladistics</i> <b>12:</b> 99-124.
<P>
Felsenstein, J.  1973a.  Maximum likelihood and minimum-steps methods
for estimating evolutionary trees from data on discrete characters.
<I>Systematic Zoology</I> <B>22:</B> 240-249.
<P>
Felsenstein, J.  1973b.  Maximum-likelihood estimation of evolutionary
trees from continuous characters.  <I>American Journal of Human Genetics</I> <B>25:</B>
471-492.
<P>
Felsenstein, J.  1978a.  The number of evolutionary trees.  <I>Systematic Zoology</I> <B>27:</B> 27-33.
<P>
Felsenstein, J.  1978b.  Cases in which parsimony and compatibility
methods will be positively misleading.  <I>Systematic Zoology</I> <B>27:</B>
401-410.
<P>
Felsenstein, J.  1979.  Alternative methods of phylogenetic inference
and their interrelationship.  <I>Systematic Zoology</I> <B>28:</B> 49-62.
<P>
Felsenstein, J.  1981a.  Evolutionary trees from DNA sequences: a
maximum likelihood approach.  <I>Journal of Molecular Evolution</I> <B>17:</B> 368-376.
<P>
Felsenstein, J.  1981b.  A likelihood approach to character weighting
and what it tells us about parsimony and compatibility.  <I>Biological Journal of the Linnean Society</I> <B>16:</B> 183-196.
<P>
Felsenstein, J.  1981c.  Evolutionary trees from gene frequencies and
quantitative characters: finding maximum likelihood estimates.
<I>Evolution</I> <B>35:</B> 1229-1242.
<P>
Felsenstein, J.  1982.  Numerical methods for inferring evolutionary
trees.  <I>Quarterly Review of Biology</I> <B>57:</B> 379-404.
<P>
Felsenstein, J.  1983b.  Parsimony in systematics: biological and
statistical issues. <I>Annual Review of Ecology and Systematics</I> <B>14:</B> 313-333.
<P>
Felsenstein, J. 1984a.  Distance methods for inferring phylogenies: a
justification. <I>Evolution</I> <B>38:</B> 16-24.
<P>
Felsenstein, J.  1984b.  The statistical approach to inferring
evolutionary trees and what it tells us about parsimony and
compatibility.  pp. 169-191 in: <I>Cladistics: Perspectives in the
Reconstruction of Evolutionary History,</I> edited by T. Duncan and T. F.
Stuessy.  Columbia University Press, New York.
<P>
Felsenstein, J.  1985a.  Confidence limits on phylogenies with a molecular
clock.  <I>Systematic Zoology</I> <B>34:</B> 152-161.
<P>
Felsenstein, J.  1985b.  Confidence limits on phylogenies: an approach
using the bootstrap.  <I>Evolution</I> <B>39:</B> 783-791. 
<P>
Felsenstein, J.  1985c.  Phylogenies from gene frequencies: a statistical
problem.  <I>Systematic Zoology</I> <B>34:</B> 300-311.
<P>
Felsenstein, J.  1985d.  Phylogenies and the comparative method.  <I>American Naturalist</I> <B>125:</B> 1-12.
<P>
Felsenstein, J.  1986.  Distance methods: a reply to Farris.  <I>Cladistics</I> <B>2:</B>
130-144.
<P>
Felsenstein, J.  and E. Sober.  1986.  Parsimony and likelihood: an
exchange.  <I>Systematic Zoology</I> <B>35:</B> 617-626.
<P>
Felsenstein, J.  1988a.  Phylogenies and quantitative characters.  <I>Annual Review of Ecology and Systematics</I> <B>19:</B> 445-471.
<P>
Felsenstein, J.  1988b.  Phylogenies from molecular sequences: inference and 
reliability.   <I>Annual Review of Genetics</I> <B>22:</B> 521-565.
<P>
Felsenstein, J.  1992.  Phylogenies from restriction sites, a
maximum likelihood approach.  <I>Evolution</I> <B>46:</B> 159-173.
<P>
Felsenstein, J. and G. A. Churchill. 1996.
A hidden Markov model approach to variation among sites in rate of evolution
<I>Molecular Biology and Evolution</I> <B>13:</B> 93-104.
<P>
Felsenstein, J. 2004. <em>Inferring Phylogenies.</em> Sinauer Associates,
Sunderland, Massachusetts.
<P>
Felsenstein, J. 2005. Using the threshold model of quantitative genetics
for inferences within and between species.  <em>Philosophical Transactions
of the Royal Society of London, Series B</em> <b>360</b> 1427-1434. 
<P>
Felsenstein, J. 2008. Comparative methods with sampling error and within-species
variation: contrasts revisited and revised. <em>American Naturalist</em> <b>171:</b> 713-725.
<P>
Fink, W. L.  1986.  Microcomputers and phylogenetic analysis.  <I>Science</I> <B>234:</B> 1135-1139.
<P>
Fitch, W. M., and E. Markowitz.  1970.  An improved method for determining
codon variability in a gene and its application to the rate of fixation of
mutations in evolution.  <I>Biochemical Genetics</I> <B>4:</B> 579-593.
<P>
Fitch, W. M., and E. Margoliash.  1967.  Construction of phylogenetic
trees.  <I>Science</I> <B>155:</B> 279-284.
<P>
Fitch, W. M.  1971.  Toward defining the course of evolution: minimum
change for a specified tree topology.  <I>Systematic Zoology</I> <B>20:</B> 406-416.
<P>
Fitch, W. M.  1975.  Toward finding the tree of maximum parsimony.  pp. 189-230
in <EM>Proceedings of the Eighth International Conference on Numerical Taxonomy,</EM>
ed. G. F. Estabrook.  W. H. Freeman, San Francisco.
<P>
Fitch, W. M. and E. Markowitz.  1970.  An improved method for determining
codon variability and its application to the rate of fixation of mutations
in evolution.  <I>Biochemical Genetics</I> <B>4:</B> 579-593.
<P>
George, D. G.,  L. T. Hunt, and W. C. Barker.  1988.  Current methods in
sequence comparison and analysis.  pp. 127-149 in Macromolecular Sequencing
and Synthesis, ed. D. H. Schlesinger.  Alan R. Liss, New York.
<P>
Gilmour, R.  2000.  Taxonomic markup language: applying XML to systematic data.
<EM>Bioinformatics</EM> <B>16:</B> 406-407.
<P>
Goldman, N., and Z. Yang.  1994.
A codon-based model of nucleotide substitution for protein-coding DNA
sequences.
<I> Molecular Biology and Evolution</I> <B>11:</B> 725-736.
<P>
Goldstein, D. B., A. Ru&iiacute;z-Linares, M. Feldman, and L. L. Cavalli-Sforza.
1995. Genetic absolute dating based on microsatellites and the origin of
modern humans. <em>Proceedings of the National Academy of Sciences USA</em>
<b>92:</b> 6720-6727.
<P>
Gomberg, D.  1968.  "Bayesian" post-diction in an evolution process.
unpublished manuscript, University of Pavia, Italy.
<P>
Graham, R. L., and L. R. Foulds.  1982.  Unlikelihood that minimal
phylogenies for a realistic biological study can be constructed in
reasonable computational time.  <I>Mathematical Biosciences</I> <B>60:</B> 133-142.
<P>
Hasegawa, M. and T. Yano.  1984a.  Maximum likelihood method of phylogenetic
inference from DNA sequence data.  <I>Bulletin of the Biometric Society of Japan</I>  No. 5:  1-7.
<P>
Hasegawa, M.  and T. Yano.  1984b.  Phylogeny and classification of
Hominoidea as inferred from DNA sequence data.  <I>Proceedings of the Japan Academy</I> <B>60 B:</B> 389-392.
<P>
Hasegawa, M., Y. Iida, T. Yano, F. Takaiwa, and M. Iwabuchi.  1985a.
Phylogenetic relationships among eukaryotic kingdoms as inferred from
ribosomal RNA sequences.  Journal of Molecular Evolution  22: 32-38.
<P>
Hasegawa, M., H. Kishino, and T. Yano.  1985b.  Dating of the human-ape
splitting by a molecular clock of mitochondrial DNA.  Journal of Molecular
Evolution  22: 160-174.
<P>
Hendy, M. D., and D. Penny.  1982.  Branch and bound algorithms to
determine minimal evolutionary trees.  <I>Mathematical Biosciences</I> <B>59:</B> 277-290.
<P>
Higgins, D. G. and P. M. Sharp.  1989.  Fast and sensitive
multiple sequence alignments on a microcomputer.  <I>Computer Applications in the Biological Sciences (CABIOS)</I> <B>5:</B> 151-153.
<P>
Hochbaum, D. S. and A. Pathria.  1997.  Path costs in evolutionary
tree reconstruction.  <I>Journal of Computational Biology</I> <B>4:</B> 163-175.
<P>
Holmquist, R., M. M. Miyamoto, and M. Goodman.  1988.  Higher-primate
phylogeny - why can't we decide?  <I>Molecular Biology and Evolution</I> <B>5:</B> 201-216.
<P>
Inger, R. F.  1967.  The development of a phylogeny of frogs. 
<I>Evolution</I> <B>21:</B> 369-384.
<P>
Jin, L. and M. Nei.  1990.  Limitations of the evolutionary parsimony method
of phylogenetic analysis.  <I>Molecular Biology and Evolution</I> <B>7:</B> 82-102.
<P>
Jones, D. T., W. R. Taylor and J. M. Thornton. 1992. The rapid generation of
mutation data matrices from protein sequences. <I>Computer Applications
in the Biosciences (CABIOS)</I> <B>8:</B> 275-282.
<P>
Jukes, T. H. and C. R. Cantor.  1969.  Evolution of protein molecules.  pp. 
21-132 in Mammalian Protein Metabolism, ed. H. N. Munro.  Academic Press, New 
York.
<P>
Kidd, K. K. and L. A. Sgaramella-Zonta.  1971.  Phylogenetic analysis: concepts
and methods.  <I>American Journal of Human Genetics</I> <B>23:</B> 235-252.
<P>
Kim, J.  and M. A. Burgman.  1988.  Accuracy of phylogenetic-estimation 
methods using simulated allele-frequency data.  <I>Evolution</I> <B>42:</B> 596-602.
<P>
Kimura, M.  1980.  A simple model for estimating evolutionary rates of base 
substitutions through comparative studies of nucleotide sequences.  <I>Journal of Molecular Evolution</I> <B>16:</B> 111-120.
<P>
Kimura, M.  1983.  The Neutral Theory of Molecular Evolution.  Cambridge
University Press, Cambridge.
<P>
Kishino, H. and M. Hasegawa.  1989. Evaluation of the maximum likelihood
estimate of the evolutionary tree topologies from DNA sequence data, and the
branching order in Hominoidea.  <I>Journal of Molecular Evolution</I> <B>29:</B> 170-179.
<P>
Kluge, A. G., and J. S. Farris.  1969.  Quantitative phyletics and the
evolution of anurans.  <I>Systematic Zoology</I> <B>18:</B> 1-32.
<P>
Kosiol, C., and N. Goldman.  2005.  Different versions of the Dayhoff rate
matrix.  <em>Molecular Biology and Evolution</em> <B>22:</B> 193-199.
<P>
Kuhner, M. K. and J. Felsenstein.  1994.  A simulation comparison of
phylogeny algorithms under equal and unequal evolutionary rates.
<I>Molecular Biology and Evolution</I> <B>11:</B> 459-468 (Erratum <B>12:</B> 525 &nbsp;1995).
<P>
K&uuml;nsch, H. R.  1989.  The jackknife and the bootstrap for general stationary
observations.  <I>Annals of Statistics</I> <B>17:</B> 1217-1241.
<P>
Lake, J. A.  1987.  A rate-independent technique for analysis of nucleic acid
sequences: evolutionary parsimony.  <I>Molecular Biology and Evolution</I> <B>4:</B> 167-191.
<P>
Lake, J. A.  1994.  Reconstructing evolutionary trees from DNA and protein
sequences: paralinear distances.
<I>Proceedings of the Natonal Academy of Sciences, USA</I> <B>91:</B> 1455-1459.
<P>
Le Quesne, W. J.  1969.  A method of selection of characters in
numerical taxonomy.  <I>Systematic Zoology</I> <B>18:</B> 201-205.
<P>
Le Quesne, W. J.  1974.  The uniquely evolved character concept and its
cladistic application.  <I>Systematic Zoology</I> <B>23:</B> 513-517.
<P>
Lewis, H. R., and C. H. Papadimitriou.  1978.  The efficiency of
algorithms.  <I>Scientific American</I> <B>238:</B> 96-109 (January issue)
<P>
Lockhart, P. J., M. A. Steel, M. D. Hendy, and D. Penny.  1994.
Recovering evolutionary trees under a more realistic model of sequence
evolution.  <I>Molecular Biology and Evolution</I> <B>11:</B> 605-612.
<P>
Luckow, M.  and D. Pimentel.  1985.  An empirical comparison of
numerical Wagner computer programs.  <I>Cladistics</I> <B>1:</B> 47-66.
<P>
Lynch, M.  1990.  Methods for the analysis of comparative data in evolutionary
biology.  <I>Evolution</I> <B>45:</B> 1065-1080.
<P>
Maddison, D. R.  1991.  The discovery and importance of multiple islands of
most-parsimonious trees.  <I>Systematic Zoology</I> <B>40:</B> 315-328.
<P>
Margush, T. and F. R. McMorris.  1981.  Consensus n-trees.  <I>Bulletin of Mathematical Biology</I> <B>43:</B> 239-244.
<P>
Muse, S. V. and B. S. Gaut. 1994.
A likelihood approach for comparing synonymous and nonsynonymous nucleotide
substitution rates, with application to the chloroplast genome.
<I>Molecular Biology and Evolution</I> <B>11:</B> 715-724,
<P>
Nelson, G.  1979.  Cladistic analysis and synthesis: principles and definitions,
with a historical note on Adanson's <I>Familles des Plantes</I>
(1763-1764).  <I>Systematic Zoology</I> <B>28:</B> 1-21.
<P>
Nei, M.  1972.  Genetic distance between populations.  <I>American Naturalist</I> <B>106:</B> 283-292.
<P>
Nei, M.  and W.-H. Li.  1979.  Mathematical model for studying genetic variation
in terms of restriction endonucleases.  <I>Proceedings of the National Academy of Sciences, USA</I> <B>76:</B> 5269-5273.
<P>
Nei, M. and T. Gojobori. 1986. Simple methods for estimating the numbers of
synonymous and nonsynonymous nucleotide substitutions. <I>Molecular Biology and
Evolution</I> <B>3:</B> 418-426. 
<P>
Nielsen, R., and Z. Yang. 1998. Likelihood models for detecting positively
selected amino acid sites and applications to the HIV-1 envelope gene.
<I>Genetics</I> <B>148:</B> 929-936.
<P>
Nixon, K. C. 1999. The parsimony ratchet, a new method for rapid parsimony
analysis. <em>Cladistics</em> <b>15:</b> 407-414.
<P>
Page, R. D. M.  1989.  Comments on component-compatibility in historical
biogeography.  <I>Cladistics</I> <B>5:</B> 167-182.
<P>
Penny, D. and M. D. Hendy.  1985.  Testing methods of evolutionary tree 
construction.  <I>Cladistics</I> <B>1:</B> 266-278.
<P>
Platnick, N.  1987.   An empirical comparison of microcomputer parsimony
programs.  <I>Cladistics</I> <B>3:</B> 121-144.
<P>
Platnick, N.  1989.  An empirical comparison of microcomputer parsimony
programs. II.  <I>Cladistics</I> <B>5:</B> 145-161.
<P>
Reynolds, J. B., B. S. Weir, and C. C. Cockerham.  1983.  Estimation of the 
coancestry coefficient: basis for a short-term genetic 
distance.  <I>Genetics</I> <B>105:</B> 767-779.
<P>
Robinson, D. F. and L. R. Foulds.  1979.  Comparison of weighted
labelled trees.  pp. 119-126 in <em>Combinatorial Mathematics VI.  Proceedings
of the Sixth Australian Conference on Combinatorial Mathematics, Armidale,
Australia, August, 1978,</em>  ed. A. F. Horadam and W. D. Wallis.  Lecture Notes in Mathematics, No. 748.  Springer-Verlag, Berlin.
<P>
Robinson, D. F. and L. R. Foulds.  1981.  Comparison of phylogenetic trees.
<I>Mathematical Biosciences</I> <B>53:</B> 131-147.
<P>
Rohlf, F. J.  and M. C. Wooten.  1988.  Evaluation of the restricted maximum 
likelihood method for estimating phylogenetic trees using simulated allele-
frequency data.  <I>Evolution</I> <B>42:</B> 581-595.
<P>
Rzhetsky, A., and M. Nei.  1992.  Statistical properties of the ordinary
least-squares, generalized least-squares, and minimum-evolution methods
of phylogenetic inference. <I>Journal of Molecular Evolution</I> <B>35:</B>
367-375 .
<P>
Saitou, N., and M. Nei.  1987.  The neighbor-joining method: a new method for
reconstructing phylogenetic trees.  <I>Molecular Biology and Evolution</I> <B>4:</B> 406-425.
<P>
Sanderson, M. J.  1990.  Flexible phylogeny reconstruction: a review of
phylogenetic inference packages using parsimony.  <I>Systematic Zoology</I> <B>39:</B> 414-420.
<P>
Sankoff, D. D., C. Morel, R. J. Cedergren.  1973.  Evolution of 5S RNA and
the nonrandomness of base replacement.  <I>Nature New Biology</I> <B>245:</B> 232-234.
<P>
Shimodaira, H. and M. Hasegawa.  1999.  Multiple comparisons of log-likelihoods
with applications to phylogenetic inference.  <EM>Molecular Biology and
Evolution</EM> <B>16:</B> 1114-1116.
<P>
Shimodaira, H.  2002.  An approximately unbiased test of phylogenetic
tree selection.  <EM>Systematic Biology</EM> <B>51:</B> 492-508.
<P>
Sokal, R. R. and P. H. A. Sneath.  1963.  <I>Principles of Numerical Taxonomy.</I>
W. H. Freeman, San Francisco.
<P>
Smouse, P. E. and W.-H. Li.  1987.  Likelihood analysis of mitochondrial
restriction-cleavage patterns for the human-chimpanzee-gorilla trichotomy.
<I>Evolution</I> <B>41:</B> 1162-1176.
<P>
Sober, E.  1983a.  Parsimony in systematics: philosophical issues.  <I>Annual Review of Ecology and Systematics</I> <B>14:</B> 335-357.
<P>
Sober, E.  1983b.  A likelihood justification of parsimony.  <I>Cladistics</I> <B>1:</B> 209-233.
<P>
Sober, E.  1988.  <I>Reconstructing the Past: Parsimony, Evolution,
and Inference.</I>  MIT Press, Cambridge, Massachusetts.
<P>
Sokal, R. R., and P. H. A. Sneath.  1963.  <I>Principles of Numerical
Taxonomy.</I>  W. H. Freeman, San Francisco.
<P>
Steel, M. A., P. J. Lockhart, and D. Penny.  1993.  Confidence in evolutionary trees from biological sequence data.  <EM>Nature</EM> <B>364:</B> 440-442.
<P>
Steel, M. A.  1994.  Recovering a tree from the Markov leaf colourations
it generates under a Markov model.  <I>Applied Mathematics Letters</I>
<B>7:</B> 19-23.
<P>
Studier, J. A.  and K. J. Keppler.  1988.  A note on the neighbor-joining
algorithm of Saitou and Nei.  <I>Molecular Biology and Evolution</I> <B>5:</B> 729-731.
<P>
Swofford, D. L. and G. J. Olsen.  1990.  Phylogeny reconstruction.  Chapter
11, pages 411-501 in <I>Molecular Systematics,</I> ed. D. M. Hillis and C. Moritz.
Sinauer Associates, Sunderland, Massachusetts.
<P>
Swofford, D. L., G. J. Olsen, P. J. Waddell, and D. M. Hillis.  1996.
Phylogenetic inference.  pp. 407-514 in <I>Molecular Systematics</I>, 2nd ed.,
ed.  D. M. Hillis, C. Moritz, and B. K. Mable.  Sinauer Associates, Sunderland,
Massachusetts.
<P>
Templeton, A. R.  1983.  Phylogenetic inference from restriction endonuclease
cleavage site maps with particular reference to the evolution of humans and the
apes. <I>Evolution</I> <B>37:</B> 221-244.
<P>
Thompson, E. A.  1975.  <I>Human Evolutionary Trees.</I>  Cambridge University
Press, Cambridge.
<P>
Veerassamy, S., A. Smith and E. R. M. Tillier.  2003.
A transition probability model for amino acid substitutions from Blocks.
<I>Journal of Computational Biology</I> <B>10:</B> 997-1010.
<P>
Wright, S.  1934.  An analysis of variability in number of digits in an inbred
strain of guinea pigs. <em>Genetics</em> <b>19:</b> 506-536.
<P>
Wu, C. F. J.  1986.  Jackknife, bootstrap and other resampling plans in 
regression analysis.  <I>Annals of Statistics</I> <B>14:</B> 1261-1295.
<P>
Yang, Z. 1993.  Maximum-likelihood estimation of phylogeny from DNA sequences
when substitution rates differ over sites.  <I>Molecular Biology and
Evolution</I> <B>10:</B> 1396-1401.
<P>
Yang, Z. 1994.  Maximum likelihood phylogenetic estimation from DNA sequences
with variable rates over sites: approximate methods.  <I>Journal of Molecular
Evolution</I> <B>39:</B> 306-314.
<P>
Yang, Z.  1995.  A space-time process model for the evolution of DNA sequences.
<I>Genetics</I> <B>139:</B> 993-1005.
<P>
Yang, Z. 1998. Likelihood ratio tests for detecting positive selection and
application to primate lysozyme evolution. <I>Molecular Biology and
Evolution</I><B>15:</B> 568-573.
<P>
Yang, Z., and R. Nielsen. 1998. Synonymous and nonsynonymous rate variation in
nuclear genes of mammals. <I>Journal of Molecular Evolution</I> <B>46:</B> 409-418.
<P>
Yang, Z. 2006. <i>Computational Molecular Evolution.</i> Oxford University
Press, Oxford.
<P>
Zharkikh, A. and W.-H. Li.  1995.  Estimation of confidence in phylogeny:
the complete-and-partial bootstrap technique.  <I>Molecular Biology and
Evolution</I> <B>4:</B> 44-63.

<P>
<DIV ALIGN="CENTER">
<A NAME="credits"><HR><P></A>
<H2>Credits</H2></DIV>
<P>
Over the years various granting agencies have contributed to the
support of the PHYLIP project (at first without knowing it).  They are:
<P>
<TABLE CELLPADDING=3 BORDER="1">
<TR><TD ALIGN="LEFT">Years</TD>
<TD ALIGN="LEFT">Agency</TD>
<TD ALIGN="LEFT">Grant or Contract Number</TD>
</TR>
<TR><TD ALIGN="LEFT">2005-2009</TD>
<TD ALIGN="LEFT">NIH NIGMS</TD>
<TD ALIGN="LEFT">R01 GM071639</TD>
</TR>
<TR><TD ALIGN="LEFT">2003-2007</TD>
<TD ALIGN="LEFT">NIH NIGMS</TD>
<TD ALIGN="LEFT">R01 GM51929-05 (PI: Mary Kuhner)</TD>
</TR>
<TR><TD ALIGN="LEFT">1999-2003</TD>
<TD ALIGN="LEFT">NSF</TD>
<TD ALIGN="LEFT">BIR-9527687</TD>
</TR>
<TR><TD ALIGN="LEFT">1999-2002</TD>
<TD ALIGN="LEFT">NIH NIGMS</TD>
<TD ALIGN="LEFT">R01 GM51929-04</TD>
</TR>
<TR><TD ALIGN="LEFT">1999-2001</TD>
<TD ALIGN="LEFT">NIH NIMH</TD>
<TD ALIGN="LEFT">R01 HG01989-01</TD>
</TR>
<TR><TD ALIGN="LEFT">1995-1999</TD>
<TD ALIGN="LEFT">NIH NIGMS</TD>
<TD ALIGN="LEFT">R01 GM51929-01</TD>
</TR>
<TR><TD ALIGN="LEFT">1992-1995 </TD>
<TD ALIGN="LEFT">National Science Foundation</TD>
<TD ALIGN="LEFT">DEB-9207558</TD>
</TR>
<TR><TD ALIGN="LEFT">1992-1994</TD>
<TD ALIGN="LEFT">NIH NIGMS Shannon Award</TD>
<TD ALIGN="LEFT">2 R55 GM41716-04</TD>
</TR>
<TR><TD ALIGN="LEFT">
1989-1992</TD>
<TD ALIGN="LEFT">NIH NIGMS</TD>
<TD ALIGN="LEFT">1 R01-GM41716-01</TD>
</TR>
<TR><TD ALIGN="LEFT">
1990-1992</TD>
<TD ALIGN="LEFT">National Science Foundation</TD>
<TD ALIGN="LEFT">BSR-8918333</TD>
</TR>
<TR><TD ALIGN="LEFT">
1987-1990</TD>
<TD ALIGN="LEFT">National Science Foundation</TD>
<TD ALIGN="LEFT">BSR-8614807</TD>
</TR>
<TR><TD ALIGN="LEFT">1979-1987</TD>
<TD ALIGN="LEFT">U.S. Department of Energy</TD>
<TD ALIGN="LEFT">DE-AM06-76RLO2225 TA DE-AT06-76EV71005</TD>
</TR>
</TABLE>
<P>
However, starting in April, 2009 there is no grant support for
PHYLIP.
<P>
I am particularly grateful to past program administrators William Moore,
Irene Eckstrand, Peter Arzberger, and Conrad Istock, who have
gone beyond the call of duty to make sure that PHYLIP continued.
<P>
Booby prizes for funding are awarded to:
<UL><LI>The people at the U.S. Department of Energy who, in 1987, decided they
were "not interested in phylogenies",
<LI>The members of the Systematics Panel of NSF who twice (in 1989 and 1992)
positively recommended that my applications <I>not</I> be funded.  I am very
grateful to program director William Moore for courageously overruling
their decision the first time.  The 1992 NSF Systematics Panel could claim
no credit for PHYLIP whatsoever.
<LI>The members of the 1992 Genetics Study Section of NIH who rated my
proposal in the 53rd percentile (I don't know if that's 53rd from
the top or the bottom, but does it matter?), thus denying it funding.  I am,
however, grateful to the NIGMS administrators, especially Irene Eckstrand,
who supported giving me
a "Shannon award" partially funding my work for a period in spite of this
rating.
<LI>The reviewers at NSF Population Biology in early 2003 who gave my 
proposal too anemic a rating: (<em>"The work has the potential to define the field for many years to come .... All agreed that the proposal is somewhat vague. There was also some concern that the proposed work is too ambitious."</em>) 
<LI>The reviewers at the Genetics Study Section of the NIH in early 2004
who said such nice things about my proposal (<em>"One is likely to look back
in 20 years and marvel that these questions were not actively pursued by more
theoreticians"</em> and <em>"The first inclination is to give this PI
all the money he wants!"</em>), but then
proceeded to give my proposal a 30.2 percentile rating.  Next time you can
tone down the praise as long as you improve the ratings, OK? [They did].
<LI> The NIH program on Continued Development and Maintenance of Software, 2008 who rated my
proposal for continued development of PHYLIP at a percentile of 50.6
(" ...  the application is not considered significant given the
state of the art today and what is being proposed .... Panel comments on the lack of software
engineering expertise. Overall, the application is considered average in its present form ....") 
<LI> The NSF Assembling the Tree of Life (AToL) Spring 2008 Advisory Panel who decided that a grant
proposal by me and Fred Bookstein to develop Brownian Motion models for morphological characters,
and methods of applying them to fossil data and to use morphometrics, was Meritorious but was Not
Fundable. ("However, there were significant
weaknesses in the proposal would have been enhanced by some proof-of-concept or other preliminary
results or examples, more specific information regarding approaches to statistical issues, and a
better overall integration of the two main components of the proposal.").  Um, if I could
show the methods working in a test case, they would essentially be complete already!
<LI> The NSF Assembling the Tree of Life (AToL) Spring 2009 Advisory Panel
who did basically the same thing again to a revised application.  Thanks
folks, PHYLIP is now basically out of grant money.
</UL>
<P>
The original Camin-Sokal parsimony program and the polymorphism parsimony 
program were written by me in 1977 and 1978.  They were Pascal versions of 
earlier FORTRAN programs I wrote in 1966 and 1967 using the same algorithm to 
infer phylogenies under the Camin-Sokal and polymorphism parsimony 
criteria.  Harvey Motulsky worked for me as a programmer in 1971 and wrote 
FORTRAN programs to carry out the Camin-Sokal, Dollo, and polymorphism 
methods (he is better-known these days as the author of the scientific
data analysis package GraphPad).  But most of the early work on PHYLIP other
than my own was by Jerry 
Shurman and Mark Moehring.  Jerry Shurman worked for me in the summers of 
1979 and 1980, and Mark Moehring worked for me in the summers of 1980 and 
1981.  Both wrote original versions of many of the other programs, based on 
the original versions of my Camin-Sokal parsimony program and my polymorphism
parsimony program.  These 
formed the basis of Version 1 of the Package, first distributed in October, 
1980. 
<P>
Version 2, released in the spring of 1982, involved a fairly complete rewrite 
by me of many of those programs.  Hisashi Horino for
version 3.3 reworked some parts of the programs Clique and Consense
to make their output more comprehensible, and has added some code to the
tree-drawing programs Drawgram and Drawtree as well.  He also worked on
some of the Drawtree and Drawgram driver code.
<P>
Later programmers Akiko Fuseki, Sean Lamont,
Andrew Keeffe, Daniel Yek, Dan Fineman, Patrick Colacurcio,
Mike Palczewski, Doug Buxton, Ian Robertson, Marissa LaMadrid, Eric Rynes,
and Elizabeth Walkup gave
me substantial help with the 3.6 releases, and their excellent work is
greatly appreciated.  Akiko, in over 10 years of excellent work, did much of the
hard work of adding
new features and changing old ones in the 3.4 and 3.5 releases,
centralized many of the C routines in support files, and is responsible for the
new versions of Dnapars and Pars. Andrew
prepared the Macintosh version, wrote Retree, added the ray-tracing
and PICT code to the Draw... programs and has since done much other work.  Sean
was central to the conversion to
C, and tested it extensively.  Mike Palczewski reorganized the code and
centralized routines, bringing us closer to object-oriented structure.
My (then) postdoctoral fellow
Mary Kuhner and her associate Jon Yamato created Neighbor, the
neighbor-joining and UPGMA program, for the current release, for which I am
also grateful (Naruya Saitou and Li Jin kindly encouraged us to use some of the
code from their own implementation of this method).  Lucas Mix created
the protein likelihood programs Protml and Protmlk.  Elisabeth Tillier
provided the code for her PMB amino acid model.  My current programmers
Jim McGill and Bob Giansiracusa have made a great contribution to
getting the current version working.
<P>
I am very grateful to over 400
users for algorithmic suggestions, complaints about features (or lack of
features), and information about the behavior of their operating systems
and compilers.  A list of some of their names will be found at
<A HREF="http://evolution.gs.washington.edu/phylip/credits.html">the
credits page on the PHYLIP web site</A> which is at
<CODE>http://evolution.gs.washington.edu/phylip/credits.html</CODE>
<P>
A major contribution to this package has been made by others
writing programs or parts of programs.  Chris Meacham contributed the
important program Factor, long demanded by users, and the even more
important ones PLOTREE and PLOTGRAM.  Important parts of the code in
Drawgram and Drawtree were taken over from those two programs.
Kent Fiala wrote
function "reroot" to do outgroup-rooting, which was an essential part of many
programs in earlier versions.  Someone at the Western Australia Institute of
Technology suggested the name PHYLIP (by writing it the label on the
outside of a magnetic tape).  Probably it was the late Julian Ford
(I've lost the relevant letter).
<P>
The distribution of the package also owes much to Buz Wilson and Willem Ellis, 
who put a lot of effort into the early distributions of the PCDOS and
Macintosh versions respectively.  Christopher Meacham and Tom Duncan for three
versions distributed a printed version of these documentation files (they could
not continue to do so), and I am
very grateful to them for those efforts.  William H.E. Day and F. James Rohlf
were very helpful in setting up the listserver news bulletin service which
succeeded the PHYLIP newsletter for a time.
<P>
I also wish to thank the people who have made computer resources available to
me, mostly in the loan of use of microcomputers.  These include Jeremy
Field, Clem Furlong, Rick Garber, Dan Jacobson, Rochelle Kochin, Monty Slatkin,
Jim Archie, Jim Thomas, and George Gilchrist.
<P>
I should also note the computers used to develop this package:
These include a CDC 6400, two DECSystem 1090s, my trusty old SOL-20, my
old Osborne-1, a VAX 11/780, a VAX 8600, a MicroVAX I, a DECstation
3100, my old Toshiba 1100+, my 
DECstation 5000/200, a DECstation 5000/125, a Compudyne 486DX/33, a
Trinity Genesis 386SX, a Zenith Z386, a Mac Classic, a DEC Alphastation 400
4/233, a Pentium 120, a Pentium 200, a PowerMac 6100, and a Macintosh G3.
(One of the reasons
we have been successful in achieving compatibility between different computer
systems is that I have had to run them myself under so many different operating
systems and compilers).
<P>
<A NAME="otherprograms"><HR><P></A>
<DIV ALIGN="CENTER">
<H2>Other Phylogeny Programs Available Elsewhere</H2></DIV>
<P>
A comprehensive list of phylogeny programs is maintained at the PHYLIP
web site on the Phylogeny Programs pages:
<P>
<DIV ALIGN="CENTER">
<FONT SIZE=+2><A HREF="http://evolution.gs.washington.edu/phylip/software.html">
<TT>http://evolution.gs.washington.edu/phylip/software.html</TT></FONT></A></DIV>
<P>
Here we will simply mention some of the major general-purpose programs.  For
many more and much more, see those web pages.
<P>
<B>PAUP*</B>&nbsp;&nbsp;  A comprehensive program with parsimony, likelihood, and
distance matrix methods.  It competes with PHYLIP to be responsible for
the most trees published.  Written by David Swofford, now of Duke
University and distributed by
Sinauer Associates of Sunderland, Massachusetts.
It is described in
<A HREF="http://www.sinauer.com/detail.php?id=8060">a web page</A>.
at <TT>http://www.sinauer.com/detail.php?id=8060</TT>.
Current prices are $100 for the Macintosh version, $85 for the
Windows version, and $150 for Unix versions for many kinds of workstations.
<P>
<B>MrBayes</B>&nbsp;&nbsp; The leading program for Bayesian inference of
phylogenies.  It uses Markov Chain Monte Carlo inference to assess
support for clades and to infer posterior distrubutions of parameters.
Produced by John Huelsenbeck and Fredrik Ronquist, it is available at
<A HREF="http://mrbayes.net">its web site</A> at
<CODE>http://mrbayes.net</CODE> as a Mac OS X or Windows
executable, or in source code in C.
<P>
<B>MEGA</B>&nbsp;&nbsp;  A program by Sudhir Kumar of Arizona State University
(written together with Koichiro Tamura, Joel Dudley and Masatoshi Nei).
It can carry out parsimony and distance matrix methods
for DNA sequence data.  Version 4 for Windows, Macintosh, and Linux
can be downloaded from <A HREF="http://www.megasoftware.net">
the MEGA web site</A>
at <TT>http://www.megasoftware.net</TT>.
<P>
<B>PAML</B>&nbsp;&nbsp;  Ziheng Yang of the Department of Genetics and Biometry at
University College, London has written this package of programs to
carry out likelihood analysis of DNA and protein sequence data.  
It is one of the only packages able to use the codon model for protein
sequence data which takes the genetic code reasonably fully into account.
PAML is particularly strong in the options for coping with variability of rates
of evolution from site to site, though it is less able than some other
packages to search effectively for the best tree.  It is available as
C source code and as Mac OS X and Windows executables from its web site at
<A HREF="http://abacus.gene.ucl.ac.uk/software/paml.html">
<TT>http://abacus.gene.ucl.ac.uk/software/paml.html</TT></A>.
<P>
<B>Phyml</B>&nbsp;&nbsp;
Stephane Guindon, currently of the University of Auckland, New Zealand,
has written Phyml, a fast likelihood program for molecular sequence data
It is available as binaries from
<a href="http://www.atgc-montpellier.fr/phyml/binaries.php">its web page at the ATGC site</a>
at the Universit&eacute; de Montpellier in
France.  Source code for Phyml, including later developments of the program,
are available at <a href="http://code.google.com/p/phyml">its site at Google Code</a>.
<P>
<B>RAxML</B>&nbsp;&nbsp;  Alexandros Stamatakis, of the Exelexis Lab at the
Technische Universit&auml;t M&uuml;nchen has written RAxML, a very fast
likelihood program for molecular sequences.  It is available from
the <a href="http://sco.h-its.org/exelexis/software.html">
Exelexis Lab software web page</a>.  Source code is available too. RAxML
seems to be the fastest implementation of likelihood for molecular data.
<P>
<B>TNT</B>&nbsp;&nbsp; This program, by Pablo Goloboff, J. S. Farris, and Kevin Nixon,
is for searching large data sets for most parsimonious trees. 
The authors are respectively at the Instituto Miguel Lillo in Tucum&aacute;n,
Argentina, the Naturhistoriska Riksmuseet in Stockholm, Sweden, and the
Hortorium, Cornell University, Ithaca, New York.
TNT is described
as faster than other methods, though not faster than NONA for small to
medium data sets.  It is distributed as
Windows, Linux, and Mac OS X executables (the latter two
require the PVM Parallel Virtual Machine library to be installed).
The program and some support files including documentation are
available from <A HREF="http://www.zmuc.dk/public/phylogeny/tnt">its download
area</A> at <TT>http://www.zmuc.dk/public/phylogeny/tnt</TT>
(see the ReadMe! web page there).
It is free, provided you agree to a license with some reasonable limitations.
<P>
<B>DAMBE</B> &nbsp;&nbsp; A package written by Xuhua Xia of the
Department of Biology of the University of Ottawa.
Its initials stand for Data Analysis in Molecular Biology and Evolution.
DAMBE is a general-purpose package for DNA and protein sequence phylogenies.
It can read and
convert a number of file formats, and has many features for
descriptive statistics, and can compute a number of commonly-used
distance matrix measures and infer phylogenies by parsimony, distance,
or likelihood methods, including bootstrapping and jackknifing.  There are
a number of kinds of statistical tests of trees available and it
can also display phylogenies.  DAMBE includes a copy of ClustalW as well;
DAMBE consists of Windows executables.  It is available from its
web site at <A HREF="http://dambe.bio.uottawa.ca/dambe.asp"><CODE>
http://dambe.bio.uottawa.ca/dambe.asp</CODE></A>.
<P>
These are only a few of the over 380 different phylogeny packages that
are now available (as of July, 2010 - the number keeps increasing).  The
others are described (and web links provided) at my
Phylogeny Programs web pages at the address given above.
<P>
<A NAME="helpme"><HR><P></A>
<DIV ALIGN="CENTER">
<H2>How You Can Help Me</H2></DIV>
<P>
Simply let me know of any problems you have had adapting the
programs to your computer.  I can often make "transparent" changes that, by
making the code avoid the wilder, woolier, and less standard parts of
C, not only help others who have your machine but even improve the
chance of the programs functioning on new machines.  I would like fairly
detailed information on what gave trouble, on what operating system,
machine, and (if relevant) compiler, and what had to be done to make the
programs work.  I am sometimes able to do some over-the-telephone
trouble-shooting, particularly
if I don't have to pay for the call, but electronic mail is a the best
way for me to be asked about problems, as you can include your
input and output files so I can see what is going on (please do <EM>not</EM>
send them as Attachments, but as part of the body of a message).  I'd really
like these programs to be
able to run with only routine changes on <I>absolutely everything</I>, down to
and possibly including the Amana Touchmatic Radarange Microwave Oven
which was an Intel 8080 system (in fact, early versions of this package did
run successfully on Intel 8080 systems running the CP/M operating system).
A PalmPilot version was contemplated too.
<P>
I would also like to know timings of programs from the package, when
run on the three test input files provided above, for various computer and
compiler combinations, so that I can provide this information in the
section on speeds of this document.
<P>
For the phylogeny plotting programs Drawgram and Drawtree,
I am particularly interested in knowing what has to be done
to adapt them for other graphic file formats.
<P>
You can also be helpful to PHYLIP users in your part of the world by
helping them get the latest version of PHYLIP from our web site
and by helping them with any
problems they may have in getting PHYLIP working on their data.
<P>
Your help is appreciated.  I am always happy to hear suggestions
for features and programs that ought to be incorporated in the package,
but please do not be upset if I turn out to have already considered the
particular possibility you suggest and decided against it.
<P>
<A NAME="trouble"><HR><P></A>
<DIV ALIGN="CENTER">
<H2>In Case of Trouble</H2></DIV>
<P>
<I>Read The (documentation) Files Meticulously</I> ("RTFM").  If that doesn't solve the
problem, please check the Frequently Asked Questions web page at the
PHYLIP web site:
<P>
<FONT SIZE=+2>
<TT><A HREF="http://evolution.gs.washington.edu/phylip/faq.html">
http://evolution.gs.washington.edu/phylip/faq.html</TT></A></FONT>
<P>
and the PHYLIP Bugs web page at that site:
<P>
<FONT SIZE=+2>
<TT><A HREF="http://evolution.gs.washington.edu/phylip/bugs.html">
http://evolution.gs.washington.edu/phylip/bugs.html</TT></A></FONT>
<P>
If none of these answers your question, get in touch with me.  My email address
is given below.  If you do ask about a problem, please specify the program
name, version of the package, computer operating system, and
send me your data file so I can test the problem.  Also it will help if you 
have the relevant output and documentation files so that you
can refer to them in any correspondence.  I can also be reached by telephone
by calling me in my office: 
+1-(206)-543-0150, or at home: +1-(206)-526-9057 (how's <I>that</I> for user
support!).  If I cannot be reached at either place, a message can be left at
the office of
the Department of Genome Sciences, +1-(206)-221-7377 but I prefer strongly that I not
call you, as in any phone consultation the least you can do is pay the phone
bill.  Better yet, use email.
<P>
Particularly if you are in a part of the world distant from me, you may also 
want to try to get in touch with other users of PHYLIP nearby.  I can also,
if requested, provide a list of nearby users.
<P>
<DIV ALIGN="RIGHT">
<TABLE><TR><TD ALIGN=LEFT>
Joe Felsenstein<BR>
Department of Genome Sciences<BR>
University of Washington<BR>
Box 355065<BR>
Seattle, Washington 98195-5065, U.S.A.
</TD></TR></TABLE>
</DIV>
<P>
Electronic mail addresses: &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<TT>joe<!deathtospam>&nbsp;(at)&nbsp;<!deathtospam>gs.washington.edu</TT>
<BR><HR>
</BODY>
</HTML>