Monday, July 27, 2015


It's that time of year again!! Just when you've finished ripping your hair out instailling BioPerl on Mavericks, Mac decides that it is not pissing off its users enough and decides to upgrade its operating system. And just when you can't take the notifications any longer, you are faced with that critical decision: jeapardize the security of your system for the sake of never having to reinstall bioperl, or update the OSX and know that the impending doom of reinstalling bioperl lingers--waiting to ruin your afternoon.

Well, it did just that! After hundreds of attempts to try to find the missing YAML library, I had to resort to installing perlbrew, which I did once, and it didn't work for me, but then I tried a different aspect of implementing the implementation in perlbrew several hours later, and it happened to work. If you are having serious trouble from the suggested cpan installation, give this a try:

curl -L http://install.perlbrew.pl | bash
source ~/perl5/perlbrew/etc/bashrc #and put this in your .bash_profile

#now we are going to pretend like we are in cpan, but we have to give ourselves permission first
#change your user name from mine (in pink)
sudo chown loloyohe /Users/loloyohe/.cpanm

#install cpanm
perlbrew install-cpanm

#install the builder from CPAN--oh, NOW it knows where YAML is!!!! >:|
sudo cpanm install Module::Build

#nothing really prints to screen, but it works!
#visit https://www.cpan.org/authors/id/C/CJ/CJFIELDS/ to get the latest version
sudo cpanm install CJFIELDS/BioPerl-1.6.923.tar.gz

#test to see if it works!
perl -MBio::Seq -e 0

Naturally, you will likely not have any of the same issues as me but a completely unique set of horrific issues specific to you! 

Thursday, July 23, 2015

PAML lab

Exercise 1:The objective of this activity is to use CODEML to evaluate the likelihood of the GstD1 sequences for a variety of ω values. Plot log-likelihood scores against the values of ω and determine the maximum likelihood estimate of ω. Check your finding by running CODEML’s hill-climbing algorithm.
#run this command to execute:

codeml ex1_codeml.ctl

This is what the control file looks like: 
      seqfile = seqfile.txt         * sequence data filename
      outfile = results_0.010.txt   * main result file name

        noisy = 9      * 0,1,2,3,9: how much rubbish on the screen
      verbose = 1      * 1:detailed output
      runmode = -2     * -2:pairwise

      seqtype = 1      * 1:codons
    CodonFreq = 3      * 0:equal, 1:F1X4, 2:F3X4, 3:F61
        model = 0      *
      NSsites = 0      *
        icode = 0      * 0:universal code

    fix_kappa = 0      * 1:kappa fixed, 0:kappa to be estimated
        kappa = 2      * initial or fixed kappa

    fix_omega = 1      * 1:omega fixed, 0:omega to be estimated
       * omega = 0.001  * 1st fixed omega value [change this]

       * EXCERCISE 1
       *alternate fixed omega values
       *omega = 0.005  * 2nd fixed value
       *omega = 0.01   * 3rd fixed value
       *omega = 0.05   * 4th fixed value
       *omega = 0.10   * 5th fixed value
       *omega = 0.20   * 6th fixed value
       *omega = 0.40   * 7th fixed value
       *omega = 0.80   * 8th fixed value
       *omega = 1.60   * 9th fixed value
       omega = 2.00   * 10th fixed value

The MLE of ω is 0.04.

Now we are going to have PAML estimate ω. In the control file, set fix_omega=0.
------------------
2 (Sim) ... 1 (Mel)
lnL = -756.568200
  0.12527  2.53119  0.06698


t= 0.1253  S=    45.2  N=   554.8  dN/dS= 0.0670  dN= 0.0204  dS= 0.3041
---------------------

Exercise 2: In this exercise you will investigate the sensitivity of your estimate of ω to the transition/transversion ratio (κ), and to the assumed model for codon frequencies (π’s). After you collect the required data you will determine which assumptions yield the largest and smallest values of S, and what effect it as on the value of ω.

Here is the control file:
      seqfile = seqfile.txt   * sequence data filename
      outfile = results.txt   * main result file name

        noisy = 9      * 0,1,2,3,9: how much rubbish on the screen
      verbose = 1      * 1:detailed output
      runmode = -2     * -2:pairwise

      seqtype = 1      * 1:codons
    CodonFreq = 0      * 0:equal, 1:F1X4, 2:F3X4, 3:F61          [change this]
        model = 0      *
      NSsites = 0      *
        icode = 0      * 0:universal code

    fix_kappa = 1      * 1:kappa fixed, 0:kappa to be estimated  [change this]
        kappa = 1      * initial or fixed kappa

    fix_omega = 0      * 1:omega fixed, 0:omega to be estimated
        omega = 0.5    * initial omega value


                       * EXCERCISE 2
                       * Codon bias = none; Ts/Tv bias = none
                       * Codon bias = none; Ts/Tv bias = Yes (ML)

                       * Codon bias = yes (F3x4); Ts/Tv bias = none
                       * Codon bias = yes (F3x4); Ts/Tv bias = Yes (ML)

                       * Codon bias = yes (F61); Ts/Tv bias = none

                       * Codon bias = yes (F61); Ts/Tv bias = Yes (ML)

Test#1  Codon bias = none; Ts/Tv bias = none
CodonFreq = 0      * 0:equal, 
fix_kappa = 1      * 1:kappa fixed,

2 (Sim) ... 1 (Mel)
lnL = -927.178200
  0.10683  0.27421


t= 0.1068  S=   152.9  N=   447.1  dN/dS= 0.2742  dN= 0.0213  dS= 0.0776
Test#2 Codon bias = none; Ts/Tv bias = Yes (ML)
*Highlighted is the ML estimate of of sequences kappa (ts/tv) 
CodonFreq = 0      * 0:equal, 
fix_kappa = 0      * 1:kappa fixed0:kappa to be estimated

2 (Sim) ... 1 (Mel)
lnL = -926.280300
  0.10533  1.88371  0.32004 


t= 0.1053  S=   165.8  N=   434.2  dN/dS= 0.3200  dN= 0.0221  dS= 0.0691

Test#3 Codon bias = yes (F3x4); Ts/Tv bias = none
CodonFreq = 2      * 2 F3X4, 
fix_kappa = 1      * 1:kappa fixed,

2 (Sim) ... 1 (Mel)
lnL = -844.507294
  0.10684  0.11807


t= 0.1068  S=    70.6  N=   529.4  dN/dS= 0.1181  dN= 0.0189  dS= 0.1605

Test#4 Codon bias = yes (F3x4); Ts/Tv bias = Yes (ML)
CodonFreq = 2      * 2 F3X4, 
fix_kappa = 0      * 0:kappa to be estimated

2 (Sim) ... 1 (Mel)
lnL = -842.214727
  0.10686  2.71034  0.12661


t= 0.1069  S=    73.4  N=   526.6  dN/dS= 0.1266  dN= 0.0193  dS= 0.1526

Test #5 Codon bias = yes (F61); Ts/Tv bias = none
CodonFreq = 3      * 3 F61

fix_kappa = 1      * 1:kappa fixed,

2 (Sim) ... 1 (Mel)
lnL = -758.551863
  0.12089  0.06278

t= 0.1209  S=    40.5  N=   559.5  dN/dS= 0.0628  dN= 0.0201  dS= 0.3198

Test #6 Codon bias = yes (F61); Ts/Tv bias = Yes (ML)
CodonFreq = 3      * 3 F61


fix_kappa = 0      * 0:kappa to be estimated
2 (Sim) ... 1 (Mel)
lnL = -756.568200
  0.12527  2.53121  0.06698


t= 0.1253  S=    45.2  N=   554.8  dN/dS= 0.0670  dN= 0.0204  dS= 0.3040

Summarized in a table


Assumptions K S N dS dN omega lnL
Fequal + k=1 1 152.9 447.1 0.021 0.078 0.27 -927.1782
Fequal + k=estimated 1.88 165.8 434.2 0.07 0.022 0.32 -926.2803
F3X4 + k=1 1 70.6 529.4 0.16 0.019 0.118 -844.507294
F3X4 + k=estimated 2.71 73.4 526.6 0.15 0.02 0.127 -842.214727
F61 + k=1 1 40.5 559.5 0.32 0.02 0.063 -758.551863
F61 + k=estimated 2.53 45.2 554.8 0.3 0.02 0.067 -756.5682

The assumption that yields the smallest value of S is F61 + k=1. It changes the value of omega by an order of magnitude.

Exercise 3: The objective of this exercise is to use three LRTs to evaluate the following hypotheses: (1) the mutation rate of Ldh-C has increased relative to Ldh-A, (2) a burst of positive selection for functional divergence occurred following the duplication event that gave rise to Ldh-C, and (3) there was a long term shift in selective constraints following the duplication event that gave rise to Ldh-C.

Here is the control file:
      seqfile = seqfile.txt   * sequence data filename
     treefile = treeH0.txt      * tree structure file name
      outfile = results.txt   * main result file name

        noisy = 9      * 0,1,2,3,9: how much rubbish on the screen
      verbose = 1      * 1:detailed output
      runmode = 0      * 0:user defined tree

      seqtype = 1      * 1:codons
    CodonFreq = 2      * 0:equal, 1:F1X4, 2:F3X4, 3:F61

        model = 0      * 0:one omega ratio for all branches
                       * 1:separate omega for each branch
                       * 2:user specified dN/dS ratios for branches

      NSsites = 0      *

        icode = 0      * 0:universal code

    fix_kappa = 0      * 1:kappa fixed, 0:kappa to be estimated
        kappa = 2      * initial or fixed kappa

    fix_omega = 0      * 1:omega fixed, 0:omega to be estimated
        omega = 0.2    * initial omega

                       * comments:
                       * H0 in Table 3: model = 0
                       * H1 in Table 3: model = 2
                       * H2 in Table 3: model = 2

                       * H3 in Table 3: model = 2

H0: homogeneous selection pressure over the tree
(X02152Hom,U07178Sus,(M22585rab,((NM017025Rat,U13687Mus),(((AF070995C,(X04752Mus,U07177Rat)),(U95378Sus,U13680Hom)),(X53828OG1,U28410OG2)))));


Here is the output (lnL highlighted in yellow):
TREE #  1:  (1, 2, (5, ((3, 4), (((6, (8, 9)), (7, 10)), (11, 12)))));   MP score: -1
lnL(ntime: 21  np: 23):  -6018.633010     +0.000000
  13..1    13..2    13..14   14..5    14..15   15..16   16..3    16..4    15..17   17..18   18..19   19..6    19..20   20..8    20..9    18..21   21..7    21..10   17..22   22..11   22..12 
 0.165986 0.145030 0.060255 0.252090 0.030239 0.230093 0.133935 0.101768 0.125411 0.692354 0.249468 0.205980 0.294515 0.131299 0.213038 0.174959 0.155890 0.172753 0.321380 0.290395 0.356759 2.401783 0.136656

Note: Branch length is defined as number of nucleotide substitutions per codon (not per neucleotide site).

tree length =   4.50360

(1: 0.165986, 2: 0.145030, (5: 0.252090, ((3: 0.133935, 4: 0.101768): 0.230093, (((6: 0.205980, (8: 0.131299, 9: 0.213038): 0.294515): 0.249468, (7: 0.155890, 10: 0.172753): 0.174959): 0.692354, (11: 0.290395, 12: 0.356759): 0.321380): 0.125411): 0.030239): 0.060255);

(X02152Hom: 0.165986, U07178Sus: 0.145030, (M22585rab: 0.252090, ((NM017025Rat: 0.133935, U13687Mus: 0.101768): 0.230093, (((AF070995C: 0.205980, (X04752Mus: 0.131299, U07177Rat: 0.213038): 0.294515): 0.249468, (U95378Sus: 0.155890, U13680Hom: 0.172753): 0.174959): 0.692354, (X53828OG1: 0.290395, U28410OG2: 0.356759): 0.321380): 0.125411): 0.030239): 0.060255);

Detailed output identifying parameters

kappa (ts/tv) =  2.40178

omega (dN/dS) =  0.13666

dN & dS for each branch

 branch           t        N        S    dN/dS       dN       dS   N*dN   S*dS

  13..1       0.166    729.3    269.7   0.1367   0.0205   0.1497   14.9   40.4
  13..2       0.145    729.3    269.7   0.1367   0.0179   0.1308   13.0   35.3
  13..14      0.060    729.3    269.7   0.1367   0.0074   0.0543    5.4   14.7
  14..5       0.252    729.3    269.7   0.1367   0.0311   0.2273   22.7   61.3
  14..15      0.030    729.3    269.7   0.1367   0.0037   0.0273    2.7    7.4
  15..16      0.230    729.3    269.7   0.1367   0.0283   0.2075   20.7   55.9
  16..3       0.134    729.3    269.7   0.1367   0.0165   0.1208   12.0   32.6
  16..4       0.102    729.3    269.7   0.1367   0.0125   0.0918    9.1   24.7
  15..17      0.125    729.3    269.7   0.1367   0.0155   0.1131   11.3   30.5
  17..18      0.692    729.3    269.7   0.1367   0.0853   0.6242   62.2  168.3
  18..19      0.249    729.3    269.7   0.1367   0.0307   0.2249   22.4   60.7
  19..6       0.206    729.3    269.7   0.1367   0.0254   0.1857   18.5   50.1
  19..20      0.295    729.3    269.7   0.1367   0.0363   0.2655   26.5   71.6
  20..8       0.131    729.3    269.7   0.1367   0.0162   0.1184   11.8   31.9
  20..9       0.213    729.3    269.7   0.1367   0.0262   0.1921   19.1   51.8
  18..21      0.175    729.3    269.7   0.1367   0.0216   0.1577   15.7   42.5
  21..7       0.156    729.3    269.7   0.1367   0.0192   0.1406   14.0   37.9
  21..10      0.173    729.3    269.7   0.1367   0.0213   0.1558   15.5   42.0
  17..22      0.321    729.3    269.7   0.1367   0.0396   0.2898   28.9   78.1
  22..11      0.290    729.3    269.7   0.1367   0.0358   0.2618   26.1   70.6
  22..12      0.357    729.3    269.7   0.1367   0.0440   0.3217   32.1   86.7

tree length for dN:      0.55489
tree length for dS:      4.06048

H1: episodic change in selection pressure in Ldh-C (only in the branch that immediately follows the gene duplication event).
Here is the tree file for this:
(X02152Hom,U07178Sus,(M22585rab,((NM017025Rat,U13687Mus),(((AF070995C,(X04752Mus,U07177Rat)),(U95378Sus,U13680Hom))#1,(X53828OG1,U28410OG2)))));

Output:
TREE #  1:  (1, 2, (5, ((3, 4), (((6, (8, 9)), (7, 10)), (11, 12)))));   MP score: -1
lnL(ntime: 21  np: 24):  -6017.572460     +0.000000
  13..1    13..2    13..14   14..5    14..15   15..16   16..3    16..4    15..17   17..18   18..19   19..6    19..20   20..8    20..9    18..21   21..7    21..10   17..22   22..11   22..12 
 0.166139 0.145654 0.060220 0.253769 0.027868 0.234240 0.134451 0.101620 0.136088 0.615641 0.250544 0.206999 0.295643 0.131224 0.213679 0.176223 0.155609 0.173597 0.319034 0.291976 0.357768 2.397123 0.131877 0.192033

Note: Branch length is defined as number of nucleotide substitutions per codon (not per nucleotide site).

tree length =   4.44798

(1: 0.166139, 2: 0.145654, (5: 0.253769, ((3: 0.134451, 4: 0.101620): 0.234240, (((6: 0.206999, (8: 0.131224, 9: 0.213679): 0.295643): 0.250544, (7: 0.155609, 10: 0.173597): 0.176223): 0.615641, (11: 0.291976, 12: 0.357768): 0.319034): 0.136088): 0.027868): 0.060220);

(X02152Hom: 0.166139, U07178Sus: 0.145654, (M22585rab: 0.253769, ((NM017025Rat: 0.134451, U13687Mus: 0.101620): 0.234240, (((AF070995C: 0.206999, (X04752Mus: 0.131224, U07177Rat: 0.213679): 0.295643): 0.250544, (U95378Sus: 0.155609, U13680Hom: 0.173597): 0.176223): 0.615641, (X53828OG1: 0.291976, U28410OG2: 0.357768): 0.319034): 0.136088): 0.027868): 0.060220);

Detailed output identifying parameters

kappa (ts/tv) =  2.39712

dN & dS for each branch

 branch           t        N        S    dN/dS       dN       dS   N*dN   S*dS

  13..1       0.166    729.4    269.6   0.1319   0.0199   0.1512   14.5   40.8
  13..2       0.146    729.4    269.6   0.1319   0.0175   0.1326   12.8   35.7
  13..14      0.060    729.4    269.6   0.1319   0.0072   0.0548    5.3   14.8
  14..5       0.254    729.4    269.6   0.1319   0.0305   0.2310   22.2   62.3
  14..15      0.028    729.4    269.6   0.1319   0.0033   0.0254    2.4    6.8
  15..16      0.234    729.4    269.6   0.1319   0.0281   0.2132   20.5   57.5
  16..3       0.134    729.4    269.6   0.1319   0.0161   0.1224   11.8   33.0
  16..4       0.102    729.4    269.6   0.1319   0.0122   0.0925    8.9   24.9
  15..17      0.136    729.4    269.6   0.1319   0.0163   0.1239   11.9   33.4
  17..18      0.616    729.4    269.6   0.1920   0.0961   0.5004   70.1  134.9
  18..19      0.251    729.4    269.6   0.1319   0.0301   0.2281   21.9   61.5
  19..6       0.207    729.4    269.6   0.1319   0.0249   0.1884   18.1   50.8
  19..20      0.296    729.4    269.6   0.1319   0.0355   0.2691   25.9   72.6
  20..8       0.131    729.4    269.6   0.1319   0.0158   0.1195   11.5   32.2
  20..9       0.214    729.4    269.6   0.1319   0.0257   0.1945   18.7   52.4
  18..21      0.176    729.4    269.6   0.1319   0.0212   0.1604   15.4   43.3
  21..7       0.156    729.4    269.6   0.1319   0.0187   0.1417   13.6   38.2
  21..10      0.174    729.4    269.6   0.1319   0.0208   0.1580   15.2   42.6
  17..22      0.319    729.4    269.6   0.1319   0.0383   0.2904   27.9   78.3
  22..11      0.292    729.4    269.6   0.1319   0.0351   0.2658   25.6   71.7
  22..12      0.358    729.4    269.6   0.1319   0.0430   0.3257   31.3   87.8

tree length for dN:      0.55619
tree length for dS:      3.98922

dS tree:
(X02152Hom: 0.151246, U07178Sus: 0.132597, (M22585rab: 0.231020, ((NM017025Rat: 0.122398, U13687Mus: 0.092510): 0.213241, (((AF070995C: 0.188443, (X04752Mus: 0.119460, U07177Rat: 0.194523): 0.269141): 0.228084, (U95378Sus: 0.141659, U13680Hom: 0.158035): 0.160426): 0.500425, (X53828OG1: 0.265802, U28410OG2: 0.325696): 0.290434): 0.123889): 0.025370): 0.054822);
dN tree:
(X02152Hom: 0.019946, U07178Sus: 0.017487, (M22585rab: 0.030466, ((NM017025Rat: 0.016142, U13687Mus: 0.012200): 0.028122, (((AF070995C: 0.024851, (X04752Mus: 0.015754, U07177Rat: 0.025653): 0.035494): 0.030079, (U95378Sus: 0.018682, U13680Hom: 0.020841): 0.021157): 0.096098, (X53828OG1: 0.035053, U28410OG2: 0.042952): 0.038302): 0.016338): 0.003346): 0.007230);

w ratios as labels for TreeView:
(X02152Hom #0.1319 : 0.019946, U07178Sus #0.1319 : 0.017487, (M22585rab #0.1319 : 0.030466, ((NM017025Rat #0.1319 : 0.016142, U13687Mus #0.1319 : 0.012200) #0.1319 : 0.028122, (((AF070995C #0.1319 : 0.024851, (X04752Mus #0.1319 : 0.015754, U07177Rat #0.1319 : 0.025653) #0.1319 : 0.035494) #0.1319 : 0.030079, (U95378Sus #0.1319 : 0.018682, U13680Hom #0.1319 : 0.020841) #0.1319 : 0.021157)#1 #0.1920 : 0.096098, (X53828OG1 #0.1319 : 0.035053, U28410OG2 #0.1319 : 0.042952) #0.1319 : 0.038302) #0.1319 : 0.016338) #0.1319 : 0.003346) #0.1319 : 0.007230);

H2: Long term shift in selection pressure in Ldh-C only; Ldh-C has a permanent change in selection pressure (as compared to its ancestors) whereas Ldh-A remains subject to the ancestral level of selection pressure.
(X02152Hom,U07178Sus,(M22585rab,((NM017025Rat,U13687Mus),(((AF070995C #1,(X04752Mus #1,U07177Rat #1)#1)#1,(U95378Sus #1,U13680Hom #1)#1)#1,(X53828OG1,U28410OG2)))));

Output:
TREE #  1:  (1, 2, (5, ((3, 4), (((6, (8, 9)), (7, 10)), (11, 12)))));   MP score: -1
lnL(ntime: 21  np: 24):  -5985.635237     +0.000000
  13..1    13..2    13..14   14..5    14..15   15..16   16..3    16..4    15..17   17..18   18..19   19..6    19..20   20..8    20..9    18..21   21..7    21..10   17..22   22..11   22..12 
 0.171224 0.151409 0.058980 0.269203 0.017888 0.262039 0.138108 0.103579 0.168914 0.560187 0.231662 0.203731 0.272797 0.130747 0.204423 0.162430 0.151155 0.167095 0.366793 0.312139 0.396584 2.403414 0.075516 0.239425

Note: Branch length is defined as number of nucleotide substitutions per codon (not per neucleotide site).

tree length =   4.50108

(1: 0.171224, 2: 0.151409, (5: 0.269203, ((3: 0.138108, 4: 0.103579): 0.262039, (((6: 0.203731, (8: 0.130747, 9: 0.204423): 0.272797): 0.231662, (7: 0.151155, 10: 0.167095): 0.162430): 0.560187, (11: 0.312139, 12: 0.396584): 0.366793): 0.168914): 0.017888): 0.058980);

(X02152Hom: 0.171224, U07178Sus: 0.151409, (M22585rab: 0.269203, ((NM017025Rat: 0.138108, U13687Mus: 0.103579): 0.262039, (((AF070995C: 0.203731, (X04752Mus: 0.130747, U07177Rat: 0.204423): 0.272797): 0.231662, (U95378Sus: 0.151155, U13680Hom: 0.167095): 0.162430): 0.560187, (X53828OG1: 0.312139, U28410OG2: 0.396584): 0.366793): 0.168914): 0.017888): 0.058980);

Detailed output identifying parameters

kappa (ts/tv) =  2.40341

dN & dS for each branch

 branch           t        N        S    dN/dS       dN       dS   N*dN   S*dS

  13..1       0.171    729.3    269.7   0.0755   0.0133   0.1756    9.7   47.3
  13..2       0.151    729.3    269.7   0.0755   0.0117   0.1552    8.5   41.9
  13..14      0.059    729.3    269.7   0.0755   0.0046   0.0605    3.3   16.3
  14..5       0.269    729.3    269.7   0.0755   0.0208   0.2760   15.2   74.4
  14..15      0.018    729.3    269.7   0.0755   0.0014   0.0183    1.0    4.9
  15..16      0.262    729.3    269.7   0.0755   0.0203   0.2687   14.8   72.5
  16..3       0.138    729.3    269.7   0.0755   0.0107   0.1416    7.8   38.2
  16..4       0.104    729.3    269.7   0.0755   0.0080   0.1062    5.8   28.6
  15..17      0.169    729.3    269.7   0.0755   0.0131   0.1732    9.5   46.7
  17..18      0.560    729.3    269.7   0.2394   0.1005   0.4198   73.3  113.2
  18..19      0.232    729.3    269.7   0.2394   0.0416   0.1736   30.3   46.8
  19..6       0.204    729.3    269.7   0.2394   0.0366   0.1527   26.7   41.2
  19..20      0.273    729.3    269.7   0.2394   0.0490   0.2045   35.7   55.1
  20..8       0.131    729.3    269.7   0.2394   0.0235   0.0980   17.1   26.4
  20..9       0.204    729.3    269.7   0.2394   0.0367   0.1532   26.8   41.3
  18..21      0.162    729.3    269.7   0.2394   0.0291   0.1217   21.3   32.8
  21..7       0.151    729.3    269.7   0.2394   0.0271   0.1133   19.8   30.6
  21..10      0.167    729.3    269.7   0.2394   0.0300   0.1252   21.9   33.8
  17..22      0.367    729.3    269.7   0.0755   0.0284   0.3761   20.7  101.4
  22..11      0.312    729.3    269.7   0.0755   0.0242   0.3200   17.6   86.3
  22..12      0.397    729.3    269.7   0.0755   0.0307   0.4066   22.4  109.7

tree length for dN:      0.56113
tree length for dS:      4.04015

dS tree:
(X02152Hom: 0.175561, U07178Sus: 0.155244, (M22585rab: 0.276022, ((NM017025Rat: 0.141606, U13687Mus: 0.106203): 0.268676, (((AF070995C: 0.152691, (X04752Mus: 0.097991, U07177Rat: 0.153209): 0.204454): 0.173624, (U95378Sus: 0.113287, U13680Hom: 0.125233): 0.121737): 0.419845, (X53828OG1: 0.320046, U28410OG2: 0.406630): 0.376085): 0.173193): 0.018341): 0.060474);
dN tree:
(X02152Hom: 0.013258, U07178Sus: 0.011723, (M22585rab: 0.020844, ((NM017025Rat: 0.010693, U13687Mus: 0.008020): 0.020289, (((AF070995C: 0.036558, (X04752Mus: 0.023462, U07177Rat: 0.036682): 0.048951): 0.041570, (U95378Sus: 0.027124, U13680Hom: 0.029984): 0.029147): 0.100521, (X53828OG1: 0.024169, U28410OG2: 0.030707): 0.028400): 0.013079): 0.001385): 0.004567);

w ratios as labels for TreeView:
(X02152Hom #0.0755 : 0.013258, U07178Sus #0.0755 : 0.011723, (M22585rab #0.0755 : 0.020844, ((NM017025Rat #0.0755 : 0.010693, U13687Mus #0.0755 : 0.008020) #0.0755 : 0.020289, (((AF070995C#1 #0.2394 : 0.036558, (X04752Mus#1 #0.2394 : 0.023462, U07177Rat#1 #0.2394 : 0.036682)#1 #0.2394 : 0.048951)#1 #0.2394 : 0.041570, (U95378Sus#1 #0.2394 : 0.027124, U13680Hom#1 #0.2394 : 0.029984)#1 #0.2394 : 0.029147)#1 #0.2394 : 0.100521, (X53828OG1 #0.0755 : 0.024169, U28410OG2 #0.0755 : 0.030707) #0.0755 : 0.028400) #0.0755 : 0.013079) #0.0755 : 0.001385) #0.0755 : 0.004567);


Time used:  0:56
H3: Long term shift in selection on both Ldh-C and Ldh-A; those lineages are subject to selection pressures different from each other and from the ancestor.
Tree file:
(X02152Hom,U07178Sus,(M22585rab,((NM017025Rat,U13687Mus),(((AF070995C #1,(X04752Mus #1,U07177Rat #1)#1)#1,(U95378Sus #1,U13680Hom #1)#1)#1,(X53828OG1 #2,U28410OG2 #2)#2))));

Output:
TREE #  1:  (1, 2, (5, ((3, 4), (((6, (8, 9)), (7, 10)), (11, 12)))));   MP score: -1
lnL(ntime: 21  np: 25):  -5984.111015     +0.000000
  13..1    13..2    13..14   14..5    14..15   15..16   16..3    16..4    15..17   17..18   18..19   19..6    19..20   20..8    20..9    18..21   21..7    21..10   17..22   22..11   22..12 
 0.173367 0.152089 0.058499 0.273176 0.018609 0.265853 0.138699 0.104431 0.178874 0.556050 0.231889 0.203607 0.272766 0.130870 0.204220 0.161922 0.150629 0.167529 0.341576 0.302793 0.379370 2.399088 0.064218 0.240341 0.094898

Note: Branch length is defined as number of nucleotide substitutions per codon (not per neucleotide site).

tree length =   4.46682

(1: 0.173367, 2: 0.152089, (5: 0.273176, ((3: 0.138699, 4: 0.104431): 0.265853, (((6: 0.203607, (8: 0.130870, 9: 0.204220): 0.272766): 0.231889, (7: 0.150629, 10: 0.167529): 0.161922): 0.556050, (11: 0.302793, 12: 0.379370): 0.341576): 0.178874): 0.018609): 0.058499);

(X02152Hom: 0.173367, U07178Sus: 0.152089, (M22585rab: 0.273176, ((NM017025Rat: 0.138699, U13687Mus: 0.104431): 0.265853, (((AF070995C: 0.203607, (X04752Mus: 0.130870, U07177Rat: 0.204220): 0.272766): 0.231889, (U95378Sus: 0.150629, U13680Hom: 0.167529): 0.161922): 0.556050, (X53828OG1: 0.302793, U28410OG2: 0.379370): 0.341576): 0.178874): 0.018609): 0.058499);

Detailed output identifying parameters

kappa (ts/tv) =  2.39909

dN & dS for each branch

 branch           t        N        S    dN/dS       dN       dS   N*dN   S*dS

  13..1       0.173    729.4    269.6   0.0642   0.0117   0.1824    8.5   49.2
  13..2       0.152    729.4    269.6   0.0642   0.0103   0.1600    7.5   43.1
  13..14      0.058    729.4    269.6   0.0642   0.0040   0.0616    2.9   16.6
  14..5       0.273    729.4    269.6   0.0642   0.0185   0.2874   13.5   77.5
  14..15      0.019    729.4    269.6   0.0642   0.0013   0.0196    0.9    5.3
  15..16      0.266    729.4    269.6   0.0642   0.0180   0.2797   13.1   75.4
  16..3       0.139    729.4    269.6   0.0642   0.0094   0.1459    6.8   39.4
  16..4       0.104    729.4    269.6   0.0642   0.0071   0.1099    5.1   29.6
  15..17      0.179    729.4    269.6   0.0642   0.0121   0.1882    8.8   50.7
  17..18      0.556    729.4    269.6   0.2403   0.1000   0.4162   73.0  112.2
  18..19      0.232    729.4    269.6   0.2403   0.0417   0.1736   30.4   46.8
  19..6       0.204    729.4    269.6   0.2403   0.0366   0.1524   26.7   41.1
  19..20      0.273    729.4    269.6   0.2403   0.0491   0.2041   35.8   55.0
  20..8       0.131    729.4    269.6   0.2403   0.0235   0.0979   17.2   26.4
  20..9       0.204    729.4    269.6   0.2403   0.0367   0.1528   26.8   41.2
  18..21      0.162    729.4    269.6   0.2403   0.0291   0.1212   21.2   32.7
  21..7       0.151    729.4    269.6   0.2403   0.0271   0.1127   19.8   30.4
  21..10      0.168    729.4    269.6   0.2403   0.0301   0.1254   22.0   33.8
  17..22      0.342    729.4    269.6   0.0949   0.0319   0.3357   23.2   90.5
  22..11      0.303    729.4    269.6   0.0949   0.0282   0.2976   20.6   80.2
  22..12      0.379    729.4    269.6   0.0949   0.0354   0.3728   25.8  100.5

tree length for dN:      0.56167
tree length for dS:      3.99725

dS tree:
(X02152Hom: 0.182422, U07178Sus: 0.160033, (M22585rab: 0.287445, ((NM017025Rat: 0.145944, U13687Mus: 0.109885): 0.279740, (((AF070995C: 0.152387, (X04752Mus: 0.097947, U07177Rat: 0.152845): 0.204148): 0.173554, (U95378Sus: 0.112736, U13680Hom: 0.125384): 0.121188): 0.416167, (X53828OG1: 0.297568, U28410OG2: 0.372824): 0.335682): 0.188217): 0.019581): 0.061554);
dN tree:
(X02152Hom: 0.011715, U07178Sus: 0.010277, (M22585rab: 0.018459, ((NM017025Rat: 0.009372, U13687Mus: 0.007057): 0.017964, (((AF070995C: 0.036625, (X04752Mus: 0.023541, U07177Rat: 0.036735): 0.049065): 0.041712, (U95378Sus: 0.027095, U13680Hom: 0.030135): 0.029127): 0.100022, (X53828OG1: 0.028239, U28410OG2: 0.035380): 0.031856): 0.012087): 0.001257): 0.003953);

w ratios as labels for TreeView:
(X02152Hom #0.0642 : 0.011715, U07178Sus #0.0642 : 0.010277, (M22585rab #0.0642 : 0.018459, ((NM017025Rat #0.0642 : 0.009372, U13687Mus #0.0642 : 0.007057) #0.0642 : 0.017964, (((AF070995C#1 #0.2403 : 0.036625, (X04752Mus#1 #0.2403 : 0.023541, U07177Rat#1 #0.2403 : 0.036735)#1 #0.2403 : 0.049065)#1 #0.2403 : 0.041712, (U95378Sus#1 #0.2403 : 0.027095, U13680Hom#1 #0.2403 : 0.030135)#1 #0.2403 : 0.029127)#1 #0.2403 : 0.100022, (X53828OG1#2 #0.0949 : 0.028239, U28410OG2#2 #0.0949 : 0.035380)#2 #0.0949 : 0.031856) #0.0642 : 0.012087) #0.0642 : 0.001257) #0.0642 : 0.003953);


Time used:  1:10

In addition, carry out likelihood ratio tests (LRT) of the hypotheses below. See the lecture notes for additional details about LRTs. Use 1 degree of freedom to obtain the P-value for each LRT.

Summary table of hypotheses:
Model ω_A0 ω_A1 ω_C1 ω_C0 lnL LRT
H0 0.1367 0.1367 0.1367 0.1367 -6018.633
H1 0.1319 0.1319 0.1319 0.192 -6017.572
H2 0.0755 0.0755 0.2394 0.2394 -5985.635
H3 0.0949 0.0642 0.2403 0.2403 -5984.111

You should change the name of the main result file (via outfile= in the control file) or you will overwrite your previous results.

Exercise 4The objective of this exercise is to use a series of LRTs to test for sites evolving under positive selection in the nef gene. If you find significant evidence for positive selection, then identify the involved sites by using empirical Bayes methods.

       seqfile = seqfile.txt          * sequence data filename

    * treefile = treefile_M0.txt      * SET THIS for tree file with ML branch lengths under M0
    * treefile = treefile_M1.txt      * SET THIS for tree file with ML branch lengths under M1
    * treefile = treefile_M2.txt      * SET THIS for tree file with ML branch lengths under M2
    * treefile = treefile_M3.txt      * SET THIS for tree file with ML branch lengths under M3
    * treefile = treefile_M7.txt      * SET THIS for tree file with ML branch lengths under M7
    * treefile = treefile_M8.txt      * SET THIS for tree file with ML branch lengths under M8

      outfile = results.txt           * main result file name
        noisy = 9                     * lots of rubbish on the screen
      verbose = 1                     * detailed output
      runmode = 0                     * user defined tree
      seqtype = 1                     * codons
    CodonFreq = 2                     * F3X4 for codon ferquencies
        model = 0                     * one omega ratio for all branches

    * NSsites = 0                     * SET THIS for M0
    * NSsites = 1                     * SET THIS for M1
    * NSsites = 2                     * SET THIS for M2
    * NSsites = 3                     * SET THIS for M3
    * NSsites = 7                     * SET THIS for M7
    * NSsites = 8                     * SET THIS for M8

        icode = 0                     * universal code
    fix_kappa = 1                     * kappa fixed
      * kappa = 4.43491               * SET THIS to fix kappa at MLE under M0
      * kappa = 4.39117               * SET THIS to fix kappa at MLE under M1
      * kappa = 5.08964               * SET THIS to fix kappa at MLE under M2
      * kappa = 4.89033               * SET THIS to fix kappa at MLE under M3
      * kappa = 4.22750               * SET THIS to fix kappa at MLE under M7
      * kappa = 4.87827               * SET THIS to fix kappa at MLE under M8

    fix_omega = 0                     * omega to be estimated 
        omega = 5                     * initial omega

      * ncatG = 3                     * SET THIS for 3 site categories under M3         
      * ncatG = 10                    * SET THIS for 10 of site categories under M7 and M8

  fix_blength = 2                     * fixed branch lengths from tree file
sdsf