Laurel's Lab Notebook

Thursday, July 23, 2015

PAML lab

Exercise 1:The objective of this activity is to use CODEML to evaluate the likelihood of the GstD1 sequences for a variety of ω values. Plot log-likelihood scores against the values of ω and determine the maximum likelihood estimate of ω. Check your finding by running CODEML’s hill-climbing algorithm.
#run this command to execute:

codeml ex1_codeml.ctl

This is what the control file looks like:

seqfile = seqfile.txt * sequence data filename

outfile = results_0.010.txt * main result file name

noisy = 9 * 0,1,2,3,9: how much rubbish on the screen

verbose = 1 * 1:detailed output

runmode = -2 * -2:pairwise

seqtype = 1 * 1:codons

CodonFreq = 3 * 0:equal, 1:F1X4, 2:F3X4, 3:F61

model = 0 *

NSsites = 0 *

icode = 0 * 0:universal code

fix_kappa = 0 * 1:kappa fixed, 0:kappa to be estimated

kappa = 2 * initial or fixed kappa

fix_omega = 1 * 1:omega fixed, 0:omega to be estimated

* omega = 0.001 * 1st fixed omega value [change this]

* EXCERCISE 1

*alternate fixed omega values

*omega = 0.005 * 2nd fixed value

*omega = 0.01 * 3rd fixed value

*omega = 0.05 * 4th fixed value

*omega = 0.10 * 5th fixed value

*omega = 0.20 * 6th fixed value

*omega = 0.40 * 7th fixed value

*omega = 0.80 * 8th fixed value

*omega = 1.60 * 9th fixed value

omega = 2.00 * 10th fixed value

The MLE of ω is 0.04.

Now we are going to have PAML estimate ω. In the control file, set fix_omega=0.
------------------
2 (Sim) ... 1 (Mel)
lnL = -756.568200
0.12527 2.53119 0.06698

t= 0.1253 S= 45.2 N= 554.8 dN/dS= 0.0670 dN= 0.0204 dS= 0.3041
---------------------

Exercise 2: In this exercise you will investigate the sensitivity of your estimate of ω to the transition/transversion ratio (κ), and to the assumed model for codon frequencies (π’s). After you collect the required data you will determine which assumptions yield the largest and smallest values of S, and what effect it as on the value of ω.

Here is the control file:

seqfile = seqfile.txt * sequence data filename

outfile = results.txt * main result file name

noisy = 9 * 0,1,2,3,9: how much rubbish on the screen

verbose = 1 * 1:detailed output

runmode = -2 * -2:pairwise

seqtype = 1 * 1:codons

CodonFreq = 0 * 0:equal, 1:F1X4, 2:F3X4, 3:F61 [change this]

model = 0 *

NSsites = 0 *

icode = 0 * 0:universal code

fix_kappa = 1 * 1:kappa fixed, 0:kappa to be estimated [change this]

kappa = 1 * initial or fixed kappa

fix_omega = 0 * 1:omega fixed, 0:omega to be estimated

omega = 0.5 * initial omega value

* EXCERCISE 2

* Codon bias = none; Ts/Tv bias = none

* Codon bias = none; Ts/Tv bias = Yes (ML)

* Codon bias = yes (F3x4); Ts/Tv bias = none

* Codon bias = yes (F3x4); Ts/Tv bias = Yes (ML)

* Codon bias = yes (F61); Ts/Tv bias = none

* Codon bias = yes (F61); Ts/Tv bias = Yes (ML)

Test#1  Codon bias = none; Ts/Tv bias = none
CodonFreq = 0   * 0:equal,
fix_kappa = 1   * 1:kappa fixed,

2 (Sim) ... 1 (Mel)
lnL = -927.178200
0.10683 0.27421

t= 0.1068 S= 152.9 N= 447.1 dN/dS= 0.2742 dN= 0.0213 dS= 0.0776
Test#2 Codon bias = none; Ts/Tv bias = Yes (ML)
*Highlighted is the ML estimate of of sequences kappa (ts/tv)
CodonFreq = 0   * 0:equal,
fix_kappa = 0   * 1:kappa fixed, 0:kappa to be estimated

2 (Sim) ... 1 (Mel)
lnL = -926.280300
0.10533 1.88371 0.32004

t= 0.1053 S= 165.8 N= 434.2 dN/dS= 0.3200 dN= 0.0221 dS= 0.0691

Test#3 Codon bias = yes (F3x4); Ts/Tv bias = none
CodonFreq = 2   * 2 F3X4,
fix_kappa = 1   * 1:kappa fixed,

2 (Sim) ... 1 (Mel)
lnL = -844.507294
0.10684 0.11807

t= 0.1068 S= 70.6 N= 529.4 dN/dS= 0.1181 dN= 0.0189 dS= 0.1605

Test#4 Codon bias = yes (F3x4); Ts/Tv bias = Yes (ML)
CodonFreq = 2   * 2 F3X4,
fix_kappa = 0   * 0:kappa to be estimated

2 (Sim) ... 1 (Mel)
lnL = -842.214727
0.10686 2.71034 0.12661

t= 0.1069 S= 73.4 N= 526.6 dN/dS= 0.1266 dN= 0.0193 dS= 0.1526

Test #5 Codon bias = yes (F61); Ts/Tv bias = none
CodonFreq = 3   * 3 F61

fix_kappa = 1   * 1:kappa fixed,

2 (Sim) ... 1 (Mel)
lnL = -758.551863
0.12089 0.06278

t= 0.1209 S= 40.5 N= 559.5 dN/dS= 0.0628 dN= 0.0201 dS= 0.3198

Test #6 Codon bias = yes (F61); Ts/Tv bias = Yes (ML)
CodonFreq = 3 * 3 F61

fix_kappa = 0 * 0:kappa to be estimated
2 (Sim) ... 1 (Mel)
lnL = -756.568200
0.12527 2.53121 0.06698

t= 0.1253 S= 45.2 N= 554.8 dN/dS= 0.0670 dN= 0.0204 dS= 0.3040

Summarized in a table

Assumptions	K	S	N	dS	dN	omega	lnL
Fequal + k=1	1	152.9	447.1	0.021	0.078	0.27	-927.1782
Fequal + k=estimated	1.88	165.8	434.2	0.07	0.022	0.32	-926.2803
F3X4 + k=1	1	70.6	529.4	0.16	0.019	0.118	-844.507294
F3X4 + k=estimated	2.71	73.4	526.6	0.15	0.02	0.127	-842.214727
F61 + k=1	1	40.5	559.5	0.32	0.02	0.063	-758.551863
F61 + k=estimated	2.53	45.2	554.8	0.3	0.02	0.067	-756.5682

The assumption that yields the smallest value of S is F61 + k=1. It changes the value of omega by an order of magnitude.

Exercise 3: The objective of this exercise is to use three LRTs to evaluate the following hypotheses: (1) the mutation rate of Ldh-C has increased relative to Ldh-A, (2) a burst of positive selection for functional divergence occurred following the duplication event that gave rise to Ldh-C, and (3) there was a long term shift in selective constraints following the duplication event that gave rise to Ldh-C.

Here is the control file:

seqfile = seqfile.txt * sequence data filename

treefile = treeH0.txt * tree structure file name

outfile = results.txt * main result file name

noisy = 9 * 0,1,2,3,9: how much rubbish on the screen

verbose = 1 * 1:detailed output

runmode = 0 * 0:user defined tree

seqtype = 1 * 1:codons

CodonFreq = 2 * 0:equal, 1:F1X4, 2:F3X4, 3:F61

model = 0 * 0:one omega ratio for all branches

* 1:separate omega for each branch

* 2:user specified dN/dS ratios for branches

NSsites = 0 *

icode = 0 * 0:universal code

fix_kappa = 0 * 1:kappa fixed, 0:kappa to be estimated

kappa = 2 * initial or fixed kappa

fix_omega = 0 * 1:omega fixed, 0:omega to be estimated

omega = 0.2 * initial omega

* comments:

* H0 in Table 3: model = 0

* H1 in Table 3: model = 2

* H2 in Table 3: model = 2

* H3 in Table 3: model = 2

H0: homogeneous selection pressure over the tree

(X02152Hom,U07178Sus,(M22585rab,((NM017025Rat,U13687Mus),(((AF070995C,(X04752Mus,U07177Rat)),(U95378Sus,U13680Hom)),(X53828OG1,U28410OG2)))));

Here is the output (lnL highlighted in yellow):

TREE # 1: (1, 2, (5, ((3, 4), (((6, (8, 9)), (7, 10)), (11, 12))))); MP score: -1

lnL(ntime: 21 np: 23): -6018.633010 +0.000000

13..1 13..2 13..14 14..5 14..15 15..16 16..3 16..4 15..17 17..18 18..19 19..6 19..20 20..8 20..9 18..21 21..7 21..10 17..22 22..11 22..12

0.165986 0.145030 0.060255 0.252090 0.030239 0.230093 0.133935 0.101768 0.125411 0.692354 0.249468 0.205980 0.294515 0.131299 0.213038 0.174959 0.155890 0.172753 0.321380 0.290395 0.356759 2.401783 0.136656

Note: Branch length is defined as number of nucleotide substitutions per codon (not per neucleotide site).

tree length = 4.50360

(1: 0.165986, 2: 0.145030, (5: 0.252090, ((3: 0.133935, 4: 0.101768): 0.230093, (((6: 0.205980, (8: 0.131299, 9: 0.213038): 0.294515): 0.249468, (7: 0.155890, 10: 0.172753): 0.174959): 0.692354, (11: 0.290395, 12: 0.356759): 0.321380): 0.125411): 0.030239): 0.060255);

(X02152Hom: 0.165986, U07178Sus: 0.145030, (M22585rab: 0.252090, ((NM017025Rat: 0.133935, U13687Mus: 0.101768): 0.230093, (((AF070995C: 0.205980, (X04752Mus: 0.131299, U07177Rat: 0.213038): 0.294515): 0.249468, (U95378Sus: 0.155890, U13680Hom: 0.172753): 0.174959): 0.692354, (X53828OG1: 0.290395, U28410OG2: 0.356759): 0.321380): 0.125411): 0.030239): 0.060255);

Detailed output identifying parameters

kappa (ts/tv) = 2.40178

omega (dN/dS) = 0.13666

dN & dS for each branch

branch t N S dN/dS dN dS N*dN S*dS

13..1 0.166 729.3 269.7 0.1367 0.0205 0.1497 14.9 40.4

13..2 0.145 729.3 269.7 0.1367 0.0179 0.1308 13.0 35.3

13..14 0.060 729.3 269.7 0.1367 0.0074 0.0543 5.4 14.7

14..5 0.252 729.3 269.7 0.1367 0.0311 0.2273 22.7 61.3

14..15 0.030 729.3 269.7 0.1367 0.0037 0.0273 2.7 7.4

15..16 0.230 729.3 269.7 0.1367 0.0283 0.2075 20.7 55.9

16..3 0.134 729.3 269.7 0.1367 0.0165 0.1208 12.0 32.6

16..4 0.102 729.3 269.7 0.1367 0.0125 0.0918 9.1 24.7

15..17 0.125 729.3 269.7 0.1367 0.0155 0.1131 11.3 30.5

17..18 0.692 729.3 269.7 0.1367 0.0853 0.6242 62.2 168.3

18..19 0.249 729.3 269.7 0.1367 0.0307 0.2249 22.4 60.7

19..6 0.206 729.3 269.7 0.1367 0.0254 0.1857 18.5 50.1

19..20 0.295 729.3 269.7 0.1367 0.0363 0.2655 26.5 71.6

20..8 0.131 729.3 269.7 0.1367 0.0162 0.1184 11.8 31.9

20..9 0.213 729.3 269.7 0.1367 0.0262 0.1921 19.1 51.8

18..21 0.175 729.3 269.7 0.1367 0.0216 0.1577 15.7 42.5

21..7 0.156 729.3 269.7 0.1367 0.0192 0.1406 14.0 37.9

21..10 0.173 729.3 269.7 0.1367 0.0213 0.1558 15.5 42.0

17..22 0.321 729.3 269.7 0.1367 0.0396 0.2898 28.9 78.1

22..11 0.290 729.3 269.7 0.1367 0.0358 0.2618 26.1 70.6

22..12 0.357 729.3 269.7 0.1367 0.0440 0.3217 32.1 86.7

tree length for dN: 0.55489

tree length for dS: 4.06048

H1: episodic change in selection pressure in Ldh-C (only in the branch that immediately follows the gene duplication event).

Here is the tree file for this:

(X02152Hom,U07178Sus,(M22585rab,((NM017025Rat,U13687Mus),(((AF070995C,(X04752Mus,U07177Rat)),(U95378Sus,U13680Hom))#1,(X53828OG1,U28410OG2)))));

Output:

TREE # 1: (1, 2, (5, ((3, 4), (((6, (8, 9)), (7, 10)), (11, 12))))); MP score: -1

lnL(ntime: 21 np: 24): -6017.572460 +0.000000

13..1 13..2 13..14 14..5 14..15 15..16 16..3 16..4 15..17 17..18 18..19 19..6 19..20 20..8 20..9 18..21 21..7 21..10 17..22 22..11 22..12

0.166139 0.145654 0.060220 0.253769 0.027868 0.234240 0.134451 0.101620 0.136088 0.615641 0.250544 0.206999 0.295643 0.131224 0.213679 0.176223 0.155609 0.173597 0.319034 0.291976 0.357768 2.397123 0.131877 0.192033

Note: Branch length is defined as number of nucleotide substitutions per codon (not per nucleotide site).

tree length = 4.44798

(1: 0.166139, 2: 0.145654, (5: 0.253769, ((3: 0.134451, 4: 0.101620): 0.234240, (((6: 0.206999, (8: 0.131224, 9: 0.213679): 0.295643): 0.250544, (7: 0.155609, 10: 0.173597): 0.176223): 0.615641, (11: 0.291976, 12: 0.357768): 0.319034): 0.136088): 0.027868): 0.060220);

(X02152Hom: 0.166139, U07178Sus: 0.145654, (M22585rab: 0.253769, ((NM017025Rat: 0.134451, U13687Mus: 0.101620): 0.234240, (((AF070995C: 0.206999, (X04752Mus: 0.131224, U07177Rat: 0.213679): 0.295643): 0.250544, (U95378Sus: 0.155609, U13680Hom: 0.173597): 0.176223): 0.615641, (X53828OG1: 0.291976, U28410OG2: 0.357768): 0.319034): 0.136088): 0.027868): 0.060220);

Detailed output identifying parameters

kappa (ts/tv) = 2.39712

dN & dS for each branch

branch t N S dN/dS dN dS N*dN S*dS

13..1 0.166 729.4 269.6 0.1319 0.0199 0.1512 14.5 40.8

13..2 0.146 729.4 269.6 0.1319 0.0175 0.1326 12.8 35.7

13..14 0.060 729.4 269.6 0.1319 0.0072 0.0548 5.3 14.8

14..5 0.254 729.4 269.6 0.1319 0.0305 0.2310 22.2 62.3

14..15 0.028 729.4 269.6 0.1319 0.0033 0.0254 2.4 6.8

15..16 0.234 729.4 269.6 0.1319 0.0281 0.2132 20.5 57.5

16..3 0.134 729.4 269.6 0.1319 0.0161 0.1224 11.8 33.0

16..4 0.102 729.4 269.6 0.1319 0.0122 0.0925 8.9 24.9

15..17 0.136 729.4 269.6 0.1319 0.0163 0.1239 11.9 33.4

17..18 0.616 729.4 269.6 0.1920 0.0961 0.5004 70.1 134.9

18..19 0.251 729.4 269.6 0.1319 0.0301 0.2281 21.9 61.5

19..6 0.207 729.4 269.6 0.1319 0.0249 0.1884 18.1 50.8

19..20 0.296 729.4 269.6 0.1319 0.0355 0.2691 25.9 72.6

20..8 0.131 729.4 269.6 0.1319 0.0158 0.1195 11.5 32.2

20..9 0.214 729.4 269.6 0.1319 0.0257 0.1945 18.7 52.4

18..21 0.176 729.4 269.6 0.1319 0.0212 0.1604 15.4 43.3

21..7 0.156 729.4 269.6 0.1319 0.0187 0.1417 13.6 38.2

21..10 0.174 729.4 269.6 0.1319 0.0208 0.1580 15.2 42.6

17..22 0.319 729.4 269.6 0.1319 0.0383 0.2904 27.9 78.3

22..11 0.292 729.4 269.6 0.1319 0.0351 0.2658 25.6 71.7

22..12 0.358 729.4 269.6 0.1319 0.0430 0.3257 31.3 87.8

tree length for dN: 0.55619

tree length for dS: 3.98922

dS tree:

(X02152Hom: 0.151246, U07178Sus: 0.132597, (M22585rab: 0.231020, ((NM017025Rat: 0.122398, U13687Mus: 0.092510): 0.213241, (((AF070995C: 0.188443, (X04752Mus: 0.119460, U07177Rat: 0.194523): 0.269141): 0.228084, (U95378Sus: 0.141659, U13680Hom: 0.158035): 0.160426): 0.500425, (X53828OG1: 0.265802, U28410OG2: 0.325696): 0.290434): 0.123889): 0.025370): 0.054822);

dN tree:

(X02152Hom: 0.019946, U07178Sus: 0.017487, (M22585rab: 0.030466, ((NM017025Rat: 0.016142, U13687Mus: 0.012200): 0.028122, (((AF070995C: 0.024851, (X04752Mus: 0.015754, U07177Rat: 0.025653): 0.035494): 0.030079, (U95378Sus: 0.018682, U13680Hom: 0.020841): 0.021157): 0.096098, (X53828OG1: 0.035053, U28410OG2: 0.042952): 0.038302): 0.016338): 0.003346): 0.007230);

w ratios as labels for TreeView:

(X02152Hom #0.1319 : 0.019946, U07178Sus #0.1319 : 0.017487, (M22585rab #0.1319 : 0.030466, ((NM017025Rat #0.1319 : 0.016142, U13687Mus #0.1319 : 0.012200) #0.1319 : 0.028122, (((AF070995C #0.1319 : 0.024851, (X04752Mus #0.1319 : 0.015754, U07177Rat #0.1319 : 0.025653) #0.1319 : 0.035494) #0.1319 : 0.030079, (U95378Sus #0.1319 : 0.018682, U13680Hom #0.1319 : 0.020841) #0.1319 : 0.021157)#1 #0.1920 : 0.096098, (X53828OG1 #0.1319 : 0.035053, U28410OG2 #0.1319 : 0.042952) #0.1319 : 0.038302) #0.1319 : 0.016338) #0.1319 : 0.003346) #0.1319 : 0.007230);

H2: Long term shift in selection pressure in Ldh-C only; Ldh-C has a permanent change in selection pressure (as compared to its ancestors) whereas Ldh-A remains subject to the ancestral level of selection pressure.

(X02152Hom,U07178Sus,(M22585rab,((NM017025Rat,U13687Mus),(((AF070995C #1,(X04752Mus #1,U07177Rat #1)#1)#1,(U95378Sus #1,U13680Hom #1)#1)#1,(X53828OG1,U28410OG2)))));

Output:

TREE # 1: (1, 2, (5, ((3, 4), (((6, (8, 9)), (7, 10)), (11, 12))))); MP score: -1

lnL(ntime: 21 np: 24): -5985.635237 +0.000000

13..1 13..2 13..14 14..5 14..15 15..16 16..3 16..4 15..17 17..18 18..19 19..6 19..20 20..8 20..9 18..21 21..7 21..10 17..22 22..11 22..12

0.171224 0.151409 0.058980 0.269203 0.017888 0.262039 0.138108 0.103579 0.168914 0.560187 0.231662 0.203731 0.272797 0.130747 0.204423 0.162430 0.151155 0.167095 0.366793 0.312139 0.396584 2.403414 0.075516 0.239425

Note: Branch length is defined as number of nucleotide substitutions per codon (not per neucleotide site).

tree length = 4.50108

(1: 0.171224, 2: 0.151409, (5: 0.269203, ((3: 0.138108, 4: 0.103579): 0.262039, (((6: 0.203731, (8: 0.130747, 9: 0.204423): 0.272797): 0.231662, (7: 0.151155, 10: 0.167095): 0.162430): 0.560187, (11: 0.312139, 12: 0.396584): 0.366793): 0.168914): 0.017888): 0.058980);

(X02152Hom: 0.171224, U07178Sus: 0.151409, (M22585rab: 0.269203, ((NM017025Rat: 0.138108, U13687Mus: 0.103579): 0.262039, (((AF070995C: 0.203731, (X04752Mus: 0.130747, U07177Rat: 0.204423): 0.272797): 0.231662, (U95378Sus: 0.151155, U13680Hom: 0.167095): 0.162430): 0.560187, (X53828OG1: 0.312139, U28410OG2: 0.396584): 0.366793): 0.168914): 0.017888): 0.058980);

Detailed output identifying parameters

kappa (ts/tv) = 2.40341

dN & dS for each branch

branch t N S dN/dS dN dS N*dN S*dS

13..1 0.171 729.3 269.7 0.0755 0.0133 0.1756 9.7 47.3

13..2 0.151 729.3 269.7 0.0755 0.0117 0.1552 8.5 41.9

13..14 0.059 729.3 269.7 0.0755 0.0046 0.0605 3.3 16.3

14..5 0.269 729.3 269.7 0.0755 0.0208 0.2760 15.2 74.4

14..15 0.018 729.3 269.7 0.0755 0.0014 0.0183 1.0 4.9

15..16 0.262 729.3 269.7 0.0755 0.0203 0.2687 14.8 72.5

16..3 0.138 729.3 269.7 0.0755 0.0107 0.1416 7.8 38.2

16..4 0.104 729.3 269.7 0.0755 0.0080 0.1062 5.8 28.6

15..17 0.169 729.3 269.7 0.0755 0.0131 0.1732 9.5 46.7

17..18 0.560 729.3 269.7 0.2394 0.1005 0.4198 73.3 113.2

18..19 0.232 729.3 269.7 0.2394 0.0416 0.1736 30.3 46.8

19..6 0.204 729.3 269.7 0.2394 0.0366 0.1527 26.7 41.2

19..20 0.273 729.3 269.7 0.2394 0.0490 0.2045 35.7 55.1

20..8 0.131 729.3 269.7 0.2394 0.0235 0.0980 17.1 26.4

20..9 0.204 729.3 269.7 0.2394 0.0367 0.1532 26.8 41.3

18..21 0.162 729.3 269.7 0.2394 0.0291 0.1217 21.3 32.8

21..7 0.151 729.3 269.7 0.2394 0.0271 0.1133 19.8 30.6

21..10 0.167 729.3 269.7 0.2394 0.0300 0.1252 21.9 33.8

17..22 0.367 729.3 269.7 0.0755 0.0284 0.3761 20.7 101.4

22..11 0.312 729.3 269.7 0.0755 0.0242 0.3200 17.6 86.3

22..12 0.397 729.3 269.7 0.0755 0.0307 0.4066 22.4 109.7

tree length for dN: 0.56113

tree length for dS: 4.04015

dS tree:

(X02152Hom: 0.175561, U07178Sus: 0.155244, (M22585rab: 0.276022, ((NM017025Rat: 0.141606, U13687Mus: 0.106203): 0.268676, (((AF070995C: 0.152691, (X04752Mus: 0.097991, U07177Rat: 0.153209): 0.204454): 0.173624, (U95378Sus: 0.113287, U13680Hom: 0.125233): 0.121737): 0.419845, (X53828OG1: 0.320046, U28410OG2: 0.406630): 0.376085): 0.173193): 0.018341): 0.060474);

dN tree:

(X02152Hom: 0.013258, U07178Sus: 0.011723, (M22585rab: 0.020844, ((NM017025Rat: 0.010693, U13687Mus: 0.008020): 0.020289, (((AF070995C: 0.036558, (X04752Mus: 0.023462, U07177Rat: 0.036682): 0.048951): 0.041570, (U95378Sus: 0.027124, U13680Hom: 0.029984): 0.029147): 0.100521, (X53828OG1: 0.024169, U28410OG2: 0.030707): 0.028400): 0.013079): 0.001385): 0.004567);

w ratios as labels for TreeView:

(X02152Hom #0.0755 : 0.013258, U07178Sus #0.0755 : 0.011723, (M22585rab #0.0755 : 0.020844, ((NM017025Rat #0.0755 : 0.010693, U13687Mus #0.0755 : 0.008020) #0.0755 : 0.020289, (((AF070995C#1 #0.2394 : 0.036558, (X04752Mus#1 #0.2394 : 0.023462, U07177Rat#1 #0.2394 : 0.036682)#1 #0.2394 : 0.048951)#1 #0.2394 : 0.041570, (U95378Sus#1 #0.2394 : 0.027124, U13680Hom#1 #0.2394 : 0.029984)#1 #0.2394 : 0.029147)#1 #0.2394 : 0.100521, (X53828OG1 #0.0755 : 0.024169, U28410OG2 #0.0755 : 0.030707) #0.0755 : 0.028400) #0.0755 : 0.013079) #0.0755 : 0.001385) #0.0755 : 0.004567);

Time used: 0:56

H3: Long term shift in selection on both Ldh-C and Ldh-A; those lineages are subject to selection pressures different from each other and from the ancestor.

Tree file:

(X02152Hom,U07178Sus,(M22585rab,((NM017025Rat,U13687Mus),(((AF070995C #1,(X04752Mus #1,U07177Rat #1)#1)#1,(U95378Sus #1,U13680Hom #1)#1)#1,(X53828OG1 #2,U28410OG2 #2)#2))));

Output:

TREE # 1: (1, 2, (5, ((3, 4), (((6, (8, 9)), (7, 10)), (11, 12))))); MP score: -1

lnL(ntime: 21 np: 25): -5984.111015 +0.000000

13..1 13..2 13..14 14..5 14..15 15..16 16..3 16..4 15..17 17..18 18..19 19..6 19..20 20..8 20..9 18..21 21..7 21..10 17..22 22..11 22..12

0.173367 0.152089 0.058499 0.273176 0.018609 0.265853 0.138699 0.104431 0.178874 0.556050 0.231889 0.203607 0.272766 0.130870 0.204220 0.161922 0.150629 0.167529 0.341576 0.302793 0.379370 2.399088 0.064218 0.240341 0.094898

Note: Branch length is defined as number of nucleotide substitutions per codon (not per neucleotide site).

tree length = 4.46682

(1: 0.173367, 2: 0.152089, (5: 0.273176, ((3: 0.138699, 4: 0.104431): 0.265853, (((6: 0.203607, (8: 0.130870, 9: 0.204220): 0.272766): 0.231889, (7: 0.150629, 10: 0.167529): 0.161922): 0.556050, (11: 0.302793, 12: 0.379370): 0.341576): 0.178874): 0.018609): 0.058499);

(X02152Hom: 0.173367, U07178Sus: 0.152089, (M22585rab: 0.273176, ((NM017025Rat: 0.138699, U13687Mus: 0.104431): 0.265853, (((AF070995C: 0.203607, (X04752Mus: 0.130870, U07177Rat: 0.204220): 0.272766): 0.231889, (U95378Sus: 0.150629, U13680Hom: 0.167529): 0.161922): 0.556050, (X53828OG1: 0.302793, U28410OG2: 0.379370): 0.341576): 0.178874): 0.018609): 0.058499);

Detailed output identifying parameters

kappa (ts/tv) = 2.39909

dN & dS for each branch

branch t N S dN/dS dN dS N*dN S*dS

13..1 0.173 729.4 269.6 0.0642 0.0117 0.1824 8.5 49.2

13..2 0.152 729.4 269.6 0.0642 0.0103 0.1600 7.5 43.1

13..14 0.058 729.4 269.6 0.0642 0.0040 0.0616 2.9 16.6

14..5 0.273 729.4 269.6 0.0642 0.0185 0.2874 13.5 77.5

14..15 0.019 729.4 269.6 0.0642 0.0013 0.0196 0.9 5.3

15..16 0.266 729.4 269.6 0.0642 0.0180 0.2797 13.1 75.4

16..3 0.139 729.4 269.6 0.0642 0.0094 0.1459 6.8 39.4

16..4 0.104 729.4 269.6 0.0642 0.0071 0.1099 5.1 29.6

15..17 0.179 729.4 269.6 0.0642 0.0121 0.1882 8.8 50.7

17..18 0.556 729.4 269.6 0.2403 0.1000 0.4162 73.0 112.2

18..19 0.232 729.4 269.6 0.2403 0.0417 0.1736 30.4 46.8

19..6 0.204 729.4 269.6 0.2403 0.0366 0.1524 26.7 41.1

19..20 0.273 729.4 269.6 0.2403 0.0491 0.2041 35.8 55.0

20..8 0.131 729.4 269.6 0.2403 0.0235 0.0979 17.2 26.4

20..9 0.204 729.4 269.6 0.2403 0.0367 0.1528 26.8 41.2

18..21 0.162 729.4 269.6 0.2403 0.0291 0.1212 21.2 32.7

21..7 0.151 729.4 269.6 0.2403 0.0271 0.1127 19.8 30.4

21..10 0.168 729.4 269.6 0.2403 0.0301 0.1254 22.0 33.8

17..22 0.342 729.4 269.6 0.0949 0.0319 0.3357 23.2 90.5

22..11 0.303 729.4 269.6 0.0949 0.0282 0.2976 20.6 80.2

22..12 0.379 729.4 269.6 0.0949 0.0354 0.3728 25.8 100.5

tree length for dN: 0.56167

tree length for dS: 3.99725

dS tree:

(X02152Hom: 0.182422, U07178Sus: 0.160033, (M22585rab: 0.287445, ((NM017025Rat: 0.145944, U13687Mus: 0.109885): 0.279740, (((AF070995C: 0.152387, (X04752Mus: 0.097947, U07177Rat: 0.152845): 0.204148): 0.173554, (U95378Sus: 0.112736, U13680Hom: 0.125384): 0.121188): 0.416167, (X53828OG1: 0.297568, U28410OG2: 0.372824): 0.335682): 0.188217): 0.019581): 0.061554);

dN tree:

(X02152Hom: 0.011715, U07178Sus: 0.010277, (M22585rab: 0.018459, ((NM017025Rat: 0.009372, U13687Mus: 0.007057): 0.017964, (((AF070995C: 0.036625, (X04752Mus: 0.023541, U07177Rat: 0.036735): 0.049065): 0.041712, (U95378Sus: 0.027095, U13680Hom: 0.030135): 0.029127): 0.100022, (X53828OG1: 0.028239, U28410OG2: 0.035380): 0.031856): 0.012087): 0.001257): 0.003953);

w ratios as labels for TreeView:

(X02152Hom #0.0642 : 0.011715, U07178Sus #0.0642 : 0.010277, (M22585rab #0.0642 : 0.018459, ((NM017025Rat #0.0642 : 0.009372, U13687Mus #0.0642 : 0.007057) #0.0642 : 0.017964, (((AF070995C#1 #0.2403 : 0.036625, (X04752Mus#1 #0.2403 : 0.023541, U07177Rat#1 #0.2403 : 0.036735)#1 #0.2403 : 0.049065)#1 #0.2403 : 0.041712, (U95378Sus#1 #0.2403 : 0.027095, U13680Hom#1 #0.2403 : 0.030135)#1 #0.2403 : 0.029127)#1 #0.2403 : 0.100022, (X53828OG1#2 #0.0949 : 0.028239, U28410OG2#2 #0.0949 : 0.035380)#2 #0.0949 : 0.031856) #0.0642 : 0.012087) #0.0642 : 0.001257) #0.0642 : 0.003953);

Time used: 1:10

In addition, carry out likelihood ratio tests (LRT) of the hypotheses below. See the lecture notes for additional details about LRTs. Use 1 degree of freedom to obtain the P-value for each LRT.

Summary table of hypotheses:

Model	ω_A0	ω_A1	ω_C1	ω_C0	lnL	LRT
H0	0.1367	0.1367	0.1367	0.1367	-6018.633
H1	0.1319	0.1319	0.1319	0.192	-6017.572
H2	0.0755	0.0755	0.2394	0.2394	-5985.635
H3	0.0949	0.0642	0.2403	0.2403	-5984.111

You should change the name of the main result file (via outfile= in the control file) or you will overwrite your previous results.

Exercise 4: The objective of this exercise is to use a series of LRTs to test for sites evolving under positive selection in the nef gene. If you find significant evidence for positive selection, then identify the involved sites by using empirical Bayes methods.

       seqfile = seqfile.txt          * sequence data filename

    * treefile = treefile_M0.txt      * SET THIS for tree file with ML branch lengths under M0
    * treefile = treefile_M1.txt      * SET THIS for tree file with ML branch lengths under M1
    * treefile = treefile_M2.txt      * SET THIS for tree file with ML branch lengths under M2
    * treefile = treefile_M3.txt      * SET THIS for tree file with ML branch lengths under M3
    * treefile = treefile_M7.txt      * SET THIS for tree file with ML branch lengths under M7
    * treefile = treefile_M8.txt      * SET THIS for tree file with ML branch lengths under M8

      outfile = results.txt           * main result file name
        noisy = 9                     * lots of rubbish on the screen
      verbose = 1                     * detailed output
      runmode = 0                     * user defined tree
      seqtype = 1                     * codons
    CodonFreq = 2                     * F3X4 for codon ferquencies
        model = 0                     * one omega ratio for all branches

    * NSsites = 0                     * SET THIS for M0
    * NSsites = 1                     * SET THIS for M1
    * NSsites = 2                     * SET THIS for M2
    * NSsites = 3                     * SET THIS for M3
    * NSsites = 7                     * SET THIS for M7
    * NSsites = 8                     * SET THIS for M8

        icode = 0                     * universal code
    fix_kappa = 1                     * kappa fixed
      * kappa = 4.43491               * SET THIS to fix kappa at MLE under M0
      * kappa = 4.39117               * SET THIS to fix kappa at MLE under M1
      * kappa = 5.08964               * SET THIS to fix kappa at MLE under M2
      * kappa = 4.89033               * SET THIS to fix kappa at MLE under M3
      * kappa = 4.22750               * SET THIS to fix kappa at MLE under M7
      * kappa = 4.87827               * SET THIS to fix kappa at MLE under M8

    fix_omega = 0                     * omega to be estimated 
        omega = 5                     * initial omega

      * ncatG = 3                     * SET THIS for 3 site categories under M3         
      * ncatG = 10                    * SET THIS for 10 of site categories under M7 and M8

  fix_blength = 2                     * fixed branch lengths from tree file

sdsf

PAUP* advice from David Swafford:

-------------------------

Things that you would examine with your own analyses:

The information that you really want from the program are the best trees found in each search replicate and the globally best across all replicates. After each individual replicate finishes, the best trees from all of the replicates completed thus far are written to the .best.all.tre file. When all replicates have finished, the best tree across all replicates is written to the .best.tre file.

The config files used here are set up to use a feature of the program that collapses internal branches that have an MLE length of zero. This may result in final trees that have polytomies. This is generally the behavior that one would want. Note that the likelihoods of the trees will be identical whether or not the branches are collapsed.

When the two search replicates have completed, we can more closely examine the results.

First, take a look at the end of the .screen.log file. You will see a report of the scores of the final tree from each search replicate, an indication of whether they are the same topology, and a table comparing the parameter values estimated on each final tree.

There are two possibilities:

The search replicates found the same best tree. You should see essentially identical lnL values and parameter estimates. The screen.log file should indicate that the trees are identical.
The search replicates found two different trees. This is almost certainly because one or both were trapped in local topological optima. You will notice that the lnL values are somewhat different and the parameter estimates will be similar but not identical. The search settings may influence whether searches are entrapped in local optima, but generally the default settings are appropriate.

We can also evaluate and compare the results of our two search replicates in PAUP*. Being able to open the results of one program in another for further analysis is a good skill to have.

Monday, July 20, 2015

 wget https://molevol.mbl.edu/images/3/3c/MSAlab.zip

*Most of this text is from the MBL tutorial for MSA. I am recording what I find and important points, in case the site gets lost over time.

Alignments with nucleotides: MUSCLE V. mafft

mafft --retree 2 1ped.fasta > mafft_dna.fasta

muscle -verbose -log muscle_dna.log -in 1ped.fasta -out muscle_dna.fasta

Compare the alignments resulting from A steps and B steps. Are they different? How many columns are in each the MAFFT or the MUSCLE alignment? What may be wrong with both? (Hint: these are protein coding genes).

They were not well-aligned, there were indels that were not multiples of three and they are protein coding genes!

Do the topologies and/or branch lengths differ? (Hint: look up some species names to get an idea of the expected topology!) Yes--very much so.

Bottom line: Don't do nucleotide alignments on protein-based sequences!

Comparing what happens with gaps:

muscle -verbose -log muscle_gap-20.log -in 1ped_aa.fasta -out muscle_aa_gap-20.fasta -gapopen -20

muscle -verbose -log muscle_gap-1.log -in 1ped_aa.fasta -out muscle_aa_gap-1.fasta -gapopen -1

Compare the modified gap penalty alignments to the default one. How do they differ? Which has the most gaps? Can you guess the order of magnitude of the default gap penalty? Can you find in the log files of MUSCLE the gap penalty use?

The higher gap penalty makes for a much more compressed alignment and helps to take care of this over "indel" issue. The default for MUSCLE is -2.9.

Alignment extension

   papara -t example.tre -s aln.phy -q sequence1.fasta -n run1

   papara -t example.tre -s aln.phy -q sequence2.fasta -n run2

How does the alignment look? Any concerns? How does the requirement for testing for sequence homology before site alignment apply here?

There are big tails at the ends of the new sequence. The alignment may have been trimmed previously, or the previous group used a different set of primers. The middle part is probably homologous, but a lot of times the ends may not line up perfectly. Always check your alignment.

   pagan --ref-seqfile aln.fasta -t example.tre -q sequence1.fasta -o pagan1

  pagan --ref-seqfile aln.fasta -t example.tre -q sequence2.fasta -o pagan2

Pagan requires input alignments in fasta rather than phylip format, but we have provided that in aln.fasta Try typing only 'pagan' to see the full explanation of parameters Compare the pagan alignments for the two sequences you downloaded.

What happens when you try to align the second sequence? HINT: Look at the output to screen. Why?

Adding the second sequence fails. Pagan is helping us to prevent alignments of completely random sequence! Thanks Pagan!

Both papara and pagan can align short reads as well as full length sequences, but caution about homology is particularly appropriate in these cases.

Codon Alignments

This is not part of the exercises, it's just for your future information.

As you now know, it is not appropriate to align nucleotides of protein coding regions. In the exercises above, you translated the nucleotides to amino acids which you could use to infer trees. But sometimes you want to analyze nucleotides that have been aligned by codon. So how do you go from an amino acid alignment back to codon-aligned nucleotides? We suggest the Pal2Nal server. You will upload your protein alignment and nucleotide sequences, and it will spit out the codon alignment. Please be aware that your nucleotides must be multiples of three (i.e. a full open reading frame).

FASTA tutorial

Identifying homologs and non-homologs; effects of scoring matrices and algorithms; using domain annotations

1. Use the FASTA search page [pgm] to compare Honey bee glutathione transferase D1 NP_001171499/ H9KLY5_APIME [seq] (gi|295842263) to the PIR1 Annotated protein sequence database. Be sure to press, not .

Take a look at the output.
1. How long is the query sequence?
2. How many sequences are in the PIR1 database?
3. What scoring matrix was used?
4. What were the gap penalties? (what is the penalty for a one-residue gap? two residues?)
5. What are each of the numbers after the description of the library sequence? Which one is best for inferring homology?
6. How similar is the highest scoring sequence? What is the difference between %_id and %_sim? Why is there no 100% identity match?
7. Looking at an alignment, where are the boundaries of the alignment (the best local region)? How many gaps are in the best alignment? The second best?
What is the highest scoring non-homolog? (The non-homolog with the highest alignment score, or the lowest E()-value.) How would you confirm that your candidate non-homolog was truly unrelated? (Hint - compare your candidate non-homolog with SwissProt for a more comprehensive test.)

Highest scoring unrelated sequence should have an E() value of 1 (or around 1). Can do "general re-search of this sequence and make sure it does not find your originally queried value.

Homology (and non-homology) can also be inferred by domain relationships. Try the same Honey bee GSTD1 search [pgm] search using the Annotate Query and Database Annotations: set to show .
1. Does the domain display change your mind about the highest scoring non-homolog?
2. There are three parts to the domain display, the domain structure of the query (top) sequence (if available), the domain structure of the library (bottom) sequence, and the domain alignment boundaries in the middle (inside the alignment box). The boundaries and color of the alignment domain coloring match the Region: sub-alignment scores.
3. Note that the alignment of Honey bee GSTD1 and SSPA_ECO57 includes portions of both the N-terminal and C-terminal domains, but neither domain is completely aligned. Why do you think the alignments do not include the complete domains?
4. Is your explanation for the partial domain alignment consistent the the argument that domains have a characteristic length? How might you test whether a complete domain is present?
5. The FASTA programs can partition an alignment score based domain boundaries, and report the amount of score associated with a domain. In the SSPA_ECO57 alignment, how much of the score comes from the GST-N terminal domain? The GST-C terminal domain? Does this alignment provide strong support for the presence of a GST-C terminal domain? How might you test for the presence of the domain on GSTD1? On SSPA_ECO57?In the subalignment scores, the Q value is -10 * log(p) for the sub-alignment score, so Q=30.0 means p < 0.001.
Repeat the GSTD1 search [pgm] using the BLASTP62/-11/-1 scoring matrix that BLAST uses. Re-examine the GSTD1:SSPA_ECO57 alignment. Are both Glutathione transferase domains present? Look at the alignments to the homologs above and below SSPA_ECO57. Based on those aligments, do you think the Glutathione-S-Trfase C-like domain is really missing? Why did the alignment become so much shorter?

SKIP

One of the candidate non-homologs is sp|Q9SI20|EF1D2_ARATH, with an E()-value of 0.11.

Does the domain structure of EF1D2_ARATH suggest that it could be a glutatione transferase homolog?

No, it looks very different
Only residues 8-60 overlapped

Use the General Research to explore the domains contained in EF1D2_ARATH homologs found in SwissProt.
Does this secondary search support homology or non-homology?

Focusing on the NODOM, it actually just isn't annotated! Scorlling down you can see that a glutathion homolog eventaully does show up in the annotation!!

Trust annotations when they say something, but don't trust them if they don't say something.

2. Exploring domains and alignment over-extension -- cortactin (SRC8_HUMAN)

Compare SRC8_HUMAN [pgm] (human cortactin) to the SwissProt protein sequence database.

Looking at the top five alignments, how many cortactin orthologs do you see? (ortholog, same protein, different species).

Four orthologs

In the SRC8 HUMAN:CHICK alignment, both the query and the subject (library) sequences align seven cortactin domains and an SH3 domain. In addition, two regions (one before the cortactin domain cluster and one after) are well conserved, but do not have annotated domains (NODOM). Are these non-domain (NODOM) regions as well conserved as the annotated domains?

Look at the SRC8_HUMAN:HCLS1_MOUSE alignment. How many cortactin domains does HCLS1_MOUSE contain? How much score does the NODOM spanning the region between cortactin domains and the SH3 domain contribute? Why is it included in the alignment? Is it likely to be homologous?

It is a homolog, but not an ortholog--it is a paralog! We already found mouse domain; only 4 shared domain too.

Is the NODOM between the cortactin domains and the SH3 domain likely to be homologous in the SRC8_HUMAN:DBNLB_XENLA alignment?

No. The domain score of the second, non annotated domain has a poor score (even negative!). Still needs to be included to include the SH3 domain. No SH3 domain, the alignment would stop.

In the SRC8_HUMAN:LASP1_HUMAN alignment, the alignment extends to include several Nebulin_repeat domains. Do you think there is a Nebulin_repeat domain in SRC8_HUMAN? Why do you think those domains are aligning?
What scoring matrix should be used to reduce over-extension from the SH3 domain?

3. Exploring domains and over-extension with local alignments -- death associated protein kinase (DAPK1_HUMAN)

Look up the domain structure of DAPK1_HUMAN at Pfam [pgm].
1. What are the major (PfamA) domain regions on the protein?
2. Which of the domains is repeated?
3. In a local (LALIGN) alignment, where would you expect to see overlapping domains like those in Calmodulin (CALM_HUMAN) and Cortactin (DAPK1_HUMAN)?
Use lalign/plalign [pgm] to examine local similarities between DAPK1_HUMAN and itself. Check the options to "annotate sequence 1 domains" and "annotate sequence 2 domains". Do you see the domains you expected from Pfam? Do they map in the same places?

Saturday, July 18, 2015

After running GARLI, I wanted to compare
OR137:
OR213: lots of disagreements; redo with 100
OR4: lots of disagreements; redo with 100
OR589: lot
OR6: all agree after 8 runs
OR10: all agree after 8 runs
OR11: tree2 disagrees, 7 others agree, use this as main gene tree
OR51: lots of disagreements; redo with 100
OR52: lots of disagreements; redo with 100

100 is a lot of tree to compare, so I wrote a small script doing so:

setwd("/Volumes/Yango/Hayden_batORs/garli/OR51/trees")

#tell where to open up all of your downloaded files from CIPRES
my.files<-list.files("/Volumes/Yango/Hayden_batORs/garli/OR51/trees/")
#read in all of these trees to a list using read.nexus function
my.trees<-lapply(my.files, read.nexus)

#create a matrix to store all of the tree comparisons
tree.matrix<-matrix(nrow = length(my.trees), ncol = length(my.trees))

for(i in 1:length(my.trees)){ #for every tree in the file
this.tree<-as.phylo(my.trees[[i]]) #look at that tree
#compare the topology to all the other trees
tree.matrix[i,]<-sapply(my.trees, function(my.trees) dist.topo(this.tree,my.trees))
}
tree.matrix

#print out the ones that agree with each other
for(i in 1:length(tree.matrix[1,])){
this.list<-as.numeric(which(tree.matrix[i,]==0))
print(this.list)
}
write.csv(tree.matrix, "tree.matrix.csv")

For OR51, are things looking better?

[1] 1

[1] 2

[1] 3

[1] 4 84

[1] 5

[1] 6

[1] 7

[1] 8 15 41 79

[1] 9 14

[1] 10 54

[1] 11

[1] 12

[1] 13 73

[1] 9 14

[1] 8 15 41 79

[1] 16

[1] 17 20 75

[1] 18

[1] 19

[1] 17 20 75

[1] 21 51 68

[1] 22 37

[1] 23 50 71

[1] 24

[1] 25

[1] 26

[1] 27

[1] 28 69 88

[1] 29

[1] 30

[1] 31

[1] 32 46

[1] 33

[1] 34 70

[1] 35 53

[1] 36

[1] 22 37

[1] 38 59

[1] 39 47 78

[1] 40

[1] 8 15 41 79

[1] 42

[1] 43

[1] 44

[1] 45

[1] 32 46

[1] 39 47 78

[1] 48

[1] 49

[1] 23 50 71

[1] 21 51 68

[1] 52

[1] 35 53

[1] 10 54

[1] 55

[1] 56

[1] 57

[1] 58

[1] 38 59

[1] 60

[1] 61

[1] 62 86 87

[1] 63

[1] 64

[1] 65

[1] 66 95

[1] 67

[1] 21 51 68

[1] 28 69 88

[1] 34 70

[1] 23 50 71

[1] 72

[1] 13 73

[1] 74

[1] 17 20 75

[1] 76

[1] 77

[1] 39 47 78

[1] 8 15 41 79

[1] 80

[1] 81

[1] 82

[1] 83

[1] 4 84

[1] 85

[1] 62 86 87

[1] 28 69 88

[1] 89

[1] 90

[1] 91

[1] 92

[1] 93

[1] 94

[1] 66 95

[1] 96

[1] 97

[1] 98

Womp, womp...not really--only a maximum of 4 trees out of 100 are agreeing.

OR4s looking much better:
[1] 1 22 28 42 75
[1] 2 99
[1] 3 14 18 83 89
[1] 4 12 17 20 23 24 27 51 58 59 72 77 79 88 96
[1] 5
[1] 6
[1] 7 15 98
[1] 8
[1] 9 19 38 80
[1] 10 50
[1] 11 67
[1] 4 12 17 20 23 24 27 51 58 59 72 77 79 88 96
[1] 13 86
[1] 3 14 18 83 89
[1] 7 15 98
[1] 16
[1] 4 12 17 20 23 24 27 51 58 59 72 77 79 88 96
[1] 3 14 18 83 89
[1] 9 19 38 80
[1] 4 12 17 20 23 24 27 51 58 59 72 77 79 88 96
[1] 21 31 44 62 84
[1] 1 22 28 42 75
[1] 4 12 17 20 23 24 27 51 58 59 72 77 79 88 96
[1] 4 12 17 20 23 24 27 51 58 59 72 77 79 88 96
[1] 25
[1] 26
[1] 4 12 17 20 23 24 27 51 58 59 72 77 79 88 96
[1] 1 22 28 42 75
[1] 29
[1] 30 33
[1] 21 31 44 62 84
[1] 32 37 61 68 81
[1] 30 33
[1] 34 100
[1] 35
[1] 36
[1] 32 37 61 68 81
[1] 9 19 38 80
[1] 39 71
[1] 40 49
[1] 41
[1] 1 22 28 42 75
[1] 43
[1] 21 31 44 62 84
[1] 45
[1] 46
[1] 47
[1] 48
[1] 40 49
[1] 10 50
[1] 4 12 17 20 23 24 27 51 58 59 72 77 79 88 96
[1] 52
[1] 53
[1] 54
[1] 55
[1] 56 90
[1] 57
[1] 4 12 17 20 23 24 27 51 58 59 72 77 79 88 96
[1] 4 12 17 20 23 24 27 51 58 59 72 77 79 88 96
[1] 60
[1] 32 37 61 68 81
[1] 21 31 44 62 84
[1] 63
[1] 64
[1] 65
[1] 66
[1] 11 67
[1] 32 37 61 68 81
[1] 69
[1] 70
[1] 39 71
[1] 4 12 17 20 23 24 27 51 58 59 72 77 79 88 96
[1] 73
[1] 74
[1] 1 22 28 42 75
[1] 76
[1] 4 12 17 20 23 24 27 51 58 59 72 77 79 88 96
[1] 78
[1] 4 12 17 20 23 24 27 51 58 59 72 77 79 88 96
[1] 9 19 38 80
[1] 32 37 61 68 81
[1] 82
[1] 3 14 18 83 89
[1] 21 31 44 62 84
[1] 85
[1] 13 86
[1] 87
[1] 4 12 17 20 23 24 27 51 58 59 72 77 79 88 96
[1] 3 14 18 83 89
[1] 56 90
[1] 91
[1] 92
[1] 93
[1] 94
[1] 95
[1] 4 12 17 20 23 24 27 51 58 59 72 77 79 88 96
[1] 97
[1] 7 15 98
[1] 2 99
[1] 34 100