Friday, June 27, 2014

We have the Pacbio reads back...first step is to assemble.

I am using Sprai because Sprai will output longer contigs--essential in distinguishing different olfactory receptors.

Sprai requires:
-Python > v 2.6 (which thank goodness we have on the server)
-NCBI BLAST v.2.2.27 (which we also have!)
-Celera Assembler (which we don't have)

Installing Celera Assembler:
Dowload the file (v. 8.1--could not get v 8.2 working) and navigate to home folder:
   bzip2 -dc wgs-8.1.tar.bz2 | tar -xf -
  cd wgs-8.1
  cd kmer && make install && cd ..
  cd samtools && make && cd ..
  cd src && make && cd ..
  cd ..

For Sprai to work, I changed the source code to accept longer reads.
cd wgs-8.1/src
vi AS_global.h
Change:
#define AS_READ_MAX_NORMAL_LEN_BITS 11
to:
#define AS_READ_MAX_NORMAL_LEN_BITS 15
Woops--I guess the new version is already set to 16!

Now install sprai:

tar -xzf sprai-0.9.5.1.3.tar.gz 
./waf configure
./waf buile

And of course I get an error.
Error #1: 
/Users/loloyohe/sprai-0.9.5.1.3/col2fqcell.h:78:7: error: use of undeclared identifier 'number_of_ballots'
      number_of_ballots += ballot[i];
      ^
....and this continues for anywhere "number_of_ballots" is stated.

Error #1 Solution:
"myrealigner.c" inherits the header "col2fqcell.h"
You can see in myrealigner.c there is no declaration of "number_of_objects"
In myrealigner.c, paste 
int number_of_ballots = 0;
under
int maximum_ballots = 11;

Error #2 & #3:
/Users/loloyohe/sprai-0.9.5.1.3/col2fqcell.h:25:47: error: function definition is not allowed here
  void set_vals(int col_index, int coded_base){
                                              ^
../myrealigner.c:583:102: error: function definition is not allowed here
    void print_fastq(char *chr, char *seq, char *depth, char *qual, char *base_exists, char *comment){
     
Error #2 & #3 solution: 
Facepalming the person who wrote this code. In C++, you cannot declare functions inside of functions GAHHHHHH

In "col2fqcell.h", near line 26, move:
  void set_vals(int col_index, int coded_base){
    ++ballot[coded_base];
    max_qvs[coded_base] = (max_qvs[coded_base] < (col[col_index].qv-'!')) ? (col[col_index].qv-'!') : max_qvs[coded_base];
  }
outside of the function 
void col2fqcell(){
...
  }

Okay now the amount of errors occurring is worrisome. There are so many undeclared variables and functions. This version of the code should not have been published. I am writing the the group that has published this and will follow up. 

No comments:

Post a Comment