Getting data assembled now. Not using Sprai--going to try to use Celera now (which I have already installed).
After several hours of downloading a million things, it turns out there a bug in svn code, but not if you download the .tar.gz file directly from sourceforge. For some reason, kmer will not compile (it must not be updated) if you try to svn into it.
Anyways, here is what I got to work:
Visit: http://sourceforge.net/projects/wgs-assembler/files/wgs-assembler/wgs-8.1/
bzip2 -dc wgs-8.1.tar.bz2 | tar -xf - #don't compile Celera yet
cd kmer
gmake install
#oops don't have gmake; just make it the same as "make"
sudo ln -s /usr/bin/make /usr/bin/gmake
cd ..
cd samtools
make
cd ..
cd src
gmake
#i tried getting Figaro and UMDOverlapper to work but I don't want to mess
#things up; let's try this for now
In the README, it says you can run the assembler with:
wgs-8.1/*/bin/runCA
#in my case, the * is Darwin-i386
The sequences are now kept in the spare Drive
cd Volumes/Spare/pacbio/C_sowelli
After several hours of downloading a million things, it turns out there a bug in svn code, but not if you download the .tar.gz file directly from sourceforge. For some reason, kmer will not compile (it must not be updated) if you try to svn into it.
Anyways, here is what I got to work:
Visit: http://sourceforge.net/projects/wgs-assembler/files/wgs-assembler/wgs-8.1/
bzip2 -dc wgs-8.1.tar.bz2 | tar -xf - #don't compile Celera yet
cd kmer
gmake install
#oops don't have gmake; just make it the same as "make"
sudo ln -s /usr/bin/make /usr/bin/gmake
cd ..
cd samtools
make
cd ..
cd src
gmake
#i tried getting Figaro and UMDOverlapper to work but I don't want to mess
#things up; let's try this for now
In the README, it says you can run the assembler with:
wgs-8.1/*/bin/runCA
#in my case, the * is Darwin-i386
The sequences are now kept in the spare Drive
cd Volumes/Spare/pacbio/C_sowelli
gunzip filtered_subreads.fast*
gunzip reads_of_insert.fast*
#make the FRG wrap file to be inputted
~/wgs-8.1/Darwin-i386/bin/fastqToCA -libraryname GPC -technology pacbio-raw -reads reads_of_insert.fastq >GPC_untrimmed.frg
#make the .spec file--for this first run:
#saved as GPC_spec.spec
merSize = 17
merThreshold = 0
merDistinct = 0.9995
merTotal = 0.995
doOBT = 0
doExtendClearRanges = 0
unitigger = bogart
ovlErrorRate = 0.05 # Compute overlaps up to 5% error
utgGraphErrorRate = 0.05 # Unitigs at 5% error
utgMergeErrorRate = 0.05 # Unitigs at 5% error
cnsErrorRate = 0.05 # Needed to allow ovlErrorRate=0.05
cgwErrorRate = 0.05 # Needed to allow ovlErrorRate=0.05
ovlConcurrency = 16
cnsConcurrency = 16
ovlThreads = 1
ovlHashBits = 22
ovlHashBlockLength = 10000000
ovlRefBlockSize = 25000
#cnsReduceUnitigs = 0 0 # Always use only uncontained reads for consensus
cnsReuseUnitigs = 1 # With no mates, no need to redo consensus
cnsMinFrags = 1000
cnsPartitions = 256
#run the assembler
~/wgs-8.1/Darwin-i386/bin/runCA -d Volumes/Spare/pacbio/C_sowelli/Assembly/GPC-trim -p GPC-trim -s /Volumes/Spare/pacbio/C_sowelli/Assembly/GPC_spec.spec GPC_untrimmed.frg
Trying with 5% error rates since the sequences are so similar.