Thursday, July 7, 2016

installing mysql

I am hoping to run a set of scripts that will identify and index olfactory receptors using BLAST and mysql. That means I need to sort out how mysql is installed on noctilio, which should be great fun.

First thing I noticed when I downloaded mysql, is that we have 3 other version of mysql on the server! :)

I knew this was going to cause some problems so I needed to uninstall all versions of MySQL in order to get this working properly. I basically followed this awesome post. Here it is reiterated for Mac OsX Yosemite:

Remove MySQL completely per The Tech Lab

  • p-ax | grep mysql

  • stop and kill any MySQL processes
  • brew remove mysql
  • brew cleanup
  • sudo rm /usr/local/mysql
  • sudo rm -rf /usr/local/var/mysql
  • sudo rm -rf /usr/local/mysql*
  • sudo rm ~/Library/LaunchAgents/homebrew.mxcl.mysql.plist
  • sudo rm -rf /Library/StartupItems/MySQLCOM
  • sudo rm -rf /Library/PreferencePanes/My*
  • launchctl unload -~/Library/LaunchAgents/homebrew.mxcl.mysql.plist
  • edit /etc/hostconfig and remove the line MYSQLCOM=-YES-
  • rm -rf ~/Library/PreferencePanes/My*
  • sudo rm -rf /Library/Receipts/mysql*
  • sudo rm -rf /Library/Receipts/MySQL*
  • sudo rm -rf /private/var/db/receipts/*mysql*
  • restart your computer just to ensure any MySQL processes are killed
  • try to run mysql, it shouldn't work

Brew install MySQL per user Sedorner from this StackOverflow answer

  • brew doctor and fix any errors
  • brew update
  • brew install mysql
  • unset TMPDIR
  • mysqld -initialize --verbose --user=whoami --basedir="$(brew --prefix mysql)" --datadir=/usr/local/var/mysql --tmpdir=/tmp #note the above command has been edited to replace the deprecated form
  • mysql.server start
  • run the commands Brew suggests, add MySQL to launchctl so it automatically launches at startup
Okay now to have MySQL start at launch, follow this lovely post. Homebrew can basically set this up for you.
brew tap homebrew/services
Apparently it is no longer supported, so also run these commands:
brew untap homebrew/boneyard brew tap gapple/services

When you run this command, mysql should start:
brew services start mysql

Thursday, May 26, 2016

How to blast against a transcriptome

A guest post by Ramatu Abubakar.
  • Log into the server and make your own folder.
  • Import the fasta file you want to blast against the transcriptome. For example, "Mc1r_Fasta.txt"
Creating your own database::
  • Under your folder, make a single folder for all your transcriptome fasta files. 
  • Make a new folder and move one transcriptome file under that folder. For example my folder was called "PE111_Cabr_VNO_trinity_output" and my transcriptome file was "PE111_Cabr_VNO_trinity_output.Trinity.fasta"

Command for creating database:
  • cd into the folder containing the transcriptome file you want to make the database for.
Type in the following:

makeblastdb  -in (transcriptome file name) -title (name of the folder contain the transcriptome)           -dbtype (prot for a database of proteins and nucl for a database of DNA or RNA) -out (name of your output).

What it means:
  • Makeblastdb tells the blast program to create a database.
  • in represents the input file.
  • title represents the title for the blast database to be created.
  • dbtype tells the blast program whether it is a protein or nucleotide sequence.
  • out represents the name of each database created. You can call it anything you want.
Example below:

105-238:PE111_Cabr_VNO_trinity_output grads$ makeblastdb -in PE111_Cabr_VNO_trinity_output.Trinity.fasta -title PE111_Cabr_VNO_trinity_output -dbtype nucl -out PE111_Cabr_VNO_trinity_output.Trinity.aa


Building a new DB, current time: 05/26/2016 13:18:13
New DB name:   /Users/grads/Ramatu_new_transcriptome/PE111_Cabr_VNO_trinity_output/PE111_Cabr_VNO_trinity_output.Trinity.aa
New DB title:  PE111_Cabr_VNO_trinity_output
Sequence type: Nucleotide
Keep Linkouts: T
Keep MBits: T
Maximum file size: 1000000000B
Adding sequences from FASTA; added 90410 sequences in 5.0665 seconds.

You should get results in your folder (make sure to refresh).

"nhr is the header file, nin is the index file and nsq is the sequence file. You dont really have to know this. Blast 'just' needs this"

Blasting against the transcriptome database:
  • cd back into your main folder
Type in the following:
blastn (use blast for nucleotide sequence and blastp for protein sequence) -query (fast file you want to search in the transcriptome) -db (database name created) -out (anything you want your output file to be called)

Building a new DB, current time: 05/26/2016 13:18:13
New DB name:   /Users/grads/Ramatu_new_transcriptome/PE111_Cabr_VNO_trinity_output/PE111_Cabr_VNO_trinity_output.Trinity.aa
New DB title:  PE111_Cabr_VNO_trinity_output
Sequence type: Nucleotide
Keep Linkouts: T
Keep MBits: T
Maximum file size: 1000000000B
Adding sequences from FASTA; added 90410 sequences in 5.0665 seconds.
105-238:PE111_Cabr_VNO_trinity_output grads$ cd
105-238:~ grads$ cd Ramatu_new_transcriptome
105-238:Ramatu_new_transcriptome grads$ ls
Mc1r_Fasta.txt PE111_Cabr_VNO_trinity_output new_transcriptomes
105-238:Ramatu_new_transcriptome grads$ blastn -query Mc1r_Fasta.txt -db /Users/grads/Ramatu_new_transcriptome/PE111_Cabr_VNO_trinity_output/PE111_Cabr_VNO_trinity_output.Trinity.aa -out Mc1r_blastn_1.txt
105-238:Ramatu_new_transcriptome grads$ 

If done correctly, you should get a new result file. (Make sure to refresh)



Thursday, March 3, 2016

installing FastCodeML

Trying out a program called FastCodeML to try to get faster estimates of my PAML simulations. Joe Parker wrote a nice little summary on this. However, contrary to Joe's "nothing a little make make install can't handle", it actually...can't handle it.

Installation:
This is optimized for a Linux environment so if you have a Linux machine, you can just run the binary in the downloaded folder.

First visit: ftp://ftp.vital-it.ch/tools/FastCodeML/ and download FastCodeML-1.1.0.tar.gz. (Not completely obvious at first).

Navigate the directory and convince yourself that you can't run the binary.
105-238:FastCodeML-1.1.0 loloyohe$ file fast
fast: ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux), statically linked, for GNU/Linux 2.6.18, stripped
Nope, sorry..we have to build from source.

The installation guidelines are vague and basically useless.
Here are the install instructions:
Requirements to generate the executable:
* C++ compiler, e.g. GCC 4
* CMake 2.8.0 (including ccmake) or later recommended, although compilation possible without
* Boost::Spirit, see http://boost-spirit.com/home/
* Reasonably new BLAS implementation (e.g. OpenBLAS, Goto2, ACML, MKL); packages from various Linux distributions can be used, but this deteriorates performance; recommended: OpenBLAS (http://xianyi.github.io/OpenBLAS/) or Intel MKL
* Reasonably new LAPACK library (e.g. original LAPACK or ACML, MKL); packages from various Linux distributions can be used, but this deteriorates performance

How to generate the FastCodeML executable:
* Generate BLAS if necessary
* Generate LAPACK if necessary
* Generate NLopt library (http://ab-initio.mit.edu/wiki/index.php/NLopt)
* Edit CMakeLists.txt if necessary
* Set paths for libraries (change and execute SETPATHS)
* "ccmake ." and switch USE_MPI and USE_OPENMP on/off (other default settings should be ok)
* make will create an executable "fast"

Computer system:
* Linux preferred, but sources are portable to other platforms

And here are further "detailed instructions in the extra_install_doc folder for "Mac_Pro":
*) Create NLOpt in ~/lib
./configure --prefix=/home/mac/lib/nlopt
make
make install

*) Copy BLAS and LAPACK to ~/lib
cp libblas.a ~/lib 
cp liblapack.a ~/lib

*) Setting environment variables
export BLAS_LIB_DIR="/home/mac/lib" #we want to flexible here, hence we do not specify /usr/lib
export LAPACK_LIB_DIR="/home/mac/lib" #we want to flexible here, hence we do not specify /usr/lib
export NLOPT_LIB_DIR="/home/mac/lib/nlopt/lib"
export NLOPT_INCLUDE_DIR="/home/mac/lib/nlopt/include"
export MATH_LIB_NAMES="blas;lapack;lapack;blas;gfortranbegin;gfortran"
#export MPI_INCLUDE_PATH="/usr/include" #might not be necessary if CXX set correctly
#export MPI_LIBRARY="/usr/lib" #might not be necessary if CXX set correctly
export CXX="/usr/bin/mpicxx.mpich2" #remember to run as mpirun.mpich2 -np 2 ./fast

This is bringing back a past "lapack" nightmare I had several months ago. Looks like its going to be great fun.

We have a "homebrew" environment. While mostly good, it has confused installers about where our libraries are and which compiler to use. That, and along with Mavericks OSX being horribly configured make this extra challenging. I first try to brew install everything they ask for. Basically, if you have a homebrew setup, don't do anything the instructions tell you.

Really could only get boost installed
105-238:FastCodeML-1.1.0 loloyohe$ brew install boost
Lapack and BLAS fail miserably. Luckily, these seem to be optional so when we go to ./configure, we can turn these things off. Instead of a normal make/install setup, this program is set up for ccmake (note: NOT cmake). Fun! This means when you run it, it is looking for "CMakeLists.txt". This is the file you edit.

After much hair-pulling, here is what my CMakeLists.txt. (Everything else stayed the same).
# Get the configuration switches
OPTION(USE_LAPACK         "Use BLAS/LAPACK" OFF)
OPTION(USE_MKL_VML         "Use Intel MKL vectorized routines" OFF)
OPTION(USE_OPENMP         "Compile with OpenMP support" OFF)
OPTION(USE_MPI             "Use MPI for high level parallelization" OFF)
if(NOT WIN32)
OPTION(BUILD_NOT_SHARED   "Build FastCodeML not shared" OFF)
endif(NOT WIN32)
OPTION(BUILD_SEARCH_MPI   "Search for MPI installation?" OFF)
OPTION(USE_ORIGINAL_PROPORTIONS "Use the original CodeML proportion definition" OFF)
SET(USE_LIKELIHOOD_METHOD "Original" CACHE STRING "Select the type of likelihood computation method: Original, NonRecursive, FatVector, DAG")
SET_PROPERTY(CACHE USE_LIKELIHOOD_METHOD PROPERTY STRINGS Original NonRecursive FatVector DAG)
OPTION(USE_IDENTITY_MATRIX "Force identity matrix when time is zero" OFF)
OPTION(USE_CPV_SCALING "Scale conditional probability vectors to avoid under/overflow" OFF)

Now "ccmake" was a new experience for me. It wasn't working when I just typed "ccmake ." as instructed. I don't know why. But anyways, if i did:
105-238:FastCodeML-1.1.0 loloyohe$ ccmake /Applications/FastCodeML-1.1.0/
then a new interface shows up in the terminal, basically showing what I had switched on and off. It was not intuitive to me what was happening, but I just kept pressing "enter" "n" and "g" until finally I got out of the screen. Makefile, can I haz? I can haz!!!

105-238:FastCodeML-1.1.0 loloyohe$ make
Scanning dependencies of target fast
[  4%] Building CXX object CMakeFiles/fast.dir/fast.cpp.o
[  8%] Building CXX object CMakeFiles/fast.dir/CmdLine.cpp.o
[ 12%] Building CXX object CMakeFiles/fast.dir/Genes.cpp.o
[ 16%] Building CXX object CMakeFiles/fast.dir/Phylip.cpp.o
[ 20%] Building CXX object CMakeFiles/fast.dir/PhyloTree.cpp.o
[ 24%] Building CXX object CMakeFiles/fast.dir/Newick.cpp.o
[ 28%] Building CXX object CMakeFiles/fast.dir/TreeNode.cpp.o
[ 32%] Building CXX object CMakeFiles/fast.dir/BayesTest.cpp.o
[ 36%] Building CXX object CMakeFiles/fast.dir/FillMatrix.cpp.o
[ 40%] Building CXX object CMakeFiles/fast.dir/Forest.cpp.o
[ 44%] Building CXX object CMakeFiles/fast.dir/TransitionMatrix.cpp.o
[ 48%] Building CXX object CMakeFiles/fast.dir/BranchSiteModel.cpp.o
/Applications/FastCodeML-1.1.0/BranchSiteModel.cpp:381:27: warning: comparison of unsigned expression < 0 is always false
      [-Wtautological-compare]
        else if(aValidLen < 0)
                ~~~~~~~~~ ^ ~
1 warning generated.
[ 52%] Building CXX object CMakeFiles/fast.dir/ProbabilityMatrixSet.cpp.o
[ 56%] Building CXX object CMakeFiles/fast.dir/FatVectorTransform.cpp.o
[ 60%] Building CXX object CMakeFiles/fast.dir/CodonFrequencies.cpp.o
[ 64%] Building CXX object CMakeFiles/fast.dir/AlignedAllocator.cpp.o
/Applications/FastCodeML-1.1.0/AlignedAllocator.cpp:22:10: fatal error: 'malloc.h' file not found
#include <malloc.h>
         ^
1 error generated.
make[2]: *** [CMakeFiles/fast.dir/AlignedAllocator.cpp.o] Error 1
make[1]: *** [CMakeFiles/fast.dir/all] Error 2

make: *** [all] Error 2
We are getting closer. Now its time to get hacky. 
Basically, comment out the malloc.h in anything that uses it.

Open up "AlignedAllocator.cpp".
Change: 
#include <malloc.h>
To:
//#include <malloc.h>

Try again!
105-238:FastCodeML-1.1.0 loloyohe$ make
Scanning dependencies of target fast
[  4%] Building CXX object CMakeFiles/fast.dir/AlignedAllocator.cpp.o
[  8%] Building CXX object CMakeFiles/fast.dir/HighLevelCoordinator.cpp.o
[ 12%] Building CXX object CMakeFiles/fast.dir/CodeMLoptimizer.cpp.o
[ 16%] Building CXX object CMakeFiles/fast.dir/ForestExport.cpp.o
[ 20%] Building CXX object CMakeFiles/fast.dir/ParseParameters.cpp.o
[ 24%] Building CXX object CMakeFiles/fast.dir/VerbosityLevels.cpp.o
[ 28%] Building CXX object CMakeFiles/fast.dir/DAGScheduler.cpp.o
[ 32%] Building CXX object CMakeFiles/fast.dir/TreeAndSetsDependencies.cpp.o
[ 36%] Building CXX object CMakeFiles/fast.dir/WriteResults.cpp.o
[ 40%] Linking CXX executable fast
[100%] Built target fast

There, that's better.
105-238:FastCodeML-1.1.0 loloyohe$ file fast
fast: Mach-O 64-bit executable x86_64

More on execution later.