This is the first part of a series of posts on the RAMMCAP suite of bioinformatics tools.
The Rapid Analysis of Multiple Metagenomes with a Clustering and Annotation Pipeline (RAMMCAP) is a tool used for analysis of metagenomic data. What it does is try to cluster and functionally annotate a set of metagenomic data, which means it takes the data, groups like pieces of data together into clusters, and then tries to figure out what the various clusters do. It’s made up of several tools, only two of which I’ve actually used and I’ll talk about later. Camera, the organization behind RAMMCAP, provides a web service where you can use RAMMCAP without installing it, but it has data limits that I’ll break very easily with my datasets plus they require registration, which seems non-functional right now (at least, I can’t get a new account set up and I’ve tried several times over the course of the last few days). So, I downloaded it a few weeks ago and worked on getting it to run over the course of several days. This post, and the others in this series, is a record of that process, including some initial miss-steps. If you have any questions or see other places where I stepped wrongly, leave a comment and let me know.
First off, the RAMMCAP download (found on the page linked above) is huge–the source code alone was a roughly 760 MB download, which extracts to around 3 gigs. That three gigs might contain some duplicate data–the folder structure is pretty disorganized and a casual glance shows a lot of folders with the same names. There are a lot of symlinks, though, so I could be wrong there. (The more I see, the more I’m convinced I’m right, though.)
Second, it looks like the source bundled a bunch of tools along with the main RAMMCAP code, including versions of BLAST, HMMER, and Metagene. That added a lot to the bulk of the download (roughly 450 MB). There’s also a huge amount of data here, including a version of the Pfam and TIGRFAM libraries (772 and 444 MB, respectively), and a couple other tools I haven’t heard of before that might be part of RAMMCAP.
Compiling, Phase 1
The README file in the main directory contains basic information on how to compile some of the tools including CD-HIT, ORF_FINDER, and CD-HIT-454, as well as the HMMERHEAD extension to the HMMER, which is optional. The instructions are pretty basic–just do the standard “make clean;make” and things should be good. I wrote a little build script to handle this, just in case I need to do this again for some reason. Everything seems to build fine with the exception of HMMERHEAD, but I’m just going to ignore that for now. Time for testing this puppy out.
Testing, Phase 1
The README indicates that there should be an examples folder somewhere with some basic test data I can use, but I don’t see it anywhere. Looking around…not seeing it. Turns out, it’s inside the rammcap directory inside the main directory.
Compiling, Phase 2
Inside the rammcap directory, I find a new README with some major differences from the one outside this directory, plus what looks like symlinks with the same names as some of the directories outside. Looks like they point to the same directories as the other ones, but I’ll recompile things, anyway, just in case. Good thing I wrote that build script.
Except that the build instructions aren’t the same–I don’t have to build CD-HIT-454, but I do need to make sure gnuplot and ImageMagick are installed. They are, which is good, because I’d either have to contact one of the tech guys to install it on this machine or I’d have to install it to my user directory, which I’ve done with several tools I need or don’t want to do without. Once I pull out that CD-HIT-454 reference, the build works fine.
Since my second round of testing was a much larger task, I’ll leave that for a later post.