extend_mOTUs_DB

1. Installation

#download extender archive
wget https://motu-tool.org/data/extend_mOTUs_DBv3.tar.gz
#decompress
tar -xzvf extend_mOTUs_DBv2.tar.gz
cd extend_mOTUs_DBv2
#create a conda enviroment with dependencies 
conda env create -f env/update_mOTUs_v2.yaml
cd ..
source activate update_mOTUs_v2
#Motus should be installed without conda
git clone https://github.com/motu-tool/mOTUs_v2.git
cd mOTUs_v2
python setup.py
python test.py
cd ..
MOTUS_DIR=`pwd`/mOTUs_v2/

2. Running

2.1. Required input files

You can see in the extend_mOTUs_DB/TEST/ directory what you have to prepare as input:

2.2. Run MG extraction

for i in $(cat extend_mOTUs_DB/TEST/genomes.list); do ./extend_mOTUs_DB/SCRIPTS/extend_mOTUs_addGenome.sh extend_mOTUs_DB/TEST/genomes/$i.fasta $i newdbfolder extend_mOTUs_DB/SCRIPTS/ $MOTUS_DIR; done

This call will extract the marker genes from the genome sequences.

2.3. Run DB generation

./extend_mOTUs_DB/SCRIPTS/extend_mOTUs_generateDB.sh extend_mOTUs_DB/TEST/genomes.list newdbname extend_mOTUs_DB/TEST/taxonomy_file.txt newdbfolder extend_mOTUs_DB/SCRIPTS/ $MOTUS_DIR

This call will do the clustering and create a new database that can be found in newdbfolder/newdbname/db_mOTU

2.4. Make DB available for mOTUs

Move the new database in the mOTU_v2 directory:

cp -r newdbfolder/newdbname/db_mOTU $MOTUS_DIR

3. Testing

Test that the database is updated. In the extend_mOTUs_DB/TEST/ directory there is a fastq file to test (test1_single.fastq). Run:

$MOTUS_DIR/motus profile -g 1 -c -s extend_mOTUs_DB/TEST/test1_single.fastq 

If the database was updated correctly you will see:

unknown Roseburia [meta_mOTU_v2_7798]	0
unknown Firmicutes [meta_mOTU_v2_7799]	0
unknown Clostridiales [meta_mOTU_v2_7800]	0
-1	0
Chryseobacterium indologenes [newdbname_1]	0
unknown Sphingobacterium [newdbname_2]	2
unknown Leadbetterella [newdbname_3]	0

Where the last three rows are the new mOTUs.
In extend_mOTUs_DB/TEST/test1_single.fastq we simulated some reads from newdbname and we are now able to profile this new species.
Note that those mOTUs are specific of the genomes that were selected in point 2.1.