Table of Contents

sourmash

sourmash is a command-line tool and Python library for computing MinHash sketches from DNA sequences, comparing them to each other, and plotting the results. This allows you to estimate sequence similarity between even very large data sets quickly and accurately.

Information

Usage

See versions of sourmash which are available:

$ module avail sourmash

Load one version into your environment and run it:

$ module load sourmash/1.0
$ sourmash

Installation

Notes from the sysadmin during installation:

$ module load python/3.5
$ sudo mkdir -p /export/apps/sourmash/1.0
$ sudo chown aorth /export/apps/sourmash/1.0
$ python -m venv /export/apps/sourmash/1.0
$ . /export/apps/sourmash/1.0/bin/activate
$ pip install --upgrade pip setuptools
$ pip install Cython
$ pip install jupyter jupyter_client ipython pandas matplotlib scipy scikit-learn khmer
$ sudo yum install gcc-c++
$ pip install sourmash
$ sudo chown -R root:root /export/apps/sourmash