sourmash is a command-line tool and Python library for computing MinHash sketches from DNA sequences, comparing them to each other, and plotting the results. This allows you to estimate sequence similarity between even very large data sets quickly and accurately.
See versions of sourmash which are available:
$ module avail sourmash
Load one version into your environment and run it:
$ module load sourmash/1.0 $ sourmash
Notes from the sysadmin during installation:
$ module load python/3.5 $ sudo mkdir -p /export/apps/sourmash/1.0 $ sudo chown aorth /export/apps/sourmash/1.0 $ python -m venv /export/apps/sourmash/1.0 $ . /export/apps/sourmash/1.0/bin/activate $ pip install --upgrade pip setuptools $ pip install Cython $ pip install jupyter jupyter_client ipython pandas matplotlib scipy scikit-learn khmer $ sudo yum install gcc-c++ $ pip install sourmash $ sudo chown -R root:root /export/apps/sourmash