sourmash is a command-line tool and Python library for computing MinHash sketches from DNA sequences, comparing them to each other, and plotting the results. This allows you to estimate sequence similarity between even very large data sets quickly and accurately.
See versions of sourmash which are available:
$ module avail sourmash
Load one version into your environment and run it:
$ module load sourmash/1.0 $ sourmash
Notes from the sysadmin during installation:
$ sudo mkdir /export/apps/sourmash $ sudo chown aorth /export/apps/sourmash $ scl enable devtoolset-2 bash $ module load python/3.5.3 $ python -m venv /export/apps/sourmash/1.0 $ . /export/apps/sourmash/1.0/bin/activate $ pip install Cython $ pip install jupyter jupyter_client ipython pandas matplotlib scipy scikit-learn khmer $ pip install sourmash $ sudo chown -R root:root /export/apps/sourmash