By globdbadmin, 1 April, 2026

We've just released version 0.1.1 of the amino acid sequence toolkit (AASTK). 

This update fixes some minor bugs we came across in the first weeks after release 0.1.0, such as a missing CASM plot, formatting of single CUGO plots and unclarity in the --metadata flag. A full changelog is available on the AASTK github page.

Tags

By globdbadmin, 25 March, 2026

Associating environmental metadata with microbial genomes retrieved from metagenomes is a notoriously hard problem. Metadata for the source sample(s) is often limited, and can be hard to retrieve at scale. We circumvented this problem by using the detection of GlobDB genomes in the 700k+ metagenomes present in the Sandpiper database, created using SingleM by Ben Woodcroft and colleagues.

By globdbadmin, 11 March, 2026

We are very happy to announce the first release of the the amino acid sequence toolkit (AASTK). 

AASTK is a suite of tools designed to leverage the genomic diversity captured by the GlobDB to create and analyze datasets of homologous proteins. Current functionality of AASTK includes tree-of-life scale dataset building, curation, and maintenance, as well as clustering of protein datasets, genomic context analysis, and metadata retrieval.