Motivation Oxford Nanopore Technologies’ devices, such as MinION, permit affordable, real-time DNA sequencing, and come with targeted sequencing capabilities. Such capabilities create new challenges for metagenomic classifiers that must be computationally efficient yet robust enough to handle potentially erroneous DNA reads, while ideally inspecting only a few hundred bases of a read. Currently available DNA classifiers leave room for improvement with respect to classification accuracy, memory usage, and the ability to operate in targeted sequencing scenarios. Results We present SKiM: Short K-mers in Metagenomics, a new lightweight metagenomic classifier designed for ONT reads. Compared to state-of-the-art classifiers, SKiM requires only a fraction of memory to run, and can classify DNA reads with higher accuracy after inspecting only their first few hundred bases. To achieve this, SKiM introduces new data compression techniques to maintain a reference database built from short k-mers, and treats classification as a statistical testing problem. Availability SKiM source code, documentation and test data are available from: https://gitlab.com/SCoRe-Group/skim. Contact tcschneg@buffalo.edu
SKiM: accurately classifying metagenomic ONT reads in limited memory
Trevor Schneggenburger,Jaroslav Zola
Published 2025 in bioRxiv
ABSTRACT
PUBLICATION RECORD
- Publication year
2025
- Venue
bioRxiv
- Publication date
2025-05-16
- Fields of study
Biology, Medicine, Computer Science, Environmental Science
- Identifiers
- External record
- Source metadata
Semantic Scholar, PubMed
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
Showing 1-36 of 36 references · Page 1 of 1
CITED BY
- No citing papers are available for this paper.
Showing 0-0 of 0 citing papers · Page 1 of 1