Along with the recent developments in long-read and single-molecule sequencing, new technological features have appeared. One of these new features is offered by sequencers from Oxford Nanopore Technologies (ONT), allowing one to drop the sequencing of a single molecule at any time point after the reading has begun. It enables researchers to perform what is commonly referred to as selective sequencing. This interactive way of measuring a sequence opens up many new use cases (such as targeting specific regions or filtering out the unwanted background) and poses interesting bioinformatics problems.
Our research focuses on the fast and accurate classification of a currently sequencing read, deciding whether the sequencing should be continued or not. Based on the accuracy of this decision and the use case, this avoids unnecessary sequencing costs or allows for the more uniform sampling of a diverse and biased population of sequences. We employ compressed dynamic data structures and alignment techniques to record information about the already sequenced population and to select newly sequenced reads according to a given objective.