bioinformatics · Coding · Nanopore · Plotting · Python

Comparing the pore status of flow cells

A (for us) very useful diagnostic of a nanopore run is the status of the pores: if they’re sequencing (‘single_pore’), saturated, unavailable or multiple. It is also interesting to see how fast you are losing pores or killing your flow cell with e.g. a particularly blocky library. Fortunately these metrics are since not-too-long ago saved… Continue reading Comparing the pore status of flow cells

Coding · Nanopore

Removing accidental 2D/1D^2 reads from an alignment

In surpyvor I collect scripts which can be helpful for structural variant analysis. Today I added a “purge2d” subcommand to remove accidental 2D/1D^2 nanopore reads from a BAM file. Most typically we do 1D ligation preps, but in some cases the complement fragment is sequenced directly after the template and not recognized as separate molecules,… Continue reading Removing accidental 2D/1D^2 reads from an alignment

Nanopore · Plotting · Python

Methplotlib examples

We recently published methplotlib, a tool for the visualization and analysis of modified nucleotides from nanopore sequencing. It works downstream of tools like nanopolish, nanocompore and direct methylation calling by the guppy basecaller. More information can be found on GitHub. Feedback, suggestions, reporting problems and feature requests are very much appreciated. Below are some example… Continue reading Methplotlib examples

Coding · Nanopore · Plotting · Python

Comparing the end reason of reads in a nanopore experiment

Since a recent version of MinKNOW, the software controlling a nanopore sequencer, a sequencing_summary file is created before basecalling, in which one column is of particular interest: the end_reason. Although I’m not yet sure what each value means, I believe it gives per read the reason why the software decided to stop sequencing here, mostly… Continue reading Comparing the end reason of reads in a nanopore experiment

bioinformatics · Coding · Python

Stacked bar chart of FILTER information from a multi-sample VCF

I wanted to make a stacked bar chart to show the number of variants with a certain FILTER status per sample from a multi-sample VCF. Nowadays I make all plots with plotly, because it’s fast, convenient to write and dynamic HTML makes it easier afterwards to select the bits I’m interested in to show. In… Continue reading Stacked bar chart of FILTER information from a multi-sample VCF

bioinformatics

bcftools concat: Failed to open variants.vcf.gz: could not load index

If you are like me and like to massively parallelize jobs then you may come across the following, initially cryptic error when using bcftools concat with thousands of vcf files: bcftools concat -a *.vcf.gz | bcftools sort -o all_variants.vcfFailed to open a_certain_variant_file.vcf.gz: could not load index I believe the problem is that too many files… Continue reading bcftools concat: Failed to open variants.vcf.gz: could not load index

Uncategorized

A very interesting false positive variant

Today I was, just like any other day, looking at structural variants (SVs) from Oxford Nanopore PromethION data, aligned using minimap2 and called with Sniffles. I came across a 100 Mb pericentrometric inversion of chr12, called in multiple individuals and visible in the alignment in all my samples. The region is not crazily repetitive (just… Continue reading A very interesting false positive variant