Nanopore · Python

NanoFilt using albacore sequencing_summary for quality filtering

Due to a discrepancy between the quality scores calculated from the reads and those from the sequencing_summary.txt from albacore I added an option to NanoFilt to filter using the qualities specified in the sequencing_summary. NanoFilt now (v1.1.0) also optionally takes a –summary flag for the sequencing_summary file. As a nice bonus, it’s also faster! This added a… Continue reading NanoFilt using albacore sequencing_summary for quality filtering

Nanopore · Plotting

Calculated average quality vs. Albacore summary

I discussed earlier the difference between a) the calculated per read average basecall quality and b) the quality score given by albacore in the sequencing_summary file. Today I had a closer look at this difference and for one dataset I calculated the average quality from the fastq file and compared that to the accompanying sequencing_summary. The… Continue reading Calculated average quality vs. Albacore summary

Nanopore · Plotting

Median basecall quality score to estimate accuracy

I’ve had a few posts already about basecall quality scores and how those compare to the percent identity of the reads. This post is again a short follow-up on those stories. Today I investigate whether the median basecall quality (Phred) score of the aligned fragments is a good or better estimator for the percent identity… Continue reading Median basecall quality score to estimate accuracy

Nanopore · Plotting

The distribution of basecall quality scores

I also investigated the Oxford Nanopore quality scores in some of my previous posts, and this post is a follow-up of those. Today I investigate the distribution of the basecall quality scores and how the mean and median per read behave. At the bottom of this post is the script I used to generate the… Continue reading The distribution of basecall quality scores

Nanopore · Plotting

Averaging basecall quality scores the right way

For my NanoPlot tool, I have been calculating the average basecall quality of a read by simply calculating the arithmetic mean of the Phred scores. I recently also added an option to generate plots based on the sequencing_summary.txt file generated by the albacore basecaller, which completely avoids parsing the fastq file and calculating the mean.… Continue reading Averaging basecall quality scores the right way

Nanopore · Plotting

Per base sequence content and quality (end of reads)

After my earlier posts investigating the sequence content and quality at the start of Oxford Nanopore sequencing reads, I also wanted to include some code to look at the end of reads. These functions are part of the unfinished tool nanoQC, in which I want to replicate some of the plots made by FastQC. Creating the same plots as… Continue reading Per base sequence content and quality (end of reads)

Nanopore · Plotting

Oxford Nanopore basecall quality scores

A recent switch in Oxford Nanopore basecaller software (albacore v1.0.1) substantially improved the per-base quality scores, as mentioned in a previous post. I wondered if those quality scores are accurate. As shown below, the average base quality of a read is above 16. These scores are Phred-scaled quality scores, meaning they correspond to the -10*log10(Probability of incorrect… Continue reading Oxford Nanopore basecall quality scores