Coding · development

How to avoid duplicating your package version number using a version.py file

I thought it was rather annoying to specify the version of your package both in the tool itself and in your setup.py, so I searched the internet for solutions and below I’ll explain how I set it up. I have a version.py file in my project folder and this is the only spot where I… Continue reading How to avoid duplicating your package version number using a version.py file

Coding · development

Getting the setup.py long_description in reStructuredText from your Markdown README

I write my README files for Python/GitHub projects in Markdown, which is quite easy and convenient. But the PyPi guidelines for projects require a README.rst file in “reStructuredText”. The setup.py file also has a field for a “long description”, which will get inserted on the Pypi project page,  see for example this one for NanoPlot. UPDATE: TURNS… Continue reading Getting the setup.py long_description in reStructuredText from your Markdown README

Nanopore · Python

NanoFilt using albacore sequencing_summary for quality filtering

Due to a discrepancy between the quality scores calculated from the reads and those from the sequencing_summary.txt from albacore I added an option to NanoFilt to filter using the qualities specified in the sequencing_summary. NanoFilt now (v1.1.0) also optionally takes a –summary flag for the sequencing_summary file. As a nice bonus, it’s also faster! This added a… Continue reading NanoFilt using albacore sequencing_summary for quality filtering

Nanopore · Plotting

Calculated average quality vs. Albacore summary

I discussed earlier the difference between a) the calculated per read average basecall quality and b) the quality score given by albacore in the sequencing_summary file. Today I had a closer look at this difference and for one dataset I calculated the average quality from the fastq file and compared that to the accompanying sequencing_summary. The… Continue reading Calculated average quality vs. Albacore summary

Nanopore · Plotting

Median basecall quality score to estimate accuracy

I’ve had a few posts already about basecall quality scores and how those compare to the percent identity of the reads. This post is again a short follow-up on those stories. Today I investigate whether the median basecall quality (Phred) score of the aligned fragments is a good or better estimator for the percent identity… Continue reading Median basecall quality score to estimate accuracy

Nanopore · Plotting

The distribution of basecall quality scores

I also investigated the Oxford Nanopore quality scores in some of my previous posts, and this post is a follow-up of those. Today I investigate the distribution of the basecall quality scores and how the mean and median per read behave. At the bottom of this post is the script I used to generate the… Continue reading The distribution of basecall quality scores

Nanopore · Plotting

Averaging basecall quality scores the right way

For my NanoPlot tool, I have been calculating the average basecall quality of a read by simply calculating the arithmetic mean of the Phred scores. I recently also added an option to generate plots based on the sequencing_summary.txt file generated by the albacore basecaller, which completely avoids parsing the fastq file and calculating the mean.… Continue reading Averaging basecall quality scores the right way