Coding · Plotting · Python

Altmetric scores for bioRxiv preprints, part 2

This post is part of the series starring altmetric attention scores and bioRxiv preprints, see also the earlier post about my Twitter bot, getting the altmetric scores and part 1 of the analysis. Today I’ll have a look at the top 10% altmetric scores and how fast the preprint got that score. These are the articles… Continue reading Altmetric scores for bioRxiv preprints, part 2

Plotting · Python

Altmetric scores for bioRxiv preprints, part 1

This post is following my earlier posts about my Twitter bot for interesting preprints and how to get altmetric scores for all bioRxiv preprints. In this post, I want to take a first look at the obtained data. I’ll do this using the Python pandas module. When working with pandas DataFrames interactively it’s convenient to… Continue reading Altmetric scores for bioRxiv preprints, part 1

Nanopore · Plotting

Calculated average quality vs. Albacore summary

I discussed earlier the difference between a) the calculated per read average basecall quality and b) the quality score given by albacore in the sequencing_summary file. Today I had a closer look at this difference and for one dataset I calculated the average quality from the fastq file and compared that to the accompanying sequencing_summary. The… Continue reading Calculated average quality vs. Albacore summary

Nanopore · Plotting

Median basecall quality score to estimate accuracy

I’ve had a few posts already about basecall quality scores and how those compare to the percent identity of the reads. This post is again a short follow-up on those stories. Today I investigate whether the median basecall quality (Phred) score of the aligned fragments is a good or better estimator for the percent identity… Continue reading Median basecall quality score to estimate accuracy

Nanopore · Plotting

The distribution of basecall quality scores

I also investigated the Oxford Nanopore quality scores in some of my previous posts, and this post is a follow-up of those. Today I investigate the distribution of the basecall quality scores and how the mean and median per read behave. At the bottom of this post is the script I used to generate the… Continue reading The distribution of basecall quality scores

Nanopore · Plotting

Averaging basecall quality scores the right way

For my NanoPlot tool, I have been calculating the average basecall quality of a read by simply calculating the arithmetic mean of the Phred scores. I recently also added an option to generate plots based on the sequencing_summary.txt file generated by the albacore basecaller, which completely avoids parsing the fastq file and calculating the mean.… Continue reading Averaging basecall quality scores the right way

Nanopore · Plotting

Per base sequence content and quality (end of reads)

After my earlier posts investigating the sequence content and quality at the start of Oxford Nanopore sequencing reads, I also wanted to include some code to look at the end of reads. These functions are part of the unfinished tool nanoQC, in which I want to replicate some of the plots made by FastQC. Creating the same plots as… Continue reading Per base sequence content and quality (end of reads)