Published May 29, 2023 | Version v1
Dataset

Methylation-free E.coli nanopore sequencing (ONT R9.4.1) data set

  • 1. KTH
  • 2. SciLife Lab

Description

The data set consists of fast5 files divided into 5 zip files (fast5_[1-5].zip), a genome record (Ecoli_K12_MG1655.fasta), an Illumina assembly genome (illumina_contigs.fasta) and a fastq file from Guppy 5 (guppy_basecalled.fastq.gz). We sequenced the Ecoli non-methylated genomic DNA (D5016, Zymo Research) with an ONT MinION device. The sequencing libraries were prepared by fragmenting the genomic DNA using Covaris g-TUBE and a Ligation sequencing kit (SQK-LSK109, Oxford Nanopore) with Flow Cell chemistry R9.4.1. We also performed short-read Illumina sequencing on the same sample using the TruSeq PCR-free library preparation on the MiSeq sequencing platform (Illumina, USA), and constructed a draft assembly from the Illumina sequencing results using SPAdes v3.6.0. We also upload a reference genome directly obtained from the E.coli sample producer website. 

In addition, the data set contains two fastq files that produced by the Lokatt basecaller (lokatt_basecalled.fasta.gz) and local-trained Bonito basecaller (bonito_local_basecalled.fastq.gz ), respectively, which are used for benchmarking in the Lokatt basecaller paper.

Additional details

Related works

Is derived from
Publication: 10.5281/zenodo.7995806 (DOI)
Is part of
Preprint: 10.1101/2022.07.13.499873 (DOI)