The Single-molecule long-read sequencing of Scylla paramamosain

Aug 29, 2019Scientific reports

Detailed genetic sequencing of the mud crab Scylla paramamosain

AI simplified

Abstract

A total of 79,005 high-quality unique transcripts were obtained from 284,803 full-length non-chimeric reads sequenced from various tissues of the crab Scylla paramamosain.

  • The transcriptome was constructed from 12 different tissues, including gill, hepatopancreas, and muscle.
  • A total of 52,544 transcripts were successfully annotated using multiple protein databases.
  • The study identified 23,644 (lncRNAs) and 131,561 (SSRs).
  • Isoforms of many genes were also detected in the analysis.
  • This research provides a comprehensive set of full-length cDNA sequences for Scylla paramamosain, which may enhance future studies on this species.

AI simplified

Key numbers

79005
Total Unique Transcripts
High-quality unique transcripts obtained from sequencing.
66.5%
Annotated Transcripts Percentage
Percentage of transcripts annotated against protein databases.
23154
Identified
Common predicted by three different bioinformatics software.

Full Text

What this is

  • This research focuses on the sequencing of the crab Scylla paramamosain using advanced long-read technology.
  • A library from 12 different tissues was constructed and sequenced, yielding a wealth of transcript data.
  • The study aims to enhance genetic understanding and facilitate future research in crustaceans.

Essence

  • The study produced 79,005 high-quality unique transcripts from Scylla paramamosain, significantly enriching the genetic information available for this species.

Key takeaways

  • A total of 284,803 full-length non-chimeric reads were obtained, leading to 79,005 high-quality unique transcripts. This extensive dataset enhances the genetic resources available for S. paramamosain.
  • The sequencing identified 52,544 annotated transcripts, with 66.5% successfully matched to protein databases. This high annotation rate supports the utility of the dataset for functional studies.
  • The analysis revealed 23,154 (), providing a foundation for further exploration of their roles in crustacean biology.

Caveats

  • The study's reliance on third-generation sequencing technology, while advantageous, may still encounter challenges related to raw data error rates and the absence of a reference genome.
  • Identifying was limited by the lack of genomic data for S. paramamosain, which could affect the classification and validation of these transcripts.

Definitions

  • long non-coding RNAs (lncRNAs): Non-coding RNA molecules longer than 200 nucleotides that play critical roles in regulating gene expression.
  • simple sequence repeats (SSRs): Short, repetitive sequences in DNA that can serve as genetic markers.

AI simplified

what lands in your inbox each week:

  • ๐Ÿ“š7 fresh studies
  • ๐Ÿ“plain-language summaries
  • โœ…direct links to original studies
  • ๐Ÿ…top journal indicators
  • ๐Ÿ“…weekly delivery
  • ๐Ÿง˜โ€โ™‚๏ธalways free