Skip to content

Update list of sequencers for poly-g trimming #508

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

semenko
Copy link

@semenko semenko commented Jul 14, 2023

This PR updates and expands the list of serial numbers of Illumina 2-color SBS sequencers for poly-g trimming.

These values come from: https://knowledge.illumina.com/instrumentation/general/instrumentation-general-reference_material-list/000003880

This expands the previous 2-color list by adding:

  • NovaSeq 1000/2000 (@VL @VH)
  • NovaSeq X Plus (@LH)

This also broadens the NovaSeq 6000 serial from (@A0 --> @A) per Illumina's doc.

(I do not see @NDX documented by illumina, but this might be their NextSeq 550Dx FDA-regulated sequencer.)

Expanded parsing of Illumina 2-color SBS definitions for poly-g trimming.

These values are via: https://knowledge.illumina.com/instrumentation/general/instrumentation-general-reference_material-list/000003880

This expands the previous 2-color list by adding:
Novaseq 1000/2000 (@vl @vh)
Novaseq X Plus (@lh)

This changes the Novaseq 6000 header from (@a0 to @A) per Illumina's doc.

(I do not see @ndx documented by illumina, but this might be their NextSeq 550Dx FDA-regulated sequencer.)
semenko added a commit to semenko/liquid-cell-atlas that referenced this pull request Jul 14, 2023
OK for our data so far, and I submitted a PR to fastp:
OpenGene/fastp#508
@dlaehnemann
Copy link

I would like to rely on the automatic setting of the --trim_poly_g, and it would be great to see this pull request here, as well as #598 merged. Is anything holding this back?

Also, the above Illumina page is gone, which they do regularly on their docs. 🤦
But here's a page that at least mentions which models should be 2-channel (and also the 1-channel iSeq):
https://web.archive.org/web/20250701072529/https://www.illumina.com/science/technology/next-generation-sequencing/sequencing-technology/2-channel-sbs.html

And it seems like 10X put in quite some effort to collect all of Illumina's machine codes right here, although this seems to be 8 years old, so there will also be stuff missing:
https://github.com/10XGenomics/supernova/blame/b82c3d8efa68bda2d95f30621cd6d91308ce11a2/tenkit/lib/python/tenkit/illumina_instrument.py#L12-L45

So maybe these pull requests could be merged into one and amended if any of the other machine codes are missing.

@dlaehnemann
Copy link

Ah, and I finally also found this, where the author actually got info from Illumina support (this seems to be the only way of getting useful and somewhat structured info from them):
https://github.com/nickp60/fcid/blob/04bd2e6aab1979a6902a4470cc82c574991242a0/fcid/run.py#L5-L123

@dlaehnemann
Copy link

OK, I couldn't help myself and went to figure out which Illumina machine models will have the polyG issue. The list is (TL;DR):

  • iSeq 100
  • MiniSeq
  • MiSeq i100 Series
  • NextSeq 500/550
  • NextSeq 1000/2000
  • NovaSeq 6000
  • NovaSeq X

And here is the full story, with receipts:

Illumina Instrument Imaging Channel Systems

Instrument Model Imaging Technology Channel System Details Documentation Link Checked manually
iSeq 100 One-channel one dye, compact system Chemistry and Imaging on iSeq 100 Done.
MiniSeq Two-channel red and green Chemistry and Imaging on MiniSeq Done.
NextSeq 500/550 Two-channel red and green Chemistry and imaging on NextSeq 500/550 Done.
NextSeq 1000/2000 Two-channel blue and green, standard or XLEAP (A blue) Chemistry and imaging on the NextSeq 1000/2000 Done.
MiSeq Four-channel oldest SBS (Sequencing by Synthesis) chemistry Chemistry and imaging on MiSeq Done.
MiSeq i100 Series Two-channel blue and green, XLEAP (C blue) Two Channel Chemistry and Imaging on the MiSeq i100 Series Done.
HiSeq 1000/2500 Four-channel oldest SBS (Sequencing by Synthesis) chemistry Chemistry and imaging on MiSeq (also mentions HiSeq series) Done.
HiSeq X Four-channel oldest SBS (Sequencing by Synthesis) chemistry HiSeq X System Guide (15050091 v07) (mentions "the four color channels") Done.
NovaSeq 6000 Two-channel red and green Chemistry and Imaging on NovaSeq 6000 Done.
NovaSeq X Series Two-channel blue and green, XLEAP (A blue) Chemistry and Imaging on the NovaSeq X Series Instruments Done.

Channel System Summary

one-channel

Good overview: https://web.archive.org/web/20250701121553/https://knowledge.illumina.com/instrumentation/iseq-100/instrumentation-iseq-100-reference_material-list/000008434

Machine series:

  • iSeq 100

General setup:

  • one dye
  • each sequencing cycle has two rounds of chemistry + imaging

Color scheme:

  • adenine: first image only
  • cytosine: second image only
  • thymine: both images
  • guanine: permanently dark

two-channel

Good overview: https://web.archive.org/save/https://knowledge.illumina.com/instrumentation/novaseq-x-x-plus/instrumentation-novaseq-x-x-plus-reference_material-list/000007970

General setup:

  • two dyes (different colors, different base associations)
  • each sequencing cycle has two rounds of chemistry + imaging

two-channel: red and green

Machine series:

  • MiniSeq
  • NextSeq 500/550
  • NovaSeq 6000

Color scheme:

  • thymine: green
  • cytosine: red
  • adenine: both
  • guanine: dark

two-channel: blue and green

Standard reagents

Machine series:

  • NextSeq 1000/2000

Color scheme:

  • thymine: green
  • cytosine: blue
  • adenine: both
  • guanine: dark

XLEAP Reagents (A blue)

Machine series:

  • NextSeq 1000/2000
  • NovaSeq X

Color scheme:

  • thymine: green
  • adenine: blue
  • cytosine: both
  • guanine: dark

XLEAP Reagents (C blue)

Good overview: https://web.archive.org/web/20250701122436/https://knowledge.illumina.com/instrumentation/miseq-i100-series/instrumentation-miseq-i100-series-reference_material-list/000009348

Machine series:

  • MiSeq i100 Series

Color scheme:

  • thymine: green
  • cytosine: blue
  • adenine: both
  • guanine: dark

four-channel

Good overview: https://web.archive.org/web/20250701122141/https://knowledge.illumina.com/instrumentation/miseq/instrumentation-miseq-reference_material-list/000003757

Machine series:

  • MiSeq (except the i100 series)
  • HiSeq Series (docs mentioning HiSeq 1000/2500 and HiSeq X, but not other HiSeqs)

General setup:

  • four dyes
  • each sequencing cycle has four rounds of chemistry + imaging

Color scheme:

  • thymine: green
  • cytosine: yellow
  • adenine: red
  • guanine: blue

Data compiled from Illumina Knowledge Base documentation as of July 1st, 2025. The initial table was created by asking Claude Sonnet 4, to aggregate the relevant info scattered across Illumina Knowledge Base pages. But all entries and linkouts were checked manually, especially those for the HiSeq series were adjusted to point somewhere with a useful citation, an all pages were archived on the Wayback Machine (as Illumina often changes their links). Finally, I made the table much more concise by giving more detailed channel system descriptions below, which I compiled during cross-checking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants