The Genomics Core blog: October 2014

Indexes are one of the simplest improvements in the last five years of sequencing, with the most incredible far-reaching effects. Today I will share a complementary pair of posts tackling the problems our customers experience most frequently when submitting indexed libraries for sequencing.

Why did I get very different yields for the libraries in my pool?

We've seen this so many times. You think you have carefully quantified and pooled your libraries, and then your sequencing data comes back with a massive variation in the number of reads for each library in your pool. What a nightmare!

Don't be fooled - there is nothing that your sequencing provider can do on the sequencer to cause a variable yield from your different indexes. An imbalance between indexes within your library pool arises during the pooling process, so an imbalanced pool indicates something has gone wrong during pooling.

Normally the problem is one of the following:

Different libraries in the pool are of different lengths
Quantification of the libraries prior to pooling was not accurate
The process of mixing the libraries into the pool was not robust

First check #1: Are your libraries of different average size?

Measure the length of every library prior to pooling on the Bioanalyzer or Tapestation (or similar).
Make sure you are including all of the visible peaks in your length measurement, including any adapter dimers, since they all contribute to the clustering.
Check that all of the libraries in your pool are a similar length to one another

Clustering efficiency is a non-linear function of length, because small fragments cluster disproportionately more efficiently than large ones. So if you mix a library of 200bp 50:50 with a library of 600bp, you will receive much more data for the short 200bp library.

As a guideline, all libraries should ideally be within +/- 50bp of one another.

Then check #2: Was your quantification prior to pooling accurate?

If your quantification is not reproducible then your library balance will be way off, whatever else you do well. When troubleshooting an imbalanced pool, I recommend you repeat quantification on your individual libraries a second time, and see if you receive the same result.

It is worth asking your NGS provider to share their quantification results with you, so you can compare them to your own expectation. No two quantification measurement will ever be in precise agreement, but your NGS provider must have a very robust process in order to provide you with a reliable per-lane yield, so you can use their result as a gold-standard during troubleshooting.

If you are quantifying by qPCR, here are some valuable tips to improve robustness:

Perform quantification measurements in triplicate on your plate
Check your triplicate measurements are within ~0.5 Ct values
Take the Median value of your triplicates
Quantify all libraries which you plan to pool together on a single qPCR plate
Always run a no-template control to check for nonspecific amplification or contamination

If you are quantifying by qubit or bioanalyzer, I recommend that you swap to qPCR as soon as possible - and I bet you will see a better pooling balance afterwards.

Finally, have a look at #3: Was the process of mixing the libraries robust?

A common mistake when pooling is to quantify your library, perform a dilution, and then assume the diluted library will be exactly the concentration you aimed for. Unfortunately this is only true if your original concentration is close to your goal. As a guideline, any dilution greater than 1:5 is unlikely to be sufficiently robust for multiplexing. Using small volumes during dilution steps can really exacerbate this problem

The best practice for diluting highly concentrated libraries prior to pooling is to dilute them to a low value just higher than your goal, then re-quantify, then do a final small dilution to reach your goal. Use large volumes for your dilution steps, and keep your final dilution step as small as possible - and definitely less than 1:5. I often aim for a final 1:2 dilution step.

Consider this simple example:

Library A is at 100nM, so I dilute 1ul in 9ul of buffer to give me 10nM
Library B is at 300nM, so I dilute 1ul in 29ul of buffer to give me 10nM
Library C is at 600nM, so I dilute 1ul in 59ul of buffer to give me 10nM
I then mix 10ul of the diluted A, B and C.

Frankly, my pooling balance is going to be rubbish.

Here's what I should do instead:

Library A is at 100nM, so I dilute 10ul in 40ul of buffer to aim for 20nM, then I re-quantify and find out it is actually at 18nM. I mix 10ul of this with 8ul of buffer to give 10nM
Library B is at 300nM, so I dilute 10ul in 140ul of buffer to aim for 20nM, then I re-quantify and find out it is actually at 22nM. I mix 10ul of this with 12ul of buffer to give me 10nM
Library C is at 600nM, so I dilute 10ul in 290ul of buffer to aim for 20nM, then I re-quantify and find out it is actually at 15nM. I mix 10ul of this with 5ul of buffer to give me 10nM
I then mix 10ul of the diluted A, B and C

My pooling balance will be beautiful

For the true NGS novices out there, if you don't know how I calculated the dilution steps in the example above then check this out.

If you have checked #1, #2, and #3 and everything looks perfect, then get in touch with Illumina's tech support team (techsupport@illumina.com) or with your NGS provider.

How do I pool my library at a defined concentration?

I get asked this a lot. Our current submission requirements are 10nM - 20nM in 15ul, but what does this mean? Is the total DNA concentration in the pool 10nM, and each individual library therefore much less? Or is it that each library within the pool is at a final concentration of 10nM?

Simply put, our submission guidelines IGNORE your indexes. Quantification and clustering cannot differentiate between indexes on a sample, so all we are interested in is the total quantity of DNA in your pool. So, for example, if you have five libraries in a pool, the final pool DNA concentration must be at least 10nM - which means that each library within that pool is at least 2nM.

Here is the simplest at-a-glance method to dilute and pool your libraries. For more detailed hints and tips read on to my next post!

Quantify and quality check all of your libraries
Select a goal concentration for pooling - at or below the lowest concentration of your set of libraries.
Make sure this is within our current submission guidelines.
Dilute all of your libraries to that concentration, using Illumina Resuspension Buffer, EB, or 10mM Tris pH 8.5 with 0.1% Tween.
Combine an equal volume of all of your libraries in your pool tube

Ta-da! You are ready to submit your pool for sequencing.

Friday, 17 October 2014

Indexing 2: Troubleshooting a bad index balance