Library Preparation – The First Step in a NGS Setup

Welcome back! Last quarter we discussed why Next Generation, or Massively Parallel, Sequencing is the next big thing in the world of Molecular Diagnostics. The sensitivity, the depth of coverage and the ability to interrogate many different areas of the genome at the same time were just a few of the benefits of these types of assays. Next, I would like to describe a couple different methods of library preparation, which is the first step necessary to run an NGS assay.

First of all, let’s define “Library.” I find this is the most common question technologists new to this technology ask. Essentially, a library is a specimen’s collection of amplicons produced by the assay that have been barcoded, tagged with appropriate platform adapters and purified. These will serve as the input for the next part of the NGS workflow, clonal amplification (the topic of next quarter’s blog!).  How these libraries are prepared differ depending on platform (i.e, Ion Torrent vs. MiSeq), starting material (RNA vs. DNA), and type of assay (targeted amplicon vs. exome).

Before we begin the library prep discussion, a note about the input specimen. The DNA must be quantitated using a method that is more specific than spectrophotometry – it must be specific for double-stranded DNA. It will lead to an overestimation of the amount of DNA in the specimen, which will lead to over-dilution and consequently, lower quantity of final library. Real-time PCR and a double-stranded kit with fluorometry are two examples of assays that will give accurate concentrations of double-stranded DNA.

Our lab has begun using NGS for some of our oncology assays, so I will focus on the two types we perform currently, but keep in mind, there are many other types of assays and platforms.

library1.png
Image 1: ion torrent amplicon library preparation. Source: Ion AmpliSeqTM Library Preparation User Guide – MAN0006735, Rev. 10 September 2012.

The assay we use for our Ion Torrent platform is a PCR amplicon based assay. The first step is to amplify up the 207 regions over 50 genes that contain hotspots areas for a number of different cancer types. This all occurs in one well for each specimen. Once those areas are amplified, the next step is to partially digest the primer sequences in order to prepare the ends of amplicons for the adapters necessary for the sequencing step. As shown in the figure above, two different combinations of adapters may be used. The top one, listed as the A adapter (red) and the P1 adapter (green), would be used if only one specimen was to be sequenced on the run. The A and P1 adapters provide universal priming sites so that every amplicon of every sample can be primed with the same primers, rather than having to use gene specific primers each time. The second possibility is listed below that, with the same P1 adapter (green) and a Barcode Adapter labeled X (red and blue) – it still contains the A adapter necessary for sequencing (red), but it also contains a short oligonucleotide sequence called a “barcode” (blue) that will be recognized during the analysis step based on the sequence. For example, Barcode 101’s sequence is CTAAGGTAAC – this will be assigned to specimen 1 in the run and all of the amplicons for that specimen will be tagged with this sequence. Specimen 2 will have the barcode 102 (TAAGGAGAAC) tag on all of its amplicons. During analysis, the barcodes will be identified and all of the reads with the 101 sequence will be binned together and all of the reads with the 102 sequence will be binned together. This allows many specimens to be run at the same time, thus increasing the efficiency of NGS even more. Lastly, the tagged amplicons are purified and normalized to the same concentration.

library2
Image 2: MiSeq amplicon library preparation. Image source: https://www.illumina.com/content/dam/illumina-marketing/documents/applications/ngs-library-prep/for-all-you-seq-dna.pdf

The assay we use for our MiSeq platform is a hybridization followed by PCR amplicon based assay. The first step is to hybridize probes to 568 regions over 54 genes that contain hotspots for a number of different cancer types. This occurs in one well for each specimen. Once the probes have hybridized, the unbound probes are washed away using a size selection filter plate. Next, the area between the probes is extended and ligated so that each of the 568 amplicons are created. These are then amplified in a PCR step using primers that are complimentary to a universal priming site on the probes, but also contain adapters plus the two indices required for paired end sequencing (the Ion Torrent platform utilizes single-end sequencing – this will be discussed in the sequencing portion in an upcoming blog!). As in the previous method, after PCR, these tagged amplicons are purified and normalized to the same concentration in preparation for the next step – clonal amplification.

Stay tuned for next quarter’s post – clonal amplification!

 

rapp_small

-Sharleen Rapp, BS, MB (ASCP)CM is a Molecular Diagnostics Coordinator in the Molecular Diagnostics Laboratory at Nebraska Medicine. 

Massively Parallel – the Next Generation of Sequencing

Sounds like a good title for a sci-fi novel, right?  What is the big deal about Next Generation Sequencing (NGS)?  Otherwise known as massively parallel sequencing or high throughput sequencing, NGS has become a technique used by many molecular labs to interrogate multiple areas of the genome in a short amount of time with high confidence in the results.  Throughout the next few blogs, we’ll discuss why NGS has become the next big thing in the world of molecular.  We’ll go through the steps of setting up the specimens to prepare them to be sequenced (library preparation), what types of platforms are available and what technologies they use to sequence.  Lastly, we’ll go through some of the challenges with this type of technology.

Let’s start with a review of dideoxy sequencing, otherwise known as Sanger sequencing, which has been the gold standard since its inception in 1977.  A typical setup in our lab for this assay begins with a standard PCR to amplify a region of the genome that we are interested in, say PIK3CA exon 21, specifically amino acid 1047, a histidine (CAT).  The setup would include primers complementary to an area around exon 21, a 10x buffer, MgCl2, a deoxynucleotide mix (dNTP’s), and Taq polymerase.  After amplification, the resulting products would be purified with exonuclease and shrimp alkaline phosphatase (SAP).  Next, another PCR would be set up using the purified products as the sample and using a similar mix as in the original amp, but with the addition of a low concentration of fluorescently labeled dideoxynucleotides.  These bases have no -OH group, so when they are incorporated into the product, amplification ceases on that strand.  Because they are present in a lower concentration, the incorporation of these is random and will occur at each base in the strand eventually.  The resulting products are then run and analyzed on a capillary electrophoresis instrument that will detect the fluorescent label on the dideoxynucleotides at the end of each fragment.  Shown below is an example of the output of the data:

NGS1

The bases will be shown as peaks as they are read across the laser.  The base in question in the middle of the picture is, in a “normal” sequence, an adenine (A), as seen in green.  In this case, there is also a thymine (T) detected at that same location, as seen in red.  This indicates that some of the DNA in this tumor sample has mutated from an A to a T at this location.  This causes a change from a histidine amino acid to a leucine (p.His1047Leu) and is a common mutation in colorectal cancers.

So all of this looks great, right?  Why do we need to have another method since we have been using this one for so long and it works so well?  There are a few reasons:

  1. The sensitivity of dideoxy sequencing is only about 20%.  This means lower level mutations could be missed.  The sensitivity of NGS can get down to 5% or even lower in some instances.
  2. The above picture shows the sequencing in the forward direction as well as the reverse direction.  This area then has 2x coverage – we can see the mutation in both reads.  If we could get a higher coverage of this area and be able to sequence it multiple times and see that data, we could feel more confident that this mutation is real.  In our lab, we require each area has 500x coverage so that we feel sure that we have not missed anything.  The picture below displays the same sequenced area as in the dideoxy sequencing above.  This a typical readout from an NGS assay and, as you can see, this base has a total of 4192 reads, so it has been sequenced over four thousand times.  In 1195 of those reads, a T was detected, not an A.  We can feel very confident in these results due to how many times the area was covered.
  3. The steps above detailed only amplifying this one area, but with colorectal cancer specimens, we want to know the status of the KRAS, BRAF, NRAS, and HRAS genes as well as other exons in PIK3CA  Using the dideoxy sequencing method is a lot of time and effort.  NGS can cover these areas in these five genes as well as multiple other areas (our assay looks at 207 areas total) all in the same workflow

NGS2

Join me for the next installment to discover the first steps in NGS workflow!

 

rapp_small

-Sharleen Rapp, BS, MB (ASCP)CM is a Molecular Diagnostics Coordinator in the Molecular Diagnostics Laboratory at Nebraska Medicine. 

Association for Molecular Pathology – A Bunch of Party Loving Pathologists…

I was privileged to attend this year’s Association for Molecular Pathology (AMP) meeting in Charlotte, North Carolina, in the beginning of November. I really enjoy this meeting – it is relevant to everything our lab does with sessions offered in topics of Hematopathology, Infectious Diseases, Solid Tumors, Inherited Diseases, and just recently added, Bioinformatics.

It is exciting to meet and discuss with others in this field, especially other laboratory technologists. AMP has done a wonderful job of including those of us who perform the bench work, offering discounted memberships, as well as learning opportunities on their website, and even an award especially for technologists’ exemplary posters/abstracts presented at the annual meeting.

This year’s meeting offered the previously mentioned topics, but an emerging trend was evident – testing cell-free DNA (cfDNA) obtained from sources other than tissue biopsies, such as plasma or urine. This quarter’s post will deal with the reason behind this and the technology for testing such specimens, specifically plasma.

Cell-free DNA has become an attractive source for tumor testing recently. This source can be tested when a tissue biopsy is just not possible, such as when a patient has progressed to the point that surgery is not recommended. Here is the biology behind why this can work as a source of tumor DNA:

1-17-fig1

Figure 1. http://www.intechopen.com/books/methylation-from-dna-rna-and-histones-to-diseases-and-treatment/circulating-methylated-dna-as-biomarkers-for-cancer-detection

The sources of DNA in a sample of whole blood (as shown in Figure 1) are:

  • white blood cells
  • degraded white blood cells (cfDNA)
  • degraded tumor cells (cfDNA)
  • circulating tumor cells (CTCs).

Because of the biology of tumor cells, they have higher turnover than other cells in the body. Due to this, a larger fraction of the cfDNA in the plasma is from tumor cells. We can take advantage of this with a so called “liquid biopsy” – with 10 cc’s of whole blood, we can attempt to capture about 10ng of cfDNA and test this for possible resistance mutations to the therapies the patient may be on.

Many of the posters and several of the sessions at the AMP meeting dealt with cfDNA. Several pre-analytical steps were stressed in order to have success with this type of specimen.

  1. The whole blood needs to be collected, as any other blood specimen should, with care taken to not lyse any of the cells during collection.
  2. The collection tube type varies depending on how much time it will take to centrifuge the specimen to obtain the plasma. If it can be spun within two to four hours, a simple EDTA tube is sufficient. If it cannot be spun within a short time, then another tube with special preservatives is required. A Streck tube has been the tube of choice in these situations, but others are becoming available on the market as the demand increases. These specific tubes offer a greater amount of time to capture the cfDNA without white blood cell lysis becoming an issue. This is important, because as the white blood cells lyse, the plasma is flooded with the patient’s normal cfDNA that will dilute out the tumor cfDNA fraction, making it even more difficult to detect.
  3. Centrifugation procedures must be altered. The brake should not be applied when stopping the centrifuge because braking can cause the white blood cells to be sheared, which will, again, flood the plasma sample with normal cfDNA. An initial spin should be performed to obtain the plasma, then an additional spin should be performed before extraction of the DNA.

There are multiple kits available on the market for extraction of cfDNA. Once the DNA is extracted it is suggested to measure the DNA fraction with a method that will display the size of the fragments, such as with a Bioanalyzer. Cell-free DNA is about 160-170bp in size and, with the readout from an instrument such as the Bioanalyzer, one can see the size of the DNA, quantitate it, as well as observe any contamination from genomic DNA (shown by a peak >>170bp in size).

Many types of testing are being performed on this cfDNA fraction such as real time PCR, digital droplet PCR, and next generation sequencing. Whichever platform is used, a validation must be performed to ensure a fairly low level of detection (as low as 0.1% or 0.01%) because, many times, the positive tumor cfDNA allele fraction will be very low due to the normal cfDNA in the plasma.

This method of testing non-invasive specimens from patients is an amazing way to help save possibly very sick people from having to undergo a risky surgery. This is yet another use of a new technique in the ever changing world of Molecular Diagnostics!

 

rapp_small

-Sharleen Rapp, BS, MB (ASCP)CM is a Molecular Diagnostics Coordinator in the Molecular Diagnostics Laboratory at Nebraska Medicine. 

The Exciting World of Molecular Diagnostics

Hello everyone! I am Sharleen Rapp and I’m a Molecular Diagnostics Coordinator at Nebraska Medicine. I feel lucky to be able to discuss all about the exciting world of Molecular Diagnostics. For my first post, I’d like to give you a little background about myself and why I feel I am lucky to be in the career that I’m in.

Ever since I was little, science has intrigued me. Perhaps it was the experiments my Dad performed in our kitchen as practice for his labs for his high school chemistry classes (who doesn’t enjoy watching salt crystals “grow” on string in peanut butter jars?) or watching my brother set up his fruit fly experiment for his high school science class, but I’ve always enjoyed learning about how things work.

I went to a small parochial school in the middle of Nebraska, and unfortunately we didn’t have the funds for elaborate science class labs. Interestingly enough, the event that clinched science for me was a project that I did for my government class. We were responsible for writing, essentially, a textbook, complete with chapters, endnotes, quizzes and tests, on a topic of our choosing. I chose to write about the Human Genome Project. I wrote this in the year 2000, when the Project was in full swing. I had read about it in the previous years, and I was completely amazed by what it accomplished. In the middle of the school year, in fact, Time magazine came out with an issue titled “The Future of Medicine – How genetic engineering will change us in the next century.” It contained nineteen different articles, all focused on how the information from the Human Genome Project would impact the future – one of which discussed the way pharmaceutical companies were designing drugs to combat the mutations in different types of cancer. I knew then I would be a part of that future; I just didn’t know how. At this time, I had no idea how I could go about working in this field. I had never heard of the discipline “Molecular Diagnostics” or medical technology.

I went off to college and got a degree in Biological Sciences with the intent to go to graduate school and study in Genetics, but I still had no real idea about how to get into the field of study of DNA. Through some interesting twists and turns, including working in a fruit fly lab in college and an amazing internship at Washington University under Elaine Mardis, I ended up at a small private company where my job was to sequence mitochondrial DNA and mitochondrial-related genes, and in doing this, I knew I had found my career. I am a self-proclaimed science nerd and I love sequencing, the whole process from wet bench to analysis, more than anything that I have ever done. When I moved over to Nebraska Medicine and began working in the Molecular Diagnostics lab, I was amazed at the work that was being done there. I’ve had some amazing opportunities to work with all different types of sequencing – dideoxy sequencing, pyrosequencing, and now, massively parallel (aka, next generation) sequencing. I am so excited to be sharing some of my experiences and case studies from the work that we do in our lab in future posts.

Thanks for reading!!

 

rapp_small

-Sharleen Rapp, BS, MB (ASCP)CM is a Molecular Diagnostics Coordinator in the Molecular Diagnostics Laboratory at Nebraska Medicine. 

Probe Structure for the Molecular Laboratory Professional

The purpose of real-time PCR is to perform efficient amplification of a target sequence and quantify the PCR products in “real time” by employing the use of a fluorescent reporter.  Fluorescent reporters can found in the form of DNA-binding dyes or fluorescently labeled primers or probes.  It is extremely important to understand the difference between DNA-binding dyes, and the various fluorescent primer and probe based chemistries.  The best way to grasp these theories is often to have a visual illustration of each of the different chemistries.

DNA-binding Dyes

  1. SYBR Green Dye – SYBR Green I is a fluorescent DNA binding dye that is commonly used as it binds to all double-stranded DNA.
    • SYBR Green is detected by quantifying the increase in fluorescence during PCR.
    • Advantages to using SYBR Green are that it is inexpensive, easy to use, and easily incorporated into the PCR reaction.
    • Disadvantages of using SYBR Green are that there is usually an increase in background and non-specific binding that can lead to detection of false positive results.
probe1
Image courtesy of: http://www.sigmaaldrich.com/technical-documents/protocols/biology/sybr-green-qpcr.html

 Fluorescent PCR Primer and Probe Based Chemistries

  1. Taqman Chemistry – Utilizes 5’ – 3’ exonuclease activity of Taq Polymerase (enzyme that copies DNA and necessary for PCR) to generate a signal.
    • The probe is composed of a single stranded DNA oligonucleotide which is complementary to the specific target sequence of the PCR template.
    • The probe has a modification to the 3’ end so that the polymerase cannot extend the sequence.
    • The 5’ end has the fluorescent dye and the 3’ end contains the quencher
    • During DNA synthesis, the exonuclease activity of the Taq Polymerase will degrade the probe, thus resulting in release of the reporter from the quencher.
probe2
Image courtesy of: https://es.wikipedia.org/wiki/TaqMan#/media/File:TaqMan_GX_cartoon.jpg
  1. Fluorescent Resonance Energy Transfer (FRET) – Energy is transferred between two light sensitive molecules.
    • Increase in target à More probes bind à Increase in fluorescence
    • The 5’ end is the donor (catalyst) and the 3’ end is the acceptor (fluorophore)
    • The energy is detected in the form of heat or fluorescence emission.
    • If probes bind, energy is transferred from donor to acceptor and generates the signal.
probe3
Image courtesy of: http://www.cdc.gov/meningitis/lab-manual/images/chapt10-figure01.gif

 

  1. Molecular Beacon – This type of chemistry measures the accumulation of product during the annealing phase of PCR.
    • Signal is detected only when probes are bound to the template before displacement by the polymerase.
    • A chemical modification prevents degradation during the extension step of PCR.
    • The 5’ end contains the reporter fluorophore and the 3’ end contains the quencher.
    • The amount of fluorescence is directly related to the amount of initial template available for binding and inversely proportional to the cycle threshold (CT) value.
    • During extension, the probe is displaced by Taq Polymerase and the hair-pin (non-fluorescent) structure is restored.
    • Unbound molecular beacon probe à reporter is too close to quencher à no signal is generated.
    • Beacon probe binds to target à reporter is separated à signal is generated.
probe4
Image courtesy of: http://www.bio-rad.com/webroot/web/images/lsr/solutions/technologies/gene_expression/qPCR_real-time_PCR/technology_detail/real-time-pcr-detection-standard-pcr-primer-and-molecular-beacon.gif
  1. Scorpion – Scorpion probes use two PCR primers, where one serves as a probe and once contains a stem-loop structure.
    • The stem-loop structure contains a 5’ fluorescent reporter and a 3’ quencher.
    • The loop of the scorpion probe contains complementary sequence to the internal portion of the target sequence.
    • If the primer binds and extends, the reporter is separated from the quencher and a signal is given off.
probe5
Image courtesy of: http://www.bio-rad.com/webroot/web/images/lsr/solutions/technologies/gene_expression/qPCR_real-time_PCR/technology_detail/real-time-pcr-detection-scorpions-pcr-primer-probe.gif

Understanding the various primer-probe chemistries including the interactions between the reporters and quenchers will provide some basic groundwork for those interested in pursuing a career in molecular biology.

 

L Noll Image_small

-LeAnne Noll, BS, MB(ASCP)CM is a molecular technologist in Wisconsin and was recognized as one of ASCP’s Top Five from the 40 Under Forty Program in 2015.

International Lung Cancer Experts Seek Public Comments on Revised Molecular Testing Guideline

From the press release:

BETHESDA, MD. June 28, 2016 — The College of American Pathologists (CAP), the International Association for the Study of Lung Cancer (IASLC), and the Association for Molecular Pathology (AMP) announced today the open comment period for the revised evidence-based guideline, “Molecular Testing Guideline for Selection of Lung Cancer Patients for EGFR and ALK Tyrosine Kinase Inhibitors.”

The open comment period begins today and will close on August 2, 2016. The online format provides an opportunity for public review of new draft recommendations for several key topics, as well as recommendation statements that have been reaffirmed since the initial guideline was jointly published online in April 2013 by Archives of Pathology & Laboratory Medicine, The Journal of Thoracic Oncology, and The Journal of Molecular Diagnostics.

The guideline revisions are designed to provide state-of-the-art molecular testing of lung cancer recommendations for pathologists, oncologists, and other cancer and molecular diagnostic laboratory professionals. The revisions are all based on evidence from an unbiased review of published experimental literature since 2013 and include the recommendations from an expert panel of renowned worldwide leaders in the field. The final recommendations will be approved and jointly published after consideration of the public comments, further panel discussion, and a complete evidence analysis. For more information and to provide comments, visit www.amp.org/LBGOCP.

Hybridization Conditions and Melting Temperature

Stringency is a term that many molecular technologists are all very familiar with. It is a term that describes the combination of conditions under which a target is exposed to the probe. Typically, conditions that exhibit high stringency are more demanding of probe to target complementarity and length. Low stringency conditions are much more forgiving.

  • If conditions of stringency are too HIGH → Probe doesn’t bind to the target
  • If conditions of stringency are too LOW → Probe binds to unrelated targets

 

Important Factors That Affect Stringency and Hybridization

  • Temperature of hybridization and salt concentration
    • Increasing the hybridization temperature or decreasing the amount of salt in the buffer increases probe specificity and decreases hybridization of the probe to sequences that are not 100% the same.
  • Concentration of the denaturant in the buffer
    • For example: Deionized Formamide and SDS can be used to reduce non-specific binding of the probe
  • Length and nature of the probe sequence
STRINGENCY AND BINDING
– Long Probe

 

– Probe has increased number of G and C bases

 

Binding occurs under more stringent conditions
– Short Probe

 

– Probe has increased number of A and T bases

 

Binding occurs under less stringent conditions

Melting Temperature (Tm) Long Probes

  • The ideal hybridization conditions are estimated from the calculation of the Tm.
  • The Tm of the probe sequence is a way to express the amount of energy required to separate the hybridized strands of a given sequence.
  • At the Tm: Half of the sequence is double stranded and half of the sequence is single stranded.
  • Tm = 81.5°C + 16.6logM + 0.41(%G+C) – 0.61(%formamide) – (600/n)

Where M = Sodium concentration in mol/L

n = number of base pairs in smallest duplex

  • If we keep in mind that RNA is single stranded (ss) and DNA is double stranded (ds), then the following must be true:

 

RNA : DNA Hybrids   More stable

DNA : RNA Hybrids        ↓

DNA : DNA Hybrids    Less stable

 

  • Tm of RNA probes is higher, therefore RNA : DNA hybrids increase the Tm by 20 – 25°C

 

Calculating the Tm for Short Probes (14 – 20 base pairs)

  • Tm = 4°C x number of G/C pairs + 2°C x number of A/T pairs
  • The hybridization temperature (annealing temp) of oligonucleotide probes is approximately 5°C below the melting temperature.

melt-temp

Sequence Complexity (Cot)

  • Sequence complexity refers to the length of unique, non-repetitive nucleotide sequences.
  • Cot = Initial DNA Concentration (Co) x time required to reanneal it (t)
  • Cot1/2 = Time required for half of the double-stranded sequence to anneal under a given set of conditions.
  • Short probes can hybridize in 1 – 2 hours, where long probes require more time.

 

Test Your Knowledge

  1. Calculate the melting temperature of the DNA sequence below:

ATCTGCGAAATCAGTCCCGG
TAGACGCTTTAGTCAGGGCC

 

Answer
If the number of G/C pairs = 11, and the number of A/T pairs = 9. The calculation is as follows:
4(11) + 2(9) = X
X = 62°C

L Noll Image_small

-LeAnne Noll, BS, MB(ASCP)CM is a molecular technologist in Wisconsin and was recognized as one of ASCP’s Top Five from the 40 Under Forty Program in 2015.