molecular biology – Lablogatory

Omicron: Variant of High Significance?

Omicron is now the dominant variant in the United States and gained that title faster than any variant before it. I have been tracking variants in the North Texas region since February of this year and detected the first Alpha variant (B.1.1.7). During this time, there were multiple substrains circulating. Some like Epsilon (origin California) rose in prominence then declined to extinction. Rise in Alpha (origin U.K.) and Delta variants (B.1.617.2, origin India) were tracked over the course of weeks, but Omicron has been tracked on a daily basis, since it is rising so quickly.

Many places are using S-Gene Target Failure (SGTF) as a surrogate for Omicron variant (Yale, University of Washington below).

Photo credit @NathanGrubaugh (Yale, Left) and @pavitrarc (UW virology, right)

SGTF occurs when the TaqPath COVID-19 multiplex test has 2/3 targets successfully amplify when the S-gene target does not or “drops out.” This phenomenon was first observed in the Alpha variant, because the probe for this target overlapped a characteristic mutation: S:Del69_70 (deletion of the 69^th and 70^th amino acids in the spike protein from a 6 base pair deletion). This mutation is absent in Delta, but present in Omicron, so has been used as an early tracker of Omicron prevalence.

Most of this discussion is speculative and we won’t ever really know, but given the rate of transmission of this variant, it seems unlikely that it would have acquired so many mutations and not been detected before now. The most recent common ancestor is from over a year ago suggesting it was incubating for a long time.

We’ve seen a case of a person severely immunocompromised with no antibody response to vaccination + booster who still has an unmutated wild type strain in their system. With no immune pressure, the virus has not evolved.

However, in HIV+ patients with variable/ low immunity, there could be enough pressure to drive the immune evasion properties seen in Omicron. Southern Africa has over 30% of their HIV+ patients not on therapy who would be likely candidates for this type of host.

Did we see this coming?

Yes. Other immune evasive variants have arisen in areas with high prevalence of previous infection (Brazil/ S. Africa). Organisms evolve just enough to overcome the challenges in their environment. Thus the level of immunity provided by various immune exposures are approximately:

Previous infection < 2x Vaccine < 2x Vaccine+ previous infection ~ x3 Vaccine

Scientists theorized that either Delta would evolve more immune evasive mutations or a totally new variant would arise. However, I didn’t think it would spread this quickly.

What is the impact?

Therapies. Most antibody therapies are directed as the business end of the spike protein—the receptor binding domain (RBD). The rest of the protein is covered in glycosylation modifications that block much recognition. Thus with many mutations in Omicron compared to the wild type strain (white), most therapeutic antibodies no longer bind/ inactivate viral replication.

Source: https://biorxiv.org/content/10.1101/2021.12.12.472269v1.full.pdf

Only one monoclonal antibody—Sotrovimab from GSK—is effective, because it binds a pan-coronovirus epitope outside of the RBD. However, this antibody is in short supply.

Thus, knowing which variant someone has can direct therapy. Several hospitals in our area are out of Sotrovimab, and only people with the Delta variant can access other options. Thus, knowing the variant in a short time frame has clinical implications.
Whole genome sequencing takes too long, so the FDA has agreed to review PCR genotyping approaches for clinical use. I have described some previous approaches, but many of these methods are useful as a screening method and would not have sufficient specificity to determine whether an omicron variant is present. Next time, I will discuss variant genotyping, why it is important, how it can be done, and what clinical actions can be taken with the knowledge.

Severity. There are signs that it is less severe. Is this due to increase in immune tolerance? We now have been prepared by either previous infection or vaccination to be protected from hospitalization or severe disease.

@Jburnmurdoch https://twitter.com/jburnmurdoch/status/1478339769646166019/photo/1

Or is the decline in severity due to lower pathogenicity? A recent non-peer reviewed study indicates the virus replicates x70 faster than Delta in the upper airways (left), but infiltrates cells 10% as well as the original strain.

From: https://www.med.hku.hk/en/news/press/20211215-omicron-sars-cov-2-infection?utm_medium=social&utm_source=twitter&utm_campaign=press_release

We all hope this will continue to be better news about the severity of Omicron, but from the lab side, I’ve heard of positivity rates >50% at some places. So this can still have a broad impact.

-Jeff SoRelle, MD is Assistant Professor of Pathology at the University of Texas Southwestern Medical Center in Dallas, TX working in the Next Generation Sequencing lab. His research interests include the genetics of allergy, COVID-19 variant sequencing, and lab medicine of transgender healthcare. Follow him on Twitter @Jeff_SoRelle.

Microbiology Case Study: A Female in her 60s with Retro-orbital Headaches

Case History

The patient was a previously healthy female who presented with a five day history of retro-orbital headaches, lightheadedness, and intermittent falls. Her presentation was consistent with meningitis and further studies were pursued. Head computed tomography (CT), CT angiogram of the head and neck, brain magnetic resonance imaging (MRI), and electroencephalogram (EEG) were unremarkable. Analysis of the cerebrospinal fluid (CSF) demonstrated an elevated white blood cell count (605 white blood cells/µL) of which 88% were lymphocytes, 9% were monocytes, and 3% were neutrophils. CSF glucose was slightly decreased at 33 mg/dL and protein was elevated at 81 mg/dL. Additional tests requested on the CSF included herpes simplex virus (HSV), varicella zoster virus (VZV), West Nile virus (WNV), and Epstein-Barr virus (EBV). The CSF was positive for HSV-2 and negative for HSV-1, VZV, and EBV by PCR. WNV IgG and IgM were negative. Of note, the patient had two episodes of viral meningitis in the past of unknown etiology. The patient received a one week course of valacyclovir and was discharged. Per the patient, she continues to have fluctuating headaches and occasional lightheadedness. Follow-up imaging has been unremarkable.

Figure1. Results of the HSV-1 and HSV-2 PCR. HSV-2 (green) and internal control (purple) amplified. HSV-1 (red) was not detected.

Discussion

Herpes simplex virus 1 and 2 (HSV-1 and HSV-2) are enveloped, double stranded DNA viruses that are members of the Herpesviridae family. They are common viruses that cause cold sores or fever blisters. HSV is a lifelong infection, and latent infection can cause reactivation. While both HSV-1 and HSV-2 can affect any area, HSV-1 is typically associated with non-genital sites whereas HSV-2 typically causes genital infections. In addition to herpetic gingivostomatitis, herpes labialis, and herpes genitalis, other associated clinical conditions include encephalitis, meningitis, keratitis, esophagitis, neonatal herpes, and disseminated primary infection. Most cases of HSV encephalitis have been linked to HSV-1 while HSV meningitis is typically caused by HSV-2. As seen in our patient, HSV-2 has been implicated in recurrent, aseptic, and self-limiting meningitis, also known as Mollaret meningitis.¹ There are no specific treatment guidelines for HSV-2 meningitis with the main therapeutic strategy being symptom management. The utilization of acyclovir to manage uncomplicated HSV-2 management is controversial and there is no current consensus.²

Clinically, patients with meningitis typically present with acute onset of fever, headache, and neck stiffness. Other associated symptoms include malaise, rash, nausea, vomiting, sore throat, lymphadenopathy, and genitourinary symptoms. In order to differentiate between the infectious etiologies (i.e. viral, bacterial, tuberculous, or fungal) that cause meningitis, a lumbar puncture may be performed. For viral meningitis, CSF will usually show an elevated white count with predominantly mononuclear cells. The CSF:serum glucose ratio and protein levels are often elevated. The most common CSF viral pathogens in the non-immunosuppressed population are enteroviruses, HSV-1, HSV-2, and VZV, which can all be detected by real time polymerase chain reaction (RT-PCR) technology Molecular methods are faster, more sensitive, and more widely available that viral culture.³ Antibody tests are not recommended for HSV as ~70% of adults will be positive for HSV-1 and ~20-50% of adults will be positive for HSV-2.⁴

Given the broad range of infectious etiologies that can cause meningitis, there has been interest in the development of a multiplex molecular test. Currently, the FilmArray meningitis/encephalitis panel is the only one that has received FDA clearance. This panel includes 14 bacterial, fungal, and viral targets, including HSV-1 and HSV-2. However, this panel should be used cautiously as several studies have shown a high proportion of false negatives in the detection of HSV-1, HSV-2, and Crytococcus neoformans/gattii. It has been suggested that for HSV-1 and HSV-2, the multiplex panel does not work as well if the viral load is near the limit of detection of the assay or if the patient is having a reactivation of HSV. If there is a high clinical suspicion, particularly in neonates and immunosuppressed patients, an assay for detection of only HSV-1 and HSV-2 should be performed.⁵

References

Koelle DM and Corey L. (2008) Herpes simplex: insights on pathogenesis and possible vaccines. Annual Review of Medicine, 59: 381-395.
Bamberger DM. (2010) Diagnosis, initial management, and prevention of meningitis. American Family Physician, 82: 1491-1498.
Logan SAE and MacMahon E. (2008) Viral meningitis. The BMJ, 336: 36-40.
Schiffer JT, Corey L. (2020) Herpes simplex virus. Bennett’s Principles and Practice of Infectious Diseases, 9^th edition.
Tansarli GS and Chapin KC. (2020) Diagnostic test accuracy of BioFire FilmArray meningitis/encephalitis panel: a systematic review and meta-analysis. Clinical Microbiology and Infection, 26: 281-290.

-Melissa Tjota, MD, PhD is a Molecular Genetic Pathology fellow at the University of Chicago Medicine and NorthShore University HealthSystem. She completed her MD/PhD (Immunology) and AP/CP residency at the University of Chicago.

-Paige M.K. Larkin, PhD, D(ABMM), M(ASCP)^CM is the Director of Molecular Microbiology and Associate Director of Clinical Microbiology at NorthShore University HealthSystem in Evanston, IL. Her interests include mycology, mycobacteriology, point-of-care testing, and molecular diagnostics, especially next generation sequencing.

Pitfalls of Artificial Intelligence for COVID-19 Variant Classification

While you have surely heard about all of the SARS-CoV-2 variants and how concerning they are, I would bet that you may not know how they are classified. Sure, from my last post, the technical aspects of whole genome sequencing and targeted approaches have been described, but bioinformatic (big data) analyses are essential to assign lineages. Furthermore, the advances of machine learning have been integrated into this system for SARS-CoV-2 lineage assignment.

How VOC lineages are given

First, phylogenetic trees (circular example below) are formed to demonstrate relatedness of strains based on how many mutations they share. The more similar they are, the closer they are together. These trees are not new nor do they rely on artificial intelligence, but they can give visual clues as to whether a lineage is new. For instance, when the first variant of concern B.1.1.7 (now called Alpha) was discovered, it branched away from other limbs of the phylogenetic tree.

Within these new viral variants, there are a set of mutations that are present in most of the viral variants. For instance, there are 17 protein coding changes in Alpha variant. However, these exact 17 mutations may not be in every Alpha variant. Individually, mutations may be present in 98% of isolates or lower; the spike gene deletion of amino acids 242-244 of the Beta variant (B.1.351, South Africa origin) is only present in 88% of specimens sequenced. This could be due to issues in sequencing, data processing, or just the prevalance/biology of the virus.

As there are many mutations that fit into certain variants, it would be difficult for a human to process all of this information in a probabilistic manner to assign lineages. Thus, machine learning tools (most common SARS-CoV-2 program is Pangolin) have been added onto the end of bioinformatic analyses to assign the lineage to a sample.

How machine learning works

The subject of machine learning has been discussed in a previous post about Protein folding prediction. Briefly, it is helpful to remember that machine learning is a process to create algorithms that give an outcome based on training data. The more diverse, large, and well curated the data, the better the accuracy of the program. One pitfall is they are based on previous data, which works well for many situations: using AI to find a lung cancer on chest x-ray would work well, because lung cancers have consistent characteristics.

However, with COVID-19, new variants keep arising and current variants are evolving (think Delta and Delta “plus”). Furthermore, if the classifier Pangolin is trained on high quality data, then trying to interpret lower quality data (missing genome regions, few sequencing reads) may confuse Pangolin and lead to inaccurate results. What follows is an example of how this occurred at our institution.

Case study

We have been sequencing COVID-19 positive specimens at UT Southwestern for the last several months. Many of the cases have been the Alpha variant (B.1.1.7, origin U.K.). However, it was around this time that Delta (B.1.617.2, origin India) cases started to arise. In one week, we found two specimens that were classified as B.1.95. This was an unusual variant I had not heard of before. There are several “wild type” strains that are B.1.1/ B.1.2 and other derivations, but I had not seen anything like this before.

Clinical history

Two specimens sequenced belonging to Hispanic, adolescent brothers whose mother had recently been hospitalized with COVID-19. There was information on mother’s travel history.

Therefore, I performed manual review of the specific variants. Many of the diverse mutations occur in the spike protein, so this was analyzed first. Immediately, I noticed two classic mutations of the Delta variant: a 2 amino acid deletion in the spike gene (S:Del157_158) and a receptor binding site mutation (S:L452R) also seen in the variant from California (B.1.429). Other mutations could be evaluated, but the combination of these two mutations is unique to Delta variant.

One suspected cause was that the Pangolin lineage classifier had an issue. Specifically, it had not been updated since February 2021- when Delta did not exist. Thus, there was no data for the program to classify the variant properly. Upgrading to the latest version of Pangolin provided the correct lineage classification.

A Few weeks later…

Once again, I was checking the lineages reported by the classifier and there were several B.1.617.2 and B.1.617.1. Both of these are variants from India (before the helpful WHO Delta designation), but they are distinct sub-variants. It was odd to see B.1.617.1, because this was found to be less infectious compared to the dominant B.1.617.2 variant (later named Delta) and B.1.617.1 was not spreading across the globe.

Intervention:

Therefore, I once again went to the sequence data for the spike protein to compare some mutations. Although these are sub-variants from the same original variant, they have several mutually exclusive mutations in the spike protein. The figure below compares the prevalence of specific mutations in the spike protein of B.1.617.1 and B.1.617.2 (dark purple = common in a variant, white = rare).

Upon manual review, all of the spike gene mutations were specific to B.1.617.2. So why was there an issue in classification? Again, there were few sequences for either of these sub-variants at that time, so the classifier wasn’t as well trained. Updating the Pangolin version brought the benefit of new data and more accurate classifications.

Take away messages

Updating Lineage classification software (Pangolin) on a regular basis is needed for accurate results.
Manual review is essential for any abnormal findings- a typical process for pathologists, but also plays an important role in COVID-19 variant monitoring.
Know what you’re looking for and know which mutations differentiate the variants.
Delta is now the dominant strain in the U.S. (graphic below).

References

Outbreak.info
https://pangolin.cog-uk.io/

–Jeff SoRelle, MD is Assistant Instructor of Pathology at the University of Texas Southwestern Medical Center in Dallas, TX working in the Next Generation Sequencing lab. His clinical research interests include understanding how lab medicine impacts transgender healthcare and improving genetic variant interpretation. Follow him on Twitter @Jeff_SoRelle.

How to Detect COVID-19 Variants of Concern

It’s a little deja-vu writing this title one year after a similar blog post on how to validate a COVID-19 assay at the start of the pandemic. In many ways, the challenges are similar: limited reagents/control material, and rising case counts. At least now, there is increasing support in the way of funding from the federal government that could help with monitoring and surveillance. I’m going to summarize the current methods available for detecting the Variants of Concern and emerging variants.

Whole Genome Sequencing

The principle method used by many is whole genome sequencing. It has the advantage of being able to comprehensively examine every letter (nucleotide) of the SARS-CoV-2 genome (30 kilobases long). At our institution, I’ve been working on the effort to sequence all of our positive specimens. While it is achievable, it is not simple nor feasible at most locations. Limitations include:

Financial: must already own expensive sequencers
Expertise: advanced molecular diagnostics personnel needed who perform NGS testing
Data Analytics: bioinformatics personnel needed to create pipelines, analyze data and report it in a digestible format.
Timing: the process usually takes a week at best and several weeks if there is a backlog or not enough samples for a sequencing run to be financially viable.
Sensitivity: the limit of detection for NGS is 30 CT cycles, which for us includes only about 1/2- 1/3 of all positive COVID19 specimens.

Bottom line: WGS is the best at detecting new/ emerging strains or mutations when cost/ time is not a concern.

Mutation Screening

Other institutions have begun efforts to screen for variants of concern by detecting characteristic mutations. For instance, the N501Y mutation in the spike protein is common to the major Variants of Concern (UK B.1.1.7, Brazil P.1, and S Africa B.1.351) and E484K is present in the Brazil (P.1), S Africa (B.1.351) and New York Variant (B.1.526). Thus, several institutions (listed below) took approaches to 1) screen for these mutations and then 2) perform WGS sequentially.

Institution	Method	Targets
Hackensack Meridian Health (HMH)	Molecular Beacon Probes, melting temp	N501Y, E484K molecular beacons
Rutgers, New Jersey	Molecular Beacon Probes, melting temp	N501Y molecular beacons
Vancouver	Probe + melting curve (VirSNiP SARS-CoV-2 Mutation Assays)	N501Y screen + qPCR reflex; Probe, melt curve assay
Yale	RT-qPCR probe assay	S:144del, ORF1Adel
Columbia	RT-qPCR probe-assay	N501Y, E484K

As you can see, HMH, Rutgers and Vancouver are using assays that use probes specific to characteristic alleles combined with melting temperature curves to detect a mutation induced change. Melting curve analysis is normally performed after qPCR to ensure that a single, correct PCR product is formed. This measure is calculated based on the change in fluorescence that occurs when the fluorescent marker is able to bind to its target DNA. Thus the Tm (melting temperature) is similar to the annealing temperature. In this case where a mutation is present in the probe (DNA fragment) binding site, binding is disrupted and occurs at lower a temperature as seen by the downward shift of 5 degrees Celsius in the graph below.

Figure 1. Schematic showing the melting temperature shift for the HMH designed probe binding normal and mutant (E484K variant) sequences at decreasing concentrations.

Figure 2. Similar shift downward in melting temperature for the Rutgers assay when a wild type probe encounters a mutant vs. WT sequence.

These approaches are quick, but can only perform a 2-3 reactions per well and require much of the same expenses as diagnostic RT-qPCR assays. Most of the studies describe this method as a way of screening for samples to be NGS sequenced, however they will not be as good at detecting emerging strains. For example, the N501Y mutation is not present in the New York nor California variants.

Multiplex RT-qPCR can solve some of these problems. At Columbia and Yale, multiple targets are designed to detect B.1.1.7 (N501Y only at Columbia and S144del + ORF1A del at Yale) vs. Brazil/ S. Africa variants (N501Y & E484K at Columbia and ORF1A only at Yale). As new variants have arrived, we found the New York strain carrying both ORF1A deletion and the E484K mutation. It is now clear there are some hotspot areas for mutation within the SARS-CoV-2 genome, which can complicate interpretations. Therefore, these RT-PCR assays are still useful for screening, but do not replace the need for Whole Genome Sequencing.

Genotyping

Given the overlapping spectrum of mutations, it would be helpful to test several markers all at once in a single reaction. At a certain point, this would effectively “genotype” a variant as well as WGS. The assays above have been limited to 2 targets/ reaction due to limited light detection channels. Therefore, I’ve created a multiplex assay that can be scaled up to include 30-40 targets within a single reaction without the need for expensive probes. This method is multiplex PCR fragment analysis, which is traditionally used for forensic fingerprinting or bone marrow transplant tracking. In this method, DNA of different length is amplified by PCR, then separated by capillary electrophoresis-the same instrument that performs Sanger Sequencing.

Fragment analysis can be performed to detect deletion/ insertion mutations and single nucleotide polymorphisms (SNPs) by allele-specific primers or with restriction enzymes that only cut the WT or Mutant sequence.

I designed the assay to target 3 deletion mutations in B.1.1.7: S:D69_70, S: D144, and ORF1A: D3675_3677. Each deletion has a specific length and if 3/3 mutations are present, then there is 95% specificity for the B.1.1.7 strain. Samples from December to present were tested and in the first batch, I detected the characteristic B.1.1.7 pattern (expected pattern and observed pattern below).

Theoretical picture of what the fragment analysis assay would look like for B.1.1.7. An actual patient sample results below, which showed the expected deletions exactly as predicted:

We have tested and sequenced over 500 positive specimens, and we found increasing levels of the B.1.1.7 strain prevalence up to nearly 30% by the middle of March. All screened B.1.1.7 specimens were validated by WGS. These results and the ability to detect the New York and California variants are detailed in our recent pre-print.

Weekly prevalence of isolates consistent with B.1.1.7 in North Texas.

Implications for future Variant Surveillance

As B.1.1.7 has become the dominant strain, and sequencing efforts are increasing. I would argue that assays should be used for what they are best at. For instance, it could be considered a waste of NGS time and resources to sequence all Variants when >50% are going to be B.1.1.7 if other tests can verify the strain faster for 10-20% of the cost. Instead, I think WGS should be focused on discovering emerging variants for which it is best suited. Across the US, case numbers have been decreasing and the number of specimens testable could be expanded by using a more sensitive PCR assay that could.

References

Clark AE et al. Multiplex Fragment Analysis Identifies SARS-CoV-2 Variants. https://www.medrxiv.org/content/10.1101/2021.04.15.21253747v1
Zhao Y et al. A Novel Diagnostic Test to Screen SARS-CoV-2 Variants Containing E484K and N501Y Mutations. A Novel Diagnostic Test to Screen SARS-CoV-2 Variants Containing E484K and N501Y Mutations | medRxiv
Banada P et al. A Simple RT-PCR Melting temperature Assay to Rapidly Screen for Widely Circulating SARS-CoV-2 Variants. A Simple RT-PCR Melting temperature Assay to Rapidly Screen for Widely Circulating SARS-CoV-2 Variants | medRxiv
Annavajhala MK et al. A Novel SARS-CoV-2 Variant of Concern, B.1.526, Identified in New York. A Novel SARS-CoV-2 Variant of Concern, B.1.526, Identified in New York | medRxiv
Matic N et al. Rapid detection of SARS-CoV-2 variants of concern identifying a cluster of B.1.1.28/P.1 variant in British Columbia, Canada. Rapid detection of SARS-CoV-2 variants of concern identifying a cluster of B.1.1.28/P.1 variant in British Columbia, Canada | medRxiv
Vogels CBF et al. PCR assay to enhance global surveillance for SARS-CoV-2 variants of concern. PCR assay to enhance global surveillance for SARS-CoV-2 variants of concern | medRxiv

COVID Variants

Since my last post on the B.1.1.7 (UK) variant, several other variants have arisen. I wanted to describe what makes some Variants of Interest and other Variants of Concern. While a “variant” is often synonymous with a mutation in genetic terms, in the context of SARS-CoV-2, variant means an alternative strain of the virus.

To become a Variant of Interest (VOI), the World Health Organization (WHO) or Centers for Disease Control (CDC) has the following characteristics:

Evidence of variants that affect transmission, resistance to vaccines/ therapeutics, mortality, or diagnostic tests
Evidence that the variants is contributing to a rise in the proportion of cases in an area.
However, limited geographical spread.

Examples: P.2 (from Brazil) B.1.525 (New York), and B.1.526 (New York).

Variants of Concern have increased problems with the same characteristics listed above:

Evidence of reduced vaccine protection from severe disease
Evidence of substantially reduced response to neutralizing antibodies or therapeutics
Evidence of widespread spread
Increased Transmissibility or disease severity

Current VOCs: B.1.1.7 (UK), B.1.351 (South Africa), P.1 (Brazil), and B.1.427/ B.1.429 (California).

The initial VOC of B.1.1.7, B.1.351 and P.1 were identified from having increased spread and more mutations than expected, especially in the Spike gene region (Figure 1).

The N501Y mutation in the Spike protein is present in each VOC. It is located at the tip of the protein that binds the ACE2 receptor, increasing binding strength.

So far, vaccines react against the B.1.1.7 variant. However, B.1.351 pseudovirus shows decreased neutralization by both Moderna and Pfizer sera. Specifically, the E484K mutation in the Spike protein confers resistance to neutralizing antibodies. Thus, the strains B.1.351 and P.1 are more likely to be resistant as would any other strain with the E484K variant.

Lastly, the California variant arose as it was found to rise in prevalence from November to February. The key mutations include W152C and L452R, but the significance of this variant is uncertain. However, this variant has begun to spread over much of Southern California and Nevada.

References

Wu K, Werner AP, Moliva JI, Koch M, Choi A, Stewart-Jones GBE, Bennett H, Boyoglu-Barnum S, Shi W, Graham BS, Carfi A, Corbett KS, Seder RA, Edwards DK. mRNA-1273 vaccine induces neutralizing antibodies against spike mutants from global SARS-CoV-2 variants. bioRxiv [Preprint]. 2021 Jan 25:2021.01.25.427948. doi: 10.1101/2021.01.25.427948. PMID: 33501442; PMCID: PMC7836112.
Tada T, Dcosta BM, Samanovic-Golden M, et al. Neutralization of viruses with European, South African, and United States SARS-CoV-2 variant spike proteins by convalescent sera and BNT162b2 mRNA vaccine-elicited antibodies. Preprint. bioRxiv. 2021;2021.02.05.430003. Published 2021 Feb 7. doi:10.1101/2021.02.05.430003
Gangavarapu, Karthik; Alkuzweny, Manar; Cano, Marco; Haag, Emily; Latif, Alaa Abdel; Mullen, Julia L.; Rush, Benjamin; Tsueng, Ginger; Zhou, Jerry; Andersen, Kristian G.; Wu, Chunlei; Su, Andrew I.; Hughes, Laura D. outbreak.info. Available online: https://outbreak.info/ (2020)
https://www.cdc.gov/coronavirus/2019-ncov/cases-updates/variant-surveillance/variant-info.html

AI vs. Crystallography: Predicting Pathogenic Variants?

Some very exciting news was recently announced about Artificial Intelligence impacting protein structure prediction. But like many of us, you probably thought, “Oh that’s nice.” Followed by either, “But that’s unlikely to impact lab medicine” or “I have no idea how they did that.” Today I will help turn around those last two thoughts for you!

The big news was that a U.K. company specializing in artificial intelligence, DeepMind (owned by Google-of course), won the CASP14 competition. CASP14 is the 14^th edition of the biannual bake-off competition where teams use bioinformatic approaches to predict protein structures. The organizers then judge how well predictions match experimentally derived structures using a score called: GDT. This score reflects the distance of where something is vs. where it should be. Each of the ~150 protein sequences are scored on this basis and given a final percent identity score (0-100%).

Figure 1. The Z-score is just the difference of a sample’s value with respect to the population mean, divided by the standard deviation. The groups that are markedly better than the average will have larger Z-scores.

Since the competition started in 1997, the winners have scored ~50% on average. That is until 2 years ago when AlphaFold, the AI created by DeepMind, won with a top score of 55%. Their paper was published open access (Ref: https://www.nature.com/articles/s41586-019-1923-7) and used similar techniques applied by others where proteins were progressively folded by a computer until the lowest energy state is revealed.

Figure 2. Improvements in the media accuracy of predictions in the free modelling category for the best team in each CASP. Measured as best-of-5 GDT.
GDT: Global Distance Test (0-100); the percentage of amino acid residues within a threshold distance from the correct position. GDT of around 90 is considered competitive with results obtained from experimental methods.

The programs driving this folding may consider amino acid charge, size, and polarity, genetic conservation (Ref), or similarity to other protein domains. However, the innovation here was that DeepMind used artificial intelligence to examine sequence information with a convolutional neural network to identify structural constraints that are used to predict accurate protein folding.

Figure 3. Sequence of events from Dataà Deep neural network (Artificial intelligence)àPredictions à protein folding process. (Figure 2 of this reference: https://www.nature.com/articles/s41586-019-1923-7/figures/2).

This results in one of those famous algorithms you’ve heard about. However, these algorithms are more complex than a simple linear regression and it is nearly impossible to trace how exactly how different levels of importance were assigned to each variable. An important requirement for an accurate A.I. derived algorithm is that it has a large training data set. Fortunately for Deepmind, they were able to train AlphaFold using about 170,000 structures that were determined experimentally using x-ray crystallography, nuclear magnetic resonance spectroscopy, and electron microscopy.

Although we haven’t seen what was changed between AlphaFold and AlphaFold 2, we have learned that AlphaFold 2 vastly outperformed the original in CASP14 with 91% accuracy. When programs are >90% accurate they are considered to be essentially as good as experimentally derived structures. In fact, AlphaFold 2 was able to provide more information than the experiments! One researcher found that their experimentally derived structure had a different configuration than the one predicted by AlphaFold, so they assumed the prediction by AlphaFold 2 was incorrect. After further analysis, the experimentally derived structure was found to be very similar to the structure predicted by AlphaFold 2. In another case, AlphaFold 2 predicted that an amino acid was in an infrequently found conformation, so they figured AlphaFold 2 made a mistake. After reanalyzing the experimental data, they found that that AlphaFold 2 was correct. It was even suspected that several lower-scoring structures based on NMR data may reflect lower accuracy in the experimental structure instead of a problem with the algorithm.

Figure 4. (Left) Model for the T1064 target (red) superimporsed onto the structure from DeepMind in CASP14 (blue). (Right) Black and green structures are from the runner-ups who made predictions for the same structure (correct in blue). Obtained from CASP14 webpage on Tuesday December 1st, 2020.

Will AI replace experimental crystallography? To answer this question, I turned to a colleague in my basic science lab, Lijing Su, who has been a structural biologist for many years. Like many cases of AI, this is a useful tool, but it doesn’t entirely replace her work because a lot of the structural biology research focuses on how proteins move and change as they do their job. Structural biology has moved beyond structures of single proteins and is now focused on how different proteins interact. There is still a role for crystallographers as AlphaFold cannot perform this role…yet.

All this still begs the question of a laboratorian “Who needs to know protein structure anyways?” We understand that knowing protein structures can help explain function, which has implications with drug development. However, our main role is to provide tests that diagnose disease. A major challenge in molecular pathology is to predict whether a genetic variant causes loss of protein function. Current software has poor performance (PolyPhen2 sensitivity= 45% specificity= 50%) as they mainly measure changes in chemical properties and amino acid site conservation. One potential application of AlphaFold is to examine the effect of genetic variants on protein structure. Pathogenic changes would be predicted to deform portions of the structure impairing activity or provoking degradation through the unfolded protein response.

As the current speed of the program is quite long, this could be difficult to implement immediately, but it is imaginable that this will become quicker. A straightforward way to validate this AI software would use confirmed pathogenic or benign variants from the public database ClinVar. There are over 1,000,000 entries into this database, which would provide a useful training and validation set. It is likely that change in protein structure would be a stronger mechanism of disease for certain types of proteins (ion channels for epilepsy or myosin chains for muscular disorders) and a less strong predictor of pathogenicity for other types of proteins (enzymes for metabolic disorders or signaling proteins where protein-protein interaction is important for function).

This blog entry was written with the very helpful insights and knowledge of Lijing Su, PhD.

References

Senior AW et al. Improved protein structure prediction using potentials from deep learning. Nature. 2020; 577: 706–710.
CASP14 website: https://predictioncenter.org/casp14/
Arnold CN et al. ENU-induced phenovariance in mice: inferences from 587 mutations. BMC Res Notes. 2012; 5: 577.
https://deepmind.com/blog/article/alphafold-a-solution-to-a-50-year-old-grand-challenge-in-biology

-Lijing Su is Assistant Professor in the Center for Genetics of Host Defense at the University of Southwestern Medical Center. She specializes in structural biology and helps determine multi-protein interactions to explain unknown mechanisms of genes important to immunology.

Stable Chimerism Post-Double Cord Transplant

Hello again! The last case study was an example of a patient with a loss of allele at two STR loci on a shared chromosome. Today, I wanted to share an interesting and unusual case that we monitor in our lab. This case explores the use of cord bloods as the source of the donor, and in this case, a double cord blood transplant.

Cord blood (CB) unit transplants can be advantageous over other donor sources, such as bone marrow or peripheral blood. The Leukemia and Lymphoma Society summarizes these advantages well, with some being their availability (CB can be prescreened/tested and then frozen for use when needed – decreasing the risk of disease transmission), less-strict HLA matching requirements, decreased graft versus host disease (GVHD) occurrence and severity, long-term storage (CB over 10 years old has been successfully transplanted), increased diversity of donors, and reduced risk of disease relapse, to name a few.^{2, 3}

CB also has its disadvantages, some include: less stem cells for engraftment which leads to longer engraftment times, these longer engraftment times lead to longer immunological recovery and a higher risk of infection, less available clinical data relative to stem cell and bone marrow transplants (newer procedure comparatively in transplant), and no additional cells for infusions later on in treatment. Further, selecting the best cords for transplant can be challenging due to the static variables of a CB (again, there is no donor to go back and get more cells). Considering all that CB has to offer, haplo-identical transplants are preferred in the U.S. over CB transplants. ^2,3,4

Before the University of Minnesota pioneered the strategy of double cord transplants, single cord transplants gave rise to a high incidence of graft failure and transplant related mortality. ² Double cord transplants have now become standard when utilizing CB as the donor, as a single CB unit contains a small number of required and necessary cells for a successful transplant and double units help overcome the issues that this presents.

Double cord transplants are interesting and complicated for analysis purposes (and in general!). All stem cell transplants involve a dynamic process between the cells of the donor and recipient. Yet, double cords bring in another dynamic process including an additional donor.^1,2 Through the chimerism monitoring process, the complexity of the engraftment process can be appreciated as one cord ultimately becomes the “winner” and the other the “loser”. In other words, one engrafts and is detectable, while the other cord fails to engraft and becomes undetectable. Figure 1 demonstrates this process, where both cords are present initially after transplant. Then, at 43 days post-transplant, a single donor cord (D2) engrafts while the other donor cord (D1) does not engraft. D1 is most likely eliminated from the host, potentially explained by multiple theories, and no longer is detectable by chimerism testing.

Figure 1. “D1” (blue) and “D2” (pink) represent donor cord one and two alleles, respectively. “D2R” (green) represent a shared allele among donor cord two and the recipient. Each image is a time lapse of the “D18S51” STR locus post-transplant. Alleles 12, 14, 15, and 19 are present at this locus. At 21 days post-transplant, both donors are present. At 43 days post-transplant and following, only donor 2 is present and alleles 14 and 15 are no longer observed.

In the case study below, the patient was diagnosed with chronic myeloid leukemia and received a double cord transplant in 2014. One would expect, as described above, that one cord would become the “winner” while the other is rejected and becomes the “loser” and becomes undetectable. Interesting enough, this patient never achieved a status of a “winner” or “loser” cord. Rather, both remained persistent within the patient’s chimerism profile and over time have become relatively stable in their percentages.

In the electropherogram below (Figure 2), alleles from both donors can be appreciated from the CD3 (top) and CD33 (bottom) lineages. Each lineage exhibits different constitutions of the donor cord percentages, where CD3 has a greater proportion of cord two than CD33; yet both lineages have a greater overall percentage of cord two than cord one. Looking at the line graph (Figure 3), the differences between the cord percentages can be further appreciated over time. It can even be noted that the cord proportions in the CD33 lineage swapped in 2017, only to swap back to favor cord two and to remain that way since. Changes of donor-recipient relative percentages occur throughout the post-transplant journey and these events are due to complex processes. Some patients become transient mixed chimerisms (who initially are mixed chimerism but later achieve total/complete chimerism), others achieve complete chimerism, and yet others may become stable mixed chimerism. It is important to note that, even in cases where complete chimerism is not achieved, disease remission can still be present.¹ In this case, the patient has achieved a stable mixed chimerism status among both donor cords and, to our lab’s knowledge, is doing well clinically.

Figure 2. “D1” (blue) and “D2” (pink) represent donor cord one and two alleles, respectively. Green D1D2R, D2R, and D1D2 represent shared alleles (where “R” represents recipient alleles). Comparing the top (CD3) and bottom (CD33) electropherograms, it can be appreciated that the percentage of each cord is different for each lineage population.

Figure 3. The red line graph on the left depicts the donor percentage of each cord blood unit (CBU) of CD3 lineage over time (11/2016 – 07/2020). It can be appreciated that CBU 2 is the dominant cord for CD3. The blue line graph on the left depicts the donor percentage of each CBU of CD33 lineage over time (11/2016 – 07/2020). It can be appreciated that CBU 2 is also dominant, but the differences between the cord donor percentages are much less compared to that of the CD3 lineage. Also, you can see over time that the two cords are relatively stabilizing in the percentages.

This case brings me back to a memory of my professor, who spoke briefly of this occurrence in a lecture only to quickly admit of its rarity. This is an interesting case because it represents one of those extremely uncommon instances. It is a privilege to be a part of a transplant center, like Northwestern’s, where we can witness rare and unique presentations like this. It opens up opportunities to learn and explore the complexities that transplant medicine and molecular HLA have to offer.

References

Faraci M, Bagnasco F, Leoni M, et al. Evaluation of Chimerism Dynamics after Allogeneic Hematopoietic Stem Cell Transplantation in Children with Nonmalignant Diseases. Biol Blood Marrow Transplant. 2018;24(5):1088-1093. doi:10.1016/j.bbmt.2017.12.801
Gutman JA, Riddell SR, McGoldrick S, Delaney C. Double unit cord blood transplantation: Who wins-and why do we care?. Chimerism. 2010;1(1):21-22. doi:10.4161/chim.1.1.12141
Leukemia & Lymphoma Society. Transplantation Facts.https://www.lls.org/sites/default/files/file_assets/FS2_Cord_Blood_Transplantation_6_16FINAL.pdf. Published May 2016. Accessed December 15, 2020.
Gupta AO, Wagner JE. Umbilical Cord Blood Transplants: Current Status and Evolving Therapies. Front Pediatr. 2020;8:570282. Published 2020 Oct 2. doi:10.3389/fped.2020.570282

-Ben Dahlstrom is a recent graduate of the NorthShore University HealthSystem MLS program. He currently works as a molecular technologist for Northwestern University in their transplant lab, performing HLA typing on bone marrow and solid organ transplants. His interests include microbiology, molecular, immunology, and blood bank.

Will the B.1.1.7 variant evade the Vaccine/Tests?

Will the B.1.1.7 variant evade the vaccine/tests?

This question came up recently and I wanted to share some cutting edge information the addresses this. This was in part adapted from Akiko Iwasaki’s (Yale HHMI immunologist) Twitter discussion of this subject.¹

Will B.1.1.7 evade our tests?

The UK variant commonly called lineage B.1.1.7 (officially Variant of Concern 202012/01) has 23 genetic variants that result in 17 protein coding changes.² Most tests including the ones at our institution (Abbott) are not currently affected (see below). Only the ThermoFisher assay has declared a target that covers the 69-70del variant in the S gene (in green). This conversely makes the TaqPath^® assay one way to detect a potential B.1.1.7 variant.

Figure 1. A picture of the SARS-CoV-2 genome with red lines indicating mutation sites and different assays and relative location of their qPCR targets.

Will the vaccine protect against the B.1.1.7 variant?

The Pfizer and Moderna RNA vaccines create an immune response against the spike protein. We don’t know the exact sequences or reactivity of the vaccines’ spike protein. However, a recent study looked at the antibody reactivity to linear epitopes of COVID-19 in 579 patients who were naturally infected with COVID-19. For the antibodies against the spike, the major reactive linear epitopes are indicated in Red at the bottom. None of the B.1.1.7 mutations (Orange) overlap with these major reactive epitopes.³

For a closer look, see below.

A limitation of these analyses is the use of only linear epitopes. Mutations might impact a 3D epitope affecting Ab binding. However, people make multiple antibodies to the spike protein.⁴ So, broad coverage should arise after exposure to the either the vaccine or natural infection with COVID-19.

The vaccine should induce a polyclonal antibody response that recognizes multiple parts of the spike protein, making it effective, even against novel variants. Also, there should be few to no False Negative COVID-19 tests due to the new variant, but we will continue to monitor and test this experimentally.

References

Prof. Akiko Iwasaki @VirusesImmunity
Chand, Meera et al. Investigation of novel SARS-COV-2 variant: Variant of Concern 202012/01 Public Health England.
Haynes WA et al. High-resolution mapping and characterization of epitopes in COVID-19 patients. MedRxiv. https://www.medrxiv.org/content/10.1101/2020.11.23.20235002v1#p-5
Shrock E et al. Viral epitope profiling of COVID-19 patients reveals cross-reactivity and correlates of severity. Science 2020 370(6520). https://science.sciencemag.org/content/370/6520/eabd4250

What to Expect When You Don’t Know What You’re Expecting: COVID-19 and Flu Season in the Laboratory

Welcome to October 2020 and a flu season unlike any other. What can we expect? Well, it’s complicated. And if we aren’t sure what to expect, can we still be prepared? Yes (at least for some things)!

From the beginning of the COVID-19 pandemic and throughout the summer of 2020 clinicians and laboratorians have been anxiously wondering what effect global presence of respiratory virus SARS-CoV-2 would have on the 2020-2021 flu season. “Flu season,” the annual, relatively predictable period of increased cases and deaths due to Influenza A and B, occurs during colder, winter months. In the northern hemisphere this is September through March. We have extensive experience tracking the onset and genetic variability of the predominant influenza viruses. We manufacturer flu vaccines based on data of potentially likely influenza strains. Other viruses that cause respiratory symptoms follow similar seasonal patterns. These include common (non-SARS-CoV-2) human coronaviruses, and Respiratory Syncytial Virus (RSV). In short: this is a known, annual occurrence that we can usually prepare to some extent.

So what will that look like this year? During the historic 1918 pandemic influenza, deaths seen during the first winter of the outbreak paled in comparison to those seen the following winter. Even if that kind of terrible scenario doesn’t occur during this pandemic year, it is possible we are facing “perfect storm” of COVID-19 plus influenza resulting in overwhelmed hospitals and depleted testing supplies. [https://www.cidrap.umn.edu/news-perspective/2020/09/fears-perfect-storm-flu-season-nears]

We know that COVID-19 spreads well in enclosed spaces with prolonged person-to-person contact, regardless of climate and temperature, via respiratory secretions. Because of this, there has been widespread adoption of mask wearing, social distancing, and limitations on in-person gathering. Promisingly, these interventions to prevent the spread of COVID-19 seem to be contributing to historically low influenza rates in the Southern Hemisphere! [https://www.cdc.gov/mmwr/volumes/69/wr/mm6937a6.htm] But adoption of these mitigation strategies are not being universally or rigorously followed in all regions and communities. As temperatures drop, we could see more people conducting activity indoors – will this change transmission patterns? Will regions with ongoing COVID-19 outbreaks be more prone to influenza as well? If hospital capacity becomes strained, will criteria for ordering tests change?

During COVID-19 laboratories have responded heroically and rapidly to test kit shortages, supply chain issues, and staffing challenges. At this stage (October of 2020) many high-level decisions about SARS-CoV-2 testing, like test platform purchasing and validation or manufacturer test kit allocations, might already be set in stone. So is there anything that can be done to help labs and laboratory workers successfully make it through flu season?

Here are 3 suggestions:

1) Establish testing algorithms and clear sample workflows.

Each facility and laboratory will have their own platforms for testing COVID-19 and other respiratory pathogens. Depending on the service ordering the test, there can be both immediate and downstream consequences for when a test comes back positive, negative, or even when that test result is slower than expected!

An algorithm helps set institutional expectations for what tests are ordered under different scenarios. For example symptomatic patients presenting to a hospital with influenza-like illness (ILI), especially when they will be admitted, should likely have both SARS-CoV-2 and influenza tests ordered simultaneously. But asymptomatic patients being admitted for procedures may only require a SARS-CoV-2 test.

Let’s say your lab has both a SARS-CoV-2 PCR test and SARS-CoV-2 rapid antigen test. But due to risk a false negative, lab and clinical leaders are uncomfortable using only a rapid antigen test to conclusively rule out COVID-19 in patients being admitted to the hospital. Your algorithm could use specify the use of SARS-CoV-2 antigen testing in symptomatic patients to quickly “rule in” potential positives, where antigen-negative patients will also have a PCR test. Algorithm specifics come down to what your institutions stake holders (clinical AND laboratory) need and capacity are. The details of an algorithm will be dependent on your lab test platforms, your available test orders, and may need to be modified to accommodate restricted test allocations.

Along with clinical algorithms, clear workflow for specimens and test types can help laboratory workers get tests where they need to go within the lab. Not all SARS-CoV-2 tests have approval in the instructions for use for, say, nasal swabs. If nasal swab comes to the lab with orders for both influenza and SARS-CoV-2 tests, what is the procedure for informing the floor for an appropriate collection? Or say that your test platforms for different tests live in different areas of the lab. Your workflow may be to set up one test and do a pour off into an aliquot tube so tests can be run at the same time. Or you may have sufficient test collection materials to request a separate sample for each test.

Probably the most important part of developing or reviewing your existing algorithms and laboratory workflow is doing it in connection with others. The purpose is to streamline the entire process from clinical decision making to test performing and reporting and help everyone be on the same page.

2) Communicate to clinical staff frequently about your tests.

Because of the intense interest surrounding COVID-19 laboratory testing, it’s entirely possible that more people have had to learn about previously niche laboratory concepts like “sensitivity vs. specificity” and “PCR vs. antibody vs. antigen tests” than at any previously time in human history! However, it is also likely that many clinicians or administrators in your own institution may know more about a test platform they read about in the news than the COVID-19 test platform that their laboratory performs.

Even at this stage in the pandemic with perhaps more exposure (pun not intended!) then the laboratory has ever had, miscommunication and unclear expectations abound surrounding test performance or turnaround times.

Whenever possible, lab leaders who interact with clinicians and administrators should look for ways to educate on test platforms, testing capacity, and expected test performance (i.e. time to result, comparative sensitivity etc.). This could include asking for time to provide formal updates during monthly meetings, monitoring test statistics (e.g. a test “dashboard”), or just informal reminders about what tests the lab performs during phone calls.

3) Keep the lab staff off the phone.

A critical part of the job of the lab is to provide information and updates on when test results are available. But when the hospital floors or clinics are busiest with patients, often the lab is busiest performing those patients’ tests. A phone call about the status of a respiratory virus test can be undeniably helpful to that patient’s clinical care team! But a dozen such phone calls over the course of a lab worker’s shift, especially under normal lab conditions (e.g. no staff shortages or instrument issues) is a failure of communication and can be detrimental to both lab performance and lab worker wellbeing.

In addition to the need for regular education about testing mentioned above, to help protect your lab staff’s bench time here are some possible ways keep from being overwhelmed with phone calls:

In some institutions, passive reminders (for example about hand hygiene or upcoming events) cycle through computer screen savers or on television screens in clinical areas. You could see if a message like “Reminder from the lab: COVID-19 tests are completed in [length of time].” could be put on a rotation.
If there is no client service or switchboard for your lab, but people call the lab directly for updates, you could institute a message stop. This is where phone calls routed to the laboratory must listen to a reminder that (for example), “If you are calling for an update of a COVID-19 test, these tests cannot be completed faster than [length of time] after arriving in the lab.”

While these messages can be undeniably annoying and disruptive for people calling the lab for other reasons (and become less effective over time) if phone calls get out of hand, this option could be considered.
A lab instrument going down can result in test backlogs and numerous phone calls to the lab. Some institutions centralize their information in the form of a duty officer (for example in the emergency department). This will be a person who can be informed of actionable information, like test delays due to instrument issues, and who will post and distribute that information to those affected.

There is a lot we don’t know about what’s to come in the COVID-19 pandemic. While we can’t predict the ways the lab may be challenged with the next unforeseen disruption, or even what our flu season testing needs may look like, hopefully we can prepare now to continue to support our patients by helping and supporting our labs.

-Dr. Richard Davis, PhD, D(ABMM), MLS(ASCP)^CM is a clinical microbiologist and regional director of microbiology for Providence Health Care in Eastern Washington. A certified medical laboratory scientist, he received his PhD studying the tropical parasite Leishmania. He transitioned back to laboratory medicine (though he still loves parasites!), and completed a clinical microbiology fellowship at the University of Utah/ARUP Laboratories in Utah before accepting his current position. He is a 2020 ASCP 40 Under Forty Honoree.

Monitoring Bone Marrow Transplant Recipients

Hello everyone, it’s been quite some time since my last post. I hope everyone has remained safe and healthy during these times!

My last post dived into short tandem repeat (STR) analysis for bone marrow engraftment monitoring. Today is a presentation of a patient who was transplanted for treatment of acute myeloid leukemia (AML). With all patients (with minor exceptions), donor and pre-transplant recipient samples are taken before transplant. Their informative alleles are then identified and used to determine the percent of donor and any recipient cells in subsequent post-transplant samples.

This patient was unique in that we were not able to obtain the donor sample (they were transplanted outside of our system), and therefore we used a buccal swab for their pre-transplant recipient informatives.

Buccal swabs are chosen because they are a non-invasive way to obtain squamous epithelial cells. These cells are important because they are of the recipient origin and will not change. With this technique, it is essential that the patient has no mucosal inflammation or is not too rough when swabbing their cheek. Otherwise, the buccal sample may become contaminated with blood which would contain donor cells.

We then inferred the donor informatives from the data of a mixed sample and the buccal swab.

Calculation of recipient and donor percentage in a post-transplant sample is determined on specific formulas that utilize these informative alleles. But what happens when a patient relapses and new mutations or deletions are introduced into their genome, causing a change in these informative alleles?

In this case, the patient had a loss of allele at two loci (CSF1PO – allele 11 and D5S818 – allele 13) after having previously obtained full engraftment (Figure 1).

Figure 1. The pre-sample was acquired through a buccal swab. There was no donor sample that was acquirable, and therefore the donor informative alleles were inferred through available data. In September of 2019, the patient was at 100% donor. Almost a year later, the patient is now at 4% donor and missing previously identified recipient alleles, indicating a loss of allele/mutation. Brown box with “R” stands for recipient. Blue box with “D1” stands for donor. Green box with D1R stands for shared.

The importance here is that the true percent donor is 4% (Figure 2). If we take a look at the affected informative alleles, we see an erroneous result of 100% donor and NI (which means the locus is non-informative, eliminating it from the calculations). This expands on the importance of an analyst to be attentive to the results presented. While this case was clearly evident and was caught by our error measurements, it is theoretically possible to cause an issue, especially in cases where the recipient percentages may be smaller. Furthermore, this phenomenon stresses the importance of including multiple informative alleles in our analysis, which increases our measurement of confidence.¹

Figure 2. CSF1PO and D5S818 are incorrectly representing the patient’s status. CSF1PO is representing the patient at 100% donor and D5S818 is automatically identified as a non-informative by our software. After automatic and manual loci ignores, the total percent donor was 4%

We know that a loss of allele (loss of heterozygosity) is the likely explanation because both loci are in locations specific to the disease. Looking at Figure 3 below, the two alleles were affected because they were both present on the long arm of chromosome 5. Further, this chromosome is known to be involved in AML, and is also, of course, associated with other disorders like MDS.² Additionally, the patient had cytology testing that identified this as an affected chromosome.

Figure 3. CSF1PO and D5S818 are both located on the long arm of chromosome 5. CSF1PO’s location is 5q33.1 and D5S818’s location is 5q23.2.

This is an interesting phenomenon and one that shows in measurable terms how a patient’s status can affect their molecular results. It’s further an expression of the molecular mechanisms of a disease, one of my first measurable experiences of how a disease affects the physical molecular constituents of another human.

To me, this encounter was an expression of how complicated, and yet connected, the entire genome has been designed. I am continuously amazed and look forward to expanding my understanding of molecular science.

References

Crow J, Youens K, Michalowski S, et al. Donor cell leukemia in umbilical cord blood transplant patients: a case study and literature review highlighting the importance of molecular engraftment analysis. J Mol Diagn. 2010;12(4):530-537. doi:10.2353/jmoldx.2010.090215
Crow J, Youens K, Michalowski S, et al. Donor cell leukemia in umbilical cord blood transplant patients: a case study and literature review highlighting the importance of molecular engraftment analysis. J Mol Diagn. 2010;12(4):530-537. doi:10.2353/jmoldx.2010.090215