Screening or diagnostic test. What is the difference?

Today was an interesting day at work.  A genetic counselor I work with emailed me that a pregnant patient wanted to have "every single Down syndrome screening test that was available."  While this was problematic in and of itself (more about that later), this patient also planned to have an amniocentsis regardless of the results of the screening test.

Do you see a problem with this line of thinking?  If not, read on.

Let's start with what a screening test is.  I've written about this before here, but to recap: a screening test is NOT the same as a diagnostic test.  A test that screens for Down syndrome doesn’t identify if a woman is pregnant with a baby that has Down syndrome; it identifies women who are pregnant with babies that are at increased risk of having Down syndrome.  In other words, the screening test puts tested women into one of two camps: those without increased risk and those with incrased risk.  Women who screen positive and who are at increased risk are offered a diagnostic test that can confirm if their baby does or does not have Down syndrome.  A screening test cannot do that.

The diagnostic test for Down syndrome is determining the karyotype of the fetus in order to identify how many copies of chromosome 21 have been inherited (unaffected fetuses have 2 copies; affected fetuses have 3 copies).

The results of a Down syndrome screening test are used to identify women who should be offered diagnostic testing (karyotype).  Women who have a positive screening test result are offered amniocentesis in order to obtain  the amniotic fluid required for karyotyping the fetus.  However, because amniocentesis is an invasive procedure, there is a small risk of miscarriage (usually less than 0.5 percent).

The problematic request of the patient to have more than one Down syndrome screening test should now be apparent for two reasons:

  1. Her desire to have every available screening test is illogical if she has already made up her mind to have an amniocentesis and diagnostic testing.  The fetal karyotype is THE definitive (i.e. diagnostic) test and so screening for a disorder makes no medical or economic sense because, regadless of the results, the diagnostic test will still be performed.
  2. Requesting all available screening tests is a complete waste of health care resources.  Granted, the number of Down syndrome screening tests available is a source of much confusion for both physicians and patients.  That said, patients (with help from their doctors) should choose the screening test that is best for them.  Choosing multiple screening tests is not a wise idea.  Consider what might be done if the results of these tests don't agree with each other.

Consumers of health care often (mistakenly) believe that more testing is better.  Few take the time to consider that tests may have downstream consequences that they might not be prepared for.  In this case, the patient had already decided to have the "best" test.  That is her choice and one that I support.  What I don't support is the wasted time, money, and effort required to perform tests that are, in this specific situation, meaningless.

Screening for neural tube defects

NeuronsA neural tube defect (NTD) is a birth defect of the spinal cord and/or brain.  The term is used to describe a group of disorders that occur very early in pregnancy and can be mild to severe or even fatal.

During the first 3 weeks of pregnancy, specific cells fuse to form a hollow tube (the neural tube) that forms the basis of what will become the spinal cord and brain.  A NTD occurs when that neural tube fails to close completely somewhere along its length.

The two most common NTDs are spina bifida and anencephaly.  Spina bifida is the most common.  There are different types of spina bifida and each has varying degrees of severity but it nearly always results in some nerve damage that can cause at least some paralysis of the legs.  Anencephaly is the most severe NTD and results in the lack of development of the brain and skull and is not compatible with life.  NTDs that are covered by skin are called “closed” defects while those that are not covered by skin are considered to be “open.”  Only open NTDs are detected by screening tests.

Alpha-fetoprotein (AFP) testing is used to screen for a NTD during the second trimester of pregnancy.  Ideally it takes place between 16 and 18 weeks of gestation but between 15 and 22 weeks is acceptable.  The concentration of AFP in fetal blood is 100,000 times greater than it is in maternal blood.  Some of the fetal AFP normally enters the maternal blood and so the AFP concentration in maternal blood will begin to increase.  A fetus with an open NTD will transfer more AFP into maternal blood than an unaffected fetus and so an unusually high AFP concentration in maternal blood can indicate that the fetus has an open NTD.

Because AFP concentrations normally increase during pregnancy (by about 15 percent each week), a statistic called the “multiple of the median” (MoM) is used to normalize the test result.  The MoM is a measure of how far an individual test result deviates from the median (middle) value of a large set of AFP results obtained from unaffected pregnancies.  For example, if the median AFP result at 16 weeks of gestation is 30 ng/mL and a pregnant woman’s AFP result at that same gestational age is 60 ng/mL, then her AFP MoM is equal to 60 divided by 30 (60/30) or 2.0.  In other words, her AFP result is 2 times higher than “normal.”

So how is the AFP MoM interpreted?  What is considered an abnormal result?  Although the AFP MoM cutoff varies by lab, the two most commonly used are 2.0 and 2.5.  Results above the cutoff are considered to be abnormal.  A cutoff of 2.0 will detect about 85 percent of open NTD and a cutoff of 2.5 will detect about 75 percent.  Most cases of anencephaly are detected with maternal serum AFP screening.  The figure below illustrates the distribution of AFP MoM results in women with unaffected fetuses, those with spina bifida, and fetuses with anencephaly.

Results to the right of the blue line (a cutoff of 2.5 MoM) would be interpreted as "abnormal" while an AFP MoM to the left of the line would be considered "normal."  Note that there is no single MoM cutoff that can completely separate unaffected from affected fetuses.  There will always be affected fetuses that screen normal and unaffected fetuses that screen abnormal.

Because this is a screening test, women with an abnormal result require additional testing to confirm if the fetus has a NTD.  More about these tests in future post.

Lastly, it’s important to keep in mind that most abnormal NTD screening tests are false-positives.  There are several reasons why AFP might be elevated in the absence of an open NTD such as: an abnormality in the fetal kidneys, a ventral wall defect (opening in the abdomen), the death of the fetus, a twin gestation, or, most commonly, underestimated gestational age.

Trimester-Specific Reference Intervals for TSH

Thyroid tests

In September 2011, The American Thyroid Association (ATA) published new guidelines on the diagnosis and management of thyroid disease during pregnancy and postpartum.  There are many recommendations in the guidelines, but I wanted to highlight one in particular.

Recommendation 2

"If trimester-specific reference ranges for TSH are not available in the laboratory, the following reference ranges are recommended: First trimester, 0.1-2.5 mIU/L; second trimester, 0.2-3.0 mIU/L; third trimester, 0.3-3.0 mIU/L."

 These reference intervals are lower than the non-pregnant reference intervals. This is due to the fact that hCG has mild thyroid-stimulating ability.  Therefore, hCG stimulates the thyroid and suppresses TSH. This is most apparent during the first trimester (7-11 weeks) when hCG is at its highest concentration. TSH concentrations actually decrease, although usually not below the normal, non-pregnant, reference interval.

This means that hypothyroidism during pregnancy needs to be defined using these pregnancy-specific reference intervals. Overt hypothyroidism is defined as decreased fT4 with TSH > 2.5 mIU/L. Subclinical hypothyroidism is defined as serum TSH 2.5-10 mIU/L with normal fT4. The ATA recommends treating overt hypothyroidism, but not subclinical hypothyroidism, unless women are also positive for anti-TPO antibodies. When patients are treated, the goal is to achieve the trimester-specific reference intervals listed above.

Interestingly, the ATA recommends that women who are taking T4 therapy have their dose adjusted to achieve a TSH concentration of 2.5 mIU/L before pregnancy. This reduces the risk of hypothyroidism during the first trimester. Likewise once women who are on T4 therapy get pregnant, their T4 therapy should be adjusted to keep them within the pregnancy-specific TSH reference intervals and serum TSH should be monitored approximately every 4 weeks during the first half of pregnancy. Serum TSH should also be checked again between weeks 26 and 32.

Similarly, according to the ATA, euthyroid (normal functioning thyroid gland) women who are not on T4 replacement therapy but are TPO antibody positive, should also have serum TSH monitored every 4 weeks during the first half of pregnancy and again between weeks 26 and 32.

The gestational diabetes mellitus debate continues

Discussion_icon_noshadowI have just returned from the annual meeting of the AACC where I attended a very interesting debate on the diagnosis of gestational diabetes mellitus (GDM). I've written about the current controversy in diagnosing GDM before and you can read about those here and here. Basically, the controversy boils down to one issue: should recently recommended criteria for identifying pregnant women with GDM be globally implemented or not? 

Arguing for that position was Dr. Donald Coustan from Brown University and regional principal investigator for North America of the Hyperglycemia and Adverse Pregnancy Outcomes (HAPO) study. He correctly pointed out that lack of a universal testing strategy when screening for GDM makes it impossible to compare clinical studies on GDM. He reviewed how the new IADPSG glucose cutoffs came into being (they were based on risk of adverse infant outcomes) that he advocates referring to as the ADA criteria because the ADA is recommending the use of the new testing method.

Arguing against the use of the ADA criteria was Dr. Sean Blackwell from the University of Texas Health Science Center at Houston, TX. He agreed with several of Dr. Coustan points. Among them that:

  1. The HAPO study was well conducted.
  2. There was a positive association between glucose concentration and adverse infant and maternal outcomes at lower glucose cutoffs than are currently used to diagnose GDM.
  3. There is benefit in having a single, universal screening test for GDM.
  4. There is evidence that, as currently defined, treatment of GDM improves outcomes.

He had two major problems with use of the new ADA criteria. The first was that its use would double the number of women diagnosed with GDM (from about 7% to 16%). The second was that the HAPO study was an observational study, not a treatment trial and, as such, there is no evidence that treating these additional women for GDM is effective or safe.

Dr. Coustan argued that the increase in the number of GDM diagnoses is not surprising given that, in the US, 31% of adult US women have either diabetes or pre-diabetes. He also argued that the Australian Carbohydrate Intolerance Study of Pregnant Women (ACHOIS) study demonstrated that treatment of women with mild GDM reduced adverse outcomes such as large for gestational age newborns, macrosomia, and preeclampsia.

Dr. Blackwell pointed out that most of the additional 10% of women that would be diagnosed with GDM under the ADA criteria would, by definition, have "milder" GDM and would only require nutritional modification and glucose monitoring rather than drugs to control their GDM. These women would have glucose control similar to those of obese women without diabetes. Further, he added that several studies in obese women without diabetes have failed to demonstrate that nutritional interventions have any impact on any infant health outcome.

The moderator of this debate was my co-blogger, Ann Gronowski. Prior to its start, she polled the audience of (mostly) laboratorians to see which testing strategy they currently offered at their institutions. Most indicated they offered the current ACOG criteria (advocated by Dr. Blackwell). At the end of the debate, the audience was asked if they would support switching to the new, ADA criteria. The majority response was "yes." Dr. Coustan argued his points effectively.

It's my belief that the evidence, while not complete, is strong enough to support widespread adoption of the ADA criteria when screening for and diagnosing GDM.

Should I get my iodine measured during pregnancy?


The short answer is no, but let me explain why.

Iodine is necessary for the production of the thyroid hormones T3 and T4. A deficiency of iodine leads to decreased production of these hormones and can cause goiter (enlargement of the thyroid) and hypothyroidism.

During pregnancy, a number of normal changes occur that involve the thyroid gland and the need for iodine including:

  1. hCG is similar in structure to TSH, the hormone that stimulate the thyroid gland, and so hCG can also stimulate the thyroid gland;
  2. There is an increased demand for T3 and T4; and,
  3. Clearance of iodine through the kidneys is increased.

In areas where there is iodine deficiency, pregnancy is associated with a 20-40% increase in the size of the thyroid gland. In areas where iodine is replete, like the United States, the thyroid increases in size by only around 10% during pregnancy.

Because of these changes, dietary iodine requirements for pregnant women are higher than they are for non-pregnant women. If iodine intake was adequate before pregnancy, women should have sufficient iodine stores and therefore have no difficulty meeting the needs for iodine during pregnancy and lactation. If their iodine intake was not sufficient, it can result in overt hypothyroidism which is associated with miscarriage, stillbirth, and, in very severe cases, cretinism (characterized by severe mental retardation and deafness). Iodine deficiency is the leading cause of preventable mental retardation worldwide. According to public health experts, iodization of salt may be the world's simplest and most cost-effective measure available to improve health

While the U.S. is an iodine replete country, some studies have suggested that women of reproductive age may be at risk of iodine deficiency.  This might make one think that iodine status should be determined in these women. Iodine status is usually assessed by measuring urine iodine concentrations. However, there is significant day-to-day variation in urine iodine excretion, such that measurement in a single individual is not useful. Urine concentrations are most useful to assess the iodine status of a whole population.

In 2011, the American Thyroid Association (ATA) published guidelines for the diagnosis and management of thyroid disease during pregnancy and postpartum. In these guidelines, the ATA recommends that all pregnant and lactating women ingest a minimum of 250 ug of iodine daily. For U.S. women that means supplementing their diet with a daily oral supplement that contains 150 ug of iodine (optimally potassium iodide).

In 1924, the Morton Salt Company began distributing iodized salt nationally, which is a good source of iodine. While iodized salt is the main source of iodine in the American diet, only ~20% of the salt Americans eat contains iodine!  Reasons for this include:

  1. Increase in popular designer salts like sea salt and Kosher salts (see photo below); Salt
  2. Iodized salt is not used in most fast and processed foods or in the production of commercial breads; and,
  3. Patient concerns about salt intake & hypertension. Good dietary sources of iodine include kelp seaweed, seafood (cod, sea bass, haddock, and perch are good sources) and dairy products.

In summary, if you are pregnant make sure you are taking a supplement that contains iodine, but do not worry about having your iodine concentration measured.


Diligence recommended during lamellar body count validation

Today's post is by a guest author, Patrick Kyle, Ph.D.  Dr. Kyle is the Director of Analytical Toxicology and an Associate Director of Clinical Chemistry at the University of Mississippi Medical Center in Jackson, MS.  He discovered a curious phenomenon when the lamellar body count is performed on a specific cell counter and he shares his observations here.  A report of this phenomenon has been published Clinical Chemistry and Laboratory Medicine.

The lamellar body count (LBC) is a relatively recent assay used for determination of fetal lung maturity (FLM) that has been discussed in previous posts (here and here) on this blog.

As with any laboratory-developed test, laboratorians should use the utmost care during LBC validation.  As reported here, during recent validation protocols in my laboratory, striking imprecision was noted in LBC values acquired from human amniotic fluid using a Beckman Coulter UniCel DxH 800.  This was unexpected because the manufacturer’s previous model (LH 750) had always yielded good precision (<10.0 CV%) with LBC.  In hopes that the problem was confined to a single instrument, the LBC results of amniotic fluid acquired from three DxH 800 platforms, a Beckman Coulter 750, and a Sysmex XE-5000 were compared.  Each of the five instruments was used to analyze two pools (low and high concentrations) of human amniotic fluid twice per day for ten non-consecutive days.  During the course of the experiment, samples were stored at 4oC and were never centrifuged or frozen.  Each day of analysis, samples were allowed to come to room temperature then inverted 5-10x immediately prior to analysis.  Each instrument aspirated sample from the same tubes during the same days.

Aberrantly low counts were randomly produced with each DxH 800 instrument, whereas the XE-5000 and LH 750 produced consistent counts.  The aberrant counts were consistently 25-50% lower than target values obtained on the Coulter LH750 and Sysmex XE-5000.  The coefficients of variability (CV%) ranged from 28.1-45.3% for the three DxH 800 instruments and were considerably higher than those of the Beckman LH750 (6.1-7.0%) and Sysmex XE-5000 (4.4-8.0%).

Interestingly, a review of the daily quality control values obtained with each instrument using three concentrations of manufacturer-specific controls revealed less than 10 CV% with each instrument.  This seemed to indicate that the DxH 800 instruments were performing as designed.  Therefore, LBC proficiency test (PT) data was examined in order to compare the results of these DxH 800 platforms to those of other laboratories.  My laboratory’s College of American Pathologists 2011 LBC-B survey results were all acceptable.  As a whole, the results of the DxH 800 group were comparable to those of other Coulter instrument groups.  Most importantly, the standard deviations of the DxH 800 group results were comparable to that of other instruments and exhibited <2.0 CV%.  Because this was inconsistent with the imprecision described above, the matrix of the PT samples was questioned.  When asked, the College of American Pathologists indicated that the PT samples were composed of synthetic amniotic fluid to which porcine platelets had been added.  In other words, the PT samples had tested the instruments’ ability to count porcine platelets, not their ability to count lamellar bodies.   

The manufacturer’s instrument literature was reviewed in order to investigate the source of the issue.  The DxH 800 incorporates Beckman Coulter’s new “Data Fusion” software, which allows intercommunication between flow cells to automatically correct values when specific morphologies are detected.  For example, lymphocyte counts are automatically corrected upon detection of giant platelets.  The DxH 800 includes particles from 2-25 fL in the platelet counts.  Particles less than 2 fL are categorized as debris, whereas particles >20 fL are categorized as giant platelets.  The DxH 800 histograms of EDTA blood (A) and amniotic fluid (B) below, reveal that the volumes of many lamellar bodies are smaller than those of platelets with many less than 2 fL.  Therefore, the aberrant values may have been caused by the limitations in platelet inclusion criteria (2 fL cutoff) and/or the algorithms applied by the Data Fusion technology.

Image 3

This issue emphasizes a very important fact: the LBC test should be validated using actual amniotic fluid
samples.  In recent communications with two laboratories in separate states that are using the DxH 800 for LBC testing, I learned that they had not used human amniotic fluid for validation nor were they using it for quality control purposes.  One lab chose to use commercial hematology controls for validation due to the lack of commercial controls, and to avoid issues with sample stability.  Given this information, many laboratories may not be aware of this issue and its potential problems.

In summary, all laboratory-developed assays, such as the LBC test, should be rigorously validated.  Matrix appropriate materials should be used whenever possible.  Beckman Coulter representatives are aware of the LBC issue on the new DxH 800.  However, imprecise LBC test results may preclude use of the DxH 800 for this assay.

Assessing Ovarian Reserve

OvariesWomen in their mid to late 30s and early 40s with infertility constitute the largest portion of the total infertility population. These women are also at an increased risk for pregnancy loss. This reflects a decline in oocyte quality and a diminished ovarian reserve as a result of follicular depletion. Ovarian reserve is a term that is used to describe the capacity of the ovary to provide eggs that are capable of fertilization resulting in a healthy and successful pregnancy.

While there is no gold standard for assessing the ovarian reserve of individual women, its indirect determination has been used to help direct infertility treatment.

Serum concentrations of follicle-stimulating hormone (FSH) and estradiol on day 3 of the menstrual cycle have been the tests of choice for assessing ovarian reserve. Cycle day 3 is chosen because at this time the estrogen concentration is expected to be low, a critical feature, as FSH concentrations are subject to negative feedback from estradiol. In general, day 3 FSH concentrations >20 to 25 IU/L are considered to be elevated and associated with poor reproductive outcome.   FSH concentrations are expected to be below 10 IU/L in women with reproductive potential.  Concomitant measurement of serum estradiol adds to the predictive power of an isolated FSH determination. Basal estradiol concentrations >75-80 pg/mL are associated with poor outcome. 

Inhibin B is produced by the developing follicles and concentrations peak during the follicular phase. Concentrations of inhibin B can be used in conjunction with serum FSH and estradiol to assess ovarian function. As women age, serum FSH concentrations in the early follicular phase begin to increase. It has been suggested that this is due to a decline in the number of small follicles secreting inhibin B.  Because inhibin is produced by the ovaries, it is thought to be a more direct marker of ovarian activity and ovarian reserve than FSH. In addition, cycle day 3 inhibin B concentrations may demonstrate a decrease before day 3 FSH concentrations. 

Seifer et al reported that women undergoing in vitro fertilization (IVF) with day 3 inhibin B concentration <45 pg/mL had a pregnancy rate of 7% and a spontaneous abortion rate of 33% as compared to pregnancy rate of 26% and abortion rate of 3% in women with day 3 inhibin B concentrations of > 45 pg/mL. 

In recent years, anti-Mullerian Hormone (AMH) has been suggested to be a more useful predictor of ovarian reserve. AMH is expressed by the granulosa cells of the ovary during the reproductive years, and controls the formation of primary follicles by inhibiting excessive follicular recruitment by FSH. In 2005 Tremellen reported that plasma AMH concentrations start to drop rapidly by age 30, and are ~10 pmol/L by the age of 37. David has blogged previously about the use of AMH as a predictor of IVF outcome.   

Using a cut off value of 8.1 pmol/L, plasma AMH could predict poor ovarian reserve on a subsequent IVF cycle with a sensitivity of 80% and a specificity of 85%.     In 2008, Riggs and colleagues confirmed that AMH concentrations correlated the best with the number of retrieved oocytes relative to age, FSH, inhibin B, LH, and estradiol. 

High concentrations of AMH can also be present in women with polycystic ovarian syndrome (PCOS), a cause of female infertility.  Therefore, in PCOS patientsAMH should not be used alone, but should be combined with transvaginal ultrasound to count the number of follicles.

Women who are diagnosed with diminished ovarian reserve should be counseled regarding options such as oocyte donation or adoption.

Should there be a critical value for hCG test results?

UrgentA short while back, a colleague asked Ann and me if we were aware of any need to have a critical value for hCG tests.  Our colleague had been asked by a physician to implement one in his laboratory because the physician had “missed” a molar pregnancy diagnosis due to her being unaware of the hCG test result that had been performed by the lab.  The physician argued that if the lab had notified her that the hCG result was very elevated, it would have alerted her to the fact that this patient may have had an abnormal pregnancy.

Seems reasonable enough, right?  Maybe not.

Let’s first consider the definition of a “critical result.”  Strictly speaking, a critical result is a test result above (or below) a pre-determined cutoff that, if observed, would require immediate medical attention due to the threat of an adverse event (e.g. death) to a patient.

The selection and use of critical results is a giant issue in healthcare laboratory medicine.  Doctors order all kinds of tests on many different patients every day.  The results of those tests are most often delivered electronically through the use of integrated data networks.  The lab performs a test, reports the result via a computer network where it is delivered to the physician.  While the test result is likely to be informative to the care and treatment of a patient, the doctor doesn’t necessarily have to immediately know about the result.  A cholesterol test is a good example.  The result is informative and may guide treatment decisions but the decision to initiate or modify treatment isn’t urgent so the doctor doesn’t need any special notifications.  As such, it would be highly unusual for a lab to have a critical result notification in place for a cholesterol test.

Now consider a result of a blood glucose test.  There can be very serious medical problems if the concentration of glucose is too high (e.g. greater than 450 mg/dL) or too low (e.g. less than 50 mg/dL).  As such, labs have critical results for glucose.  If the concentration exceeds the critical cutoff then the healthcare provider is immediately notified so that appropriate interventions can take place to prevent medical harm to the patient.

While these two are examples are clear-cut, many others are not.  The result is that the lab struggles to find answers to several questions such as 1) What tests are considered “critical?” and 2) What is (are) the cutoff(s) to be used for tests that make the list?  The answers to these questions aren’t trivial.  If the list of tests includes those that aren’t truly critical to patient care then lab personnel spend a lot of time calling results of tests to doctors.  The same thing happens if the cutoffs aren’t selected appropriately.  Furthermore, not only are lab personnel affected, so too are the doctors whose time from caring for their patients is consumed answering those phoned results.

It’s not unusual for a physician to request that the lab add a test to the critical result list.  Frequently this occurs when the physician fails to notice a test result and made an inappropriate decision as a result.  That was the case with the molar pregnancy patient.  A molar pregnancy is an abnormal pregnancy without a viable fetus that usually results in a very elevated hCG concentration, often much higher than that seen in a normal pregnancy.  Women with molar pregnancies have to be treated to prevent potential malignant disease from developing.

While that may seem like a good reason for having a critical hCG test result it’s actually not.  That’s because there isn’t a single, useful hCG concentration above which a molar pregnancy can be easily differentiated from a normal pregnancy.  Some molar pregnancies have hCG concentrations that are more typical of normal pregnancies and some normal pregnancies can have hCG concentrations that are as elevated as they are with molar pregnancies.  There is too much overlap between the two types of pregnancies for an effective hCG cutoff to be established for use as a critical value.  The use of one would be chaotic and would likely result in extreme confusion between doctors, laboratories, and patients.  And in the delivery of any type of healthcare, confusion is definitely not a good thing.

Syphilis Testing

SyphilisRecently, I blogged about TORCH testing  (sometimes called TORCHES).  The "S" in TORCHES stands for Syphilis. Syphilis is a sexually transmitted infection caused by the spirochete bacterium Treponema pallidum. Syphilis can present in one of four different stages: primary, secondary, latent, and tertiary.  Many people infected with syphilis are asymptomatic, but can be at risk for complications if the infection is not treated. Syphilis is transmitted from person to person through direct contact with a syphilis lesion or sore.  These lesions occur most frequently on the vagina, anus, rectum or external genitals. The bacteria may also be transmitted from mother to fetus during pregnancy or at birth, resulting in congenital syphilis which can cause stillbirth or infant death shortly after birth. Untreated babies may become developmentally delayed, have malformations, seizures, or die. Here I'll discuss further syphilis testing.

The CDC 2010 "Sexually Transmitted Diseases Treatment Guidelines"  recommend that pregnant women be screened on their first prenatal visit for sexually transmitted diseases such as Syphilis.

Blood tests for syphilis are divided into nontreponemal and treponemal tests.

Nontreponemal tests are used for the screening of pregnant and non-pregnant people, and include venereal disease research laboratory (VDRL) and rapid plasma reagin (RPR) tests. These tests do not detect antibodies against the actual bacterium, but rather for antibodies against substances released by infected cells when they are damaged by T. pallidum (e.g. cardiolipin). Nontreponemal tests are inexpensive, relatively rapid to perform, and widely available. They can detect an active infection and will become negative over time with proper treatment, therefore, they can be used to follow therapy.  The diagnostic sensitivity of these tests varies between 80-90% for detecting primary infection, and their specificity is approximately 98%. However, because there are occasionally false positive results, confirmation of all positive nontreponemal test results is required. This is accomplished by using a more specific treponemal test.

Treponemal tests detect antibodies that are specific to T. pallidum. These tests include treponema pallidum particle agglutination (TP-PA) or fluorescent treponemal antibody absorption test (FTA-Abs). These tests cannot distinguish between past and current infection because the antibodies are generally present for life. Treponemal tests have diagnostic sensitivities of about 84-88% for detecting primary infection and close to 100% for detecting later infections. They also have specificities >95%. However, testing is time and labor intensive.

Syphilis is often diagnosed by using a nontreponemal screening test first and then performing treponemal tests on any positive samples. However the interpretation of a nontreponemal test is subjective and so this approach does not allow for high test throughput. In the last few years, a number of automated treponemal enzyme immunoassays (EIA) have been developed for the diagnosis of syphilis. A recent paper compares the clinical utility of seven of these assays.   The authors of this study concluded that the automated treponemal assays have sensitivities and specificities comparable to the FTA-Abs test (considered to be the gold standard) and a consensus panel of treponemal tests resulted in an overall concordance of 95-99%. They also concluded that the cost and turnaround time of the automated assays are comparable to the FTA-Abs test with the benefit of less hands-on time and more objective results.  This would be especially true for large laboratories that process a large number of samples.

Recently, the Centers for Disease Control and Prevention (CDC) and the Association of Public Health Laboratories (APHL) convened an expert panel to evaluate available information and produce recommendations that will ultimately result in the development of Guidelines for the Laboratory Diagnosis of syphilis testing in the U.S. the Meeting Summary Report are currently available and provide a summary of what was discussed.  It is anticipated that the formal Testing Guidelines will be released in 2011.  The summary states that in areas with a low prevalence of syphilis, samples may be screened using a treponemal-specific EIA and those with positive results analyzed with a nontreponemal test to assess for an active disease state and treatment status. This is the reverse of what is currently practiced.  

However, there is a potential problem with this new algorithm. The use of a treponemal test (FTA-Abs or treponemal EIA) for screening purposes can result in patients who are positive by a treponemal-specific screening test yet are negative by nontreponemal tests. This can occur in patients with past or recently treated syphilis and in patients with very early or late/latent disease. Therefore, physicians must review patient disease and treatment histories in order to properly interpret syphilis testing results. In addition follow-up FTA-Abs testing may be required. Several publications have examined the "reverse syphilis algorithm" in practice.

So what is the current thought? Currently, the CDC still recommends using a nontreponmeal screening test and treponemal test as confirmation of syphilis. The traditional algorithm is designed to identify patients with active syphilis while minimizing false-positive results in populations with a low prevalence of syphilis. Although the reverse syphilis algorithm has some attractive features, such as automation of testing, objective results, and detection of latent syphilis, it may not be suitable in all clinical situations.

Is the qualitative serum pregnancy test obsolete?

I’ve written several times about qualitative hCG tests in this blog.  As a reminder, qualitative tests can be performed using urine or serum samples.  Urine tests can be performed close to the patient or even at home because the urine sample requires no special processing.  However, when serum is the test really can’t be performed at home or at the point-of-care because the blood sample has to be centrifuged to first obtain the serum and centrifugation is usually only performed in the clinical laboratory.  Notably, clinical labs are often able to do quantitative hCG testing on serum, too.

So, if a lab can do qualitative and quantitative hCG testing on serum, why not just offer one test instead of two?  In other words, might the qualitative test be considered obsolete?  My lab recently published a study that addressed that question.

To answer that question we surveyed several hundred doctors and the survey results revealed the following:

  1. When requesting serum hCG tests, 49% of physicians preferred to order a qualitative rather than a quantitative test even though they believed quantitative tests were more accurate.
  2. Physicians preferred qualitative tests because they believed that they received the test results faster.

However, when we examined the turnaround time data, that last point was not supported.  There are a few definitions of turnaround time to consider.  Doctors consider it to be the time it takes to get a result after the sample is collected while laboratorians consider it to be the time it takes to produce the result after they receive the sample.

By the lab’s definition, qualitative tests were performed more rapidly than quantitative tests but there were no differences using the doctors’ definition of turnaround time.  That’s because the time it takes to transport the sample to the laboratory is known to contribute the most to delays in the total testing process.  So, although physicians believed they get results from qualitative tests more quickly, it doesn’t seem to be the case.

We also compared the analytical sensitivities of the two types of tests.  The qualitative test that we used had a claimed detection limit of 25 IU/L.  That is, a sample with an hCG concentration above 25 IU/L should produce a positive result.  Of the samples that gave a positive result, about 20% had an hCG concentration that was <25 IU/L which indicated that the qualitative test was more analytically sensitive than we expected it to be.  In my opinion, that’s a good thing.

Because we determined the actual pregnancy status of all the patients with a positive result, we were also able to determine how well the qualitative and quantitative tests performed at determining pregnancy status.  Both tests did quite well and showed high sensitivity and specificity.  That is, there were very few false-negative or false-positive results.  From a clinical perspective, a false-negative result is more concerning than a false-positive one because a pregnant patient who is incorrectly identified as not being pregnant risks being exposed to a medical intervention that could harm the fetus.  The false-negative rate was lowest, only 0.1%, when the qualitative test was evaluated against pregnancy status and the detection threshold of 25 IU/L.  The performance of the quantitative serum hCG test was identical.  So, both the qualitative and quantitative serum hCG tests do a very good job at ruling-out a possible pregnancy.

So, given this evidence, I would conclude that while qualitative hCG tests could be replaced by quantitative tests, there is really no compelling reason to do so.