This article is rated C-class on Wikipedia's content assessment scale. It is of interest to the following WikiProjects: | |||||||||||||||||||||||||||||||
|
request merge of "negative predictive value" "positive predictive value" and "Sensitivity and specificity". these terms are intimiately related, and should be in one place, possibly with a discussion of ROC. Further, suggest modeling on <http://www.musc.edu/dc/icrebm/sensitivity.html>> this is a great expostion of this complicated stuff.Cinnamon colbert (talk) 22:25, 16 November 2008 (UTC) PS: the three articles are a great start
See Talk:Sensitivity (tests) re past wish list for simpler description, setting what it is before launching in mathematical jargon. I have also added a table and in Sensitivity (tests) added a worked example. The table is now consistant in Sensitivity, Specificity, PPV & NPV with relevant row or column for calculation highlighted. David Ruben Talk 02:45, 11 October 2006 (UTC)
The link to false discovery rate should be removed as the (linked) false discovery rate includes an expected value. The definition here is non-standard.
"Physician's gold standard" seems to be an unhelpful phrase as it is used in this article.
My experience has been that when "gold standard" is used in this context it refers to the reference test against which the accuracy of a test is measured. As we all know, sensitivity, specificity, PPV, etc., require a "gold standard" test for reference -- otherwise we don't have a basis for claims about % true positives and % true negatives.
Here it seems that "physician's gold standard" means something like "it is the statistical property of a test that is most useful to physicians".
It seems that either the author was confused about the use of "gold standard" in biostatistics or there's another (unfortunate) use of the phrase that I'm not familiar with. Since I don't know which, I'm not editing the page. If others agree, perhaps this phrase should be replaced.
--will 02:19, 24 July 2007 (UTC)
Let's consider following tabel (Grant Innes, 2006, CJEM. Clinical utility of novel cardiac markers: let the byer beware.)
Table 3. Diagnostic performance of ischemia modified albumin (IMA) in a low (5%) prevalence population.
ACS Yes No Total Sensitivity (true-positive rate) = 35/50 = 70% IMA + 35 722 757 Specificity (true-negative rate) = 228/950 = 24% IMA – 15 228 243 Positive predictive value = 35/757 = 4.6% 50 950 1000 Negative predictive value = 228/243 = 94%
The positive predictive value is smaller than the prevalence. We must conclude that a positive test result decreases the probability of disease or in other words that the post-test probability of disease, given a positive result, is smaller than the pre-test probability (prevalence): very strange and unusual conclusion.
From a statistical point of view this very strange conclusion can be avoided by interchanging the rows of thet table: IMA- becomes a positive test result. This operation results in a predictive value of 6.17%. The conclusion is that a positive test result, if the test is of any value at all, increases the post-test probability as it is expected to do and in no case decreases this value.
This example illustrates the need for an unequivocal definition of a positive test result. If a positive test result is unequivocally defined, the positive predictive value is mathematically unequivocally defined. A text providing such an unequivocal definition was removed by someone who called it 'garble'. I intend to put the text back, any objections? —Preceding unsigned comment added by Michel soete (talk • contribs) 18:57, 22 September 2007 (UTC)
Yes - makes no sense, 'garble' indeed. I've removed it and placed here in talk page where we can work on this.
And, alternatively, too:
PPV = PR * LR+ / (PR * (LR+ - 1) + 1)
wherein PR = the prevalence (pre-test probability) of the disease, * = the multiplication sign and LR+ = the positive likelihood ratio. LR+ = sensitivity / (1 - specificity). The prevalence, the sensitivity and the specificity must be expressend in per one, not in percentage or in pro mille a.s.o.. The frequency of the True Positives must be this frequency that exceeds or equals the expected value, mathematically expressed: True Positives >= (True positives + False Positives) (True Positives + False Negatives) / N wherein N = True Positives + False Positives + True Negatives + False Negatives. If this condition is not met and if the sensitivity differs from .50 (50%) then two different results after the calculation of sensitivity are possible since the rows of two by two tables can be interchanged and then a former positive result can be called a negative, a former negative result can be called a positive (Michel Soete, Wikipedia, dutch version, Sensitiviteit en Specificiteit, 2006, december 16th).
As a start, lets use same terminology as rest of article, ie call PR just Prevalence, no need explain maths symbols. If LR+ is "sensitivity / (1 - specificity)", then I get:
PPV = Prevalence * sensitivity / (1 - specificity) -------------------------------------------- Prevalence * ((sensitivity / (1 - specificity)) - 1) + 1
Lets multiply through by (1 - specificity):
PPV = Prevalence * sensitivity -------------------------------------------- (Prevalence * (sensitivity - (1 - specificity)) + (1 - specificity)
Which is:
PPV = Prevalence * sensitivity -------------------------------------------- Prevalence * sensitivity - Prevalence + specificityPrevalence + 1 - specificity
and so to:
PPV = Prevalence * sensitivity -------------------------------------------- Prevalence * sensitivity + (1-specificity)(1- prevalence)
ie exactly the same as the last formula already given in the article ! This fails to add therefore a new insight into its derivation or meaning.
As for "The frequency of the True Positives must be this frequency that exceeds or equals the expected value, mathematically expressed: True Positives >= (True positives + False Positives) (True Positives + False Negatives) / N wherein N = True Positives + False Positives + True Negatives + False Negatives. If this condition is not met and if the sensitivity differs from .50 (50%) then two different results after the calculation of sensitivity are possible since the rows of two by two tables can be interchanged and then a former positive result can be called a negative, a former negative result can be called a positive" - sorry can't even begin to get my head around this.
My mother tongue is dutch. Initially I did not understand quite well what garble is but now I think it is the same of nonsense.
Let us assume that allowing ambiguity is a good option. Following tables can then be constructed:
D+ D- D+ D- blue (P) 99 (a) 1 (b) red (P) 1 99 red (N) 1 (c) 99 (d) blue(N) 99 1
Constructing these tables I respected some conventions: The frequencies of diseased people are in the first column, the frequencies of the positives in the first row, the frequency of the true positives in cell a.... a.s.o..
Now we can write that sensitivity is a / (a + c). For those for whom blue is positive the sensitivity is 99%, for those for whom red is positive the sensitivity is 1%. The positive predictive value ( a / (a + b)) is 99% (blue is positive) or 1% (red is positive).
Such a possibility for ambiguity is not in line with traditional medical thinking and therefore it leads to (at least seemingly) contradicory statements and therefore confusion.
Megan Davdson writes (2002, The interpretation of diagnostic tests: A primer for physiotherapists): 'Where sensitivity or specificity is extremely high (98-100%, interpretation of test results is simple. If the sensitivity is extremely high, we can be sure that a negative test result will rule the disease out.' If ambiguity is allowed we have to add 'or extremely low (0-2%)' and 'If the sensitivity is extremely low, we can be sure that a positive test result will rule disease out'. Moreover, the relatively new concepts SpPIn and SnNOut are described in the article. It are acronyms. A SpPIn is a test with such an extreme high Specificity that if a test result is Positive disease can be ruled In. A SnNOut is a test with such an extremely high Sensitivity that if the test result is Negative the disease can be ruled Out.
Thus our demand that a > the expected value in cell a is a solid basis for these concepts and their names and for the classical ideas that they incorporate. Also the strong living idea that a positive test result always points to disease find in this demand a firm basis.
I hope that the argumentation above were convincing enough and that the removed text will be put back by the person that removed it.
81.244.101.52 12:07, 29 September 2007 (UTC)
Disease Healthy +ve 980 10 -ve 20 10
Disease Healthy +ve 20 1 -ve 980 99
Hi Davidruben,
I did not claimed that a test with a high sensitivity, given a negative test result, ruled disease out, it was Megan Davidson. It was not Megan Davidson who claimed that a low negative test, given a a positive test result ruled disease out, I was it who stated that this should be added in her article if ambiguity was allowed. Davidson did not wrote a textbook but an excellent article of six pages on the subject.
You disagree with those claims, I disagree too but Davidson and Grant Innes are examples of classical thinking about the subject. Moreover, they have strong argumentation for their point of view. Davidson writes: 'Unfortunately the predictive values only apply when the clinical prevalence is identical to that reported in the study. Prevalence changes dramatically depending on where the test is being performed.'
Grant Innes writes: 'In reality, predictive value is less a measure of test performance than it is a reflection of disease prevalence in the population being tested.'(op.cit.)
His illustrating examples are good. But I disagree with both of them. Your examples are good and following example is it too:
D+ D- +ve 99 99 -ve 1 1
No further comment on this tabel needed, I suppose.
I consider their point of view as an expression of what was generally believed in the former century and, I suppose, by many if not most of the physicians today.
I stress the point that allowing ambiguity in defining a positive result does not result in ambiguity of the conclusion for the testee. Blue remains in any case the color that ends up in the conclusion D+. For the patient it is of no importance if the sensitivity is called 90% or 10% and if this conclusion is the result of what is called a positive or negative test result. For him, blue is disease.
The null hypothesis on its own does not says wich test result is positive. For two by two tables the null hypothesis says that experimental data will not deviate (significantly) from the table of expected values. My demand is decisive for what must be considered as a positive result (a must be higher than the expected value in cell a). It results in a situation wherein only one sensitivity a.s.o. is possible. A positive result is then not a result of a decision but of a calculation.
By the way, in my opinion, a cheap, innocent poor test may have very good utility. A potential good test is a test where the test result shows association with disease. The utility of a test is depending on decisions and is not only a characteristic of a test if there is association between test results and disease. Let us assume that the physician or the patient is satisfied with a probability of 97% to decide to a dangerous treatment then a very poor, cheap test increasing the post-test probability from 93% to 97% is potentially a very usefull test.
So, I hope that you will convinced that it is to be preferred to put my 'confusing' text back.
Michel soete 20:05, 29 September 2007 (UTC)
Hi David Ruben
I think I can understand quite well your hesitation. I suppose that nobody should hesitate to prefer a sensitivity of 99% in the last example I gave and therefore my new example was not convincing (but perhaps somewhat shocking). It is logical for a measure as sensitivity that everyone desires that it is high and for these tables there is no good reason to prefer a very low sensitivity. Applying my requirement the conclusion is too that the sensitivity is 99% and not 1%. But the problem of the table of Grant Innes is therewith not solved and this is not a esoteric, pedantic problem. It is a real life problem.
I looked on the website of wynneconsult.com and there I found the following (in dutch): 'The probability of a positive test result, as the patient has the disease, is called sensitivity. The sensitivity has to be as high as possible.' This is quite reasonable, I believe. They write too: 'The probability of a negative test result in absence of disease the disease is called specificity and it must also be as high as possible.' This too is quite reasonable, I think, but it is a pity that, reconsidering the table of Grant Innes, both requirements cannot be met at the same time. Indeed both should be at least as high as minimum 50%. We must make a choice and on what basis? So the requirements are not of general value and it is for that reason that I proposed a new requirement, it is an objective basis to make this choice.
Moreover if the unindependant variable is a numerical variable sensitivity and specificity can be manipulated by changing the cut-off points. If the positivity decreases the sensitivity will decrease too and there will be a cut-off point that is low enough to cause a sensitivity that is lower than 50%. What then? Switch positive results into negative results to meet the requirement of as high sensitivity as possible again? I do'nt like the idea.
For all those reasons and yet a few others I proposed my requirement that solves those problems. There is even no loss: if sensitivity in some cases will be lower than 50% it will be at the profit of specificity and it will be justified.
I thank you for your efforts to answer.
81.244.101.52 20:11, 30 September 2007 (UTC)
Hi David Ruben
If noboby will make obejections in the first days I will add to the text after 'is the proportion of patients with positive test who are correctly diagnosed.' the following: 'The positive predictive value must exceed the prevalence.' This results in the same as the text that was remved.
Justification:
Let us consider following table filled with expected frequencies.
D+ D- red (pos) 12 (a) 28 (b) 40 blue (neg)18 (c) 42 (d) 60 30 70 100
Prevalence = 30%, positive predictive value = 30%. If there is no association between colour and D+, D- prevalence and positive value are equal. Let prevalence and positivity (a + b)/(a + b + c + d) remain equal but a = 13 (then b = 27, c =17, d = 43). The positive predictive value will then increase with 2,5% and will exceed the prevalence. The more a increases the more the positive predictive value will exceed the prevalence. So far no problems. But what if a decreases? Let a = 11 (then b = 29, c = 19 and d = 41). The predictive value decreases to 27,5% and is lower than the prevalence. The more a decreases the more the positive predictive value will decrease. Conclusions: In some cases it is better to predict the presence of the disease with the prevalence (preferring indicating at random 30 persons in hundred patients as possibly having the disease than on a positive test result)and a positive test result can decrease the possibility of the presence of the disease in comparison with the prevalence. I think many will dislike such conclusions. The problem can easily be solved by interchanging the rows in the table and call blue positive and red negative. I suppose that Grant Innes in the table above thought that a high level of IMA would make it more probable that ACS was present (or become present)and therefore called IMA+ a positive test result. The data does not confirm this theory and it should have been better to adapt to the data (or reject the study for being without quality) and call a low level of IMA positive. Accepting the demand that a positive predictive value must exceed the prevalence makes conclusions possible that are easy to accept and perhaps in earlier times and yet nowadays believed by most people: that a positive test result always increases the possibility of the presence of the disease, that LR+ is always greater than 1 and yet others. This demand brings more order in this area of medical statistics. Therefore this demand (in this form or in onother) is, in my opinion an essential element of the definition of the positive predictive value and cannot be omitted without risking seemingly contradictory statemants in regard with some tables.
Michel soete 18:46, 6 October 2007 (UTC)
D+ D- Test pos 9 1 10 Test neg 31 59 90 40 60 100
D+ D- Test pos 29 1 30 Test neg 11 59 80 40 60 100
Hi David Ruben,
If PPV = prevalence than a predictive value was calculated but not a positive predictive value. We could perhaps say we calculated a neutral predictive value. If we thought we calculated a PPV and find that 'PPV' < prevalence then we calculated the post-test probability of disease given a negative test result but not PPV. Our initial hypothesis was proven to be wrong if 'PPV' < prevalence. A test result is positive if it makes we can assess that there is a higher probability of disease than we can assess on the basis of prevalence alone. Therefore I remain convinced that PPV must exceed prevalence and that this is essential for the definition of PPV. PPV is not a measure of the quality of a test. For every PPV it is possible to construct tables that show that the test is of no, low, high value. The suggestion of taking in account both PPV and NPV in assessing the quality of a test is very interesting but it seems to me that it is not relevant for the definition of PPV. By the way, accuracy ((a + d)/ (a + b + c + d) is an overal measure of the quality of a dichotomous test. Perhaps your suggestion leads to a yet more meaningfull measure for the overall quality of a test. I fear that such overall measures could hide the fact that a test can have very moderate overall quality but can be execellent in ruling in or ruling out disease. For instance the ANA test is, on its own, a very moderate test in ruling in SLE but excellent in ruling out SLE.
Michel soete 15:43, 8 October 2007 (UTC)
Michel, you are needlessly complicating the terminology. "Positive" Predictive Value is fine irrespective of the relative values of PPV and prevalence. Without the word "positive" it is unclear what is being measured. It also doesn't make sense for the name of a statistic to depend on its value in comparison to another statistic. PPV tells me the number of actual diseased individuals who had a "positive" value on the test. Prevalence is not part of that definition.
--Loonatickle —Preceding unsigned comment added by Loonatickle (talk • contribs) 21:07, 14 May 2008 (UTC)
Hi David Ruben,
I have the feeling that thus far I could not fully convince. In every day words my demand equals stating that the post-test probability of disease given a positive result must always be greater than the post-test probability of disease given a negative test result except in the case of no association between variable and disease.
Let's prove it: Let the expected frequencies be a', b', c' and d'. Fur such a table we agreed that PPV = prevalence. My demand was initially that a > a', thus a = a' + x, x being a positive number. Since the marginal totals do not change we can write a' + b' = a + b and b = b' - x. It is obvious that (a' + x)/(a' + b') > a'/(a' + b') since a' + x > a'. Thus we can write a / (a + b) > prevalence since a'/(a' + b) = prevalence. Remark that this is the same as saying that the PPV > prevalence. If a = a' + x then c = c' - x. It can be proven in a similar manner that c/(c + d) < prevalence. Now we can write a / (a + b) > prevalence > c/ (c + d). Thus a/(a + b) > c/ (c + d) what was to be proven.
Now we can state that the post-test probability of disease given a positive test result is always greater than the post-test probability of disease given a negative test result. Without my demand this cannot be stated. This demand makes also others statements possible as for instance that LR+ > 1 and that LR- <1 a.s.o. For those who think that this conclusion is possible without a demand for a I recommend the calculation of LR+ and LR- on the table of Grant Innes above.
Michel soete 11:47, 17 October 2007 (UTC)
The article uses a lot of language very specific to the field of medicine. Even the very definition uses terms like "patients" and "diagnosed." Unless I am mistaken, I believe that this is not a term specific for medicine but rather for any binary classification scheme. I think there are a lot of ambiguities/inconsistencies throughout related articles about this and about multiple terms for the same concept (see precision and recall framed differently). It seems that the articles should be a lot clearer on whether this term is just the specific term used in medicine for a concept with other names in other fields, or is the term truly domain agnostic as I believe it is. I posted something along these lines in the statistics project page but have yet to receive a response. What am I missing? Mickeyg13 (talk) 15:29, 24 May 2010 (UTC)
Surely it's absurd to say "a negative result is very good at reassuring that a patient does not have cancer (NPV = 99.5%) given that this is only marginally better than pointing to a random person and declaring they don't have cancer (probability 98.5%)" Also, the sooner this and Positive predictive value are merged, the better, IMO - the articles are essentially identical. Jmc200 (talk) 16:58, 3 June 2010 (UTC)
A merge has been proposed for 1 1/2 years with no objections, and in addition there are three statements of support (on in 2008, one above, and one on the Negative Predictive value talk page. In this light, I have done my best to merge these two articles, however I feel this article would benefit from a significant copyedit. --LT910001 (talk) 08:09, 22 December 2013 (UTC)
Hello all, I was nervous to make the edit myself, but there is a big problem with the beginning of this article. It says NPV and NPA are the same and cites an FDA article. The article says they are not the same. There is an important difference. NPA is TN/(FP + TN) and NPV = TN/(FN + TN). This actually caused me a lot of trouble as I believe the text in Wikipedia and made a determination about how to trust a rapid COVID test that listed a 100 percent NPA. The NPV may be more relevant and much worse, because I got a conflicting positive result days later after I had already resumed some activity. — Preceding unsigned comment added by 2601:249:8D80:3120:A586:2211:DEF8:5C1B (talk) 19:55, 22 August 2020 (UTC)
§ Other individual factors states that
PPV is directly proportional to the prevalence of the disease or condition
But the definition of PPV in § Positive predictive value (PPV) states that
which means that PPV is directly proportional to prevalence, since it shows up right there in the equation.