![]() | This ![]() It is of interest to the following WikiProjects: | ||||||||||||||||||||
|
|
||
Let's say we do coin tosses and let's say we assume P(head) = 0.1 and P(tail) = 0.9 as probabilities. That's a legitimate probability function according to Kolmogorov. But now the LLN becomes obviously false. So there must be some premise of LLN that forbids this constellations. Which one is it? — Preceding unsigned comment added by Rs220675 (talk • contribs) 19:47, 20 February 2021 (UTC)
→I don't understand your message, but why would the LLN "become obviously false"? if you toss your coin an high number of times, the number of tail divided by the number of toss may lean toward 0.9, i don't understand your problem with that? — Preceding unsigned comment added by 2A01:CB11:88F:A800:6CDA:5A2C:D4F2:CDC (talk) 11:32, 30 October 2021 (UTC)
The section on the strong law gets excessively wordy describing exactly what it means to be strong (in an unclear way, since a theorem can be strong when the hypothesis is weaker, (so that it implies the weak one and applies to more cases) or when both the hypothesis and conclusion are stronger, as is here.) I think it would be better to rewrite most of that verbage, specifically:
"The strong law implies the weak law but not vice versa, when the strong law conditions hold the variable converges both strongly (almost surely) and weakly (in probability). However the weak law may hold in conditions where the strong law does not hold and then the convergence is only weak (in probability)."
I suggest instead a general explanation of strong vs weak theorems be made in theorem and linked to from here (and perhaps any other place that uses the terms). I'll be adding a talk over there about that.
This section also gets wordy on another count, as it appears some are arguing as to whether the strong and weak forms are possibly equivalent. There are examples of probability distributions for which the weak law applies, but not the strong law.[1] As such, I suggest the following be removed:
To date it has not been possible to prove that the strong law conditions are the same as those of the weak law.
as well as the clarification needed prior, and the citation needed after. Is the StackExchange conversation a sufficient reference to make such an edit?
— Preceding unsigned comment added by Jandew (talk • contribs) 22:49, 23 November 2016 (UTC)
References
Should Borel's law of large numbers get merged into this article and made a redirect page? Michael Hardy (talk) 16:04, 10 January 2008 (UTC)
correction of misdirected merger proposal from March2008. Melcombe (talk) 13:11, 12 May 2008 (UTC) -- and copying from Talk:Law of Large Numbers corrected again — ciphergoth 07:44, 21 May 2008 (UTC)
Agreed. In fact the other article should probably simply be removed. OliAtlason (talk) 15:28, 21 May 2008 (UTC)
Consider the paragraph:
This is simply explaining in words what convergence in probability is. I don't consider it useful. I'll remove it shortly if no-one objects. Aastrup (talk) 22:17, 23 July 2008 (UTC)
I removed the annoying references/citations tag and added a few references. Should there be more citations? Should I have left the tag where it was? I dont think so. Aastrup (talk) 19:44, 24 July 2009 (UTC)
From reading this article many can get the wrong impression that a sequence of averages almost surely converges, and converges to the expected value. But in reality the law of large numbers only works when expected value of the distribution exists, and there are many heavy-tailed distributions which don't have an expected value. Take, for example, the Cauchy distribution. A sequence of sample means won't converge, because the average of n samples drawn from the Cauchy distribution has *exactly* the same distribution as the samples. I think the article definitely needs a section about this misconception with examples and a neat graph of diverging sequence of averages, but as you might see, my English is too bad for writing it myself. --87.117.185.161 (talk) 12:53, 21 November 2009 (UTC)
There is a proof of the Strong Law of Large Numbers that is accessible to students with an undergraduate study of measure theory, its established by applying the dominated convergence theorem to the limit of indicator functions, and then using the Weak Law of Large Numbers on the resulting limit of probabilities. Would this be appropriate for inclusion with the group of articles on the Laws of Large Numbers?Insightaction (talk) 21:17, 20 January 2010 (UTC)
I have created and uploaded an image similar to the current image, but in SVG format instead of GIF and with source code available. It also looks a little different and has different data (new data may be generated by anyone with the inclination using my provided source code (or their own)). I would like to propose that we switch to my image, File:Largenumbers.svg. --Thinboy00 @175, i.e. 03:11, 3 February 2010 (UTC)
It has been suggested at the Wikipedia:Proposed mergers page, that Law of averages be merged with Law of large numbers (LLN). Please state your comments regarding this action. --TitanOne (talk) 20:58, 3 March 2010 (UTC)
I'm looking for a quick answer, trying to resolve a certain issue. Does the LLN hold even when we're collecting samples from and for a model/function that has infinite VC dimension?
I was reading some papers on statistical learning (http://dl.acm.org/citation.cfm?id=76371). They mention that "C is uniformly learnable if and only if the VC dimension of C is finite," where "a learning function for C is a function that, given a large enough randomly drawn sample of any target concept in C, returns a region in E (a hypothesis) that is with high probability a good approximation to the target concept." My understanding of "concept" here is function. Perhaps my understanding of "concept" is wrong or the LLN has limitations. — Preceding unsigned comment added by 150.135.222.152 (talk) 04:02, 18 October 2011 (UTC)
The article says that finite variance is not required, without citation or justification. Every single other source I saw said that finite variance is in fact needed. This writing titled Law of Large Numbers even specifically says that finite variance is needed, and uses the Cauchy distribution as an example where the variance is not finite, and the Law of Large Numbers does not hold (in the section 'Cauchy case'). — Preceding unsigned comment added by 62.49.144.162 (talk) 10:23, 29 May 2012 (UTC)
Why on earth does an article on the law of large numbers lead to Nasim Taleb's vanity page? I have also deleted the rest of the sentence, which was unencyclopedic, and unnecessary. If you disagree, could you please show a reference from the serious lln literature that mentions lightning or references the black swan?
Otherwise it's not appropriate. — Preceding unsigned comment added by 82.132.235.94 (talk) 19:29, 31 August 2012 (UTC)
"with the accuracy increasing as more dice are rolled." This is not correct, and in the figure the accuracy for n=100 is greater than for n=200 or even 300.
"Convergence in probability is also called weak convergence of random variables". I don't think this is standard or fortunate. Convergence in distribution is already called weak convergence. The MSC (Mathematics Subject Classification) category 60F05 is "Central limit and other weak theorems", meaning theorems with convergence in distribution, not convergence in probability (as far as I know).
"Differences between the weak law and the strong law". It may be interesting to add here that the Weak Law may hold even if the expected value does not exist (see e.g. Feller's book). This underlines that, in their full generality, none of the laws follows directly from the other.
"Uniform law of large numbers". The uniform LLN holds under quite weaker hypotheses. This is definitely uninteresting to the average reader, but a reference to the Blum-DeHardt LLN or the Glivenko-Cantelli problem might be very valuable to a small fraction of readers.
"Borel's law of large numbers, named after Émile Borel, states that if an experiment is repeated a large number of times, independently under identical conditions, then the proportion of times that any specified event occurs approximately equals the probability of the event's occurrence on any particular trial;" The LLN as stated might as well be Bernoulli's original LLN from 1713. There seems to be no reason to attribute *that* statement to Borel. Compare e.g. http://www.encyclopediaofmath.org/index.php/Borel_strong_law_of_large_numbers93.156.35.219 (talk) 02:30, 2 January 2013 (UTC)
OK, I'll be up front and admit that the math on this page is beyond me. I looked up the Law of Large Numbers to try and find out why it happens. (I mean why it happens, not how it happens.) So can someone explain in plain (or even complicated) English why something that is random each time you do it (eg tossing a coin or betting on roulette) tends to give a pattern over a large number of incidences? Why is it that we can anticipate (roughly) what the average will be, rather than its being completely random and not able to be anticipated? Surely there's a place for that issue in the article, if there is some literature on it. Thanks.89.100.155.6 (talk) 20:36, 25 January 2013 (UTC)
"In particular, it implies that with probability 1, we have that for any ε > 0 the inequality holds for all large enough n.[1]". I do not believe this sentence (if I am wrong, please ignore me and delete this post). If you fix ε > 0 then for every n there is some small probability that . Indeed, with non-zero probability all the first n tosses are say > \mu + ε.
References
--93.219.149.62 (talk) 19:54, 6 March 2014 (UTC)
What if we would use different legitimate integration methods for expectation definition as:
which is much more general than Lebesgue one?
Then we will shurely have random variables with finite expectation where L.L.N do not hold.http://www.math.vanderbilt.edu/~schectex/ccc/gauge/venn.gif — Preceding unsigned comment added by Itaijj (talk • contribs) 20:23, 2 February 2014 (UTC)
In my opinion, many statements expressed on this page are not correct, such as:
"It follows from the law of large numbers that the empirical probability of success in a series of Bernoulli trials will converge to the theoretical probability. For a Bernoulli random variable, the expected value is the theoretical probability of success, and the average of n such variables (assuming they are independent and identically distributed (i.i.d.)) is precisely the relative frequency."
"The LLN is important because it "guarantees" stable long-term results for the averages of random events."
"According to the law of large numbers, if a large number of six-sided die are rolled, the average of their values (sometimes called the sample mean) is likely to be close to 3.5, with the precision increasing as more dice are rolled."
This confuses conclusions from the mathematical theorem proven from Kolmogorov's axioms (of which there is very little for the axioms are very weak and do not provide a definition or constraints strong enough for a meaningful interpretation of probability), from its intuitive interpretation that requires additional assumptions, equivalent to assuming the law itself true a priori. See a more elaborate explanation here:
Stable relative frequencies in the real world are discovered empirically and are not conclusions from any mathematical theorem. Ascribing "P()'s" to events with a frequency interpretation in mind is the same as already assuming the relative frequencies of those events converge in the limit of an infinite number of trails to a definite number, this "P()". The only thing the theorem allows to conclude, is that if all the relative frequencies involved in the given reasoning are stable in the first place, the difference from a finite number of trails between the measured and "ideal" mean is likely to be less than so and so.
Jarosław Rzeszótko (talk) 06:56, 2 May 2014 (UTC)
The six numbers on a die are interchangeable with any other set of symbols - is an integer mean relevant? My first guess is that the result would be 3, if I'm looking at random integers [0-6], rather than 7/2 which seems like part of a different concept, or an artifact of the way dice are labeled Cegandodge (talk) 03:34, 19 March 2016 (UTC)
Looking through the references currently given for the uniform law of large numbers, I notice a technical issue in the current statement. One of the references (Newey & McFadden 1994) gives a formulation of the uniform LLN which allows for the function f to be continuous almost everywhere, but provides only for uniform convergence in probability. The other (Jennrich 1969) gives a formulation which allows only for the function f to be continuous everywhere (not allowing a set of measure 0 on which discontinuities may occur), but gives the stronger mode of a.s. convergence. Is there an obvious synthesis of these two statements which yields the hybrid given in the article (a.e. continuous function f as well as a.s. convergence) or is there a reference out there which gives the stronger statement? If not, it may be worth revising the statement to more accurately reflect the references. Gillespie09 22:17, 21 March 2021 (UTC) — Preceding unsigned comment added by Gillespie09 (talk • contribs)
The comment(s) below were originally left at Talk:Law of large numbers/Comments, and are posted here for posterity. Following several discussions in past years, these subpages are now deprecated. The comments may be irrelevant or outdated; if so, please feel free to remove this section.
I'd love a proof of the strong law. Aastrup 11:28, 29 July 2007 (UTC) |
Last edited at 11:28, 29 July 2007 (UTC). Substituted at 20:03, 1 May 2016 (UTC)
I don't know if this page is active, but it's a classic mistake to think that since by increasing the number of trials, the empirical average gets closer to its theoretical value, it's the same for the sum of the outcome, that would get closer to the sum of the average outcome, while in fact, it's not the case (and on the contrary, the standard deviation increases), and this value converges only if we divide it by the number of trials.
and i think it would be good, to underline the fact that only the average converges and not the sum of the outcome minus the theoretical outcome, to add a paragraph about this, and also a diagram showing a dice throwing experiment (with excel, with the number of throws on the abscissa, and the sum of the results on the ordinate, so that we can see that the curve of the results doesn't converge towards the theoretical curve), and next to it the curve of the empirical average (of the same experiment) which we would see converging towards the theoretical line
I don't do the modification myself, because I'm not too used to the wikipedia codes (and I already got a slap on the wrist for doing that, but, as I have the impression that this page doesn't seem to be very active, if nobody reacts after a month or so, I'll do the modification myself. — Preceding unsigned comment added by 2A01:CB11:88F:A800:6CDA:5A2C:D4F2:CDC (talk) 11:21, 30 October 2021 (UTC)
The entire introductory section is currently as follows:
"In probability theory, the law of large numbers (LLN) is a theorem that describes the result of performing the same experiment a large number of times. According to the law, the average of the results obtained from a large number of trials should be close to the expected value and tends to become closer to the expected value as more trials are performed.
"The LLN is important because it guarantees stable long-term results for the averages of some random events.[1][2] For example, while a casino may lose money in a single spin of the roulette wheel, its earnings will tend towards a predictable percentage over a large number of spins. Any winning streak by a player will eventually be overcome by the parameters of the game. Importantly, the law applies (as the name indicates) only when a large number of observations are considered. There is no principle that a small number of observations will coincide with the expected value or that a streak of one value will immediately be "balanced" by the others (see the gambler's fallacy).
"It is also important to note that the LLN only applies to the average. Therefore, while
other formulas that look similar are not verified, such as the raw deviation from "theoretical results":
not only does it not converge toward zero as n increases, but it tends to increase in absolute value as n increases."
It is an astonishingly terrible idea to use a wholly unexplained symbol in the introductory paragraph ... or anywhere.
The introduction uses infinitely many unexplained symbols.
I hope that someone knolwedgeable about this topic can rewrite the introduction so that it is comprehensible to most readers.
Preferably by someone who understands what writing an encyclopedia article entails.
BUT ALSO: The introduction states the law of large numbers as an equality. It is not an equality; it is an equality with probability one. 2601:200:C000:1A0:808C:2579:CA92:A870 (talk) 15:48, 15 July 2022 (UTC)
References
The introductory section contains this passage:
"It is also important to note that the LLN only applies to the average. Therefore, while
other formulas that look similar are not verified, such as the raw deviation from "theoretical results":
not only does it not converge toward zero as n increases, but it tends to increase in absolute value as n increases."
But: The meaning of the term "theoretical results" is not necessarily clear to many readers.
Unfortunately, the meaning of the term " "theoretical results" " (the previous term, but ths time inside quootation marks) is even less clear. 2601:200:C000:1A0:3938:3645:9394:290D (talk) 01:52, 24 August 2022 (UTC)
One of the introductory sentences:
not only does it not converge toward zero as n increases, but it tends to increase in absolute value as n increases."
Is false, the difference in absolute value will be unbounded, but it will also be 0 an infinite amount of times. The lim inf will be zero, and the lim sup will be infinity. 2A01:11:8A10:8690:9A7:D52C:C665:FDF (talk) 23:31, 14 February 2023 (UTC)