To: All Msg #217, Apr-13-93 12:45PM Subject: Bayes

Master Index Current Directory Index Go to SkepticTank Go to Human Rights activist Keith Henson Go to Scientology cult

Skeptic Tank!

From: Simon Clippingdale To: All Msg #217, Apr-13-93 12:45PM Subject: Bayesian Statistics, theism and atheism Organization: Department of Computer Science, Warwick University, England From: (Simon Clippingdale) Message-ID: <> Newsgroups: alt.atheism This is a cut-down version (believe it or not) of part of a file I'm in the process of writing. Since I don't want this distributed before it's been through the process of causing outrage on the net and being suitably modified (this isn't even an alpha version), here comes a copyright notice. **** Contents copyright 1993 Simon Clippingdale, so there. **** Apologies for the length and the occasional reference to stuff unrelated to the current thread, but this came out of an e-mail discussion and I haven't had time to make it totally stand-alone or edit all the flab. This part examines the question "Can a lack of evidence for something be considered as evidence against it?" using Bayesian statistics. The general framework involves the updating of notional running estimates of probability for each of a number of hypotheses H[i], as new observations x[n] arrive at times n. The hypotheses are assumed to be Ha (atheism, correct if no gods exist) and Ht (theism, correct if one or more gods exists). Since the hypotheses form a partition (gods either exist or they don't), then P(theism) + P(atheism) = 1. The conditional probability P(X | Y) is read "probability of X given Y" and is equal to P(XY) Prob. of X and Y P(X | Y) = ----- = ---------------- P(Y) Prob. of Y Enough preliminaries; here goes. Cut to the file. ************************* begin included material ***************************** The question is what happens when `running' estimates of probability are updated upon the arrival of a new observation. If the hypothesis in question is H, prior observations are denoted by x[1],...,x[n-1] and the new observation by x[n], then the relevant statement of Bayes' Rule in this case expresses the updated estimate in terms of the previous estimate as: P(x[n] | Hx[1]...x[n-1]) P(H | x[1]...x[n]) = P(H | x[1]...x[n-1]) -------------------------- P(x[n] | x[1]...x[n-1]) or Prob. of new obs. Prob. of H Prob. of H given H and prior obs. given prior = given prior ---------------------- and new obs. obs. Prob. of new obs. given prior obs. The fraction on the right multiplies the old estimate to give the new one. The denominator of that fraction is independent of H, so we need worry only about the numerator (nevertheless, I'll leave the denominator in for clarity). Assuming statistical independence of the observations x[k], 1 <= k <= n, [i.e. P(x[k]x[m]) = P(x[k])P(x[m]), hence knowing past observations does not help in predicting future observations], both the denominator and numerator may be simplified: P(x[n] | Hx[1]...x[n-1]) = P(x[n] | H), and P(x[n] | x[1]...x[n-1]) = P(x[n]). So the updating term which multiplies the estimate at `time' n-1 to give the estimate at `time' n simplifies to P(x[n] | H) ----------- . P(x[n]) We need to look at P(x[n] | H) for various observations x[n] and hypotheses H. Recall that the denominator P(x[n]) is independent of H. Now we are back to the business of what I've called `event spaces', which are discrete or continuous spaces of all possible observations x[.], upon which the various hypotheses each define some conditional probability density function (pdf) f(x | H). I'll only deal with the general case of continuous x; the discrete case simply involves Dirac delta functions at the permissible observation values for each hypothesis. The important assumption is that there are *some* observations which are compatible with the theist hypothesis and not with the atheist hypothesis, and thus would falsify atheism; these are what I called `appearances of god/s', but this need not be taken too literally. Any observation which requires for its explanation that one or more gods exist will count. All other observations are assumed to be compatible with both hypotheses. This leaves theism as unfalsifiable, and atheism as falsifiable in a single observation only by such `appearances of god/s'. What follows is a schematic representation of the conditional pdf's corresponding to the theist hypothesis Ht and to the atheist hypothesis Ha. The exact shape isn't important, and neither is the extent (I'm actually going to represent f(x | Ht) as nonzero only on a finite interval, even though unfalsifiablility implies that it is nowhere zero, just because it's easier to draw. Extending its range to infinity doesn't affect the result). Here goes (Ia, It are defined below): f(x | Ha) 1/|Ia| ________________________________________________ | | 1/|It| |------------------------------------------------+--------- | f(x | Ht) |/////////| | |// area /| | |// A /| | |/////////| 0 ---------------------------------------------------------------------- --> x | | appearances --> | | <-- of god/s | | Also define two intervals on the event space as Ia, the space of all x compatible with atheism: Ia = {x : f(x | Ha) > 0} and similarly for It: |------------------------------------------------| Ia |----------------------------------------------------------| It Note that f(x | Ha) is larger on Ia than is f(x | Ht); this is because of the normalisation condition inf Integral f(x) dx = 1 x = -inf for any pdf f(x), conditionals included. For densities other than uniform, f(x | Ha) may dip below f(x | Ht) but the area under it on Ia is still larger, and that is the important point. The implication is that the theist hypothesis Ht `wastes' some proportion (the area A in the schematic) of its available probability f(x | Ht) on appearances of god/s, and this is finally its undoing in the absence of such appearances, no matter how small is the area A provided that it is nonzero. (If A = 0, the theism says that everything will always appear exactly as if no gods existed, and is indistinguishable from atheism other than as a thought experiment.) P(x[n] | H) And so back to the updating multiplier ----------- . P(x[n]) Numerator and denominator are both asymptotically zero, so we have to consider a small interval around the observed value x0 of x[n], and let this interval tend to zero: P(x0 <= x[n] < x0 + dx | H) = f(x0 | H) dx P(x0 <= x[n] < x0 + dx) = f(x0) dx which gives the multiplier as P(x[n] | H) f(x[n] | H) ----------- = ----------- P(x[n]) f(x[n]) and in the case illustrated in the schematic above, we have f(x[n] | Ha) multiplier for Ha = ------------ f(x[n]) and f(x[n] | Ht) multiplier for Ht = ------------ f(x[n]) f(x[n] | Ha) = ------------ (1 - A) f(x[n]) for an observation x[n] on the interval Ia. Thus for an observation on Ia, compatible with both theism and atheism, the multiplier for Ht is smaller than that for Ha by a factor of 1 - A, where A is the area [or probability, integral of f(x | Ht) dx] which Ht `wastes' on making possible the appearance of god/s. After a large number N of observations all of which fall on the interval Ia, the estimate of conditional probability for the theistic hypothesis will be down on that of the atheistic hypothesis by a factor of (1 - A)^N. As N becomes arbitrarily large, with all observations on Ia, and no observed `appearances of god/s' in [It & (!Ia)], the running estimates asymptotically approach zero in the case of the theist hypothesis Ht and unity in the case of the atheist hypothesis Ha. And there you have it. Summary: if theism states that god/s *may* `appear' or, more generally, give rise to observations incompatible with atheism, then observations which are compatible with both theism and atheism must tend statistically to support atheism. This means that a lack of evidence which specifically supports theism *is* evidence for atheism, because every observation compatible with both theism and atheism causes running estimates of the probability [of correctness] of atheism to increase and those of theism to decrease. ( Aside: as to the initial estimates of P(Ha) and P(Ht), before any ( observations are in, we are compelled by the so-called Principle of ( Insufficient Reason to set both equal to 0.5. This is because to do ( otherwise implies that we have some information on the strength of ( which we can discriminate between the hypotheses. In the absence of ( any such information, an arbitrary relabelling of the hypotheses ( cannot lead to a change in the probabilities assigned, and thus they ( are bound to be equal. ( ( In the case shown in the schematic, the updating multipliers are ( k for Ha and k(1 - A) for Ht, where k is a normalising constant, ( the value of which changes with observation number N: ( ( 1 + (1 - A)^N ( k = ---------------------- ( 1 + (1 - A)^(N+1) ( ( giving the multiplier for Ha as ( ( 1 + (1 - A)^N ( ---------------------- ( 1 + (1 - A)^(N+1) ( ( and the multiplier for Ht as ( ( 1 + (1 - A)^N ( (1 - A) ---------------------- . ( 1 + (1 - A)^(N+1) ( ( ( The first few values of the running estimates in this case are: ( ( obs.# for Ha: for Ht: ( ( 0 1 / 2 1 / 2 ( 1 1 / (2-A) (1-A) / (2-A) ( 2 1 / (2-2A+A^2) (1-2A+A^2) / (2-2A+A^2) ( ( ... ... ... ( ( N 1 / [1+(1-A)^N] (1-A)^N / [1+(1-A)^N] ************************ end included material ******************************** Cheers Simon -- Simon Clippingdale Department of Computer Science Tel (+44) 203 523296 University of Warwick FAX (+44) 203 525714 Coventry CV4 7AL, U.K.


E-Mail Fredric L. Rice / The Skeptic Tank