|
Principal Analysis of the Qur’ânic Initials
My observations regarding the nature of the Qur’ânic initials based on analyses of statistical significance generated twenty data points in all. In this analysis, the data that do not include hamzah (the etiological variant of the alif that was introduced in written Arabic as an aid in pronunciation) and the data that do will be looked at simultaneously for the sake of comparison.
The Issue of Post-Selection. Many have contended that Dr. Khalifa's reports of intricately intertwined sums of letters, numbers, verses, chapters, and other structural features of the Qur’ân are the product of post-selection. That is, it is not difficult to find multiples of 19 in any random array of data, whatever the type, as long as the range is not unduly restricted. Post-selection involves picking out numbers that match your criterion (in this case, multiples of 19) after you have already been presented the data. About every nineteenth datum should be a multiple of 19 by virtue of random chance alone, so about eighteen data are thrown out for every "significant" one encountered. Because Dr. Khalifa did not diligently report every insignificant finding that must inevitably have resulted from his searches, at least some of what he reported is likely to be the result of post-selection. Consequently, it is tempting to disregard all of his findings as being of the same quality. For this reason, many people dismiss Dr. Khalifa's findings without ever undertaking a serious investigation of them.
Should Post-Selection Always Be Assumed? When faced with some of the fantastic claims by Dr. Khalifa and others who have probed the Qur’ânic text for its mathematical properties, most people respond by dismissing the claims as the consequence of post-selection and never seriously investigate them. Is this reasonable? My answer would be yes. There have been many claims by many people throughout history regarding miraculous mathematical properties in various holy texts, especially the Bible. Virtually all of these have been the product of post-selection* in one form or another. Consequently, experience would dictate that dismissing any new such claim without bothering to analyze it is quite rational. Unless such claims can be shown to be something other than the product of post-selection, it is reasonable to assume that they are invalid.
*In certain cases, the observations may actually have some validity. For example, the Bible Book of Matthew in Greek has an unusual proliferation of multiples of seven, and some have claimed that the number 19 also plays a role in the Torah in Hebrew (the first five books of the Bible). In the case of the Book of Matthew, philologists have noted that several alterations had been made to at least some portions of the extant text in order to create a variety of multiples of seven. For example, the number of ancestors in the genealogy has been altered, as is evidenced by the variant text fragments that exist today. In the case of the Torah, the multiples of 19 reported by a rabbi of the medieval period have yet to be investigated seriously.
Dr. Khalifa's Inferences. I would personally contend that at some point Dr. Khalifa did indeed begin to view every multiple of 19 as part of an intricate web designed by the Creator, but this was the product of his eventual conviction that nothing in the structure of the Qur’ân was there by chance. Logically, if that is truly the case and there is a 19-based code in the Qur’ân, then every multiple of 19 in the Qur’ân is significant, regardless of whether or not it defies the laws of random chance. However, in order to establish this kind of claim in the first place, it must be shown that at least some aspects of that ostensible structure do indeed violate the laws of random chance.
Post-Selection and Choice of Structural Features. The post-selection hypothesis holds that every case of 19-divisibility within a given set of narrowly defined structural parameters is accompanied by eighteen cases of 19-indivisibility, on average. However, it also admits to the possibility that an unusually great number of multiples of 19 will be found in some structural feature if one spends enough time looking for such an anomaly. Why is this? Imagine examining one structural feature of the Qur’ân, such as the numerical value of all verses in initialed chapters that contain the word "rabb." What if you found a very high frequency of multiples of 19 in that analysis? Would it prove that the Qur’ân is a mathematically coded book? The answer is that it might, and it might not. A more basic question is, "What made you decide to examine that particular structural feature, of all that were available?"
Your answer should not be that you had tried several hundred variations until you chanced upon one that worked. If that is the case, then I cannot be surprised if you eventually ran across a structural feature that featured an anomaly so unusual that it would take several hundred tries before you could reasonably expect it to occur by chance.*
*Proponents of this argument against Dr. Khalifa's findings readily admit that there may be more multiples of 19 than one might expect in those portions of the Qur’ân under study, simply because he is presumed to have tried several different approaches until one worked. On the contrary, Dr. Khalifa's interest in the muqaţţa‘ât preexisted his observations about the number 19 (cf. my home page for a brief review). Consequently, this argument cannot be made against his earliest observations, those surrounding the muqaţţa‘ât in particular.
Testing the Post-Selection Hypothesis. The post-selection hypothesis is equivalent to the null hypothesis in standard statistical analysis. The null hypothesis is a generic hypothesis that applies to all statistical tests. It holds that there is no significant relationship among the properties you are examining. In the case of 19-divisibility, it would hold that there is no significant relationship between the textual structure of the Qur’ân and 19-divisibility. In order to test it, two things must be done. First, the structural feature under examination must have some objective reason for being chosen. This involves the definition of the data set, which has already been discussed. Second, it must be shown that other numbers could not just as well have occurred in the same unusual manner as the number 19. The reason for this is that the post-selection hypothesis holds as a corollary that any number at all could be made out to appear "miraculous" in precisely the same manner as is claimed about the number 19. If similar anomalies are found among other numbers under the same constraints, this fact alone would render the number 19 quite ordinary by comparison.
To test the post-selection hypothesis using a data set that has already been defined as objectively as possible, it is sufficient to include other numbers in our analysis, along with the number 19. If these other numbers show statistical properties similar to 19, then we will have to conclude that 19-divisibility was never, in fact, intended as a facet of the structure of the Qur’ân.
Primes Versus Composites. The numbers that will be used for purposes of comparison in this analysis will all be prime numbers below 100. Composites (non-primes) will not be considered because they are literally the products of the interactions of their component primes. For example, say the prime numbers 5, 7, and 11 all happen to occur with somewhat great frequency (say, p = .05 in each case). Then it would actually be consistent with random chance to find that the numbers 35, 55, and 77 (their immediate products) stand out as especially frequent (p = .052 = .0025). Meanwhile, the number 385 would appear downright "miraculous" (p = .053 = .000125). In order to control for this interaction effect, the independent effects of each composite number’s factors must be partialed out as a separate procedure. As an alternative, we can simply restrict our analysis to prime numbers in order to address the issue of post-selection in a straightforward manner without having to correct for the interaction effects associated with composites.
Revisiting the Data Set. Based on our prior analyses about the nature of the muqaţţa‘ât and how the associated findings inform our definition of the data set, our data are as follows. The chapter numbers included in each sum are indicated in parentheses. The sums in brackets refer to the counts made with the letter hamzah included:
Datum 1: alif-lâm-mîm third-order grouping (2-3,7,29-32) 24417 [24998] {number in brackets includes hamzah in the alif count}
Datum 2: alif-lâm-râ third-order grouping (10-15) 10577 [10891]
Datum 3: ţâ-sîn third-order grouping (26-28) 1313
Datum 4: alif-lâm-mîm second-order grouping (2-3,29-32) 19279 [19730]
Datum 5: alif-lâm-râ second-order grouping (10-12,14-15) 9141 [9415]
Datum 6: ţâ-sîn-mîm second-order grouping (26,28) 1192
Datum 7: alif-lâm-mîm first-order grouping—first series (2-3) 15096 [15458]
Datum 8: alif-lâm-mîm first-order grouping—second series (29-32) 4183 [4272]
Datum 9: alif-lâm-râ first-order grouping—first series (10-12) 7097 [7310]
Datum 10: alif-lâm-râ first-order grouping—second series (14-15) 2044 [2105]
Datum 11: ħâ-mîm first-order grouping—first series (40-42) 1121
Datum 12: ħâ-mîm first-order grouping—second series (43-46) 1026
Datum 13: kâf-hâ-yâ-‘ayn-şâd (19) 798 {note: hâ is different from ħâ}
Datum 14: ţâ-hâ (20) 279
Datum 15: yâ-sîn (36) 285
Datum 16: şâd (38) 29
Datum 17: ‘ayn-sîn-qâf (42) 209
Datum 18: qâf (50) 57
Datum 19: nûn (68) 132
Datum 20: grand total of all muqaţţa‘ât = 40243 [41138]
From these data, using the binomial formula (cf.
Use of Statistics), we derive the following associated probabilities. The figures in brackets refer to the counts with hamzah included, wherever they differ. Probabilities (p-values) refer to the chances of getting "at least this many," rather than "exactly this many but no more," because this approach addresses our research question directly. That is, we expect to see high probabilities (closer to 1) for very common, uninteresting events. We expect to see low probabilities (closer to 0) for less common events. Only the extremely rare events should have probabilities lower than p = .001 (cf.
Statistical Significance for an explanation of how this number was selected as our cut-off value).
Multiples of 2 = 6 (p = .97931) [10, p = .58810]
Multiples of 3 = 9 (p = .19055) [8, p = .33853]
Multiples of 5 = 1 (p = .98847) [5, p = .37035]
Multiples of 7 = 4 (p = .31778) [2, p = .80144]
Multiples of 11 = 3 (p = .27165) [2, p = .55407]
Multiples of 13 = 2 (p = .46207) [1, p = .79828]
Multiples of 17 = 1 (p = .70255)
Multiples of 19 = 6 (p = .00043360) [6, p = .00043360]
Multiples of 23 = 0 (p = 1.0000)
Multiples of 29 = 1 (p = .50432) [2, p = .15026]
Multiples of 31 = 1 (p = .48097)
Multiples of 37 = 1 (p = .42188) [0, p = 1.0000]
Multiples of 41 = 0 (p = 1.0000)
Multiples of 43 = 0 (p = 1.0000) [1, p = .37538]
Multiples of 47 = 2 (p = .06678) [0, p = 1.0000]
Multiples of 53 = 0 (p = 1.0000)
Multiples of 59 = 1 (p = .28957) [2, p = .04460]
Multiples of 61 = 0 (p = 1.0000)
Multiples of 67 = 0 (p = 1.0000)
Multiples of 71 = 0 (p = 1.0000)
Multiples of 73 = 1 (p = .24109) [0, p = 1.0000]
Multiples of 79 = 0 (p = 1.0000)
Multiples of 83 = 0 (p = 1.0000)
Multiples of 89 = 1 (p = .20227)
Multiples of 97 = 0 (p = 1.0000)
Although it is quite visible that the frequency of multiples of 19 stands out as rather deviant from what we would expect under conditions of random chance, the comparison among the primes would be clearer in a chart, hence the images below.





Explanation of the Above Charts
Each chart depicts the odds against this particular outcome (number of multiples) for any prime number from 2 to 97. The odds are indicated on the vertical axis. The horizontal axis lists the prime numbers. Although all twenty-five prime numbers below 100 are graphed on all five charts, the particular scaling method used in these charts only lists about every third one. The closest (blue) graph shows the counts with hamzah included, the middle (red) graph shows the counts with hamzah excluded, and the farthest (white) graph depicts the outcome with the alif-initialed chapters excluded.
The first chart is produced at a scale showing odds of up to one in 25. This is intended to demonstrate the ordinary distribution of outcomes in a completely random data set. Obviously, the odds associated with the number 19 do not fit within that chart.
The second chart is produced at 1/10 the scale of the first, or odds of up to one in 250. Here we see a diminishment in salience associated with all primes except for the number 19. Again, the odds associated with the number 19 do not fit within the chart.
The third chart at last accommodates the height of the spike associated with the number 19 for those outcomes that include the alif-initialed chapters. This is at 1/100 the scale of the first chart. Here, the contrast between the number 19 and the other contenders is hardly subtle. The contours in the remainder of the chart have been essentially obliterated by this contrast. Yet the full extent of the odds against finding this many multiples of 19 can only be displayed in this manner.
Finally, the level of significance associated with the number 19 when the alif-initialed chapters are excluded cannot be contained within the boundaries of the chart until it has been reduced by another factor of 100. (The outcomes generating the alif-exclusive graphs are not listed among the results above.) This offers substantial evidence that the orthographical differences influencing the alif counts have deeply affected the salience of the code. Although these statistical outcomes cannot tell us what the actual alif counts must be, they do tell us that there are some errors among them.
The first chart above provides a good example of a thoroughly uninteresting distribution of odds. Ignoring for the moment the number 19, this is what we should expect in a random sample of data. In this case, the numbers 47 and 59 happen to occur at a slightly unusual frequency. The odds against their occurring at these particular frequencies are about one in 15 and one in 22, respectively. Since we tried a total of 25 prime numbers, the outcomes for 47 and 59 are precisely what we should expect—deviations from the mean corresponding to roughly one in twenty-five trials.
The odds against the high frequency of multiples of 19, on the other hand, are quite another matter—one in 2,306 for the alif-inclusive outcomes, whether the hamzah is counted or not. Now, it is important to recall that these numbers are the result of a highly conservative approach to constructing a data set to test some claims that have been attributed to capitalizing on chance. The entire data set was generated solely on the basis of the innate characteristics of the relevant phenomenon, viz., the muqaţţa‘ât. Whether or not the data thus generated turned out to be multiples of 19 was completely ignored, except for one case, in which a multiple of 19 was removed from the data set for the sake of conservatism. All in all, everything within reason was done to ensure that excessive multiples of 19 would not be detected in the data unless its source was the textual structure of the Qur’ân itself.
If Dr. Khalifa's claims about a Qur’ân Code were illusory, as many have suggested, the number 19 should have produced no more than an uninteresting addition to the first chart above. Having carefully reviewed my methods since the first time I posted my analyses, I find this outcome somewhat difficult to fathom. The small size of the data set notwithstanding, there should certainly not be this many multiples of 19 among those sums. Four multiples of 19 would have been quite enough to dominate the charts (three if we exclude the alif-initialed chapters). It is quite extraordinary to say the least.
In order to test the robustness of the structure that we have now confirmed to exist in the Qur’ân, I have conducted several
analyses with one or more rules violated. This is partially driven by the assertions of some people that the Qur’ân Code is an artifact of a clever combination of rules that conspire to produce an inordinate number of multiples of 19. (Of course, the eight observations already address this issue by generating an objective data set, which explicitly ignores 19-divisibility.) The claim that 19-divisibility is an artifact of specially selected rules implies that subtle deviations from those rules should cause the phenomenon readily to vanish without a trace. My other motivation for conducting these supplementary analyses is that they may possibly reveal something about the structure of the code that can inform further research.
The Qur’ân Code:
Home page.
The Mysterious Qur’ânic Initials:
An explanation and chapter-by-chapter count of the Qur’ânic initials.
Use of Statistics: How
statistics can be used to analyze the Qur’ânic text.
Defining the Data Set to be Analyzed:
Overview of the nature of the Qur’ânic initials and definition of the principal
data set.
The Principal Analysis:
Includes links to supplementary analyses.
Implications of the Present Findings:
Discusses the outcome of the principal analysis.
Supplementary Analyses:
Overview, with links to individual analyses.
General Summary of Findings by Statistical Significance:
Compilation of all analyses for comparison of outcomes.
© Copyright Abu Jamil
and The Q Zone
|