Approved For Release 2000/08/08 : CIA-RDP96-00789ROO2200520001-0 Reprinted from FouNDATioNs op PHYsics Vol. 19, No. 1Z Deoember 1989 Prorted w Belgium Evidence for Consciousness-Related Anomalies in Random Physical Systems Dean I. Radin' and Roger D. Nelson2 Received May 6, 1988; revised June 12, 1989 Speculations about the role of consciousness in physical systems are frequently observed in the literature concerned with the interpretation of quantum mechanics. While only three experimental investigations can be found on this topic in physics j more Man 800 relevant experiments have been reported in the literature of parapsychology. A well-defined body of empirical evidence from this domain was reviewed using meta-analytic techniques to assess methodological quality and overall effect size. Results showed effects conforming to chance expectation in control conditions and unequivocal non-chance effects in experimental conditions. This quantitative literature review agrees with the findings of two earlier reviews, suggesting the existence ofsomeform ofconscrousness-related anomaly in random physical systems. 1. ENTRODUCTION The nature of the relationship between human consciousness and the physical world has intrigued philosophers for miUenia. In this century, speculations about mind-body interactions persist, often contributed by physicists in discussions of the measurement problem in quantum mechanics. Virtually all of the founders of quantum theory-Planck, de Broglie, Heisenberg, Schr6dinger, Einstein-considered this subject in depth,(') and contemporary physicists continue this tradition. (2-7) Department of Psychology, Princeton University, Princeton, New Jersey 085". Present 'address: Contel Technology Center, 15000 Conference Center Drive, P.O. Box 10814, Chantilly, Virginia 22021-3808. Department of Mechanical and Aerospace Engineering, Princeton University, Princeton, New Jersey 08544. 14" 0015-9018/89/12OD-149M.00/0,t 1989 Plenum Publishing corporation Approved For Release 2000/08/08 : CIA-RDP96-00789ROO2200520001-0 Approved For Release 2000/08/08 : CIA-RDP96-00789ROO22005200Q.1-0 1500 Radin a nd Nelson The following expression of the problem can be found in al recent interpretation of quantum theory: If conscious choice can decide what particular observation I measure. and thtre- fore into what states my consciousness splits. might not conscious choice also be able to influence the outcome of the measurement? One possible place where mind may influence matter is in quantum effects. Experiments on whether it is possible to affect the decay rates of nuclei by thinking suitable thoughts wo','Uld presumably be easy to perform, and might be worth doing."' Given the distinguished history of speculations about the.: role of consciousness in quantum mechanics, one might expect that the' physics literature would contain a sizable body of empirical data on this topic. A search, however, reveals only three studies. The first is in an article by Hall, Kim, McElroy, and Shimo !by, who reported an experiment "based upon taking seriously the proposalithat the reduction of the wave packet is due to a mind-body interaction, ~:n which both of the interacting systems are changed.""' This experiment examined whether one person could detect if another person had previously 6bserved a quantum mechanical event (gamma emission from sodium-22i atoms). i The idea was based on the supposition that if person A's ob5.ervation actually changes the physical state of a system, then when person obser- ves the same system later, B's experience may be different according to whether A has or has not looked at the svstem. Hall et aVs results, based on a total of 554 trials, did not support the hypothesis; the 6bserved number of "hits" obtained in their experiment was precisely the: number expected by chance (277), while the variance of their measurements was significantly smaller than expected (p < 0.05).'91 The second study is referred to by Hall et aL, who end their article by pointing out that a similar, unpublished experiment using cobalt-P as the source was successful (40 hits out of 67 trials). 110 The third study is a more systematic investigation reported by Jahn and Dunne,"" who summarize results of over 25 millio0i binary trials collected during seven years of experimentation with randdin-event generators. These experiments, involving long-term data collection with 33 unselected individuals, provide -persuasive, replicable eviden4e of an anomalous correlation between conscious intention and the o tput of random number generators. . . I Thus, of three pertinent experiments referenced in mainstreanj physics journals, one describes results statistically too close to chance expectation and two describe positive effects.('-") Given the theoretical implic,4tions of such an effect, it is remarkable that no further experiments of this type can be found in the physics literature; but this is not to say that bo such experiments have been performed. In fact, dozens of researchas have Approved For Release 2000/08/08 : CIA-RDP96-00789ROO2200520001-0 Approved For Release 2000/08/08 : CIA-RDP96-00789ROO2200520001-0 Consciousness in Physical Systems . 1501 reported conceptually identical experiments in the puzzling and uncertain domain of parapsychology. Perhaps because of the insular nature of scientific disciplines, the vast majority of these experiments are unknown to most scientists. A few critics who have considered this literature have dismissed the experiments as being flawed, nonreplicable, or open to fraud, 112-161 but their assertions are countered by at least two detailed reviews which provide strong statistical support for the existence of anomalous consciousness-related effects with rando m- number, gen,e,r,ators."',"I in this paper, we describe the results of a comprehensive, quantitative meta-analysis which focused on the questions of methodologi- cal quality and replicability in these experiments. 2. THE EXPERIMENTS The experiments involved some form of microelectronic random number generator (RNG), a human observer, and a set of instructions for the observer to attempt to "influence" the RNG to generate particular numbers, or changes in a distribution, solely by intention. RNGs are usually based upon a source of truly random events such as electronic noise, radioactive decay, or randomly seeded pseudorandom sequences. (19) Feedback about the distribution of random events is often provided in the form of a digital display, but audio feedback, computer graphics, and a variety of other mechanisms have also been used. Some of the RNGs described in the literature are technically sophisticated, the best devices employing electromagnetic shielding, environmental failsafe mechanisms triggered by deviant voltages, currents, or temperature, automatic computer-based data recording on magnetic media, redundant hard copy output, periodic randomness calibrations, and so on. (18.20) RNGs are typically designed to produce a sequence of random bits at the press of a button. After generating a sequence of say, 100 random bits (O's or I's), the number of I's in the sequence may be provided as feedback. In an experimental protocol using a binary RNG, a run might consist of an observer being asked to cause the RNG to produce, in three successive button presses, a high number (sum of I's greater than chance expectation of 50), a low number (less than 50), and a control condition with no direc- tional intention. An experiment might consist of a group of individuals each contributing a hundred such runs, or one individual contributing several thousand runs. Results are usually analyzed by comparing high aim and low aim means against a control mean or theoretical chance expectation. Approved For Release 2000/08/08 : CIA-RDP96-00789ROO2200520001-0 Approved For Release 2000/08/08 : CIA-RDP96-00789ROO220052000:1-0 1502 Radin !and Nelson 3. META-ANALYTIC PROCEDURES The quantitative literature review, also called meta-analysis, has become a valuable tool in the behavioral and social sciences. (21) Meta-analysis is analogous to well-established procedures used in the physical sciences to determine parameters and constants. The 'technique assesses replication of an effect within a body of studies by examining the distribution of effect sizes."'-") In the present context, the null hypothesis (no mental influence on the RNG output) specifies an expected ml an effect ,e size of zero. A homogeneous distribution of effect sizes with nonz '~ro mean indicates replication of an effect, and the size of the deviation of.,the mean from its expected value estimates the magnitude of the effect. Meta-analyses assume that effects being compared are similar across different experiments, that is, that all studies seek to estimate the sAme pop- ulation parameters. Thus the scope of a quantitative review must be strictly delimited to ensure appropriate commonality across the different studies that are combined."',"' This can present a nontrivial problem! in meta- analytic reviews because replication studies typically investigate number of variables in addition to those studied in the original experim.eOts. In the present case, because different subjects, experimental protocols,.apd RNGs were employed within the reviewed literature, some heterogeneity 6ution of attributable to these factors was expected in the obtained distri, effect sizes. However, the circumscription for the review required that every study in the database have the same primary goal or hypothesis, 4nd hence estimate the same underlying effect. Experiments selected for review examined the following hypothesis: The statistical output of an electronic RNG is correlated with! observer intention in accordance with prespecified instructions, as indicated by the directional shift of distribution parameters (usually the moan) from expected values. Because this "directional shift" is most often reported as a'standard normal deviate (i.e., Z score) in the reviewed experiments, we determined effect size as a Z score normalized by the square root of the sa'mple size (N), e = Z/,/-N, where N was the total number of individual ran4rn events (with probability of a hit at p = 0.5, p = 0.25, etc.). This effect size; measure is equivalent to a Pearson product moment correlation. 3.1. Unit of Analysis To avoid redundant inclusion of data in a meta-analysis,: "units of analysis" are often specified. We employed the following method: If an author distinguished among several experiments reported in a single Approved For Release 2000/08/08 : CIA-RDP96-00789ROO2200520001-0 Approved For Release 2000/08/08 : CIA-RDP96-00789ROO2200520001-0 Consciousness in Physical Systerns 1503 article with titles such as "pilot test" or "confirmatory test," or provided independent statistical summaries, each of these studies was coded and quality-assessed separately. If an experiment consisted of two or more conditions comparing different intentions or types of RNG devices, the data were split into separate units of analysis to allow the results to be coded unambiguously. In general, within a given reviewed report, the largest possible aggregation of nonoverlapping data collected under a single intentional aim was defined as the unit of analysis (hereafter called an experiment or study), For each experiment, a Z score was assigned corresponding to whether the observed result matched the direction of intention. Thus, a negative Z obtained under intention to "aim low" was recorded as a positive score. When sufficient data were provided in a report, Z was calculated from those data and compared with the reported results; the new calculation was used if there was a discrepancy. If only probability levels were reported, these were transformed into the corresponding Z score. For experiments reported only as "nonsignificant," a conservative value of Z = 0 was assigned; if the outcome was reported only as "statisti- cally significant," Z = 1.645 was assigned; and if sample size was not repor- ted or could not be calculated from the information provided, a special code of N= I was assigned. 3.2. Assessing Quality Because the hypothesized anomalous effect is not easily accom- modated within the prevailing scientific world-view, it is particularly important to assess the trustworthiness of each reviewed experiment. Unfortunately, estimating experimental quality tends to be a subjective task confounded by prior expectations and beliefs.(26,211 Estimates of inter- judge reliability in assessing the quality of research reports, for example, rarely exceed correlations of 0.5.121) We addressed this problem by assigning to each experiment a single quality weight derived from a set of sixteen binary (present/absent) criteria. The first author coded and double-checked the coding for all studies; the second author independently coded the first 100 studies. Inter-judge reliability for quality criteria was r = 0.802 with 98 degrees of freedom. These criteria were developed from published criticisms about random-number generator experiments (14,15.29-33) and from expert opinion on important methodological considerations when performing studies involving human behavior. (20,34,35) Collectively, these criteria form a measure of credibility by which to judge the reported data. The criteria assess the integrity of the experiment in four categories-procedures, Approved For Release 2000/08/08 : CIA-RDP96-00789ROO2200520001-0 Approved For Release 2000/08/08 : CIA-RDP96-00789ROO22005200d.1-0 1504 Radin,pknd Nelson statistics, the data, and the RNG device-and they cover virtually all methodological criticisms raised to date. They are (I)control te4ts noted, (2) local controls conducted, (3) global controls conducted, (4J controls established through the experimental protocol, (5) randomness calibrations conducted, (6) failsafe equipment employed, (7) data automatically recor- ded, (8) redundant data recording employed, (9) data double I checked, (10) data permanently archived, (11) targets alternated on successive trials, (12) data selection prevented by protocol or equipment, (13) rixed run lengths specified, (14) formal experiment declared, (15) tampefl-resistant RNG employed, and (16) use of unselected subjects. Each criterion was coded as being present or absent in the ireport of an experiment, specifically excluding consideration of previously ublished descriptions of RNG devices or control tests. This strategy was imployed to reflect lower confidence in such experiments since, for example,: random ness tests conducted once on an RNG do not guarantee acceptab)e perfor mance in the same RNG in all future experiments. As a result,' assessed quality was conservative, that is, lower than the "true" quality Ifor some experiments, especially those reported only as abstracts or conference proceedings. Using unit weights (which have been shown to be robust in 1311) i such applications on each of the sixteen descriptors, the quality rating for an individual experiment was simply the sum of the descriptlqrs. Thus, ! while a quality score near zero indicated a low quality or poorly' reported experiment, a score near sixteen reflected a highly credible experiment. 3-3. Assessing Effect Size Assume that each of K experiments produces effect size estimates e of a parameter E, based on N samples, and that each e has a known'standard error s. The weighted mean effect size is calculated as e. = Z OteilY_ coi, where co j I Is2 = Ni, and i ranges from I to K The standard err r of e. is s,. co A test for homogeneity for the K estimates of ei isl given by HK=Y_o)j(eI-e_)2, where HK has a chi-square distribution with K-1 degrees of freedom."" The same procedure can be followed t6 test for homogeneity of effect size across M independent investigators. In this case, e. j and sj are calculated per investigator, and the test for homo$leneity is performed as H,I coj(e. j - e. f)2, where e. j and coj are mean 4eighted effect size and Ils2 per investigator, respectively, e. m coje. jlT_~ coj, and e j ranges from I to M. Hm has M - 1 degrees of freedom. For a quality-weighted analysis, we may determine: e. Q Z (QICOiej)1T_ (QiCOA where Qi is the quality assessed for experimp;nt i. The standard error associated with e,2 is seQ = (7_ (Q~Coy(Z QI(oi)2) -1/2; the I test for homogeneity is similar to that described above. Finally, following Approved For Release 2000/08/08 : CIA-RDP96-00789ROO2200520001-0 Approved For Release 2000/08/08 : CIA-RDP96-00789ROO2200520001-0 Consciousness in Physkai Systems 1505 the practice of reviewers in the physical sciences, (23,24) we deleted potential ~outlier" studies to obtain a homogeneous distribution of effect sizes and to reduce the possibility that the calculated mean effect size may have been spuriously enlarged by extreme values. The procedure used was as follows: If the homogeneity statistic for all studies was significant (at the p < 0.05 level), the study that would produce the largest reduction in this statistic was deleted; this was repeated until the homogeneity statistic had become nonsignificant. 4. RESULTS On-line bibliographic databases for psychology and physics journals were searched, as was a specialized database covering parapsychological articles, technical reports, conference proceedings and manuscripts. Altogether 152 references were found from 1959 to 1987. These reports described 832 studies conducted by 68 different investigators (597 experimental studies and 235 control studies). Fifty-four experimental and 33 control studies reported only as nonsignificant were assigned Z=O. Six experiments and two control studies coded as (N = 1, Z > 0) were eliminated from further meta-analysis because effect size could not be accurately estimated (this required the elimination of one investigator who reported a single study). Figures 1 and 2 show the distributions of Z scores reported for control and experimental studies, respectively. 12 6 3 CONTROL, N 235 THEORY FIT SHIFT ----- - 5 4 3 2 1 0 1 Z-SCORES Fig. 1. Distribution of Z scores reported in 235 control studies. Thirty-three of these studies were reported only as "nonsignificant" and were assigned Z scores of zero. To replace the spurious spike at Z=O, those 33 studies were recast as normally distributed Z scores, bounded by ± 1.64, averaging Z = 0. Approved For Release 2000/08/08 : CIA-RDP96-00789ROO2200520001-0 1 Approved For Release 2000/08/08 : CIA-RDP96-00789ROO2200520001-0 1506 Radin ind Nelson REG META-ANALYSIS: FREQUENCY OF 2-SCORES 12 EXPERIMENTAL, N 597 THEORY FIT 9 SHIFT 6 N cr 3 0 -5 -4 -3 -2 -1 1 2 3 4 5 Z-SCORES Fig. 2. Distribution of Z scores reported in 597 experimental studies. Fifty-fo4r of these studies were reported as "nonsignificant" and were assigned Z scores of zero. A~ in Fig. 1, those 54 studies were recast as normally distributed Z scores, bounded by ± 1.641' averaging Z 0. N_ N .1591 N .490 S91 490 (R)CNTL (b)EXP7 (C)EXP/I (d)EXPH 19)ED(PAH (I)QUAL (Q)QUALH Fig. 3. Mean effect size point estimates ± I standard error for (a) control studies and (b) individual experiments; (c) mean effect size per investigator, (d) homogeneous mean effect size for experiments. (e) homogeneous mean effect size per investigator, (f) mean effect size for quality-weighted expenments, and (g) mean effect size for homogeneous quality-weighted experiments. Approved For Release 2000/08/08 : CIA-RDP96-00789ROO2200520001-0 Approved For Release 2000/08/08 : CIA-.RDP96-00789ROO2200520001-0 Consciousness in Physical Systerns 1W These results, expressed as overall mean effect sizes, show that control studies conform well to chance expectation (Fig. 3a), and that experimental effects, whether calculated for studies or investigators, deviate significantly from chance expectation (Fig. 3b, 3c). To obtain a homogeneous distribu- tion of effect sizes, it was necessary to delete 17% of individual outlier studies (Fig. 3d) and 13 % of mean effect sizes across investigators (Fig. 3e). This may be compared with exemplary physical and social science reviews, where it is sometimes necessary to discard as many as 45 % of the studies to achieve a homogeneous effect size distribution. ~ ") Of individual studies deleted, 77% deviated from the overall mean in the positive direction, and of investigator means deleted, all were positive (i.e., supportive of the experimental hypothesis). 4.1. Effect of Quality Some critics have postulated that as experimental quality increases in these studies, effect size would decrease, ultimately regressing to the "true" value of zero, i.e., chance results. (12.13,15,32,33.38) We tested this conjecture with two linear regressions of mean effect size vs. mean quality assessed per investigator, one weighted with wj as defined above and the other weighted with the number of studies per investigator. The calculated slope for the former is -2.5 x 10-5 + 3.2 x 10 -', and for the latter, - 7.6 x 10 " ± 3.9 x 10 -'. These nonsignificant relationships between quality and effect size is typical of meta-analytic findings in other fields, (39,40) suggesting that the present database is not compromised by poor experimental methodology. Another assessment of the effect of quality was obtained by comparing unweighted and quality-weighted effect sizes per experiment (Fig. 3b vs. 3f). These are nearly identical, and the same is true after deleting outliers to obtain a homogeneous quality-weighted distribution (Fig. 3d vs. 3g), confirming that differences in methodological quality are not significant predictors of effect size. It might be argued that the quality assessment procedure employed here was nonoptimal because some quality criteria are more important than others, so that if appropriate weights were assigned, the quality-weighted effect size might turn out to be quite different. This was tested by Monte Carlo simulation, using sets of 16 weights, one per criterion, randomly selected over the range 0 to 6. A quality-weighted effect size was calculated for the 597 experiments as before, now using the random weights instead of unit weights, and this process was repeated one thousand times, yielding a distribution of possible quality ratings. The average effect size from the simulation was 3.18xlO-'+0.15x 10-4, indicating that in this particular database coded by these sixteen criteria, Approved For Release 2000/08/08 : CIA-RDP96-00789ROO2200520001-0 Approved For Release 2000/08/08 : CIA-RDP96-00789ROO2200520001 -0 1508 Radin pd Nelson the probable range of the quality-weighted mean effect size clearly'excludes chance expectation of zero. 4.2. The "Filedrawer" Problem Although accounting for differences in assessed quality does not nullify the effect, it is well known in the behavioral and social sciences t.6.at non- significant studies are published less often than significant studiq (this is called the "filedrawer" problem (21.41-43) ). If the number of nonsignificant studies in the filedrawer is large, this reporting bias may seriously inflate the effect size estimated in a meta-analysis. We explored several prp'cedures for estimating the magnitude of this problem and to assess the ppssibility that the filedrawer problem can sufficiently explain the observed fesults. The filedrawer hypothesis implicitly maintains that all or riearly all significant positive results are reported. If positive studies are not 6alanced by reports of studies having chance and negative outcomes, the 6.mpirical Z score distribution should show more than the expected propoirtion of scores in the positive tail beyond Z= 1.645. While no argumeni. can be made that all negative effects are reported, it is interesting to notel that the database contains 37 Z scores in the negative tail, where only 30 would be expected by chance. On the other hand, there are 152 scores in tho'positive tail, about five times as many as expected. The question is wheffier this excess represents a genuine deviation from the null hypothesis or: a defect in reporting or editorial practices. This question may be addressed by modeling based on the as4lumption that all significant positive results are reported. A four-parameterifit mini- mizing the chi-square goodness-of-fit statistic was applied to al I bbserved data with Z >, 1.645, using the exponential 1 /21 (1 ;J Ila j Y.= e-, v/2- a to simulate the effect of skew or kurtosis in producing the di:spropor- tionately long positive tail. This exponential is a probability disi:ribution with the same mean and variance as the normal distribution, but with kurtosis = 3.0. To begin, the null hypothesis of a (0, 1) normal distribution: with no kurtosis was considered. To account for the excess in the posi;!tive tail, N = 585,000 filedrawer studies were required, and the chi-squared statistic remained far too large to indicate a reasonable fit (see Table I). This large N, in comparison with the 597 studies actually reported togetherwith the poor goodness-of-fit statistic, suggests that the assumption of: a (0, 1) normal distribution is inappropriate. Approved For Release 2000/08/08 : CIA-RDP96-00789ROO2200520001-0 Approved For Release 2000/08/08 : CIA-RDP96-00789ROO2200520001-0 Consciousness in Physical Systems 1509 Table 1. Four-Parameter Fit (E: N, N, Mean, sd) Minimizing Chi-Square (10 df) Goodness-of-Fit Statistic to the Positive Tail of the Observed Z Score Distribution, for Several Exponential: Normal Ratios' Assumption E: N ratio N Mean sd Chi-square P Normal distribution585,0000 1 57,867.840 0 (null hypothesis) 5,300 0 1 220.97 0 1 2 4,800 0 1 167.84 0 3 4,600 0 1 148.45 0 10 4,400 0 1 119.69 0 Empirical distribution700 0.1452.10 23.94 0.008 0 1 747 0.3451.90 16.32 0.091 2 757 0.4451.80 14.21 0.164 3 777 0.4451.80 11.08 0.226 10 807 0.4451.80 11.08 0.351 The null hypothesis is tested by clamping the mean at 0 and the standard deviation at 1, allowing N and E:N to vary. The empirical database is addressed by allowing all four parameters to vary. Adding simulated kurtosis to a (0, 1) normal distribution by mixing exponential [Eq. (1 ) ] and normal distributions in a I: I ratio reduced N by two orders of magnitude, and ratios of 2:1, 3: 1, and 10: 1 exponential to normal (E:N) yielded further small improvements. However, the chi- squared statistic still indicated a poor fit to the empirical data. Applying the same mixture of exponential and normal distributions, but starting from the observed values of N = 597, mean Z score = 0.645, and standard deviation = 1.601, with the constraint that the mean could only decrease from 0.645, resulted in much better fits to the data. Table I shows the results. This procedure shows that the null hypothesis is unviable, even after allowing a huge filedrawer. The chi-square fit vastly improves with the addition of kurtosis, but only becomes a reasonably good fit when mean and standard deviation are allowed to approximate the empirical values. The filedrawer estimate from this model depends on a number of assump- tions (e.g., the true distribution is generally normal, but has a dispropor- tionately large positive tail). It suggests a total number of experimental studies on the order of 800, of which three-fourths have been formally reported. A somewhat simpler modeling procedure was applied to the data assuming that all studies with significant Z scores in either the positive or negative tail are reported. The model is based on the normal distribution with a standard deviation = 1, and estimates the mean and N required to Approved For Release 2000/08/08 : CIA-RDP96-00789ROO2200520001-0 Approved For Release 2000/08/08 : CIA-RDP96-00789R00220052ooq.1_o 1510 Radio &W. Neison account for the 152 Z scores in the positive tail and 37 Z score.~ in the negative tail. This mean-shift model, which ignores the shapei of the observed distribution, results in an N= 1,580 and a mean Z scorep~ 0.34. These modeling efforts suggest that the number of unreported or unretrieved RNG studies falls in the range of 200 to 1,000. A remaining question is, how many filedrawer studies with an average null resuit would be required to reduce the effect to nonsignificance (i.e., p<0.05)? This "failsafe" quantity is 54,000---approximately 90 times the number. of studies actually reported. Rosenthal suggests that an effect can be c: oh'sidered robust if the failsafe number is more than five times the observed number of studies. (21) ?Zx~ 5. DISCUSSION Repeatable experiments are the keystone of experimental sci.ence. In practice, repeatability depends upon a host of controllable and uncon- trollable ingredients, including factors such as stochastic variation,! changes in environmental conditions, difficulties in communicating tacit kriowledge employed by successful experimenters,'"' and so on. DifficVlties in achieving systematic replication are therefore ubiquitous, from expelpimental psychology 12',") to particle physics. 121.21) Of course, this is not to 'say that systematic replication is impossible in these or other fields, but it may appear to be extraordinarily difficult when experiments are considered individually rather than cumulatively. In the case of the present database, the authors of a recent report issued by the US National Researchi Council stated that the overall results of the RNG experiments could' not be explained by chance, 1,16 ) but they questioned the quality and replic4bility of the research. This meta-analysis shows that effects are not a fu4ction of experimental quality, and that the replication rate is as good as th~At found in exemplary experiments in psychology and physics. Besides the issue of replicability, five other objections are ofton raised about the present experiments. These are (a) the effect is inconsi0ent with prevailing scientific models, (b) the experimental methodology. i~ techni- cally naYve, thus the results are not trustworthy, (c) the experi illents are vulnerable to fraud by subjects or by experimenters, (d) skeptics cannot obtain positive results, and (e) there are no adequate theoretical ~Xplana- tions or predictions for the anomalous effect. These criticisms may be addressed as follows: (a) "Inconsisteocy with the scientific world-view" is essentially a philosophical argum:ent that carries little weight in the face of repeatable experimental evidence, as suggested by the present and two corroborating meta-analyses. (17,18) Approved For Release 2000/08/08 : CIA-RDP96-00789ROO2200520001-0 Approved For Release 2000/08/08 : CIA-RDP96-00789ROO2200520001-0 Consciousness in Physical System 1511 Indeed, if the "inconsistency" argument were sufficient to discount anomalous findings, we would have ignored much of the motivation leading to the development of quantum mechanics. (b) The "naive method- ology" argument was empirically addressed by the assessment of methodological quality in the present analysis. No significant relationship between quality and effect size was found. (c) Fraud postulated as the explanation of the results is untenable as it would have required widespread collusion among 68 independent investigators. In any case, even severe critics of parapsychological experiments have discounted fraud as a viable explanation. (32) (d) Skeptics often assert that only "believers" obtain positive results in such experiments. However, a thorough literature search finds not a single attempted replication of the RNG experiment by a publicly proclaimed skeptic; thus the assertion is not based on verifiable evidence. ' Furthermore, skeptics who claim to have attempted replications insist (without providing details or references) that they have never achieved positive results in any of their RNG experiments."',"' Such a claim is itself quite remarkable, as the likelihood of never obtaining a statistically significant result by chance in series of experiments can be extremely low, depending on the number of experiments conducted. Unfor- tunately, because we cannot determine how many experiments skeptics have actually conducted, it is impossible to judge the validity of this criticism. Finally, (e) the "no theoretical basis" argument is correct, but it does not support a negative conclusion about experimental observation. There are at present no adequate theories, with the possible exception of some interpretations of quantum mechanics, (2-1,1," 1 that convincingly explain or predict consciousness-related anomalies in random physical systems. We note, however, that the anomalous effects reviewed in this paper apparently can be operationally predicted under well-specified conditions. For exam- ple, when individuals are instructed to "aim" for high (or low) numbers in RNG experiments, it is possible to predict with some small degree of confidence that anomalous positive (or negative) shifts of distribution means will be observed. 6. CONCLUSION In this paper, we have summarized results of all known experiments testing possible interactions between consciousness and the statistical behavior of random-number generators. The overall effect size obtained in experimental conditions cannot be adequately explained by methodological flaws or selective reporting practices. Therefore, after considering all of the Approved For Release 2000/08/08 : CIA-RDP96-00789ROO2200520001-0 Approved For Release 2000/08/08 CIA-RDP96-00789ROO220052000 ;1-0 1 1512 Radin ind Nelsm retrievable evidence, published and unpublished, tempered by all legitimate criticisms raised to date, it is difficult to avoid the conclusion tliat under certain circumstances, consciousness interacts with random physical systems. Whether this effect will ultimately be established as an o%"erlooked methodological artifact, as a novel bioelectrical perturbation of sensitive electronic devices, or as an empirical contribution to the phil*phy of mind, remains to be seen. ACKNOWLEDGMENTS This study was supported by major grants from the James S. McDonnell Foundation, Inc. and the John E. Fetzer Foundation'.1 Inc. The authors express their gratitude to Dr. York Dobyns of the Princeton University Engineering Anomalies Laboratory for his assistance! with the filedrawer models. REFERENCES 1. R. G. Jahn and B. J. Dunne, Margins of Reality (Harcourt Brace Jovanov4, Orlando, Florida, 1987). 2. B. d'Espagnat, "The quantum theory and reality," Sci. Am., pp. 158-181 ~Novetnber, 1979). 3. 0. Costa de Beauregard, "S-matrix, Feynman zigzag and Einstein correlation;" Phys. Leu. 67A, 171-173 (1978). 4. N. D. Mermin, "Is the moon there when nobody looks? Reality and the quanturn theory, Phys. Today, pp. 38-47 (April, 1985). . I 5. A. Shimony, "Role of the observer in quantum theory," Am. J. Phys, 31, 7551 (1963). 6. E. P. Wiper, "The problem of measurement," Ant. J. Phys. 31, 6 (1963). 7. U. Ziemelis, "Quantum-mechanical reality, consciousness and creativity," Can. Res. 19, 62-68 (September, 1986). 8. E. J. Squires, "Many views of one world-an interpretation of quantum theory," Eur. J. Phys. 8, 173 (1987). 9. j. Hall, C. Kim, B. McElroy, and A. Shimony, "Wave-packet reduction as medium of communication," Found Ph -vs. 7, 759-767 (1977); p. 761. 10. R. Smith, unpublished manuscript, MIT, 1968. (Cited in Ref. 9, p. 767.) 11. R. G. Jahn and B. J. Dunne, "On the quantum mechanics of consciousness, ith applica- tion to anomalous phenomena," Found Phys. 16, 721-772 (1986). 12. J. E. Alcock, Parapsychology: Science or Magic? (Pergamon Press, Elmsford New York, 1981), pp. 124-125. New York, 13. M. Gardner, Science: Good, Bad, and Bogus (Prometheus Books, Buffalo,'; 1981). 14. R. Hyman, "Parapsychological research: A tutorial review and critical appfiaisal," Proc. IEEE 74, 823-849 (1986). 15. P. Kurtz, "Is parapsychology a science?" In Paranormal Border ds:;l of Science, K. Frazier, ed. (Prometheus Books, Buffalo, New York, 198 1). Approved For Release 2000/08/08 : CIA-RDP96-00789ROO2200520001-0 Approved For Release 2000/08/08 : CIA-RDP96-00789ROO2200520001-0 Consciousness in Physical Systems 1513 16. D. F. Marks, "Investigating the paranormal," Nature (London) 320, 119-124 (1986). 17. C. Honorton, "Replicability, experimenter influence, and parapsychology: An empirical context for the study of mind," paper presented at the annual meeting of the AAAS, Washington, D.C., 1978. 18. E. C. May, B. S. Humphrey, and G. S. Hubbard, "Electronic system perturbation techniques." SRI International Final Report, September 30, 1980 19. H. Schmidt, "Precognition of a quantum process," J. Parapsychol. 33, 99-108 (1969); "A PK test with electronic equipment," J. Parapsychol. 34, 175--181 (1970); "Mental influence on random events," New Sci. Sci. J. 50, 757-758 (1971); "PK tests with pre-recorded and pre-inspected seed numbers," J. Parapsychol. 45, 87-98 (1981). 20. R. G. Jahn, "The persistent paradox of psychic phenomena: An engineering perspective," Proc. IEEE 70, 136-170 (1982); R. D. Nelson, B. J. Dunne, and R. G. Jahn, "An REG experiment with large data-base capability, III: Operator-related anomalies," Technical Note PEAR 84003, Princeton Engineering Anomalies Research Laboratory, Princeton University, School of Engineering/Applied Science, September 1984; H. Schmidt, R. Morris, and L. Rudolph, "Channeling evidence for a PK effect to independent observers," J. Parapsychol. 50, 1-16 (1986). 21. R. Rosenthal, Meta-Analytic Procedures for Social Research (Sage Publications, Beverly Hills, California, 1984); K. Wachter, "Disturbed by meta-analysis?" Science 241, 1407-1408 (1988). We may note that Cohen's h, the difference between control and experimental proportions, is a common effect size measure that might have been used in the present study. This was rejected in favor of e, as defined, because some of the reviewed studies reported only Final p values or only overall Z scores; e was thus deemed more useful in the present meta-analysis. 22. R. L. Bangert-Drowns, "Review of developments in meta-analytic method," Psychol. Bull. 99, 388-399 (1986). 23. A. H. Rosenfeld, "The particle data group: Growth and operations." Annu. Rev. Nucl. Sci. 25, 555-599 (1975). 24. C. G. Wohl et al., Rev. Mod Phys. 56, Part II, p. S5 (1984). 25. G. V. Glass, "In defense of generalization," Behav. Brain Sci. 3, 394-395 (1978). 26. H. M. Cooper, "Scientific guidelines for conducting integrative reviews," Rev. Educ. Res. 52, 291-302 (1982). 27. R. M. Dawes, "You can't systematize human judgment: Dyslexia." In New Directions for Methodology of Social and Behavioral Science: Fallible Judgment in Behavorial Research, R. A. Shweder, ed. (Jossey-Bass, San Francisco, 1980), pp. 67-78. 28. S. D. Gottfredson, "Evaluating psychological research reports: Dimensions, reliability, and correlates of quality judgments," Am. Psychol. 33, 920-934 (1978). 29. C. Akers, "Methodological criticisms of parapsychology," In Advances in Parapsychologi- cal Research, Vol. 4, S. Krippner, ed. (McFarland, Jefferson, North Carolina, 1984); "Can meta-analysis resolve the ESP controversy?" In A Skeptic's Handbook of Parapsychology, P. Kurtz, ed. (Prometheus Books, Buffalo, New York, 1985). 30. J. E. Alcock, "Parapsychology: Science of the anomalous or search for the soul," Behav. Brain Sci~ 10, 553-565 (1987). 31. P. Diaconis, "Statistical problems in ESP research," Science 201, 131-136 (1978). 32. C. E. M. Hansel, ESP and Parapsychology: A Critical Reevaluation (Prometheus Books, Buffalo, New York, 1980). 33. R. Hyman, "The ganzield psi experiment: A critical appraisal," J. Parapsychol. 49, 3-50 (1985). 34. T. X Barber, Pitfalls in Human Research: Ten Pivotal Points (Pergamon Press, Elmsford, New York, 1976). Approved For Release 2000/08/08 : CIA-RDP96-00789ROO2200520001-0 Approved For Release 2000/08/08 : CIA-RDP96-00789ROO220052000 1-0 1514 Rodin ind Nelson 35. J. B. Rhine, "Comments: 'A new case of experimenter unTeliability,"' J. Par c ol. 38, 215-255 (1974). 36. R. M. Dawes, "The robust beauty of improper linear models in decision MaIJung," Am. Psychol. 34, 571-582 (1979). 37. L. V. Hedges, "How hard is hard science, how soft is soft science?" Am, sychol. 42, 443-455 (1987). 38. C. E. M. Hansel, ESP: A Scientific Evaluation (Charles Scribner's Sons, New York, 1966), p. 234. i 39. R. Rosenthal and D. B. Rubin, "Interpersonal expectancy effects: The first 30 studies," Behav. Brain Sci. 3, 377-415 (1978). 40. G. V. Glass, B. McGaw, and M. L. Smith, Meta-analysis in Social Research (cage Publi- cations, Beverly Hills, California, 1981). 41. Q. McNemar, "At random: Sense and nonsense," Am. Psychol. 15, 295-300 (.t%O). 42. S. lyengar and J. B. Greenhouse, "Selection models and the file-drawer problem," Technical Report 394, Department of Statistics, Carnegie-Mellon University ()uly, 1987). 43. L. V. Hedges, "Estimation of effect size under nonrandom sampling: Th~ effects of censoring studies yielding statistically insignificant mean differences," J. E4c. Stai. 9, 61-86 (1984). 44. H. H. Collins, Changing Order: Replication and Induction in Scientific PrOnctice (Sage Publications, Beverly Hills, California, 1985). i 45. S. Epstein, "The stability of behavior, 11: Implications for psychological research," Am. Psychol. 35, 790-806 (1980). 46. D. Druckman and J. A. Swets, eds. Enhancing Human Performance: Issues, T~eories, and Techniques (National Academy Press. Washington, D.C., 1988), p. 207. 1 47. A. Neher, The Psychology of Transcendence (Prentice-Hall, Englewood Cliffs, ~ew Jersey, 1980),p.147. Approved For Release 2000/08/08 Printed by Catherine Press, Ltd., Tempelhof 41, B-8000 Brugge, Belgium CIA-RDP96-00789ROO2200520001-0