Approved For Release 2000/08/15: CIA-RDP96-00792ROO0701020002-6 A SURVEY OF FREE-RESPONSE JUDGING PRACTICES Julie Milton Psychology Department University of Edinburgh 7 George square Edinburgh EH8 9JZ Scotland, U.K. ABSTRACT An idealised model of the free-response judging process is developed, and its elements discussed in terms of judging practices in those free-response studies published in full between 1964 and 1985. A wide variety of occasionally conflicting judging practices was found, along with valuable indications for further research in this important area. ., A AQDKXffJMGE24HM: My thanks are due to Nancy Zingrone and Deborah Weiner for allowing me to use a draft version of their free-response bibliography. 195 4-z Approved For Release 2000/08/15: CIA-RDP96-00792ROO0701020002-6 While free-response methodology has been popular in ESP studies over recent years, very little research has been directed to the important question of how best to judge the correspondence between free-response material and the target. However, many experimenters have commented on judging issues, or have reported relevant analyses or data which, when brought together, may suggest strengths and weaknesses in our judging practices, and promising directiona for future research. With these aims in mind, I have examined various aspects of procedure which might influence the success 'of judging, using as a database eighty-five free-response studies in which statistical assessment of the results was attempted and which were published in fuill between 1964 and 1987 inclusive, in the Journal of Parapsychology, Journal of the American Society fo Rughical Research, Journal of the Society for 3nhical Research, International Journal of Parapsycholog , and European Journal of Parapsychol2gy. Space constraints prevent me from presenting a summary table of these studies and their full references, but these can be obtained from me on request. All of the papers in these journals (whether experimental or not), and those appearing in Research in Parapsychology during the same period, were searched for commentary relevant to free-response judging, as well as other sources where appropriate. The survey is in two sections. In the first section, a model of an ideal judging process is presented, and its elements discussed in terms of their importance in current judging practices. The second section addresses the issues of. whether percipients or independent judges are best suited to perform the complex judging task, and what qualities a judge should have. Finally, the findings of the review are discussed with their implications for further research. THE ELEMENTS OF JUDGING The underlying structure of the judging process In a free-response ESP experiment, the percipient's task is to observe and report his or her thoughts, imagery, feelings and mental or physical experiences, which might relate to a randomly selected target. In free-response studies, the targets used are generally fairly complex (they may be people, or geographical locations, objects, and so on). The targets may have elements (such as colour, the presence or absence of people) which differ in their salience for the percipient, and in their frequency of occurrence. In addition, targets may be regarded as possessing various broad categories of content (such as semantic content, or emotional content), each of which broad categories may differ in their salience. The salience of both individual elements and categories of content may differ from one percipient to another, depending on individual differences. Just as free-response targets are complex and varied, so too are the mentations reported by percipients. Mentations may be in the form of imagery in any sense.modality, or merely abstract concepts; the may be vivid, bizarre, fleeting, spontaneous, or have other distinguishing characteristics. Content of various kinds may be present in them, with varying chance frequencies of occurrence. Mentation items may relate in a variety of ways to the target material, such as semantically or by association, and to a greater or lesser degree. The type of correspondence may vary from percipient to percipient, or from mentation to mentation, or both. Certain types of mentation, and certain kinds of target-mentation correspondence may be more likely to carry psi information than others. The function of a free-response judge (in process-oriented research at Approved For Release 2000/08/15: CiW-RDP96-00792ROO0701020002-6 leas4p1m_wede_FOZR94e2R&20QQ/MM RSC1A*RDP061W921ROOOft1b2bdO2 tie probability that psi was responsible for any resemblance between the target and the mentation (or inversely, the strength of the ESP component on a given trial). In the complex situation described above, one way of looking at the task of an ideal judge is that he or she should: M Assign some numerical value in proportion to the degree of correspondence between a single mentation item and the target (and, in some types of judging, to the controls); (ii) increase this value (given a perfect match) in accordance with the rarity of occurrence of the mentation item's content in the mentation of all percipients in similar experimental. conditions (or in the mentation of that particular percipient on other trials, if such data is available); (iii) increase this value (given a perfect match) in accordance with the rarity of occurrence of the mentation item's content in the entire experimental target pool; (iv) increase this value in accordance with the likelihood that the mentation item, by virtue of its characteristics, is psi-related (e.g., whether it was bizarre, vivid, spontaneous, or whatever characteristics, if any, are shown by research to mediate ESP) (v) increase this value in accordance with the salience which the content of the mentation has for the percipient (e.g. if research shows that the presence of people in a target is highly salient to a percipient, then a mentation item bearing on the presence or absence of people would be weighted relatively heavily); (vi) increase this value in accordance with the likelihood that the type of correspondence (semantic, emotional, etc.) between mentation item and target carries psi-related information, if such differences in likelihood are inditated by research. Having thus arrived at a weighted measure of the correspondence between each mentation item in a trial and the target (and controls if appropriate), the measures may be summed across the trial or otherwise combined to yield the ESP score for that trial. Although this procedure resembles an atomistic judging procedure most closely in its structure, it can also be thought of as an implicit or idealised basis for holistic or coded judging procedures. In holistic judging, it is possible to think of the overall rating assigned to items in the judging set as a sum of individual mentation ratings weighted as appropriate. In coded judging, the decision of whether a given content category was present or absent could be regarded as being made according to the sum of weighted ratings of relevant mentation items. Further weightings could then be assigned to each decision according to the known salience of the content category and the rarity of that value of the code in the target pool. The importance of elements of judging in the literature Each of the six elements of judging in various forms has received occasional. attention either implicitly or overtly in experimental and theoretical papers, although very little direct or systematic research has been done on this topic. Most opinion about how best to judge free-response material seems to be based on anecdotal observations. While such observations may be unreliable, they may also contain useful information about aspects of judging which should be investigated empirically. This being so, each of the six elements of judging is discussed in turn below in the context of commentary and experimental results in the literature surveyed. 197 Approved For Release 2000/08/15: CIA-RDP96-00792ROO0701020002-6 (i) Assignment of a numerical value to cQrrespondenc Ideally, the value assigned to the correspondence between a mentatior, item and a target should reflect the correspondence in some objective (and hence reliable) way. 16 studies reported in 10 papers in the database surveyed. used atomistic judging, but in no case was interjudge reliability calculated for the allocation of such ratings. In eight of the studies, each point on the rating scale was labelled for the use of the judges (e.g., 0 = ft no correspondence"), which practice might be expected to increase interjudge reliability. The number of points on the rating scale ranged from two to eleven, with a mean of 4.2, and it is possible that the scales at the low end of the range may be too constrained to be sensitive, while those at the higher end require judges to make more fine judgements than is appropriate, and so may be insensitive in effect because they increase error variance. In this latter case of large rating range, interjudge reliability may be reduced. 7be same may be true of holistic rating scales, which ranged from 4 points to 101, and which were clearly reported as being labelled in only 14 out of the 52 studies in which a holistic scale was. used. The number of items in the judging set may be a factor in determining the appropriate rating scale; in the studies surveyed, set size ranged from 2 to 36 items. Any future research which addresses the issue of the appropriate rating scale in this task could most usefully do so in the context of active training of judges, with feedback, in the use of such scales. Boerenkamp (1984) had considerable success in training eight independent judges to rate each statement made by a it psychic" about a missing person on a fully-labelled four-point scale of likelihood that it would apply to anyone in the population. To test the reliability of the judges' ratings, the judges were randomly assigned to two groups of four, and the average ratings of each statements were correlated, yielding correlations ranging from rs = +0.66 (36df, p<0.01) to rs = +0-93 (19df, p<0.01). The training consisted of having each judge rate independently the first statement in the transcript, followed by a discussion among the judges about the differences in their scoring. Then the second statement was scored and discussed, andzo on, allowing the judges to discover why they differed from the group norm and to adjust their rating strategy accordingly. Similar training in rating statements for the likelihood that they were the product of deductive reasoning, also on a fully-labelled four-point scale, yielded similarly respectable interjudge reliabilities, ranging from rs = +0.66 (72df, p<0.01) to r. = +0.95 (20 df, p<0.01). Although no pretraining measures of reliability were taken, the assignment of ratings of the likelihood that a statement would be true of a person on a fully-labelled three-point scale by two untrained judges in a study by Tart and Smith (1968), showed perfect agreement only 49% of the time. The reliability of Boerenkamp's judges i ,a relatively high compared to that generally obtained in free-response judging, and this may be a useful method for training the judges in the reliable assignment of ratings to mentation-target correspondence. Maren (1986) discusses the application of artificial intelligence (AI) to give measures of the correspondence between target and mentation. However, she stresses that the development of appropriate AI systems is at an early stage. It seems that for the time being, the best bet for improving the reliability of atomistic (and possibly holistic) ratings may be the training of judges, with feedback, in the use of fully-labelled scales with a range appropriate for the task. 198 Approved For Release 2000/08/15: CIA-RDP96-00792ROO0701020002-6 Approved For Release 2000/08/15: CIA-RDP96-00792ROO0701020002-6 (ii) Weig ting in accordance with the rarity of the mentation item's content The probability of an exact match between a mentation item and an element outhe target is equal to the product of the probability that the mentation item should occur on that trial out of all the others, and of the probability that that target element should be present in that target out of all other targets. This being so, the rarer the mentation item, the more weight it should receive. Stanford's response-bias hypothesis (1967), coming from a different angle, also suggests that rare responses should be relatively heavily weighted. Although a number of experimenters have instructed judges to attach more weight to rare correspondences (e.g. Palmer, Khamashta, & Israelson, 1979; Sargent, 1980), the calculation of frequencies of mentation occurrence has been seldom. The exceptions are studies by Roll, 1971, Roll et al., 1973, and Tart and Smith (1968). In these studies, statements made by a medium about a number of people were weighted inversely according to the number of people in the study about whom the medium made the same statement. Further work attempting to calculate norms for free-response mentation would need to take into account a number of factors. The setting may be important; in the ganzfeld, for example, the white noise often elicits imagery about waterfalls, beaches and aeroplanes. Some responses are comon in certain states. of consciousness, for example, faces are a common feature of hypnagogic imagery (Mavromatis, 1987). Presumably for this reason, Braud and Braud (1974) and Braud, Wood and Braud (1975) instructed their percipient (who later did the judging) to attempt to distinguish target-relevant impressions from those induced by the state itself (in this case, conventional meditation imagery). As well as being dependent on the situation and state of consciousness of the percipients, mentation content frequencies may also vary from percipient to percipient; most experimenters will probably have come across percipients who in repeated testing, always mention one or more specific images which occur in each of their trials, while in contrast, Sargent, Bartlett and Moss (1982) reported that experienced participants in their ganzfeld study adopted the strategy of not bothering t6 report responses which they recognised as habitual. Frequency norms may also vary according to the nature of the target; percipients, in a study in which geographical location is the target may be more inclined to talk about trees than a percipient in a study-in which aspects of a person is the target. (iii) Weighting in accordance with the rarity of the mentation item in targe PO As stated in (ii) above, the probability of an exact match between a mentation item and an element of the target decreases as the probability of the occurrence of the mentation item in the target pool decreases. Therefore, the rarer the mentation item in the target pool, the more heavily it should be weighted. Jahn, Dunne and Jahn (1980) calculated the a priori probabilities of all values of their thirty binary descriptors,in the entire target pool (i.e., the probability-that people were present, the probability that people were absent, the probability that movement was present, etc.) to facilitate the heavier weighting of rarer target contents (or absence of content). In judging involving a judging set, frequency of content is usually neither calculated nor weighted. 199 Approved For Release 2000/08/15: CIA-RDP96-00792ROO0701020002-6 (iv) Mentation characteristics Several experimenters have attempted a formal analysis of which mentation categories, if any, tend to be psi-related. Sargent, Bartlett and mos. (1982), Sargent, Moss and Bartlett (1981), an Bennet (1982) found that scoring on the basi d Sargent, Milton, Payne and s of "bizarre" mentation was significantly above chancep although scoring on the basis of bizarre imagery was compared to the theoretical chance level, rather than to scoring On the basis of other imagery.. Milton T1984) found significant psi-hitting on the basis of "surprising" imagery and Milton (1985) found significantly below-chance scoring on "fleeting" imagery according to the results of one of two independent judges. A third study by Milton (1987) examining a wide range of mentation categories f6und no significant results. White, Krippner, Ullman and Honorton (1977) had one judge place mentation items from dream transcripts into one or more of seventeen categories, and had a second judge compare each item to the target for that night and mark it "telepathy present" or "telepathy absent". Eight categories (listed as waking imagery, hypnagogic and hypnopompic imagery, associations, colour, communication, witness, specificity and elaboration) were associated with target correspondences to a significant degree, the association with waking imagery being a negative one (i.e. waking imagery seemed to be associated with the target less often than chance). Seven categories (including anxiety, experiment-related, hostility-misfortune, penetration of self-boundaries, participation, vividness) yielded non-significant results, and there was insufficient data to test two categories (sex and violence). However, since it is not mentioned whether the mentation items were edited after being categorised, it may be that these results reflect at least in part the judges' expectations; he or she might have been more inclined to consider telepathy present for a mentation item which fell into a category which he or she expected to be successful. In a study by Schouten and Merkestein (1985) percipients, already familiar with the target pool, had to record striking experiences during the day and later selected the day's target drawing from the pool on the basis of these experiences. In order to reduce the amount of work involved in the judging task, Merkestein selected for independent judging only those experiences which fell into five specified categories. Only memories yielded significant above-chance results (p=0.000 08), while spontaneous, unexpected inner experiences, dreams and daydreams., experiences related to a topic which the percipient had forgotten was in the target pool, and experiences related to the mood of the day, were not significantly target-related. In addition to direct experimentation, some experimenters have offered anecdotal observations concerning what sort of mentation appears to be more psi-related than others. Reporting on an informal research discussion among apparently successful percipients and researchers, Schlitz (1984) notes that many of the participants felt that imagery which was fleeting, novel, or recurrent tended to be psi-related, and that nonvisual impressions, including kinaesthetic, auditory and olfactory images, were of equal or greater importance compared to visual imagery. Honorton and Harper (1974) observed that memory images seemed to be successful in a ganzfeld study; Dunne and Bisaha (1979) commented that logical inferences from an initial impression were unhelpful.. In a ganzfeld experiment by Palmer, Bogart, Jones and Tart (1977), the scores of only one out of two independent judges yielded significant displacement scoring, and among the differences reported by the judges in their strategies was the inclination of the more "successful" judge Approved For Release 2000108115: dW.RDP96-00792ROO0701020002-6 to pay -more attQr1tio t Lans that were unique in the general context of th at ? P, WhERWO"Q792ROWN MUM ongoing train of thought" (p.138). Other experimenters have instructed judges to pay more attention to particular mentation categories. Thus Sargent (1980) instructed judges to pay more attention to mentations which the percipient reports as being novel, striking, odd, unusual, unexpected or particularly clear, and to pay less attention to mentations clearly linked with an immediate memory (thus not conflicting with the comment of Honorton and Harper (1974) above, which presumably relates mostly to long-term memories). The criteria for deciding whether or not one of Honorton's (1975) binary content categories is present in the mentation include "intensity" and persistence of content-related mentation. It can be seen that a number of experimenters feel that certain mentation types may be more likely to be psi-related than others, although authors vary in their choices, and few seem to have.based their opinions on formal research findings. This would seem to be another aspect of the judging process which would benefit from systematic, direct research, with anecdotal observations as a valuable starting point. (v) Salient aspects of targets In an ideal judging situation, those elements of the target material which are most salient for the percipient should be more heavily weighted in the judging than elements known not to be salient. For the purposes of this discussion, 'salient' describes an element of the target about which the percipient tends to give accurate information-more often than chance. Thus if percipients were very often correct about whether or not people were present in a pictorial target, then mentation items dealing with the presence or absence of people should be more heavily weighted than other mentation items. Roll et all (1973) applied such a weighting to mentation categories according to their content, made by a sensitive, and meant to apply to various people. The content categories were those of physical description, health, vocation/education, family, love life-, future, wants, interests, needs, personal characteri sties, and other, and mentation items of half the data were weighted in accordance with the success of mentation items in those content categories in the other half of the data. The content categories used by Roll et. al were presumably chosen since most of the sensitive's mentation could conveniently be coded in terms of them, rather than because each of these categories was believed to be highly salient; indeed, the study was partly one of salience. However, in studies where mentation and targets are coded in terms of content categories (e.g. Honorton, 1975; Jahn, Dunne and Jahn, 1980), content categories seem to be chosen not for their salience but for similar pragmatic reasons of allowing a fairly full description of the mentation report. Further research identifying salient content categories, to allow them to be weighted appropriately or used as the basis of coding systems, would be useful. JyLjL Corresp2ndence types A number of authors have discussed ways in which mentations have appeared to relate to targets in their studies, and some authors instruct their judges to watch out for some of these correspondences. Those mentioned have included literal, formal (shape), sensory (colour, material), symbolic/metaphorical, associational, emotional and functional correspondences, and it has been suggested that these correspondences may relate to either parts of or the 201 Al@ Approved For Release 2000/08/15: CIA-RDP96-0,0792ROO0701020002-6 target, or to the whole, or both. However, authors dif fer, and sometimes conflict, in the importance they attach to these correspondence types. Some authors only take into account one or two types of correspondence, while others deal with most of them but weight heavily certain types which other authors feel are unimportant. For example, Sargent, Bartlett and Moss (1982) in their judging instructions attach most importance to direct (presumably literal) correspondences, and then consider formal, associative, symbolic, and mood/emotive correspondences in order of decreasing importance (a similar order of importance is reflected in Sargent's (1982) judging instructions). These instructions conflict with the opinions of several other researchers, such as Dunne and Bisaha (t979) who consider that correspondences of shape, colour, size, and relation'to other shapes, and metaphorical correspondences are more likely (and presumably more important) than literal correspondences. Similarly, Targ and Puthoff (1978) feel that correspondences of shape, form, colour and material are likely to be more accurate than correspondences of function or name; Schlitz and Haight (1984) instructed their judges to expect correspondences of shape or association rather than literal correspondences; Gelade and Harvie (1975) commented that accurate descriptions were rare, and that metaphorical and symbolic correspondences were more frequent; Hearne (1986) instructed independent judges to look particularly for symbolic correspondences, and Stanford (1979) used artists as judges on the basis of comments by other researchers indicating that meaning was often distorted in mentation but hat the form of the target was often described correctly. Other researchers, while instructing their judges about the types of possible correspondence have either urged their judges to give equal weight to all types, or have given instructions in which'no type of correspondence was made to seem more important than any other (Moriarty and Murphy, 1967; Musso and Granero, 1973; Palmer, Ehamashta and Israelson, 1979; Palmer, Bogart, Jones and Tart, 1977; and Wood, Kirk and Braud, 1977). These differences among authors could be due to a number of different factors. Firstly, the type of correspondence thought to be most important in judging may relate. to the percipient's mode of response, which tends to vary from study to study. Those experimenters who encourage their percipients to make drawings of their imagery have more opportunity to note correspondences (real or spurious) of form than meaning, while the reverse applies to those who encourage their percipients to make verbal responses. This may account for Sargent's preference of meaning over form in his ganzfeld studies in which percipients make mostly verbal responses, in contrast to the preference of form over meaning in the studies of, for example, Moriarty and Murphy (1967) and Musso and Granero (1973) which were 'both picture drawing studies. A second factor in differences among authors may relate to individual differences between authors themselves, or between the percipients, in those authors' studies. Hearne (1986) emphasised symbolic correspondences in his instructions to judges because the single percipient in that study seemed to have obtained such correspondences in earlier testing. Ullman (1966), discussing work on field dependence by Witkin (1965), suggested that the type of correspondence in each percipient's mentation might reflect whether the percipient is field dependent, with field dependent percipients yielding symbolic correspondences. In addition, in a study with both types of correspondence, the types of correspondence noted by the experimenter may depend in part on whether he or she is field dependent. The Use of differed types of target material, may also result in different kinds of correspondence; for example, abstract art prints may yield correspondences of Approved For Release 2000/08/15: CI"DP96-00792ROO0701020002-6 Approved For Release 2000108115: CIA-RDP96-00792ROO0701020002-6 form &A sensory qualities, while pictorial representations of archetypes (such as those used bi, Gertz, 1983) may tend to yield symbolic correspondences. PERCIPIENT JUDGES VERSUS INDEPENDENT JUDGES So far, I have discussed the steps to be taken in an idealised judging process. A related issue is that of who is most likely to be suited to such a complex task. Most discussion in the literature on this issue has centred on the relative merits of percipients as judges of their own material, and of independent judges. The fact that at least one independent judge was used in 58.2% of the 98 studies in the database in which the use of an independent judge would have been appropriate may indicate a preference for independent judges. Several reasons have been put forward for why independent judges should be preferred. First, the use of independent judges should give a uniformity of judging criteria across trials which'may be lacking when percipients. judge, resulting in reduced error variance with independent judges (Palmer, Bogart, Jones and Tart, 1977). Second, it should be easier to select or train a few good independent judges than to select numerous experimental participants who will be both good percipients and good judges (Palmer, Bogart, Jones and Tart, 1977). Third, the use of percipients as judges is likely to confound their ESP performance with their judging ability,, such that relationships between their ESP score and other variables may -be partly with their judging ability rather than their ESP; for example, a correlation between extraversion and the ESP measure may be due the extravert percipients judging more carefully to please the experimenter and hence increasing their ESP score (Stanford 1978, 1984). Fourth, the use of independent judges means that the percipient need only be shown the target at the end of the trial, which some experimenters feel may reduce the risk of precognitive displacement (Palmer, Bogart, Jones and Tart, 1977; Irwin, 1982). Fifth, independent judges are less likely to be ego-involved in the trial's outcome than the percipients since it is not their personal chance to demonstrate ESP in from of others, and may therefore be less likely to use such strategies as "going for broke" (i.e. artificially increasing the correspondence rating of a picture once they are sure it is the target, to make it look like a "better" hit) (Stanford and Sargent, 1983) or of deliberately avoiding giving points to a target which is a personal favorite (Sargent, 1980), although Stanford (1984) suggests avoiding telling independent judges that they are assessing ESP data in order to reduce the temptation for them also to "go for broke". Sixth, experienced independent judges may be more familiar with norms for free- response mentation. and may be able to identify and hence weight appropriately mentation items which are unusual better than naive percipients. In a study by Sargent, Bartlett and Moss (1982), an independent judge who separated naive percipients' mentations, into unusual and common mentationst obtained less of a scoring difference between the two than did the percipients who also categorised their own mentation. However, the judge also obtained lower scoring than the percipients in both categories, indicating that the judge may have been handicapped in the judging task (for examplep by the percipients' inability to describe their imagery). For reasons similar to those for avoiding percipient judging, several experimenters have explicitly recommended combining scores from several independent judges to dilute the effect of idiosyncrasies of each judge, such as the ability to only detect certain types of correspondence (Stanford, 1984) 203 bow- Approved For Release 2000108115: CIA-RDP96-00792ROO0701020002-6 or personal preferenc6s for certain targets or mentation items which might influence the judge unduly (Targ and Targ, 1986). Some experimenters have judges working in consensus (e.g. Targ and Targ, 1986; Jahn, Dunne and Jahn, 1980), presumably for these reasons. Indeed, of those 57 studies in which at least one independent judge was used, only 20 used only one judge; the number used ranged from one to eight. However, the advantages of independent judges, multiple or otherwise, depends upon them being good judges, whether naturally, as a result of training, or due to the provision of full and appropriate instructions (Stanford, 1984). The need to know what makes good judges and good judging has been stressed in the literature (Honorton and Stump, 1969; Sargent, 1980, 1981), Only two studies in those surveyed set out to compare the judging skills of judges of varying backgrounds. Roney-Dougal (1987) found that a psychotherapist independent judge with considerable experience in and knowledge of subliminal perception research scored most highly above chance (mean Z +0.187, n.s.), while a "naive" poet scored slightly above chance (mean Z +0.127, n.s.). A third judge who was a 'trained "psychic"' scored significantly below chance (mean Z = -0.16, t = 2.155, p = 0.04). This result is difficult to interpret, since we cannot know whether the percipients were ft really" scoring above or below chance. In a hypnotic dream study reported as a conference abstract, Keeling (1972) found that only a group of ten clinical psychology graduate students who judged the percipients' data obtained significantly above-chance scoring (p=0.018, one-tailed), while a group of ten undergraduates in an introductory psychology class, and a group of twenty middle-aged students in a YMCA course on the occult acting as judges yielded non-significant results. However, the results of the three groups were not strictly comparable, since the occult students @ judged different data from the other two groups, and the undergraduate psychology students did the judging in a different order from the other two groups. Given the differences in scoring between the judges in these two studies, the judges' background and experience would seem to be an important variable in any free-response study. However, it was made clear in only 10 out of 57 studies using independent judges that the judge had experience relevant to judging (in areas of psychology, the visual or literary arts, etc. which deal specifically with the transformation of subconscious information, or previous experience of free-response judging). No studies concerning the training of judges seem to have been made. However, the finding that Palmer, Bogart, Jones and Tart (1977) that a judge with experience of the ganzfeld gave significant evidence of displacement in a ganzfeld study while a second judge with no ganzfeld experience did not, might suggest that experience of the experimental procedure used in a study might usefully be included in any judge's training. Results from a study by Maher (1987), in which judges' scores increased with repeated judging of the same material presented in a different format each time, may suggest that simple repeated exposure to the judging task, or increasing familiarity with the judging material, may improve scoring. 7he effect of instructions upon judges has similarly been a neglected topic, although Palmer, Khamashta and Israelson were led to compare the results of judging with and without instructions when they observed that uninstructed percipient judges scored below chance (Mean Z -0.34), while two independent judges with full instructions scored above chance (mean Z = +0.37), the difference being significant (Wilcoxon T 15, CR = -3.36, p<0.001, two-tailed). They had two more judges judge the data without Approved For Release 2000108/15: CIA-RDP96-00792ROO0701020002-6 904 Approved For Release 2000108115: CIA-RDP96-00792ROO0701020002-6 instructions, yielding a mean Z-score of +0,29, and concluded that the lack of judging instructions in this case probably did not cause the difference between the results of the percipients and the original two independent judges. However, the two uninstructed judges had taken part in a discussion of the judging of free-response material in Palmer's graduate class in parapsychology some months earlier, and so were not entirely naive. Instructions were reported as being given to judges without judging experience or knowledge of unconscious processes in only 12 out of 47 studies in which such judges were used. Further research is clearly needed.on this topic. The only reason against using independent judges has been that only the percipients themselves can have full knowledge of what their imagery really was and would be able to recognise personal symbolism (e.g. Palmer, 1986). This problem might also result in confounding the percipient's ability or inclination to fully report their imagery with their ESP performance if independent judging were used, possibly resulting in misleading relationships with other variables (Stanford, 1984). A number of experimenters have explored the importance of asking percipients for more information about their mentation after the end of the free-response period, by comparing the performance of independent judges provided with transcripts of the initial mentation reports, and with the initial transcripts plus additional information provided by the percipients. Stanford (1984) has suggested training percipients in the reporting of imagery, while Palmer, Bogart, Jones and Tart .(1977) suggest having an experimenter who is blind to the identity of the target, review the percipient's experience with him or her immediately after the response period and add to the transcript possibly relevant information (such as a full description of certain images, or the unusual qualities of images, phenomenological character ist@ics, and so on). Along these lines, it may also be advisable to offer percipients the opportunity to draw imagery which may have been difficult to describe verbally, or vice-versa, depending on the task. Sondow (1979) found that independent judging by two experienced judges of the initial transcripts only from participants in a ganzfeld study yielded scoring exactly at chance (15 direct hits in 60 trials), while judging with the addition of the percipients' personal. associations to the mentation gave significantly above-chance scoring (23 hits, Z=2.39, p<0.02). Each judge judged half of the initial transcripts only, and half of the transcripts with associations, so that no judge judged the same trial with and without associations. However, the percipients' judging yielded even higher scoring (30 hits in 60 trials). In a dream study by Krippner, Honorton and Ullman (1972), independent judges judged first the initial mentation transcript alone, and then the transcript plus the results of an interview in which the percipient gave details of what mood accompanied the dream, what thoughts or memories it brought to mind, what elements of the dream made. no sense in terms of the dreamer's personal life, and what the main them of the night's dreams had been. On the initial transcripts alone, the judges obtained two hits out of eight trials (with a one in eight chance of success). With the addition of the details of the interview, the judges obtained five hits, a result which was significantly above chance (p=0.0012, one-tailed). The percipient did not do any judging in this study, so no comparison with his scores can be made. A similar procedure, was used in a study by Ullman, Krippner and Feldstein (1966). In the interview, the percipient was asked what the dream reminded him or her of, what if anything seemed to be trying to intrude on the dream, 205 6-00792ROO0701020002-6 Approved For Release 2000108115: CIA-RDP9 Approved For Release 2000/08/15: CIA-RDP96-00792ROO0701020002-6 and whether. there was anything in the. dream which was different from the percipient's dreams, such as colour, feeling the dream to be real, or Private symbolism. On the basis of the initial transcripts alone, the three judges (whose judging experience", if anyp was not reported) scored significantly above chance (F=8.30,'p<0.01); with the addition of the interview material, scoring was even higher (F=18.14, p<0.001). The percipient judging Yielded non-significant results with the initial transcript alone, and results above chance at the p<0.05 level (F=4.41) with, the addition of the interview material. On the basis",of these results, it seems that further elaboration by percipients on their initial mentation reports adds useful information, since scores with such elaboration were higher than those without in all three studies discussed above. H6wever, while the percipients still managed to score at a higher level than the experienced judges in Sondow's study even when the judges were provided with their associations, Ullman, Krippner and Feldstein found that their (possibly inexperienced) independent judges scored higher than the percipient judges both with and without associations. This apparent conflict of findings.may be in part due to the extra information which Ullman et al. elicited from their percipients during the interview. Clearly, more research needs to be directed to this question. DISCUSSION The most striking feature about judging practices in the literature surveyed is their variety, and in some aspects of judging, their contradiction. The level of description of aspects of judging is generally very brief, and it may be that judging practices are much more. similar from laboratory to laboratory than appears in print. Similarly, the 4% of studies using independent judges which involved giving the judges full instructions concerning various types of transformation types along with detailed examples, may be an underestimate, since more experimenters may have given their judges equally full instructions without reporting it. However, either a lack of instructions or a lack of reporting them might imply a lack of importance being attached to the judging process within the field. Since judging is logically a crucial part of any free-response study, both more attention to judging and its reporting-is surely merited. Delanoy (1987) has listed information about judging which should ideally be listed in any free-response study. ' Although little direct research has been done on the judging process, the studies surveyed indicated many potentially profitable lines of research. The training of judges (real training with feedback, rather than merely repeated exposure to the judging task) has apparently not be explored, and may be a valuable research strategy in this area. Awareness of individual differences, methods of responding (verbal, pictorial, etc.), setting and target type are among the many variables which need to be considered in further research on judging, as well as aspects of procedure such as the use of rating scales with appropriate ranges and judging sets of an appropriate size for the task. We clearly need to know more about all aspects of judging as part of our efforts to improve the I reliability and effectiveness of free-response experimentation in general. 206 Approved For Release 2000/08/15: CIA-RDP96-00792ROO0701020002-6 Approved For Release 2000/08/15: CIA-RDP96-00792ROO0701020002-6 REFERENCES BOERENKAMP, H. G. (1984). Potential paranormal value of statements of psychics acquired under feedback conditions. European Journal of Parapsychology, 5, 101-124. BRAUD,, L. W., & BRAUD, W. G. (1974). Further studies of relaxation as a psi-conducive state. Journal of the American Society for Psychical Research, 68, 229-245. BRAUD, W. G., & WOOD, R. (1977). The influence of immediate feedback on free-response GESP performance during Ganzfeld stimulation. Journal of the American Society for Psychical Research, 71, 409-427. DELANOY, D. (1987). The reporting of methodology in ESP experiments. Parapsychology,Review, 18, 1-4. DUNNE, B. J., & BISAHA, J. P. (1979). Precognitive remote viewing in the Chicago area: A replication of the Stanford experiment. Journal of Parapsychology, 43, 17-30. GELADE, G., & HARVIE, R. (1975). Confidence ratings in an ESP task using affective stimuli. Journal of the Society for Psychical Research, 48, 766, 209-219. GERTZ, J. (1983). Hypnagogic fantasy, EEg, and psi performance in a single subject. Journal of the American Society for Psychical Research, 77, 155-170. HEARNE, K. M. T. (1986). An analysis of premonitions deposited over one year, from an apparently gifted subject. Journal of the Society for Psychical Research, 53, 804, 376-382. HONORTON, C. (1975). Objective determination of information rate in psi tasks with pictorial stimuli. Journal of the American Society for Psychical Research, 69, 353-359. KEELING, K. R. (1972). Telepathic transmission in hypnotic dreams: An exploratory study. In Roll, W. G., Morris, R. L., & Morris, J. D. (Eds.), Proceedings of the Parapsychological no. 8, 1971.. HONOWON, C., & HARPER, S. (1974). Psi-mediated imagery and ideation in an experimental procedure for regulating perceptual. input. Journal of the American Society for Psychical Research, 68, 156-168. HONORTON, C., & STUMP, J. P. (1969). A preliminary study of hypnotically-induced clairvoyant dreams. Journal of the American Society for Psychical Research, 63, 175-184. 207 Approved For Release'2000/08/15: CIA-RDP96-00792ROO0701020002-6 IRWIN, C. P. (1982). The role of memory in free-response ESP studies: Is target familiarity reflected in the scores? Journal of the American Society for Psychical Research, 76, 1-22. JAHN, R. G., DUNNE, B. J., &.JAHN, E. G. (1980). Analytical judging procedure for remote perception experiments. Journal of Parapsychology, 44, 207-231. KEELING, K. R. (1972). Telepathic transmission in hypnotic dreams: An exploratory study. In Roll, W. G., Morris, R. L., & Morris, J. D. (Eds.), Proceedings of the Parapsychological no. 8, 1971. KRIPPNER, S., HONORTON, C., & ULLMAN, M. (1972). A second precognitive dream study with Malcolm Bessent. Journal of the American Society for Psychical Research, 66, 269-279. MAHER, M. (1987). Replication of an "incline" effect in blind judging scores. In Weiner, D. H., & Nelson, R. D. (Eds.), Research in Parapsychology 1986, Scarecrow Press: Metuchen, N.J. MAREN, A. J. (1987). Representation and performance evaluation approaches in psi free-response tasks. In Weiner, D. H., & Nelson, R. D. (Eds.), Research in Parapsychology 1986, Scarecrow Press: Metuchen, N.J. MAVROMATIS, A. (1987). Hypnagogia. Routledge & Kegan Paul: London & New York. MILTON, J. (1984). The effect of the presence of an agent on ESP performance and of the isolation of the target from its controls on displacement in a ganzfeld clairvoyance experiment. In White, R. A., & Broughton, R. S. (Eds.), Research in Parapsychology 1983, Scarecrow Press: Metuchen, N.J. MILTON, J. (1985). The effect of agent strategies on the percipient's experience in the ganzfeld. In White, R. A., & Solfvin, J. (Eds.), Researchin Parapsychology 1984, Scarecrow Press: Metuchen, N.J. MILTON, J. (1987). Judging strategies to improve scoring in the ganzfeld. In Weiner, D. H., & Nelson, R. D. (Eds.), Research in Parapsychology 1986, Scarecrow Press: Metuchen, N.J. MORIARTY, A. E., & MURPHY, G. (1967). An experimental study of ESP potential and its relationship to creativity in a group of normal children. Journal of the American Society for Psychical Research, 61, 326-338. Approved For Release 2000/08/15: CIA-RDP96-00792ROO0701020002-6 208 Approved For Release 2000/08/15 :,,CIA-RDP96-00792ROO0701020002-6 MUSSO, J. R., & GRANERO, M. (1973). An ESP drawing experiment with a high-scoring subject. Journal of Parapsychology, " , 13-36. PALMER, J. (1986). Experimental methods in ESP research. Chapter in Edgel H., Morris, R. L., Palmer, J., & Rush, J. H., Foundations of parapsychology, (pp. 111-137), Routledge & Kegan Paul: Boston, London & Henley. PALMER, J., BOGART, D. N., JONES, S. M., & TART, C. T. (1977). Scoring patterns in an ESP ganzfeld experiment. Journal of the American Society for Psychical Research, 71, 121-145. PALMER, J., KHAMASHTA, K., & ISRAELSON, K. (1979). An ESP ganzfeld experiment with transcendental meditators. Journal of the American Society for Psychical Research, 73, 333-348. ROLL, W. G. (1971). Free verbal response and identi-kit tests with a medium. Journal of the American Society for Psychical Research, 65, 185-191. 1 ROLL, W.G., MORRIS, R. L., DAMGAARD, J. A., KLEIN, J., & ROLL, M. (1973). Free verbal response experiments with Lalsingh Harribance. Journal of the American Society for Psychical Research, @@7, 197-207. RONEY-DOUGAL, S. M. (1987). A comparison of subliminal and psi perception: Exploratory and follow-up studies. Journal of the American Society for Psychical Research, 81, 141-181. SARGENT, C. L. (1980). Exploring psi in the ganzfeld. Parapsychology Foundation: New York, N.Y. SARGENT, C. L. (1981). ESP in the twilight zone: State of the art. Parapsychology Review, 12, 1-7. SARGENT, C. L. (1982). A ganzfeld GESP experiment with visiting subjects. Journal of the Society for Psychical Researcli, 51, 790, 222-232. SARGENT, C. L., BARTLETT, H. J., & MOSS,-S. P. (1982). Response structure and temporal incline in ganzfeld free-response GESP testing. Journal of Parapsychology, 46, 85-110. SARGENT, C. L., MILTON, J., PAYNE, J., & BENNET, S. (1982). Unpublished study. SARGENT, C. L., MOSS, S. P., & BARTLETT, H. J. (1982). Unpublished study. 209 Approved For Release 2000/08/15: CIA-RDP96-00792ROO0701020002-6 Approved For Release 2000/08/15: CIA-RDP96-00792ROO0701020002-6 SCHLITZ, M. (1984). Esalen meetings on psi research. Parapsychology Review, 15, 10-12. SCHLITZ, M. J., & HAIGHT, J. (1984). Remote viewing revisited: An intrasubject replication. Journal of Parapsychology, 48, 39-49. SCHOUTEN, S. A., & MERKESTEIN, J. (1985). A free-response study in a real-life setting. European Journal of Parapsychology, 6, 19-32. SONDOW, N. (1979). Effects of associations and feedback on psi in the ganzf6ld: Is there more than meets the judge's eye? Journal of the American Society for Psychical Research, 73, 123-150. STANFORD, R. G. (1967). Response bias and the correctness of ESP test responses. Journal of Parapsychology, 31, 280-289. STANFORD, R. G. (1978). Special problem areas in research methodology. In W. G. Roll (Ed.), Research in Parapsychology 1977, Scarecrow Press: Metuchen, N. J. STANFORD, R. G. (1984). Recent ganzfeld-ESP research: A survey and critical analysis. In Krippner, S. (Ed.), Advances in Parapsychological Research 4. McFarland: Jefferson, N. C. STANFORD, R. G. (1979). The influence of auditory ganzfeld characteristics upon free-response ESP performance. Journal of the American Society for Psychical Research, 73, 253-272. STANFORD, R. G., & SARGENT, C. L. (1983). Z scores in free-response methodology: Comments on their utility and correction of an error. Journal of the American Society for Psychical Research, 77, 319-326. TARG, R., & PUTHOFF, H. E. (1977). Mind-Reach. Dell: New York. TARG, E., & TARG, R. (1986). Accuracy of paranormal perception as a function of varying target probabilities. Journal of Parapsychology, 50, 17-27. TART, C. T., & SMITH, J. (1968). Two token object studies with Peter Hurkos. Journal of the American Society for Psychical Research, 62, 143-157. ULLMAN, M. (1966). A nocturnal approach to psi. In Roll, W. G. (Ed.), Proceedings of the Parapsychological Association, no. 3, 1966. 210 Approved For Release 2000/08/15: CIA-RDP96-00792ROO0701020002-6 Approved For Release 2000/08115: CIA-RDP96-00792ROO0701020002-6 ULLMAN, M., KRIPPNER, S., & FELDSTEIN, S. (1966). Experimentally-induced telepathic dreams: Two studies using EEG-REM monitoring techniques. International Journal of Parapsychology, 8, 577-603. WHITE, R. A., KRIPPNER, S., ULLMAN, M., & HONORTON, C. (1971). Experimentally-induced telepathic dreams with EEG-REM monitoring: Some manifest content variables related to psi operation. In Roll, W. G., Morris, R. L., & Morris, J. D. (Eds.), Proceedings of the Parapsychologinal Association no. 5. 1968. WOOD, R., KIRK, J., & BRAUD, W. (1977). Free response GESP performance following ganzfeld stimulation versus induced relaxation, with verbalised versus nonverbalised mentation: A failure to replicate. European Journal of Parapsychology, 1, 80-93. WITKIN, H. A. (1965). Psychological differentiation and forms of pathology. Journal of Abnormal Psychology, 10, 317-336. 211 roved For Release, 2000/08115: CIA-RDP96-00792ROO0701020002-6