Orne, M. T., & Holland, C. H. On the ecological validity of laboratory deceptions. International Journal of Psychiatry, 1968, 6, 282-293.


On the Ecological Validity of Laboratory Deceptions

Martin T. Orne, M.D., Ph.D.

Director, Unit for Experimental Psychiatry, Institute of the Pennsylvania Hospital; Department of Psychiatry, University of Pennsylvania

Charles H. Holland, Ph.D.

Unit for Experimental Psychiatry, Institute of the Pennsylvania Hospital; Department of Psychiatry, University of Pennsylvania

"O what a tangled web we weave,

When first we practise to deceive!"

Sir Walter Scott, 1808

In any psychological experiment, the subject's knowledge and beliefs about the study may have significant effects upon his behavior. In order to obtain undistorted responses, it is often felt necessary to disguise the purpose of an experiment, and to do this investigators have used misinformation, confederates, and other forms of deception. Milgram's studies in obedience are analyzed as an example of significant research where the importance of the theoretical, social and moral implications has tended to obscure these methodological issues. How the subject perceives the experiment in general, and how plausible the deception manipulation is for him in particular, must be evaluated before meaningful inference can be drawn from the experiment to life outside the laboratory.

IN THE last half of this century social psychology has gained increasing significance and importance. In an age when technology has made the sudden extinction of man an all too real possibility, when we are witnessing a world-wide crisis of values and the only remaining social certainty is continuing change, the prospect of bringing relevant social psychological processes under scientific scrutiny is of major concern to all of us. The impressionistic, quasiphilosophical approaches which had long characterized writings about crowd behavior and group processes were not sufficient to form the body of a science, nor could


The substantive work upon which the theoretical outlook presented in this paper was based was supported in part by contract # Nonr 4731 from the Group Psychology Branch of the Office of Naval Research.

We wish to thank our colleagues, Frederick J. Evans, Edgar P. Nace, Emily C. Orne, David A. Paskewitz, and David L. Rosenhan, for their helpful comments and criticisms.


 

283 CRITICAL EVALUATIONS

the technology of evaluating attitudes and public opinion, regardless of its methodological sophistication, provide for the development of basic new insights into the nature of man. Rather the pioneering work, particularly by Lewin, Asch, and Sherif, showed how the techniques of experimental psychology could also be applied to the study of social psychological phenomena. The use of the psychological experiment as a tool has made it possible to systematically manipulate a wide range of variables, and increasing ingenuity has been devoted to the application of this tool to an ever wider range of problems. The experiments by Festinger and his students (e.g., Brehm & Cohen, 1962) in support of the cognitive dissonance theory are particularly ingenious examples of what has recently become known as experimental social psychology.

Conceptual and methodological issues that had been skirted by much of the research in this exciting new discipline have been brought to a head by Milgram's studies in obedience. He has addressed himself to one of the most compelling questions of our time: What are the conditions under which man will inflict pain and suffering upon another individual? In a series of apparently crucial experiments Milgram seems to have shown that subjects (hereinafter designated Ss) can be required to inflict pain up to and beyond intensities clearly designated as dangerous merely by legitimizing this behavior as part of a scientific experiment. Such findings are uncomfortably reminiscent of the concentration camp "medical experiments" reported to have been carried out by Schumann, Mengele, and others (Manvell & Fraenkel, 1967). Milgram has also tried to demonstrate that this behavior is lawful in that it can be shown to vary, depending upon the proximity between the S and his victim and the extent to which the experimental situation as a whole is legitimized, i.e., carried out within the confines of a university campus versus a rented office in a slightly disreputable office building.

The implications of Milgram's work are clear. It would appear that with little effort most individuals can be induced to carry out destructive and aggressive actions bringing severe pain, possibly permanent injury or even death to their victim. The fact that the S's behavior is in the name of science provides little reassurance and suggests at the very least a horrifying callousness as a characteristic of modern man. The studies seem to provide convincing empirical support for Freud's belief in the death instinct and the philosophic position on man put forth by Hobbes, Nietzsche and others. One may even conceive of these studies as laboratory analogs to the Genovese murder (Rosenthal, 1964).

As has often been pointed out, the extent to which scientific findings become generally accepted is only partly a function of the care with which they are obtained. In large part acceptance depends upon the extent to which results fit the Zeitgeist and the prejudices of the scientific community. The flair with which Milgram presents his findings and the affect they generate tend to obscure serious questions about their validity. In evaluating research which has broad implications and is of practical importance, there is a tendency to minimize concern for methodological rigor. Yet it is because of its importance that this research demands thoughtful consideration.

Ecological Validity of Deception Studies. In some areas, the bases for evaluating the methodological adequacy of research have been worked out in considerable detail. A judgment is usually made by evaluating the controls, the manner in which data are collected and how they are handled statistically. By these criteria, Milgram's work appears to have been carefully carried out. Unfortunately there is an entirely different set of problems requiring consideration,

 

284 STANLEY MILGRAM

which Brunswik (1947) has subsumed under the concept of ecological validity.

Experiments are carried out to make inferences to other -- usually nonexperimental -- situations. They make it possible to observe events in a standard situation, ideally holding constant everything other than the particular independent variable under investigation. For this technique to allow valid inference it is essential that the experimental situation adequately reflect the process under investigation. This crucial step is taken when the general process is translated into specific experimental terms by an operational definition. Milgram, for example, has operationally defined the concept of obedience as whether or not in the experiment the S continues to administer ever increasing levels of electric shock. Then by studying the conditions under which the specific behavior may be obtained, the ease with which it can be elicited, the percentage of individuals who obey, etc., he tries to investigate the generic problem of obedience. The validity of his findings for legitimate generalizations to nonexperimental contexts where the concept of obedience applies depends upon the appropriateness of the experimental situation and the adequacy of the operational definition -- questions central to the issue of ecological validity. Unfortunately, while the rules of statistical inference have received a great deal of attention in recent years, no such consensus exists about how to evaluate the ecological validity of research findings.

As a solution to the problem of ecological validity, Brunswik suggested running Ss with differing demographic, personality and I.Q. characteristics and extending the study to a wide variety of contexts. Milgram's work is of special interest because he, more than most other investigators, has systematically tried to vary both subject population and the institutional setting in which his studies are carried out, implicitly recognizing the crux of the issues confronting his work. These attempts do not, however, successfully deal with the two issues addressed in this discussion: the methodological problems common to all psychological deception studies and the unique social psychological attributes of the psychological experiment itself.

Problems of Deception Studies. Conceptually, the Milgram situation is closely related to the conformity situation developed by Asch. In his classic research on conformity, Asch (1952) placed Ss in a group situation, ostensibly to investigate perception. The Ss were required to carry out simple perceptual tasks such as judging the length of lines and to reach agreement about their perceptions. Starting out with very ambiguous stimuli, the perceptual qualities became more and more clear-cut. The situation was so devised, however, that there was only one real S while the other Ss were confederates. They were instructed to agree on perceptions which were in fact inaccurate, and matters were so arranged that the actual S was required to make his judgment after most of the others. In the beginning, with stimuli which were very ambiguous, it was not difficult for him to agree but, as the experiment continued, he found himself confronted with agreement among his peers about perceptions that were clearly at odds with his own, an experience which many Ss found extremely frightening and disturbing. They were forced either to conform to group pressure and deny what they could plainly see or to maintain their perceptual judgment against the group.

Asch used this situation to explore the kinds of factors that determined the S's conformity response. Recognizing the importance of not allowing the S to suspect that the other Ss were actually confederates, he was careful to keep the situation plausible. The stimuli were chosen to be am-

 

285 CRITICAL EVALUATIONS

biguous at first, and only gradually was the S forced to recognize the increasing discrepancy between his perceptions and those of his peers. The extent to which Ss accepted the situation was checked by careful postexperimental interviews which allowed Asch to evaluate their degree of suspicion. In this way he was able to determine the limits within which the experiment had to be conducted without becoming obvious -- a formidable problem when Swarthmore students were used as Ss!

The development of a new paradigm of this kind is usually followed by a large number of studies using the technique in order to relate conformity to a wide variety of other parameters. While Asch himself paid close attention to the plausibility of his situation, later investigators showed less concern about this problem, often changing the perceptual stimuli abruptly and excessively. Rather than checking carefully whether Ss were taken in by the deception they tended to define conformity in simple behavioral terms, either omitting postexperimental discussion or carrying it out in a perfunctory fashion.

Unless a postexperimental inquiry is carried out with great persistence and sensitivity a "pact of ignorance" tends to develop (Orne, 1962). It is important to Ss that their experimental participation prove useful. If the S sees through the deception in an experiment, he may also realize that this might destroy the value of his performance. Since neither he nor the experimenter (E) wants to discard his data, their interests collaborate to make the S appear naive even after extremely transparent procedures.

The use of deception in social psychological studies has become extremely popular in recent years.1 It is obviously felt that deception is needed to make it possible to explore the process under investigation. Experimenters implicitly realize that Ss are active, sentient beings who are influenced not only by the immediate stimuli in the experimental situation but also by their symbolic meaning in a broader sense: the context in which the studies are carried out, their aims, purposes, and so forth. The deception, then, is an attempt on the part of the investigator to circumvent those cognitive processes of the S which would interfere with his research. When such an experiment is carried out, however, it is vital that the investigator determine whether it is the S or himself who is being deceived!

Milgram's studies use deception to create what seems to be a compelling conflict situation. Because of the ingenuity of the deception, the reader is drawn to assume with Milgram that Ss accepted the situation at face value. It is unfortunate that no data are presented to indicate whether this was in fact the case. Yet the extent to which the deception actually was accepted by the S determines how the results should be interpreted. Had quasi-control procedures been included in the obedience studies, they would have shed light on the adequacy of the deception manipulation, but in the absence of data on this crucial issue, the only way we can evaluate the experiment's plausibility is to analyze the congruence of the cues available to the S.

The Milgram Design. In all the variations of Milgram's obedience research, Ss are told that they are participating in an experiment to study the effect of punishment on memory. Each S always draws straws with the other S (confederate) and by "chance" ends up with the teacher's role. The other "S" is always strapped into an awesome looking electric chair and both are always shown a shock generator by means of which shocks from a very low voltage up to 450 volts can be administered. This machine is made all the more awesome by labeling


1 For an excellent discussion of the ethical problems as well as other issues concerning the use of deception in psychological studies see Kelman, "On the Human Use of Human Subjects," 1967.

 

286 STANLEY MILGRAM

the high end of the scale in red letters, "Danger -- Severe Shock -- XXX."

Before beginning the experiment a shock described as 45 volts is administered to the S, presumably so that he knows what the shocks are like. This, incidentally, is an extremely uncomfortable shock. The teacher's role is then explained to the S -- how to present a paired-associates learning task to the learner, how it is the teacher who determines the duration of the punishment, and how a higher level of shock must be administered for each failure. Initially the learner is able to answer correctly two out of three times, but he rapidly begins to fail more and more frequently, requiring the S to administer ever higher levels of shock to the victim. "Starting with 75 volts the learner begins to grunt and moan. At 150 volts he demands to be let out of the experiment. At 180 volts he cries out that he can no longer stand the pain. At 300 volts he refuses to provide any more answers to the memory test, insisting that he is no longer a participant in the experiment and must be freed" (Milgram, 1965, p. 60).

The E throughout this procedure remains calm and instructs the S to continue. If the teacher says that the learner is no longer responding and therefore there is no point in going on, he is told, "Treat the absence of an answer as equivalent to a wrong answer, and follow the usual shock procedure." If the S does not wish to do so, he is told, "You have no other choice, you must go on!" Regardless of the degree of agitation shown by the S or the anguished screams of the victim, the E remains imperturbable and demands that the S continue.

How Plausible Is This Paradigm? Ss participating in psychological experiments have considerable awareness of the implicit rules which govern the situation. They have learned to distrust the E because they know that the true purpose of the experiment may be disguised. Many Ss view their task as a problem-solving situation which requires them to determine the "real" situation and respond appropriately. This process has been analyzed elsewhere (Orne, 1962; Riecken, 1962). Of particular relevance here is that the S's perception of the purpose of an experiment will depend only in part on what he is told explicitly. He will then evaluate this information in terms of his prior knowledge, using whatever cues are available in the situation. These cues include not only the manner in which instructions are communicated but also scuttlebutt about the experiment, the setting in which it is carried out, the person of the E and, most important of all, the experimental procedure itself. The congruence of all of these cues with the instructions that the S is explicitly given will determine the plausibility of the experimental situation. When the procedure suggests one experimental intent and the explicit instructions another, what the S believes becomes difficult to determine and very slight changes in the procedure may lead to radical changes in the S's hypotheses and subsequent behavior. In a conflict situation when the instructions are at odds with other cues, the S is apt, however, to rely preferentially on those cues stemming from the experimental procedure because, as the old adage says, "Actions speak louder than words."

To successfully carry out a deception study is exceedingly difficult because subtle practical problems, often dealt with in some fashion by research assistants, assume crucial importance. In arranging the schedule, for example, the S may inquire whether he might be run at the same time as a friend. Considerable ingenuity is required to explain in a plausible fashion why no suitable time exists for the Ss to be run together in a study apparently requiring two Ss. The task of preventing Ss from communicating with each other is also formidable, especially in an experiment that makes such

 

287 CRITICAL EVALUATIONS

ideal cocktail party conversation. There are moreover innumerable subtle cues that can give away the true status of a confederate, stemming not only from the confederate's behavior but also from that of the E. (It is exceedingly difficult to treat the confederate and the S in a similar fashion.) In the absence of evidence it does not seem justified to assume that the performance was carried out flawlessly in each instance. Plausible deceptions are not easily achieved, but no hint of difficulties or S disqualifications appears in any of Milgram's reports.

Beside the myriad technical problems, even if we were to assume that everybody played his role to perfection, the experimental procedure itself contains serious incongruities. The experiment is presented as a study of the effect of punishment on memory. The investigator presumably is interested in determining how the victim's rate of learning is affected by punishment, yet there is nothing that he requires of the S (teacher) that he could not as easily do himself. Those Ss who have some scientific training would also be aware that experimental procedures require more care and training in administering stimuli than they have been given. The way in which the study is carried out is certainly sufficient to allow some Ss to recognize that they, rather than the victim, are the real Ss of the experiment.

The most incongruent aspect of the experiment, however, is the behavior of the E. Despite the movie image of the mad scientist, most Ss accept the fact that scientists -- even behavioral scientists -- are reasonable people. No effort is made to emphasize the world-shaking importance of the learning experiment; rather it is presented as a straightforward, simple study. Incongruously the E sits by passively while the victim suffers, demanding that the experiment continue despite the victim's demands to be released and the possibility that his health may be endangered. This behavior of the E, which Milgram interprets as the demands of legitimate authority, can with equal plausibility be interpreted as a significant cue to the true state of affairs -- namely that no one is actually being hurt. Indeed, if the S believes that the experiment is a legitimate study, the very fact that he is being asked to continue a relatively trivial experiment while inflicting extreme suffering upon his victim clearly implies that no such suffering or danger exists.

The incongruity between the relatively trivial experiment and the imperturbability of the E on the one hand, and the awesome shock generator able to present shocks designated as "Danger -- Severe Shock" and the extremity of the victim's suffering on the other, should be sufficient to raise serious doubts in the minds of most Ss.

Another Way to Conceptualize Milgram's Findings. In considering the incongruities of the situation, one may wonder how different this experiment is from the stage magician's trick where a volunteer from the audience is strapped into the guillotine and another volunteer is required to trip the release lever. The magician is careful to do a professional job of deception. He demonstrates that the guillotine will split a head of cabbage and allows the volunteer to satisfy himself about the genuineness of the guillotine. Though releasing the lever will lead to the apparently inevitable decapitation of the victim, he has little difficulty in obtaining "obedience" because the S knows full well that everything is going to be all right. This does not, of course, prevent the S from being somewhat uncomfortable, perhaps showing nervous laughter, when he is actually required to trip the lever, if only because such behavior is appropriate in this context.

The lawfulness which Milgram demonstrates in the relationship of obedience to physical proximity can be accounted for by

 

288 STANLEY MILGRAM

the cues that different procedures communicate to the S -- albeit implicitly -- that things are not what they seem. Sixty-six per cent obedience is obtained when the victim is in another room and the only communication between him and the teacher is his banging the wall at 300 volts, ostensibly wanting to be let out. When we consider the most striking cues available to the S at this point -- the victim's protestations and the E's calmly continuing to take notes and ignoring these pleas -- it would seem more plausible to assume not that the E is some fanatic with a cause which justifies his behavior but rather that the E is still a responsible scientist whose behavior is clearly communicating that the alleged victim is not really a victim at all. One need not assume undue intellectual ability on the part of the S to postulate that he may be able to surmise the true state of affairs. To us the high rate of compliance would seem less due to the S's "putting the victim out of mind" (as Milgram suggests) than to the relatively low cue value of the victim's behavior when compared with that of the E's unswerving imperturbability. The addition of auditory feedback (the victim's screams) results in a remarkably small decrement in obedience -- to 62.5 per cent. We would suspect this is partly due to the technical problems; namely, the S tends to scream only in response to the immediate punishing stimuli while sitting quietly and continuing to cooperate in response to the verbal stimuli. Regardless of the quality of the screams, the situation is not very plausible. Apparently by using good actors in close proximity it is easier to convince Ss that the situation might be real. Here the degree of compliance drops to 40 per cent, and when the S is required to actually hold the victim's hand, to 30 per cent.

In contrast to many other investigators, Milgram is aware of some of the difficulties inherent in social psychological laboratory research and, for this reason, he used the ingenious modification of renting an office in downtown Bridgeport and making the research appear unrelated to a major university. Under these circumstances the cue value of the E's imperturbability would diminish and, as one might expect, the degree of obedience also diminishes. Even in this situation, however, the E is still carrying out an experiment in an apparently professional manner.

Relevant Data from Other Studies. The problems inherent in this research are illustrated in the longstanding controversy about whether Ss in hypnosis can be compelled to carry out antisocial actions. Already, in 1889, Janet reports that before a distinguished group of jurists and medical men a deeply hypnotized patient stabbed individuals with rubber daggers, poisoned their tea with sugar and carried out any other type of murder or mayhem required of her. This demonstration was very impressive, and after the distinguished guests had left, the S was left to be awakened by students who wished to end the experiment on a lighter note. They suggested to the patient that she was alone, about to take a bath, and should undress. Her response to the suggestion was to awaken immediately, greatly disturbed. It is one thing to "kill" people during an experimental situation with means that cannot really do damage; it is quite another to be asked to undress in a context that transcends the experimental situation.

More recently, Rowland (1939) carried out a seemingly definitive experiment by showing that Ss in deep hypnosis could be compelled to carry out antisocial and destructive acts, such as throwing fuming nitric acid at a research assistant, a finding which was subsequently replicated by P. C. Young (1952). In both studies Ss were "obedient" in deep hypnosis, but when asked in the waking state whether they would carry out these actions, they indi-

 

289 CRITICAL EVALUATIONS

cated in horror that they would not. In a careful replication of these experiments, Orne and Evans (1965) found that five out of six deeply hypnotized Ss could in fact be compelled to carry out an action as antisocial as throwing fuming nitric acid at another individual. In addition, however, it was found that six out of six non-hypnotized individuals, who had been required to simulate hypnosis for a "blind" E, would also carry out these actions. Depending upon the degree of social pressure, moreover, various degrees of "obedience" were obtained from other groups of Ss who were merely asked to participate in a previously unspecified experiment. The crucial difference seemed to be that instead of asking Ss whether they would carry out the action, the E clearly communicated to them that they were to do so. In an experiment, when it is clearly communicated to the S that he is to carry out an action which appears very destructive and dangerous, it is thereby concurrently communicated that it will be safe to do so. In contrast, when the S is questioned as to whether he would carry out such an action, it does nothing to alter what is patently obvious -- that someone would be severely hurt -- thereby eliciting vehement denial.

Thus far we have only described the part of the experiment which is analogous to the seeming antisocial aspects of Milgram's work. More illuminating perhaps is that Ss were also required to carry out apparently self-destructive actions; in particular, to pick up and place in a bag a snake known to be poisonous and to remove a penny from the jar of fuming nitric acid with their bare fingers. In Milgram's terms, there was little trouble in eliciting obedience from our Ss. Our findings in this regard did not, however, lead us to conclude that outside the experimental situation Ss can be instructed to walk off the roof, in some other way to injure themselves or even to commit suicide. On the contrary, when we asked our Ss about their behavior, they clearly indicated that despite perceptual evidence to the contrary they did not have the slightest doubt that every care had been taken to protect them from serious harm. Without having to be told, Ss were quite aware of the reality constraints governing research in our society and, correctly we might add, assumed that as long as we really intended them to carry out these behaviors, we would have made certain no serious injury would befall anyone -- neither them nor our research assistants.

It was essential for the S to be in an actual subject-experimenter relationship in order to have him carry out these actions; despite repeated attempts, not one of our colleagues could be induced to attempt any of these acts. We would like to attribute their refusal not to a lack of trust in us, but to the absence of the experimental relationship that legitimized the behavior. Our colleagues were also convinced, much like the group of psychiatrists polled by Milgram, that few if any Ss would comply with the demands of the experiment. It would appear that even expertise in nonexperimental human behavior does not allow accurate prediction of what Ss will do in an actual experimental situation.

Holland (1967) extended Milgram's studies in an attempt to explore the demand characteristics of the situation. The substantive findings will be reported elsewhere and only some observations are relevant here. Initially it was intended to replicate precisely the verbal feedback condition where the victim refuses to answer further after having received a 300-volt shock. It became clear, however, that it would be necessary to add additional auditory feedback in the form of moans or groans each time a substantial shock was administered because during the pilot studies Ss run in the original condition indicated after the experiment that they had been certain that

 

290 STANLEY MILGRAM

the victim had either escaped or been released! Even with additional auditory feedback and meticulous attention to perfecting the deception aspects of the situation, three-quarters of the Ss run in the analog of Milgram's situation indicated that they did not really believe the deception when carefully questioned after the experiment.

Holland added two additional groups of Ss in his study and a "blind" second E. The first E told one group that they were control Ss and suggested that there would be something "fishy" about the experiment in which they were about to participate. They were instructed not to "let on" but rather to act as though they were regular Ss. In this manner it was hoped to raise their level of suspicion, making it virtually certain that they would become aware of the incongruities of the situation. The other group was told before the experiment that the level of shock which would be administered to the S was really only one-tenth of what was indicated on the shock generator. Though the experimental manipulation made certain that these latter two groups would be relatively aware that things were not as they seemed, their obedience behavior was not significantly different from the group run under the Milgram condition. Other aspects of their behavior also failed to separate these groups: Ss exposed to each treatment appeared to become uncomfortable and show signs of severe anxiety conflict. As a result, the "blind" E was unable to judge at better than chance levels the experimental group to which Ss belonged. These findings suggest that Ss run in the Milgram condition could easily have become aware of the deception and still behave as if they were naive. Neither the extent of the S's obedience nor his objective signs of discomfort necessarily reflect what he experiences. Thus, in the final post-experimental inquiry it became clear that much of the S's disturbed behavior was purposive and occurred because the individual felt that such behavior was demanded by the situation.

It would seem that the simple behavioral response of obeying instructions cannot tell us much about why the S obeys. We are dealing with a highly complex situation where Ss may perceive that no real hurt is being inflicted upon anyone and yet not be certain what constitutes the desired response on their part. They are placed in a dilemma where the only definitely appropriate response seems to be discomfort. Whether continued obedience or conformity is seen as the successful response seems to depend upon many as yet obscure and subtle aspects of the total situation.

It seems that in the Milgram-type experiment one may encounter two groups of Ss that do not necessarily differ in their overt behavior. Usually the majority of Ss assume that the situation is essentially safe while a much smaller group may accept the situation at its face value. Logically these two groups of Ss take part in quite different experiments. Obedience by the first group that is (correctly) convinced the situation is essentially safe allows inference about what they would do in other experimental situations but not about what they would do outside of such a context. For the experiment to have any significance for other contexts, it is essential that the Ss believe in its reality. To ignore the Ss' perceptions and merely focus upon their overt behavior by saying it matters only whether the S is in fact obedient according to an arbitrary operational definition ignores a vital issue: To what are they obedient? Only by answering this question can an experiment of this kind have broader meaning.

The Psychology of the Psychological Experiment. For most psychological studies involving deception it is sufficient to make certain that the situation is accepted, but Milgram's paradigm raises an additional problem. There are some issues which can-

 

291 CRITICAL EVALUATIONS

not readily be examined in an experimental context since they are context-specific or at least context-related. This is particularly true when an experimental situation is used to study compliance or obedience.

Milgram appropriately points out that we expose our neck to the barber and remove our shoes in the shoe store because these constitute legitimate requests. In everyday life the individual is able to determine what constitutes a legitimate request and rather clearly defined, implicit rules govern what one individual may ask of another. However, the agreement to participate in an experiment gives the E carte blanche about what may legitimately be requested. In asking the S to participate in an experiment, the E implicitly says, "Will you do whatever I ask for a specified period of time? By so doing you may earn a fee, contribute to science, and perhaps even learn something of value to yourself. In return I promise that no harm will befall you. At the completion of the experiment you will be no better or worse off than you are now and though you may experience temporary inconvenience, this is justified by the importance of the undertaking." A corollary to this agreement is that the S may not ask why certain things are required of him. He must assume that these actions are legitimate and appropriate for the needs of the experiment.

The S's willingness to comply with unexplained or unreasonable requests in an experimental context does not permit inference to be drawn beyond this context. For example, a study required Ss to carry out a boring and tedious task -- serial addition. After completing each page of work Ss were instructed to destroy their own product and continue working (Orne, 1962). To our surprise, Ss were willing to carry out this task for long periods of time and do so with a high rate of speed and accuracy. Anyone who believes direct inference about obedience in real life can be drawn from an experimental context should ask his secretary to type a letter, and after making certain there are no errors, ask her to tear it up and retype it. With rare exceptions, two or three such trials should be sufficient to ensure that the E will require a new secretary! It should be noted that the activity required of the secretary is no different in kind from what she is normally required to do. She is paid for the work at her usual rate, no one is hurt and yet as long as she has an option, there is little question about her behavior in real life. Incidentally, the same individual would likely be "obedient" if she had agreed to participate in an experiment. 2

The S's unquestioning compliance with the E's requests depends in part upon his awareness of the total experimental situation and the safeguards built into it. The Milgram paradigm runs, therefore, into an inevitable paradox. There are some things which the S would not do but these are behaviors that he knows the E cannot require of him. Therefore, when the E asks him to carry out an action which would lead to serious harm either to himself or to someone else and communicates that the S is intended to carry out these actions, he inevitably also communicates that these actions will not lead to their apparent consequences. That the S will in an experiment carry out behaviors that appear destructive either to himself or others reflects more upon his willingness to trust the E and the experimental context than on what he would do outside of the experimental situation.

It can be argued that Ss will carry out behaviors which appear dangerous and that an unscrupulous investigator could utilize


2 A secretary would regard this kind of request as intolerable only in the context of her continuing activities but would be relatively untroubled by such a request if it was "episodic" in the sense of Garfinkel (1967) and legitimized in an appropriate fashion, e.g., necessary for science.

 

292 STANLEY MILGRAM

this to inflict serious harm on either the S or other individuals. This is, of course, true but lest we be concerned about breeding a nation of sheep who will unsuspectingly carry out dangerous actions, it is well to remember that in our complex but still basically cooperative society it is impossible to function without trust in reasonable situations. We take our car to have the brakes repaired and assume without personally checking that they have been put back together properly; we take a plane without personally subjecting the pilot to physical examination, trusting in his competence and soundness of mind; when a physician prescribes green pills, we blithely assume that the medication is for our good although, of course, if he had chosen to give us arsenic we would have taken it and died. Therefore, a demonstration that a situation that can legitimately be expected to contain all possible safeguards for the participants could conceivably be perverted cannot at the same time be used to prove that man is either gullible or, when on guard, easily deceived. Unfortunately, as important as the problem of obedience is in modern society, it is unlikely to be resolved by using the psychological experiment as a tool in a situation where the S can recognize and define it as such.

Conclusion. Ignoring the questions we have raised thus far, one might try to set up a situation where Ss must stop being obedient; for example, having the victim complain of heart trouble and showing signs of having a coronary attack. Certainly no learning experiment would justify continuing at the risk of bringing about the S's death. It is difficult under these circumstances to imagine either an E calmly saying, "Continue -- you must go on -- the experiment requires that you continue" or a S actually continuing to administer shock to a victim who had passed out after complaining of chest pain. If some other investigator reported such a caricature of the Milgram situation it might be considered a scientific practical joke. A finding that in the face of an apparent coronary by the victim, Ss continued to administer shock would have to be explained by assuming that Ss did not believe in the reality of the events and therefore would most likely be dismissed as a poorly executed piece of laboratory work. Yet it is Milgram himself who reports such findings! The fact that he elicits obedience from a significant proportion of Ss even under the threat of an impending heart attack must throw serious doubt on the manner in which the deception was handled in all of his other studies. Thus, by pushing the psychological deception experiment ad absurdum, Milgram forces us to come to terms with issues of ecological validity.

What can be said about the ease with which man can be forced to abuse his fellow man? To show that Milgram's empirical findings do not allow him either to prove or disprove his conclusions does not help us to draw any meaningful inferences about the true nature of man. Rather, the news media, more validly and, alas, more eloquently than any experimental data, attest to the scope of the problem and the urgent need to understand the forces that govern violence. Appropriate ecologically valid techniques must be developed to study this and related problems. The difficulties of research in these areas do not mean that we can afford to abandon either the scientific method or the experimental technique; rather, to attack some problems we will have to devise experiments that are not recognized as such by the Ss.3 In doing so, moreover, we must make certain that this


3 For an extended discussion of these issues, see Orne (1962); Orne (in press). A particularly elegant example of a method which allows systematic scientific study without the S's recognizing that he is participating in an experiment is the lost-letter technique developed by Milgram (1965).

 

293 CRITICAL EVALUATIONS

is in fact the case while being careful to keep in mind the ethical strictures which must govern research in a free society.

Milgram's studies in obedience, though they fail to provide a viable model for the scientific investigation of violence, are nonetheless a milestone in social psychology. By demonstrating that it is possible to stay within currently accepted scientific conventions and yet push the psychological experiment beyond its limits, the obedience studies force us to consider what these limits are and hasten the day when issues of ecological validity will receive the kind of careful attention currently devoted to statistical inference. Finally, Milgram has dared to attempt the systematic scientific study of an urgent problem currently facing our society. While new means will be required for this purpose, by focusing on vital issues Milgram has provided new impetus to an exciting field.

References

ASCH, S. E. Social Psychology. New York: Prentice-Hall, 1952.

BREHM, J. W. and COHEN, A. R. Explorations in Cognitive Dissonance. New York: Wiley, 1962.

BRUNSWIK, E. Systematic and Representative Design of Psychological Experiments with Results in Physical and Social Perception. (Syllabus Series No. 304) Berkeley: Univ. of California Press, 1947.

GARFINKEL, H. Studies in Ethnomethodology. New York: Prentice-Hall, 1967.

HOLLAND, C. H. Sources of Variance in the Experimental Investigation of Behavioral Obedience. Unpublished Doctoral Dissertation, Univ. of Connecticut, 1967.

JANET, P. L'automatisme psychologique; essai de psychologie experimentale sur les formes inferieures de l'activite humaine. Paris: Alcan, 1889.

KELMAN, H. C. Human Use of Human Subjects: The problems of deception in social psychological experiments. Psychol. Bull., 67:1-11, 1967.

MANVELL, R. and FRAENKEL, H. The Incomparable Crime. New York: Putnam's, 1967.

MILGRAM, S. Some conditions of obedience and disobedience to authority. Human Relations, 18:57-76, 1965.

MILGRAM, S., MANN, L., & HARTER, SUSAN. The Lost-Letter Technique: A Tool of Social Research. Publ. Opin. Quart., 29:437-438, 1965.

ORNE, M. T. Antisocial Behavior and Hypnosis: Problems of Control and Validation in Empirical Studies. In G. H. Estabrooks (Ed.), Hypnosis: Current Problems. New York: Harper & Row, 1962. Pp. 137-192.

ORNE, M. T. On the Social Psychology of the Psychological Experiment: With Particular Reference to Demand Characteristics and Their Implications. Amer. Psychologist, 17:776-783, 1962.

ORNE, M. T. Demand Characteristics and Quasi-Controls. In R. Rosenthal and R. Rosnow (Eds.), Artifact in Social Research. New York: Academic Press, in press.

ORNE, M. T., and EVANS, F. J. Social Control in the Psychological Experiment: Antisocial Behavior and Hypnosis. J. Pers. soc. Psychol., 1:189-200, 1965.

RIECKEN, H. W. A Program for Research on Experiments in Social Psychology. In N. F. Washburne (Ed.), Decisions, Values and Groups. New York: Pergamon Press, 1962. Vol. 2. Pp. 25-41.

ROSENTHAL, A. M. Thirty-Eight Witnesses. New York: McGraw-Hill, 1964.

ROWLAND, L. W. Will Hypnotized Persons Try to Harm Themselves or Others? J. Abnorm. Soc. Psychol., 34:114-117, 1939.

YOUNG, P. C. Antisocial Uses of Hypnosis. In L. M. LeCron (Ed.), Experimental Hypnosis. New York: Macmillan, 1952. Pp. 376-409.


The preceding paper is a reproduction of the following article (Orne, M. T., & Holland, C. H. On the ecological validity of laboratory deceptions. International Journal of Psychiatry, 1968, 6, 282-293.). It is reproduced here with the kind permission of Jason Aronson -- An imprint of Rowman & Littlefield Publishers, Inc.