Techniques for creating and using web questionnaires in research and teaching

Jonathan Baron and Michael Siepmann
University of Pennsylvania

Jonathan Baron's web page is at http://www.sas.upenn.edu/jbaron/. He can be contacted by email at baron@psych.upenn.edu, by mail at Department of Psychology, University of Pennsylvania, 3815 Walnut St., Philadelphia, PA 19104-6196, USA, or by telephone at 215-898-6918.

Techniques for creating and using web questionnaires in research and teaching


In this chapter, we describe techniques and procedures we have used for putting questionnaires on the web, with particular emphasis on the technical details. Before moving to the web, we used pencil and paper questionnaires. Subjects came to a room that was open at certain hours, and they were paid on the spot for completing questionnaires. Almost all the subjects were students. We have been doing studies on the Web for almost two years as of this writing, and have completed over 130 studies in that time, each typically involving collection of data from about 50 subjects over a period of a few weeks. We monitor the results as they come in, and frequently redesign studies that subjects seem to misunderstand. The first author keeps a list of subjects, which now contains over 700, each of whom has completed an average of about nine studies. All of our research is now done on the web, except research with collaborators who use other methods. The topics include utility elicitation for cost-effectiveness analysis in medicine, moral judgments, judgments of fair allocation of resources, risk, and subjective estimates of bias in probabilistic beliefs. Our questionnaires are available on the web at: http://www.psych.upenn.edu/baron/qs.html.

The use of the Web for research is a topic of some recent interest. The NetLab Workshop (NSF, 1997) recommended encouragement of just this sort of effort. One interesting feature of the web is that it encourages altruism. That is, people are willing to put in a little bit of work to put something on the web, without remuneration, but knowing that millions of people-the numbers growing by the minute-will have access to it. This was a feature of the Internet even in the days before web browsers, but browsers and search engines have made things easier to find. A great deal of web content may be freely copied and reused.

Much of experimental psychology involves presenting subjects with visual or auditory information and asking them to respond manually. Almost any study of this type can be done on the web. Once an experiment is available on the web, others can be invited to copy it and modify it for their own use, although this hasn't happened much so far. The first author's general experimental procedure (which is continuing to evolve) is to randomize the stimuli separately for each subject. Each session contains 20-100 judgments, depending on their difficulty, and pay ranges from $1 to $6, again, depending on the difficulty and time required. In some cases, he presents all the stimuli in one condition before all the stimuli in another condition, and the stimuli are randomized within each condition. Typically, each item is presented separately, in such a way that the subject cannot back up. One current study provides practice items, which teach the subject how to make matching judgments by alternating between numbers that are too high and too low. All designs are within-subject, and the order of major conditions is usually either randomized or counterbalanced. Some advantages we have found web questionnaires to have over paper ones are:

The subjects are more varied, and special efforts can be made to solicit particular groups of subjects. Diversity on the Internet increases each year, as more people use the web.

It is easier to make sure that subjects answer all the questions, give answers in the appropriate range, and meet minimal conditions on ordering.

Interactive questioning can be programmed so that it can be described clearly and replicated. In this regard, the method is similar to computer assisted telephone interviewing, but it is much less expensive. Included here are error checking, challenges to past responses, and use of practice trials with feedback about them.

Some forms of "cheating," such as completing the same study twice, are more difficult than when subjects are paid cash on the spot in a lab, because duplicates can be detected before payment is sent. In our research, subjects must provide their name and address (and, for Americans, their social security number) in order to be paid.

Although random sampling of a population is impossible, as it is in other methods based on convenience samples, the sample is more varied than those that use solely students. Typically, now, most of our respondents are not students, and most are female, despite fears that the web is male dominated. About 10% are not from the U.S., an additional source of variety. Also, we can tell whether any studies are frequently begun and not completed, which could cause a sampling problem, by inspecting access logs on the web server computer. So far, this has not been a problem. Most are completed once begun.

Costs of data entry, supervision of subjects, space for completing questionnaires, paper, printing, and paper storage are eliminated. As a result, the same research can be much less expensive.

The time between planning a study and analyzing data can be reduced considerably, both because more subjects can complete a questionnaire per day (since the web ``lab'' is open all hours, and can accomodate an unlimited number of subjects), and because the data entry phase is eliminated.

The biggest disadvantage is that, as in other research of this sort, some subjects give little thought to the questions. Error checking can discourage such subjects. (It is annoying to be told repeatedly that your responses are inappropriate.) Even without such checking, we have found the number of nonsensical answers to be lower than before we started using the web.

Do people give different responses on web and paper questionnaires?

In our own research in the field of judgment and decision making, we have several times used both web and paper questionnaires as part of the same series of experiments. In most cases, we used web and paper for different experiments in the same series of experiments. We got the same kinds of effects but could not do any direct comparisons of the magnitude of these effects. Our subjects for paper questionnaires were recruited either from undergraduate courses or from a laboratory that was open during specified hours, in which subjects could complete questionnaires for pay. The laboratory was advertised on the University of Pennsylvania campus and most subjects were students. Our subjects for web questionnaires were recruited initially by posting notices to Usenet newsgroups relevant to the studies being done (such as sci.environment and sci.psychology). Other web subjects came from a nearby college, which required students in introductory psychology to participate in an experiment and counted our studies as meeting the requirement. As word of our web studies spread, several people put links to our site in their web pages, and others discovered our site through search engines. In a couple of cases we have asked people to put links to our site on their web pages.
A study by Baron, Hershey, and Kunreuther (1998) allowed direct comparisons of web and paper questionnaire responses to very similar questionnaires. Of interest here is their Experiment 2, which compared a web questionnaire completed by 49 subjects with a paper questionnaire completed by 42 residents of West Philadelphia in the context of a face-to-face interview. The purpose of this study was to look at the determinants of the desire to reduce risks. For each of 32 different risks, such as "cancer from pesticides" or "bacterial infections from drinking water", subjects answered several questions. The main question was about priority for action. It asked: ``If you had more money to spend, which of these risks would you spend it on? Circle the priority you would give to each risk: Hi=high, Med=medium, Lo=low, No=no money at all.'' There were no significant differences between web and interview subjects in any variables, with one exception. On the web, the difference between the judged probability of the risk for the average American and for the subject was greater (in the direction of lower for the subject) than on paper.
Closer examination of the questionnaires suggested a reason for this. In the paper version, each risk had its own row, and each question had its own column. The äverage American" and ßubject" answer columns were side by side. Thus, subjects found it easy to give the same answer to both questions, and many did so. On the web page, the two answer columns were presented sequentially rather than side by side, and the subject first answered the questions about average risk for all 32 items, and then the questions about the subject's own risk. This suggests that using the web can affect results indirectly if it affects the form of presentation. (Presenting answer columns side by side on the web is possible, as explained below, but requires more complex techniques than presenting them sequentially.) Using the web also makes it more difficult to hold the form of presentation constant across subjects. For example, many browsers allow users to ignore the colors specified in a web page and choose their own color scheme. Thus, a questionnaire which on paper would be in black ink on white paper for everyone, could potentially be in green ``ink'' on orange ``paper'' for one subject, in pink on blue for another, and so on. Different computers have different screen sizes and resolutions, so it can be important to test one's questionnaires at different resolutions and try to make them as similar as possible on a 640 by 480 pixel display, for example, as on a 1280 by 1024 pixel display. (Lynx, one of the earliest browsers, has a purely text based interface which is so different from the interfaces of graphical browsers that it would probably be best either to have all subjects use Lynx to complete questionnaires specifically designed for Lynx, or to have no subjects use Lynx.)

Using web questionnaires in teaching

Use of forms in teaching

Web questionnaires make it much easier to have students collect and analyze data, as a teaching exercise. The first author has used web questionnaires in a research experience course on value measurement and decision analysis, and in his undergraduate course on thinking and decision making. Students learn about the topic both by filling out the questionnaires, and by analyzing the data they collectively produce, which are made available to them on the web in a Systat (SPSS Inc., Chicago, IL) file after everyone has completed the questionnaire. They also learn about the possibility of doing research on the web.
Some of the exercises concern utility elicitation. Students find "conjoint analysis" particularly interesting, and web forms using JavaScript (discussed below) are ideal for this method. The questionnaire involves rating a series of objects (e.g., cars) described one by one. The objects vary in a few attributes, such as price, safety, and repair record, with a few levels of each. The idea is to analyze the ratings of the objects to infer the utilities of the attributes. A statistics program such as Systat is used to derive an equation that predicts the ratings from the levels of the attributes. The equation assumes (a) that the ratings of the objects are monotonically related to the utilities of the objects, (b) that the utility of each object is an equally weighted linear combination of the utilities of the object's attributes, and (c) that the utility of an attribute is a function of its level on its dimension.
Examples of programs are available in the course syllabi in the first author's web page, http://www.sas.upenn.edu/jbaron, but those programs are constantly changing. A brief introduction to conjoint analysis in Systat is available at http://www.psych.upenn.edu/systat/CJTEST.SYC and, for analysis across subjects, at http://www.psych.upenn.edu/systat/conjoint.htm.

Making web questionnaires

Below we discuss techniques for making web pages using (a) HTML only, (b) HTML and JavaScript, and (c) HTML, JavaScript, and Java (briefly). HTML (hypertext markup language) is the original language of web pages on the World Wide Web. There are many good references to HTML on the web itself. We will assume that the reader is familiar with them. (See http://www.psych.upenn.edu/cattell/manual.html for a list of these.) JavaScript is a programming language that is built into all the major browsers: Netscape, Internet Explorer, and Opera. (The version of JavaScript in Opera 3.5 does not work for the methods described here. Version 4 of Opera is expected soon.) A JavaScript program or ßcript," written as part of a web page, is downloaded along with the rest of the page and is executed by the browser. Most browsers can also run Java programs. Java is a much more powerful language. The second author has used it in one questionnaire, and we discuss it briefly below.
Before we discuss JavaScript in more detail, we should mention that many of our comments are of the form, ït is good to...". JavaScript is a wonderful idea, but it is not consistently implemented across different browsers, or even sometimes across the same browser on different operating systems. It ßhould" work if you merely follow the rules, but does not always do so. It will work most of the time if you follow the advice we give. There may be additional tricks that we haven't learned yet, and we do get occasional complaints about things not working. Flanagan's (1997) book is very helpful not only in explaining the rules but also in explaining some of the tricks.
Part of the problem with JavaScript is that the politics of commercial competition have led to different versions of the language. For example, if you want to write for both Microsoft Internet Explorer and Netscape Navigator you have to resort to a lowest common denominator. Each company has added commands in an effort to persuade people to use their browser, and to encourage the development of web pages compatible with their own but not their competitor's browser. Moreover, these commands appear in later versions but not earlier ones. It is possible to have multiple versions of Netscape Navigator installed on a single computer, allowing one to test one's questionnaire on each of them. Unfortunately it is not possible, at least on the Windows 95 operating system, to have both versions 3 and 4 of Microsoft Internet Explorer installed simultaneously. However, we have been told that this is possible on operating systems other than Windows. Clearly, the wider the variety of operating systems and browsers you have access to, the more thoroughly you can test your questionnaire. However, even if you have the ability to test on many combinations, it may not be worth the time and effort. If a questionnaire works in an older browser like Netscape 3, for example, the chances are good that it will work on most other browsers commonly used today. It may be most efficient to test on Netscape 3 and a version of Internet Explorer, and then find out if problems are occurring on other browsers or operating systems by asking subjects to email you if they encounter problems.
All of the first author's questionnaires work on Netscape 3 and above, and so far he has encountered no browser-specific problems. An alternative approach, discussed by Flanagan (1997), is to have the questionnaire detect which browser the subject is using and then use its capabilities. This complicates programming considerably, and making a questionnaire differ visibly depending on the browser used is typically a bad idea in a research context. However, this approach could be used in ways that do not visibly affect the questionnaire. For example, data about mouse movements could be collected from the subset of subjects using more sophisticated browsers, while keeping the questionnaire usable by, and externally identical for, subjects with less sophisticated browsers.

Making web questionnaires using HTML

Simple questionnaires

Simple questionnaires

The simplest sort of questionnaire uses a form, and everything else in this chapter builds on this basic idea. The HTML <form> tag that begins a form can include the name of a ``server-side'' program that does something with the data entered into the form when the subject submits it, typically by clicking a ßubmit" button. (A ``server-side'' program is one which runs on the server computer on which the web page resides rather than on the client computer on which the web page is being viewed.) The following, for example, names a program called "mailform":

<form method=POST action="/cgi-bin/mailform?baron@psych.upenn.edu">

Our mailform program (which was written by M.-J. Dominus of the University of Pennsylvania) sends the data in an email message to the address after the question mark, in the form of name=value pairs, such as q5=45, where q5 is the name of a form element and 45 is the value assigned to it. Typically, the name is something you make up for the question item (e.g., q5 for question 5) and the value is the subject's answer to that question. Mailform sorts the name=value pairs by name and puts each pair on a separate line.
Mailform is written in Perl (Wall, Christiansen, & Schwartz, 1996), but many languages can be used for such programs. Schmidt (1997) discusses server-side programs more complex than mailform which can accomplish much of what we use ``client-side'' JavaScript to accomplish (see below). However, server-side programs must be carefully written in order not to make the server vulnerable to break-ins, so it is best to consult your local system administrator for advice about which program to use. For the same reason, client-side JavaScript is probably a better choice when writing your own programs.
It is also possible to put a web server on a dedicated desktop computer connected to the internet. Servers are available free from http://www.apache.org for most platforms. If the computer is on your institution's network, however, this approach is still subject to policies of your institution concerning security and privacy.
If you do not have access to a web server on which you can run a suitable server-side program such as mailform, you could try a <form> tag like the following:

<form method=post action="mailto:siepmann@psych.upenn.edu" enctype="text/plain">

In Netscape browsers configured to send email, and perhaps in other browsers, this will result in the browser sending you the data as an email message from the subject (or whoever the browser is set up to send email from) with name=value pairs like those sent by our mailform program. However, a major problem with this method is that whether subjects' responses reach you is determined by details of the subject's browser and how it is set up. If this method is your only option, you should probably require subjects to complete a test form to check whether their browser can actually get data to you. They should be warned not to spend time filling out your real questionaire until you have emailed them to confirm you received their test form data. Otherwise, you could be faced with large numbers of emails saying ``Did you get my data? When will I be paid?'' from subjects whose data never reached you.
The form ends with a </form> tag. In between the beginning and end tag are text, images, and anything else that can go in a web page. Critically, though, certain other elements are recognized in this context. One is a hidden input tag, such as: <input type="hidden" name="_qnaire" value="medu16b">. This does not appear on the web page, but results in the name=value pair _qnaire=medu16b appearing in the email sent by the mailform program, to identify which questionnaire the data are from. Putting an underscore at the beginning of the name causes it to be listed at the top of the email, before any names that start with a letter, to make it easy to find.
A very useful element is for inputting text, e.g.: <input type="text" name="q5" size=10 maxlength=20>. This presents an input box 10 characters wide that allows an answer of up to 20 characters. If the subject types more than 10 characters, the text window scrolls to the left to accommodate this. We give complete examples including text inputs later.
Buttons are useful in conjunction with JavaScript, which we discuss later. They do something when the subject clicks on them. For example, the following two buttons call a JavaScript function called "prac(x,y)" which changes a displayed number in order to change the relative utilities of two options, A and B.

<input type=button value=Ä is worse now." onClick="prac(0,3);">
<input type=button value="B is worse now." onClick="prac(1,3);">

For example, A might be paralysis of both legs and B might be a specified probability of death. The "value" is displayed on the button and subjects click on whichever button describes how they feel. Each click causes the prac(x,y) function to adjust the probability of death to get closer to the subject's indifference point. The parameters x and y tell the function prac(x,y) which button was pressed and which practice trial is being done, respectively. Subjects stop clicking when they are indifferent between A and B, because then neither button describes how they feel.
Finally, the subject must do something to submit the form. The usual way to do this is to include a submit button immediately before the </form> element that ends the form, for example:

<input type=submit value="Click here to submit your answers.">

Other kinds of form elements are radio buttons, checkboxes, lists, and text areas. In some browsers, radio buttons, checkboxes, and lists can only be used with the mouse, and even when they can be used with the keyboard, they are usually better suited to the mouse. It is worth avoiding intermixing large numbers of them with text inputs, since this forces subjects to keep moving back and forth between the keyboard and the mouse. With text inputs, the user can simply type letter keys and then press the Tab key to move from one input field to the next.
Text areas allow several lines of text. If carriage returns are included, the mailform program will not remove them and will spread the data over several lines, like this:

comments=I found this

It is usually undesirable to have carriage returns in a response because each response of each subject should fit in a single cell of a spreadsheet or a single variable in a statistics program. One simple solution is to provide several single-line text inputs instead of a text area. However, text areas may have advantages for open-ended responses since subjects can type and edit their responses freely the same way they do in email or wordprocessor programs. Later we show how to use JavaScript to get rid of carriage returns in text area responses.

Separate display areas:


It is possible with HTML to divide the browser window into several panels, each called a frame. Each frame can contain a separate HTML document. Frames are useful when you want to have the subject answer several questions about each of several items. In the Baron et al. (1998) study mentioned earlier, subjects were presented with a list of 32 risks (ranging from auto accidents to asteroids hitting the earth) and were asked several questions about each. Each question required a scale. For example, the probability questions used the following scale:

4A. What is the lifetime risk of the average American family, for each of the risks listed? Answer with a letter, except for L (less than 1 in 100,000); in that case, write in a decimal or fraction. When something can happen more than once, we want the probability that it happens at least once.

A1 in 1Certain to happen
B1 in 3A 80 year old dying by 85
C1 in 10A 65 year old dying by 70
D1 in 30A 51 year old dying by 56
E1 in 100A 35 year old dying by 40
F1 in 300A 20 year old dying by 23
G1 in 1,000A 20 year old ... in the next year
H1 in 3,000... in the next 4 months
I1 in 10,000... in the next 5 weeks
J1 in 30,000... in the next 2 weeks
K1 in 100,000... in the next 4 days
Lless(specify the probability)

The scale for each question appeared in the right frame, while the risks and the input fields appeared in the left frame. This allowed the subject to see the scale easily while answering the questions, and for their previous answers on the left to scroll off the screen without causing the scale on the right to move.
Implementation of two frames requires three documents. One is called a frameset document and defines the frames. Its only visible content is its title. The other two documents are displayed in the frames. Their titles are not shown, so they generally do not need titles.
In the Baron et al. (1998) study, the frameset document for the questionnaire was:

<head><title>Questionnaire, prot1</title></head>
<frameset cols="50%,50%">
<frame src="prot1ans.htm" name=änswers" scrolling=ÿes">
<frame src="prot1qus.htm" name="questions" scrolling=ÿes">

The frameset element specified that each frame was a column taking up half of the screen. The two files prot1ans.htm and prot1qus.htm were the answer and question documents to be displayed in the left and right frames, respectively. They were separate files in the same directory of the server computer containing the web page. After each set of risks in the answer document, a link was used to bring the next question into view in the frame on the right. The following example shows only the last of a set of risks:

Cancer from food additives <input type=text size=10 name=q12>
<a target=questions href="prot1qus.htm/#ques3">
Click here to see question 3; then respond here:</a>

The target=questions part specifies that clicking on the link should affect the frame on the right, which is named ``questions`` in the frameset document, rather than the left frame which the link itself is in. The #ques3 part makes the link point to the top of question 3, marked with the tag <a name="ques3"> in the question document.

Putting answer columns side by side

It is easy to put columns of answer spaces so that they display side by side, simply by putting them in two columns in a table. However, pressing the tab key will move the cursor to the next answer space to the right, rather than the next one down. This is fine if subjects are supposed to answer one row at a time, but very undesirable if subjects are supposed to complete one column at a time. It is possible to make pressing the tab key move the cursor down by implementing each column as a separate table, embedded within the main table. The following example has two columns and two rows. The first <TR> starts the top row, which contains headings. The second <TR> starts the second row. The first cell of the second row contains the text of the first item. The second cell spans two rows and contains a table within it. That table has two single-cell rows containing the input boxes for questions A1 and A2. The third cell of the second row is just like the second but contains the input boxes for questions B1 and B2. The last <TR> starts the third row. The first cell of the third row contains the text of the second item. It is the only cell of the third row, because the second and third cells of the second row span two rows, so they intrude into the third row.

<TD VALIGN=TOP HEIGHT="100">1. An item.</TD>
<INPUT type=text size=3 name=A1>
<INPUT type=text size=3 name=A2>
<INPUT type=text size=3 name=B1></TD></TR>
<INPUT type=text size=3 name=B2>
<TD VALIGN=TOP HEIGHT="100">2. Another item.</TD></TR>

Having subjects allocate themselves to different versions

It is often desirable to have different subjects complete different versions of a questionnaire, for example to implement a between subjects manipulation or to counterbalance the order of questions in a within subjects design. The simplest way to do this is to have subjects select a version on a basis that approximates random assignment, for example whether they were born on an odd or even day of the month. This method is often quite adequate, but sometimes it may be undesirable for subjects to be aware that there are multiple versions, and it is harder to think of a simple approximately random basis for allocating subjects to more than two versions. To randomly present different versions without involving subjects in the process requires JavaScript, to which we now turn our attention.

Making web questionnaires using JavaScript and HTML

Transparently allocating different subjects to different versions

With JavaScript one can randomly allocate subjects to multiple versions of a questionnaire without having to involve them in the process. The following routine assigns each subject to one of three versions, which differ in the text assigned to the variable text. The variable random0to1 is set to a random number between 0 and 1 using the time in milliseconds, because although JavaScript has a random number function, it does not work on all browsers. The division by 10000 gets rid of trailing zeros, and the %1 gets the remainder after dividing by 1, which is the fractional part. If you repeatedly reload this example in your browser, you will see the different values of random0to1 displayed. The Math.floor expression turns the fractional number from 0 to 1 into an integer from 1 to 3. Further down, the expression ``document.myform._qnaire.value += version'' adds the version number to the hidden input called _qnaire which will appear in the data emailed to you, letting you know which version the subject did.

<html><! THIS IS MULTI.HTM ->
<script language="JavaScript">
var random0to1 =((new Date()).getTime()/10000)%1 ;
var version = Math.floor(3*random0to1)+1 ;
var text;
if (version == 1) text="This is the text of the first version.";
if (version == 2) text="This is the text of the second version.";
if (version == 3) text="This is the text of the third version.";
<form name=myform>
<input type=hidden name=_qnaire value="My questionnaire, version ">
<script language="JavaScript">
document.myform._qnaire.value += version;
document.write("<BR><BR>(_qnaire.value is now: "");
document.write(document.myform._qnaire.value + "")");
document.write("<BR>(random0to1 is: " + random0to1 + ")");

Randomizing question order separately for each subject

Randomizing question order separately for each subject

It is possible to randomize the order of questions separately for each subject, by extending the techniques we have described so far. The questions could either be all displayed at once, as a single questionnaire page, or presented one at a time, on separate pages, which is the approach we will focus on. Putting questions on separate pages makes it impossible or difficult (depending on the browser) for the subject to go back to previous questions. This is important in within-subject designs that test variables that subjects think should not affect their responses. When subjects can easily compare and revise their answers at different levels of a variable they think should not affect their responses, they will tend to give the same answer at all levels, masking the true effect of the variable. (Internet Explorer 4 does allow returning to previous pages that have since been rewritten, although Netscape 3 and 4 do not allow this to be done easily. Prevention of such backing up in IE4 can be done by disallowing responses that would result in revision of recorded data.)
In principle, there are many ways to put items on separate pages. You could, for example, use a separate HTML file for each question. This, however, can make it difficult to make modifications of wording that is repeated in every question. According to Flanagan (1997) it is possible to use JavaScript to keep writing over the main window, but we have not gotten this method to work. We use a third method, involving frames, suggested by Flanagan (1997).
The trick is to use two frames, one invisible. The invisible frame contains a form with only hidden input elements. This is done as follows:

<frameset rows=``100%,*''>
<frame src=visible.htm name=visible>
<frame src=hidden.htm name=hidden>

The top row takes 100% of the screen, and the asterisk indicates that the bottom row takes up whatever is left, which is 0%, so it is invisible. The visible frame contains a form with questions and input elements. A JavaScript program in the frameset document transfers the subject's answers from the inputs in the visible frame to the hidden inputs in the hidden frame before rewriting the visible frame with the next page of questions. At the end, the form in the hidden frame is submitted as if it were the only form.
Below are three files making up a skeleton of a JavaScript questionnaire that presents separate items in a random order, one at a time. The frameset document, program.htm, contains the JavaScript code to run everything. When the questionnaire is first loaded, program.htm sets up some variables it will need and decides on the random order in which it will present the items. After that it does nothing until the subject clicks a button on the introductory page, visible.htm, which calls a JavaScript function in program.htm called PresentItem(). The PresentItem() function then overwrites visible.htm with the first question page. This question page includes a button which, when clicked, calls a function called GetResponse(). The GetResponse() function transfers the subject's response from the form on the visible question page to the hidden inputs in the form in hidden.htm, and then calls the PresentItem() function, which overwrites the current question page with the next question page. This cycle continues until the PresentItem() function detects that all the questions have been asked, at which point it overwrites the last question with a page asking the subject for information about him or herself. The button on this page calls the function GetInfo(), which transfers the subject's information from the form on the visible page to the hidden inputs in the form in hidden.htm, and then submits the hidden form and displays a message thanking the subject.
Here is what the user sees initially, visible.htm. When the user clicks the button, it calls the function PresentItem() in the JavaScript program in program.htm, which is in the visible frame's parent frame-the main browser window. The PresentItem() function will then overwrite this introductory page with the first question.

Introductory text.
<input type=button value="Click if you are ready to go on."
onClick = "parent.PresentItem();">

Here is the hidden document in which data are stored, hidden.htm. The hidden inputs q01, q02, and q03 will hold the responses to the three questions. There must be one hidden input for each response. Most of the hidden inputs provide places to store responses that have not yet been occurred, and are initially set to null values (""), but two are preset to identify the questionnaire and payment.

<html><! THIS IS HIDDEN.HTM ->
<form name=hiddenForm method=POST
<input type=hidden name=__qnaire value=ßkeleton">
<input type=hidden name=q01 value="">
<input type=hidden name=q02 value="">
<input type=hidden name=q03 value="">
<input type=hidden name=_payment value="$5">
<input type=hidden name=_sex value="">

In JavaScript, form inputs can be referred to by name or number, and program.htm refers to them by number when transferring responses to the hidden form. That means it is very important that the inputs in hidden.htm are in the order that program.htm expects them to be. They are numbered starting from 0, so __qnaire is input number 0, q01 is input number 1, q02 is input number 2, etc. If, for example, you inserted an input between __qnaire and q01, then program.htm would put the response to question 1 into the new input that you inserted, the response to question 2 into your q01 input, etc. At best, you would later realize what had happened and be able to reconstruct where each datum was supposed to go. At worst, you would either never find out what had happened, or you would not find out until after publishing a paper based on the data.
Here is the frameset document which contains the JavaScript program, program.htm. It contains four functions, Random(max), PresentItem(), GetResponse(), and GetInfo(). The code outside of the four functions is run only once, when the questionnaire is first loaded into the web browser. It sets up some variables and randomizes the order of the questions, using the Random(max) function. After that, nothing happens until the subject clicks the button on the introductory visible.htm page, which calls PresentItem().

<title>Example of randomizing question order</title>
<script language="JavaScript">
var itmNum = 1 ; // NUMBER OF CURRENT ITEM
var nItms = 3 ; // NUMBER OF ITEMS
itm = new Object(); // ARRAY TO HOLD TEXT OF EACH ITEM
itm[1] = "This is the 1st item.";
itm[2] = "This is the 2nd item.";
itm[3] = "This is the 3rd item.";
now = new Date();
var seed = now.getTime()%714025;

function Random(max) {
seed = ((seed*4096+150889)%714025);
return Math.floor(max*seed/714025);

itmOrder = new Object();
var currentItm, itmToSwap, temporary;
for (currentItm = 1; currentItm <= nItms; currentItm++) {
itmOrder[currentItm] = currentItm;
for (currentItm = 1; currentItm <= nItms; currentItm++) {
itmToSwap = Random(nItms+1-currentItm)+currentItm;
temporary = itmOrder[itmToSwap];
itmOrder[itmToSwap] = itmOrder[currentItm];
itmOrder[currentItm] = temporary;

function PresentItem() {
if (itmNum <= nItms) {
var itmHTML
= "<html><body><form name=visibleForm>n"
+ itm[itmOrder[itmNum]]
+ "<P>This is the question to be asked after each item. "
+ "<input type=text size=8 name=response>n"
+ "<P><input type=button onClick='parent.GetResponse()' "
+ "value='Press TAB, then SPACE, or click'>"
+ "</form></body></html>" ;
junk = parent.visible.document.open();
if (itmNum > nItms) {
var aboutSubj
= "<form name=visibleForm>n"
+ "<P>Are you male (m) or female (f)?<br>"
+ "<input type=text size=3 name=_sex>n"
+ "<P>n" + "Thanks.n" + "<hr>n"
+ "<center><input type=button value='Submit responses.'"
+ önClick='parent.GetInfo()'></center><br>n"
+ "</form>" ;
junk = parent.visible.document.open();

function GetResponse() {
= parent.visible.document.visibleForm.response.value ;

function GetInfo() {
= parent.visible.document.visibleForm._sex.value;
junk = parent.visible.document.open();
parent.visible.document.write("<html><body>Thanks. Your answers "
+ äre being submitted.</body></html>");


<frameset rows=100%,*>
<frame src=visible.htm name=visible>
<frame src=hidden.htm name=hidden>

The lines that say junk = top.visible.document.open() should in principle be able to say simply top.visible.document.open(), but that does not work in Netscape 3. Adding ``junk ='' makes it work in Netscape 3. The questionnaire should also work in Internet Explorer 3 or above. It does not work with Netscape 2, though there might be a way to make it work. The problem we found with Netscape 2 was that on the question pages that were written into the visible frame by program.htm (as opposed to being an actual HTML file like visible.htm), the JavaScript code in the button did not do anything.
There are some other points to note about program.htm: (a) It is a very good idea, because it works with more browsers, to name the input(s) on the visible form, as we did (e.g., <input type=text size=8 name=response>), and refer to them by name when transferring responses to the hidden form; (b) On the question pages, you might think you can use ``onChange'' in a text input field to call the function that advances to the next item, in order to avoid needing a button. This works in Netscape, but not Internet Explorer. However, there is no need to use the mouse in Netscape 3 and 4 and Internet Explorer 4, since pressing the Tab key and then the Spacebar has the same effect as clicking the button; (c) We had to name the array object itm as we did, rather than name it item, because Internet Explorer 4 did not work if it was named item, although Netscape 3 and 4 did-this highlights the importance of testing on multiple browsers.

Error checking

Error checking

Error checking can be added to the GetResponse() function in program.htm (above). The following example checks to see whether the response is a number between 1 and 5 and alerts the subject if it is not between 1 and 5, or is not a number. Although our example program.htm only asks for only one response per page, it is possible to check more than one response at a time, so the following example is written to be easily adaptable to do so.

function GetResponse() {
response = new Object();
response[1] =
error = 0; // NO ERRORS FOUND SO FAR
for (i = 1; i <= 1; i++) {
if (!((response[i] > 0) && (response[i] < 6))) error = 1;
if (error == 1) {
alert(Ëach response must be a number from 1 to 5.");
= parent.visible.document.visibleForm.response.value ;

As another example, if you wanted to test whether a response contained one of the letters u, v, or y, you could take out the parseInt() part, and replace the line ``if (!((response[i] > 0) && (response[i] < 6))) error = 1;'' in the above example with:

if ( (response[i].indexOf(ü") == -1)
&& (response[i].indexOf("v") == -1)
&& (response[i].indexOf(ÿ") == -1)) error = 1;

indexOf() is called a property of the object response[i]. It returns the position of ``u'' in the string response[i], or -1 if ``u'' is absent. The above expression tests whether all three of ``u'', ``v'', and ``y'' are absent and sets the error variable to 1 if so. You would also, of course, want to alter the alert message to something like "You must respond with u, v, or y".
More generally, you can define very complex conditions for giving alerts. You can also store subjects' answers for use in later questions. We will not give examples of these, because, at that point, it is just JavaScript programming. The general rules of programming apply; the more complex the program, the more likely it will have errors.

Removing unwanted character codes from text responses

As we noted earlier, text area responses may include carriage returns, which would prevent these responses from fitting into a single spreadsheet cell. Tab characters can also get into responses somehow and can be undesirable for similar reasons. The following demonstrates using JavaScript to replace carriage returns and tabs with text.

<html><! THIS IS REMOVE.HTM ->
<script language="JavaScript">
function PerformSubstitution(before) {
var after="";
for (var position=0 ; position < before.length ; position++) {
if (before.charAt(position) == 'r')
{after += "[ENTER]"; position += 2;}
if (before.charAt(position) == 't')
{after += "[TAB]"; position++;}
if (position < before.length) after += before.charAt(position);
return after;
function Substitute() {
= PerformSubstitution(window.document.myform.mytext.value);
<form name="myform">
<textarea name="mytext" rows=5 cols=40 wrap=soft>
Type here!</textarea><br>
<input type=button value="Substitute" onClick="Substitute();">

The position in the response string is advanced by two after inserting [ENTER] because a carriage return consists of two invisible characters. The wrap=soft in the text area tag means that the computer will wrap text to the next line without inserting a carriage return, as in a word processor. The substitution can be done automatically when the form is submitted. For example, the following button does the substitution and then submits the form. Note that it is a regular button and not a submit button. The submission is handled by calling the JavaScript submit() function directly.

<input type=button value=``Submit'' onClick= ``Substitute(); window.document.myform.submit();''>

Process tracing, analog scales, and timing

You can use JavaScript to find out what subjects are looking at. Payne, Bettman, and Johnson (1993) describe some details and many applications of Mouselab, a method of using computers for finding out what people are looking at, which they developed in the 1980s. Mouselab presents a screen with several boxes, each containing a piece of information. The subject moves the mouse over the box and the information is displayed. The program keeps track of the order and timing of box displays.
Here is a simple way to do this. The idea is to use the input element of a form as the box, and a link that does nothing if clicked but causes a stimulus to be displayed in an adjacent text box whenever the mouse is over the link. The following example simply demonstrates the technique. It counts the number of times the stimulus is displayed and the number of milliseconds the stimulus was last viewed.

vcv <script language="JavaScript">
var looks = 0, now, timer;
function InOut(which) {
if (which == 'in') {
now = new Date();
timer = now.getTime();
looks += 1;
document.myForm.counter.value = looks;
document.myForm.display.value = "Hi!";
if (which == 'out') {
now = new Date();
timer = now.getTime() - timer;
document.myForm.display.value = "";
document.myForm.timer.value = timer;
<form name=myForm>&nbsp;
<a href="javascript:void(0)" onMouseOver=ÏnOut('in')"
onMouseOut=ÏnOut('out')">Show:</a> <input type=text
size=5 name=display> You've looked <input type=text
size=5 name=counter value="0"> times. You last looked
for <input type=text size=5 name=timer> ms.

Some points to note are: (a) this runs in Netscape 2, 3 and 4, and Internet Explorer 4, but in Netscape 2 the onMouseOver event is generated when the mouse is moved within the link, resulting in grossly inflated counts; (b) the now variable is unnecessary if one uses timer = (new Date()).getTime(), but that does not work in Netscape 2; (c) if you just put "" rather than "javascript:void(0)" in the link, then if the subject clicks on the link, it will do something like display the index page or directory listing from this file's directory on the server.
Other techniques used in Mouselab include allowing the subject to use the mouse as an analog input, e.g., to indicate position along a continuum. This too can be accomplished in JavaScript. (For this idea, we are indebted to the software library and demonstrations at Les Lenert's site, http://prefdev.ucsd.edu/ .) The idea is to use an image map (an HTML element), with the component images consisting of thin slivers. A JavaScript function reports which sliver the mouse is on. For an example of this technique, see http://www.psych.upenn.edu/baron/examples/ohp1.htm . This site also contains the image slivers called emptysca.gif and filledsc.gif, which were copied from Lenert's site. The code for the questionnaire contains the routine for controlling the scale. An important trick here is based on the fact that a link tag with an image takes on the dimensions of the image, so the tag can be quite thin, yet the link tag can still contain the usual functions needed to respond to the cursor. Another trick is to use the fact that a URL in a link tag can be a piece of JavaScript code to be executed. Thus, a typical link tag is:

<a href='javascript:top.change(80)'><img src='emptysca.gif' border='0'></a>

Here the function top.change(80) replaces emptysca.gif with filledsc.gif up to the current link, so the scale takes on the color of the latter image.

Making web questionnaires using Java, JavaScript, and HTML

Details of the use of Java applets (Java programs that run in a web page) are beyond the scope of this chapter. However, the second author has used one for a study involving probability judgment. He wrote an applet that displayed white dots at constantly changing positions on a black rectangle. The subject's task was to judge the probability that when the computer (i.e., the Java applet) randomly chose a point on the rectangle, there would be a white dot there.
To write an applet, you need a Java compiler. You can download one free at http://java.sun.com/products. The second author bought Microsoft's Visual J++ for less than $50 (academic price) at our university's computer store. Getting it set up properly was a considerable challenge, but once working it did make the programming process easier. It came with a very helpful introductory book (Davis, 1996).
Once you have written an applet, you can include it in a web page with an HTML tag such as <APPLET name="myApplet" code=bel_9_1.class width=500 height=200>. (Compiled Java programs have the file extension .class.) The JavaScript in the web page can then control the applet by calling its functions. For example, the second author's questionnaire included a command window.document.myApplet.changeChances(300), which set the probability display to 300 in 100,000.

Getting data from web questionnaires into usable form

The mailform program does not return the data in a form usable by any statistics package or spreadsheet. Mailform, and other similar programs we've seen, send an email message with a list of ``name=value'' pairs, such as:


Statistics packages usually require a list of variables in a fixed order, with a particular character between each datum (e.g., space, comma, or tab), with the variable names listed only on the top line, and with one line per subject. For example:

q1 q2 q3 ...
1 yes 98 ...

It is possible to use JavaScript to transform the data into a usable form, but we started using Perl scripts for this purpose. The first author's script performs various checks on the data and translates them into a Systat command file that enters the data into Systat. The second author's script translates the data into a tab-delimited text file, which he then transfers into a spreadsheet program. He uses the spreadsheet program for tasks such as checking that nobody has done the study twice and for working with open-ended responses not suited to a statistics program. He then exports the data from the spreadsheet to Systat.
The first author checks to make sure that nobody has done the study twice before converting the data to a usable form. He does this with a one line Unix script: grep aemail $1 sort uniq -c grep -v "1 aemail" more. He has saved this text in a file called echeck and made the file executable. To use it on a file called medu16, for example, he types at the unix prompt: echeck medu16. Here is how the script works. The grep aemail $1 part outputs only the lines of the email file that contain ``aemail'', the name he uses for the email address that the subject types in. The sort part passes that output to the sort command, which sorts the lines. The uniq -c part passes the sorted output to the uniq -c command, which outputs the lines with duplicates removed and a count of the number of times the line occurred on the left. The grep -v "1 aemail" part takes that output and filters out all lines where the count is 1. Finally, the more part displays the resulting output one screen at a time so that it doesn't scroll off the top of the screen. If someone does the study twice and does not enter exactly the same email address both times, they would not be detected by this method. If you do not have Unix server, you can get all these commands for Windows 95 from ftp://mirrors.aol.com/pub/cica/pc/win95/sysutil/unix95.zip.

Administrative aspects of using web questionnaires

Informed consent

Informed consent has not been a problem. We simply explain the rules to the subjects. They have to read the rules in order to know what to do (more carefully than they would have to read or listen to a consent form), so doing the questionnaire is equivalent to consenting to do the questionnaire. Here is the first author's current front page. Of course, the details change almost daily.

Jonathan Baron's questionnaires
What you can do now
Symptoms and quality of life (ass4)
Pay: $5
March 25, 1999
Note: This is also a class assignment for Psycholgy 153, designed to illustrate methods of utility judgment. (But anyone can do it.)
When I expect to have something new
Midnight, GMT, Sunday, March 28 (no promises).
When payment checks are sent
I submitted the last batch of check requests on March 15. I will submit next batch of check requests on April 19.
What you should know
These questionnaires are part of my research on judgment and decision making. Anyone can complete them. I assume you can read English and give serious answers. They require very careful reading. You can be paid. If you do not wish to be paid and do not provide the information required, all I will see is the part of your email address to the right of the @ sign.
Please try to keep track of which ones you have done. To help you do this, I send an automated email reply, which goes to the address you give me. It includes the questionnaire code and the usual pay.
Pay: To be paid, you must give your name, address, and social security number (SSN, required by the University) at the end of the questionnaire, and you must answer all the questions seriously. (The money is from research grants from the National Science Foundation and the Penn Cancer Center.)
Personal information: Once you have given me your email address and other information as part of a fully completed questionnaire, you do not need to type the other information each time. I have a data base in which you are identified by your email address. This means that you must type that address correctly each time, and you must use the same address (unles you tell me you're changing). Note: This is NOT a secure server, yet. This is extremely unlikely to matter.
You may, if you wish, send your SSN by email to baron@psych.upenn.edu, by post to J. Baron, Psychology, University of Pennsylvania, 3815 Walnut St., Philadelphia PA 19104-6196, USA, or by phone (on my answering machine) at 215-898-6918. I will then add it to the data base. You must identify youself with your email address.

Non-residents of the United States: You need not submit a social security number. I will send a check in U.S. dollars.

Time required: The pay is based on an estimate of $1 for 10 minutes of very careful reading and responding by a slow reader. So a questionnaire that pays $3 should take at most 30 minutes to complete, usually much less.
Technical stuff: If the cursor is not in the box where the response goes, click on the box. Most of these studies require JavaScript (not Java). They should work on Netscape 3 (or above) and Internet Explorer 3 (or above). Please report any problem. (There are occasional problems. Sometimes they go away if you close the browser and then start over.) I use no cookies.

Paying subjects

The first author's use of the Internet for data collection has advanced gradually. He began by posting short questionnaires to newsgroups and email lists. He has always tried to pay subjects when it is feasible.
At the beginning he paid subjects by entering them in drawings, to reduce the number of payments necessary. He referred to this as a ``lottery'', and one subject argued that it violated state law in one state. (This subject turned out to win the ``lottery'' in question.) Our advice: If you use drawings, do not use the word ``lottery''. It is not a lottery because subjects do not buy tickets and cannot lose money. We feel that drawings are somewhat undesirable, in that they play into a combination of subjects' altruism and misperception of probability, neither of which we want to rely on if we have other options.
The biggest problem (even from the days of paper questionnaires in the lab) has been finding efficient ways to pay subjects. These problems are probably specific to our university and its financial rules, so we won't go into them, except to make one point. Soon, we will have electronic cash (see, for example, http://www.digicash.com). When this becomes widely available, researchers should encourage their respective financial officers to learn to use it. Part of the problem is that financial people are (as they should be) cautious and conservative, reluctant to change anything that works. As a result, many of the methods for handling accounts, sending checks, etc., seem to be carry-overs from the nineteenth century. Although checks are now written by a machine, it took over a year for the first author to convince the university authorities to let him provide an electronic list of recipients and amounts, to avoid the delay (and expense) of retyping the entire list. The good news is that our department's business office recently obtained a copy of Quicken, a computer program that writes checks (Intuit, Menlo Park, CA), and obtained permission from the university to use a separate bank account. The first author now submits a ".qif file," which he produces with a program he wrote in Perl (Wall, Christiansen, & Schwartz, 1996; he will provide the program on request; Perl is available free from http://language.perl.com). He sends out checks once per month. (Some foreign subjects cannot cash U.S. checks at reasonable cost, so he sends them cash. He used to send all subjects cash, but it took a lot of time, and he frequently heard that the cash was "lost" in the mail. Checks, he has found, do not get lost in the mail.)
A similar Perl script keeps a database of subjects and makes a list of payments due. The database is organized by the subject's email address. This may not be the best way to do it. Subjects often change addresses or use different aliases of the same address (e.g., baron@psych.upenn.edu and baron@cattell.psych.upenn.edu both work for the first author). Sometimes a husband and wife use the same email address. Most of the time this works, though. Once the subject types an address and social security number, these are entered into the data base and need not be retyped for subsequent studies. The database itself is in the form of an Excel .csv (comma separated value) file, which is a text file with a fixed number of fields in each row, separated by commas, with one row per subject. The first field is the email address, then the name, etc. The database can thus be edited with Excel, or with a text editor. The details of the Perl scripts are probably too idiosyncratic to be useful to others. The general point is that it is worthwhile to automate the entire operation, though it is also important to monitor the output, check it in various ways, and make corrections manually when necessary. There is always the possibility that subjects will do something that one's script is not prepared to deal with.
The payment list is printed on paper and also converted into a .qif file for writing checks. This file is then submitted to the Quicken program. We include here a note on the format of the .qif file, which is difficult to find in the Quicken documentation and may be useful to others:


Mbribe in return for course grade

The file is an ASCII text file, beginning with !Type:Bank. Each record begins with on a line by itself. The meaning of each line is determined by the first letter. D is the date. T is the amount. (It seems that the minus sign between the T and the amount of the check is helpful, perhaps because a check is a withdrawal from a bank account.) N is the check number, which should be set to "print", which means Quicken will decide on a number when it prints the check. P is the payee, the lines (up to five of them) beginning with A are address fields, and M is the memo to be written on the check. Then the next record begins immediately with on a line (no blank line in between), and so on. The file must have the extension .qif. Make backups of all other Quicken files before you try submitting the file to Quicken.


In our view, web questionnaires have many major advantages over paper questionnaires, and few major disadvantages. The advantages are likely to increase since, for example, a wider and wider range of people will use the web more and more frequently, the range of stimuli it is feasible to display on the web will increase, and electronic cash will simplify the payment process and allow anonymous payments. We have discussed a variety of techniques and procedures we have found useful in creating and using judgment and decision making web questionnaires. We hope others will find these pointers helpful and will build on them.



Baron, J., Hershey, J. C., & Kunreuther, H. (1998). Attitudes toward risk reduction. Manuscript.
Davis, S. R. (1996). Learn Java Now. Redmond, WA: Microsoft Press.
Flanagan, D. (1997). JavaScript: The definitive guide (2nd ed.). Sebastopol, CA: O'Reilly.
NSF (1997). NetLab Workshop Report. http://www.uiowa.edu/grpproc/netlab.htm.
Payne, J. W., Bettman, J. R, & Johnson, E. J. (1993). The adaptive decision maker. New York: Cambridge University Press.
Schmidt, W. C. (1997). World-Wide Web survey research: Benefits, potential problems, and solutions. Behavior Research Methods, Instruments, & Computers, 29 (2), 274-279.
Wall, L., Christiansen, T. & Schwartz, R. L. (1996). Programming Perl (2nd ed.). Sebastpol, CA: O'Reilly.

File translated from TEX by TTH, version 2.00.
On 10 Apr 1999, 09:31.