Inconsistency and bias in thinking about

Inconsistency and bias in thinking about tax reform

1  Introduction

Few if any systems of social and economic control are as large and pervasive in their effects as the United States tax system. Various state, federal and local taxes in America churn some three trillion dollars through government coffers each year. These taxes have dramatic impacts, not only on the distribution of material resources, but also on behaviors, including decisions about household composition, education, work, savings, retirement, charitable giving, and so on. Not surprisingly, given its salience, tax has loomed large in the political history of the United States, from pre Revolutionary War tax rebellions to the property tax revolts of the 1970s, the Reagan tax-cutting revolution of the 1980s, and George H.W. Bush's unfortunate ``read my lips'' pledge of the 1990s. At the dawn of a new millennium, there are no signs of any abatement in the public's interest in tax.

What is surprising, however, is the extent to which the current tax system suffers from inconsistency and instability. Public finance economists and legal scholars have made powerful arguments for various kinds of tax reform to make tax more efficient, simple and fair. But the reforms are rarely made. We seem stuck with a system that is far less than optimal by any account. Of course, history and rational choice social theory provide reasons why suboptimal tax policies might emerge from a second-best political system with interest groups and politicians pursuing their own agendas (Doernberg & McChesney, 1987a,b; Pollack, 1996; Graetz, 1997). Our research aims to provide a different, and complementary, answer to the puzzle of persistent instability in tax: ordinary citizens want tax policies that are incompatible. For example, on the whole, people want progressive taxes, equal treatment of couples with equal monetary income, and no marriage penalties or bonuses: a triad of goals that are flat-out inconsistent (McCaffery & Baron, 2001a). Changes in tax law are often driven by efforts to patch one problem while ignoring unintended consequences that generate the demand for later, further changes: witness the earned-income tax credit, designed to offset the effects of positive payroll taxes on poor wage earners, but ultimately part of a panoply of taxes creating high marginal tax rates and steep marriage penalties on the lower middle income class (McCaffery, 1997).

We thus have two problems. The first is vacillation and poorly thought out change. The second is resistance to truly beneficial reforms. Why do people support some policies that hurt them and make the law ``worse'' in some meaningful sense, while resisting other changes that might improve the tax law and their own lot within it? We believe that these questions about ordinary evaluation of tax and tax reform complement the more familiar public choice explanations for the shape of the tax system. Absent a consistent, stable consensus on many if not all matters of tax policy, public opinion provides a fickle backdrop for reform. Politicians who are better able to frame their proposals to appeal to one aspect of inconsistent normative judgments will carry the day. But the day's victory will not endure: tax will remain vulnerable to later reform, often in opposing directions, in a never-ending cycle of change.

Our exploration of inconsistency in thinking about tax aims to bring the established and growing body of scholarly work on heuristics and biases in decision making to bear on this important subject of public political interest. In particular, the last thirty years of research on judgment and decision making have suggested that people use various heuristic methods for thinking about decisions (Baron, 2000). The heuristic methods are usually correlated with optimal decisions, but, in many cases lead to non-optimal ones. More recently, Baron (1994, 1998) has argued that some heuristics are elevated to the status of moral principles, where they lead to intuitive judgments about a range of matters. These judgments lead in turn to the choice of suboptimal outcomes.

The research proposed here extends a research program we have already begun on framing effects in judgments about taxation. The goal of the proposed research is to examine heuristics and biases in judgments about various policy changes in tax using a range of methodologies. We aim, first, to generalize and extend our findings about inconsistency and bias in the normative thinking about tax and then, second, to consider ways out of the problem, as through various ``debiasing'' or educative techniques. We also shall examine, in all studies, demographic factors (income, marital status, etc.) that might affect the attitude toward specific reforms. Although many studies find that political attitudes do not depend much on self-interest (e.g., Brodsky & Thompson, 1993; Sears & Funk, 1991; Shabman & Stephenson, 1994), the issue remains of interest, if only from the methodological need to control some of the variance in responses. We shall also use measures of relevant political ideology, so that we may compare its effects to those based on self-interest according to demographic information.

In addition, we plan to extend our work methodologically by using paper questionnaires as well as studies on the World Wide Web. This will not only allay lingering suspicion among journal reviewers about the validity of Web studies but will also allow us to target very specific groups, such as students who have studied economics or tax law, and people with specific self-interested reasons to support or oppose some reform.

1.1  Heuristics and Biases in General

Simplifying heuristics can be expected to operate more strongly in thinking about public policy than in thinking about personal decisions. This is because judgments about policies rarely have serious consequences for the individuals who make them (even when the individuals are legislators). Policy decisions are made by large groups of people, so that each person's effect on the decision is small. As a result, individual input to such decisions is often ``expressive'' (Brennan & Lomasky, 1993; Brennan & Hamlin, 2000; Baron, 2001). For example, voters often vote their moral views without thinking much about consequences.

Scholars over the last several decades have noticed a wide array of heuristics and biases that generate deviations from ideal decision making in a wide range of contexts. The proposed research draws on and is to extend this body of work.

1.1.1  The effects of current interest

Tax reform often involves replacing one tax or expenditure with another tax or expenditure, e.g., substituting a consumption tax for an income tax, or a direct subsidy for a tax deduction. Such changes have three sorts of possible effects:

  1. They replace one source of income or loss with another source for individuals.
  2. They replace losses for some people with gains for other people.
  3. They change incentives, thus changing people's behavior.

These three effects lead to three hypotheses about why people will resist reform. First, simple replacement will cause resistance from a simple status-quo bias (Samuelson & Zeckhauser, 1988), unless people can integrate the gains and losses (in the sense of Thaler, 1985). Presumably such an effect is the result of loss aversion (Tversky & Kahneman, 1981), although loss aversion itself could have many sources.

Second, the replacement of gains for some with losses for others, in addition to loss aversion, is affected by heuristics and biases dealing with fairness in distributions, such as the ``do no harm'' principle (Baron & Jurney, 1993). For example, people will perceive unequal benefits as unfair, and they will think it is morally wrong to harm some people in order to help others, even if the harm is small relative to the benefit. Such heuristics may operate even with the goal of a reform is to make some ultimate distribution more egalitarian.

These two general hypotheses are supported by some prior evidence. People have intuitions and norms that oppose beneficial change (Baron, 1994). For example, Baron and Jurney (1993) presented subjects with six proposed reforms, each involving some public coercion that would force people to behave cooperatively, that is, in a way that would be best for all if everyone behaved that way, e.g., a gasoline tax. Most subjects thought that things would be better on the whole if the reforms, as described, were put into effect, but many of these subjects said that they would not vote for the reforms. Subjects who voted against proposals that they saw as improvements cited such principles as unfairness in the distribution of benefits or costs, and the fact that the reform would harm some people to benefit others. For other evidence for such a basis of opposition, see Baron (1995, 1996).

The third general hypothesis is that people underestimate incentive effects. They do not think ahead, putting themselves in the position of those affected by new laws. Examples of neglect of incentives are in the underestimation of the effect of a gas tax (Kempton et al., 1995) and in the neglect of the deterrent effects of tort liability (Baron & Ritov, 1993; Baron, Gowda, & Kunreuther, 1993). Thus, people will not appreciate the potential benefits of reforms that change incentives.

Tax laws, such as pollution taxes or deductions for charity, often exist because they provide deterrence or incentive. Understanding of this function is not widespread. In discussions of energy taxes in 1993, Senator Max Baucus declared, ``Either we have an energy tax where everybody pays evenly, or we have no energy tax.''1 Of course, such a tax would be impossible in principle: some people use more energy than others; and if the tax were equal for everyone, it would not provide any incentive to reduce energy use.

In part, the failure to understand incentive may result from failing to imagine how the tax might affect behavior in the long run. For example, Kempton et al. (1995, pp. 146-147) found that about half of the respondents in survey thought their own driving would not change at all in response to an increase in the price of gasoline. They did not think that their next car might be more efficient, for example. Yet, in the past, driving has proved to be quite elastic in response to price changes. (See Baron & Ritov, 1993, and Baron, Gowda, and Kunreuther, 1993, for other examples of neglect of incentive.)

1.1.2  Framing effects (some based on prior NSF support)

We shall also examine effects of framing on proposed changes, extending our current research, carried out under Baron's current NSF grant and additional funds from U.S.C. Law School. We have produced two papers based on this work, one submitted for publication and the other to be submitted shortly. Both will be available in
http://papers.ssrn.com.

The first article (McCaffery & Baron, 2001a) reports the results of four studies of attitudes toward tax policies and their fairness, carried out on the World Wide Web. Each employed a framing manipulation and asked about tax issues relating to marriage and household composition, such as the ``marriage penalty.'' The results reflect a great deal of context sensitivity, most importantly to framing effects, and inconsistent normative judgment. For example, judgments about taxes for singles and for couples with one or two income earners depended on whether the taxes were described as ``per person'' or ``per couple,'' as well as on whether they were described as percents or dollars.

The second article (McCaffery & Baron, 2001b) describes the results of three experiments carried out on the World Wide Web to test attitudes about the appeal of various tax regimes. Subjects were asked questions about their preferred distribution of taxes. A framing manipulation involved whether subjects were asked to design a single, global tax system, or to vary one part of a tax system with the other part held constant. The idea was to replicate the effects of income tax reform given a constant payroll tax system. Subjects revealed a variety of inconsistent normative judgments. For example, they would focus on the fairness of whatever tax they would ask to consider (income tax or payroll tax) even when they could see, immediately, the effect of their judgments on the total tax.

The framing effects in our studies were accounted for by a few general principles of judgment. One well-known principle is that people tend to anchor on the values they are given and underadjust. Thus, when subjects were given a proposal and asked to adjust it until it was best, the starting point affected the end point. If the starting point was a highly graduated tax, for example, the end point was more graduated than if the starting point was a flat tax. Similarly, when people were asked about taxes in dollars, they favored more graduated taxes than when they were asked about taxes as a percentage of income.

Other principles were, as noted, that people wanted to avoid marriage penalties, but they also wanted to maintain couples neutrality, that is, equal treatment of one-earner and two-earner couples. Combined with the assumption of a graduated tax, these are impossible. People seem to focus on whatever they are asked about, tending to ignore the other parts of the system that must be adjusted when that focus is changed.

A related effect is that people want each component to be fair, as well as the total tax. Thus, when taxes are disaggregated into (e.g.) payroll tax and income tax, subjects would adjust whichever tax they were asked about - payroll, income, or total - yielding inconsistent results. From a normative (ideal) point of view, the total taxes are what matter.

Other results of Baron's recent NSF support are not relevant to this proposal and are sketched in an Appendix.

1.2  Examples of the need for reform

Even a casual consideration of the tax law landscape reveals many aspects of it that seem sub-optimal. Examples include:

2  Proposed research

The proposed research will examine attitudes toward reforms of various sorts. It will go beyond previous demonstrations by examining the effects of various debiasing manipulations. These manipulations will be a form of reflection, which might occur during an extended debate about some proposal.

2.1  Methods

Some experiments will be conducted on the World Wide Web. See
http://www.psych.upenn.edu/~baron/qs.html for details. People find this site because it is linked from many other sites (including those that list experiments on the internet and ways to earn money on the internet). Because experiments are done every week, on different topics, subjects also return several times once they find it. (In some cases, an experiment will ``ruin'' a subject for experiments of the same type, and this will be specified in the introductory page. This occurs when the experiments involve training that could change people's preferences.) Use of the Web for research has several advantages over the alternatives for this kind of research (usually students): the subjects are much more varied than those from other convenience samples; expenses connected with data entry and checking are reduced; and, because it is easy to check answers as the subject enters them, fewer responses need to be discarded because they are nonsensical (Baron & Siepmann, 2000). Moreover, the general quality of the data is at least as high as that of data from paper questionnaires, and, in general, substantive results do not differ from those of comparable methods (Birnbaum, 1999, 2000; McGraw et al., 2000). Because subjects are paid, it is possible to track individual identities and be sure that nobody completes the same study twice. Typically, we use 100 subjects per experiment. We try to design it so that answering the questions (which is timed) takes about 20 minutes for the median subject and pays $4. We time responses, without telling the subjects, and eliminate data from subjects who consistently respond too quickly (usually about 5%).

In a recent study, which asked for detailed demographic data, 90% of the respondents were between the ages of 20 and 55, the median education level was a high-school degree (which is the median of the U.S. population), and the median income was in the $20,000-$40,000 range. Most respondents were married and had children. The most unusual thing about our respondents is that the vast majority (75% or more in most studies) are women; we do check for gender differences, though, and we rarely find them.

The proposed research is concerned with general psychological mechanisms. It is of interest to examine the effects of demographic variables, but it is not of interest to attempt a random sample of any particular population, such as U.S. citizens or English speakers around the world. The issues are relevant to the entire world, now and in the future, so a truly representative sample is out of the question. From this perspective ``convenience samples'' of subjects on the World Wide Web are nearly ideal. With more people able to use the Web, it is possible now to get an extremely varied sample, which includes subjects from adolescence to old age, and subjects from many different countries (including middle-class people from poor countries such as India). Baron has been using this method for four years now and has three grants (all small) with funds for payment of Web subjects.

However, to insure the robustness of our results across methods, we shall carry out as many studies as possible using pencil-and-paper questionnaires. We shall seek different convenience samples for these studies, such as students, jurors waiting for jury duty or people in an airport waiting for their flight. One possible source of subjects consists of prospective jurors at the Philadelphia County Courthouse. From past studies, the typical demographic make-up of these respondents is: mean age of 45, mean of 14 years of education, 33% male, 51% Caucasian, and 41% African-American.

Some of the proposed studies require particular samples based on course work. We shall discuss this later, and also our proposed experiments on debiasing.

Data analysis will be done using simple parametric statistics such as t tests and analysis of variance. Whenever possible, comparisons are made within each subject. Each subject makes judgments about several cases, which differ in variables of interest. Thus, a measure of the effect of each manipulated variable is obtained for each subject (as well as interactions among variables). We also obtain basic demographic data. These data can be used to examine individual differences, which are usually quite large in these kinds of studies. Data analysis uses R, a free program based on the S language, available at http://www.stat.cmu.edu/R/CRAN/ and discussed in the present context by Baron and Li (2001, which gives specific examples of the sorts of analyses that will be done).

2.2  Particular topics

We shall apply these methods to several topics. In each study, we shall examine various effects in both Web studies and paper questionnaires. We shall ask about demographic information relevant to the proposals at issue, and also about political ideology.

2.2.1  Pollution taxes

Economists have argued that pollution taxes are often, perhaps usually, the most efficient way to regulate pollution. By charging polluters for the external effects of their actions, the government discourages polluting activities just when their social costs are greater than their benefits. (An exception to this principle is when the benefits are public goods that are not fully paid for.) Moreover, polluters choose the most cost-effective means of pollution reduction, which is not necessarily the case when the law specifies the technology. Finally, pollution taxes can serve as a source of revenue, offsetting other taxes.

Can individuals appreciate the advantages of pollution taxes? Political reaction to proposals for increased taxes on fuel suggests that the usual difficulties arise, and these are also found in laboratory studies (Baron & Jurney, 1993).

We shall carry out further hypothetical studies in which we ask about whether various pollution taxes should be increased, or, in case they already exist, decreased. In each study, subjects will rate proposals that differ in the pollution tax change (positive or negative), the current level of the tax, the change in specific other taxes, and the distributional effects of the combined increase and decrease (how the change would affect different groups of people, such as drivers and non-drivers, or people of different incomes).

We shall, in some cases, specify the distributional effects. In other cases, we shall leave them unspecified and ask respondents about them (after they have answered other questions). We shall also specify, or ask about, incentive effects, e.g., how much a given increase in the gasoline tax would affect the use of gasoline. In some studies, we shall use examples chosen so that they are comparable to real data.

We expect that people will underestimate incentive effects, compared to economic data. We also expect that people will not think much about incentive effects when they are evaluating the proposals, unless the effects are given before the judgment. Therefore, specification of these effects before the judgment of the proposal will affect judgments more than beliefs of comparable magnitude that are assessed after the judgment. For example, if the subject is told that a tax will reduce pollution by 10% and then asked to judge it, the judgment will be more favorable than if the tax will not reduce pollution at all. But a pre-existing belief in a 10% incentive effect will have little or no effect on the tax judgment, if the belief is assessed after the tax judgment. Thus, presentation of incentive information before the judgment is a kind of debiasing.

2.2.2  Payroll tax, and marriage penalty/bonus

We shall extend our work on these topic in two main ways, as noted. First, we shall examine more systematically the role of demographic factors, using a variety of sampling methods. Most of our web subjects earned relatively low incomes. (They did the studies in part for the small amounts of money we offered.) We shall try to get higher income respondents by going to nearby business conventions, for example.

Second, we shall examine the effects of debiasing, on the Web, by confronting respondents with their own inconsistencies and asking the respondents how they would resolve these inconsistencies. For example, when people favor different rates of graduation as a function of presentation in percent vs. dollars, which comes closer to their ``true opinion'' when they see both? And, will such debiasing transfer to new judgments made immediately after the confrontation. (See Baron & Leshner, 2000, for examples of how debiasing of this sort can be done effectively in Web studies, including effects of transfer.)

2.2.3  Taxing imputed income

One way to understand the trouble with the marriage penalty perceptions is that, in clinging to the norm of couples' neutrality, respondents ignore imputed income. Such income comes from self-supplied labor or capital. Hence the couple with a stay-at-home spouse that earns $60,000 in fact has greater ``income'' - more resources available to consume now or later - than a couple with two-earners that also earns $60,000. One way to get at this imputed income is to see that the two-earner couple has greater out-of-pocket costs, as for child care, than the one-earner couple: these costs are simply the absence of the imputed income of one earners. To be ``neutral'' in an economic sense, we must tax the imputed income of one-earners or allow deductions to two-earners.

We hypothesize that respondents will have a difficult time equating ``imputed'' income with ``real'' income, or even factoring imputed income into their normative judgments. This is parallel to the failure to consider opportunity costs as ``real'' ones. We also expect to find that the idea of taxing imputed income is less appealing than the idea of giving deductions for two-earner couples, even if the ``base tax rates'' are adjusted so that the total payments are equated.

Imputed income also arises in the capital context. Persons who own their own houses are able to live, tax-free, off the imputed rental stream from their owner-occupied housing; renters and those who do not own their houses outright must pay rent (or principal on their mortgages) with after-tax dollars. Again, we hypothesize that respondents will have difficulty understanding this point and incorporating it into their judgments, as assessed by the same kind of comparison described in the last paragraph (additional taxes vs. deductions, with different base rates of taxes).

2.2.4  Consumption tax

``Income'' is simply the sum of personal consumption and savings - you either spend or do not spend all of your available resources. An income tax is designed as a ``double'' tax on savings, since both the receipt and the subsequent yield to capital are taxed. A consumption tax, in contrast, is any systematic single tax on an individual taxpayer's flow of funds. Standard public finance shows the equivalence of a pre-paid consumption (or equivalently, ``yield exempt'' or wage) tax and a post-paid (or, equivalently, ``cash flow,'' ``qualified account'' or sales) tax, under constant tax rates.

We hypothesize that respondents will have a hard time understanding this, or understanding that an income tax is, for those who do not save, precisely equivalent to a consumption tax: we hypothesize, that is, that the merely formal labelling of a tax will matter. For example, we shall compare an income tax with a deduction for savings to a consumption tax: we hypothesize that the former would be preferred, even though the rates are the same. Attitudes toward a consumption tax may also be affected by whether the respondent saves money or not. Finally, we hypothesize that a consumption tax would be seen as benefiting the rich, so that, even if the consumption tax were heavily graduated so that the average rich person would pay just as much as in the income tax, people will prefer that the consumption tax be more graduated still (or that the income tax be less graduated).

Debt or borrowed money is simply negative savings, and the two models of consumption taxes differ in their treatment of debt. The prepaid or wage tax taxes individuals when they earn labor income, and allows no deduction for the repayment of debts (it systematically ignores savings); the post-paid or spending model taxes individuals when they engage in consumption, however financed, and then allows for a deduction for repayments of principal (to keep the debt taxable in the period of ultimate private preclusive use). It can be shown, under progressive rates, that the post paid model reduces tax burdens for most ordinary ``life cycle'' earners. We hypothesize that respondents will have difficulty conceptualizing debt as negative savings, and that they will disfavor the postpaid model even though it would reduce their tax burdens over a lifetime.

2.2.5  Government health care

Government programs often substitute for private spending. A person can spend money on health insurance, or pay higher taxes so that government can provide the insurance. People may not include their own private spending in their thinking about government programs. More generally, people will respond to information about gross taxes, rather than net taxes that take into account the benefits they receive.

We hypothesize similar results to those we found for integration vs. segregation of payroll and income taxes. One additional factor here is the uncertainty of the benefits or the difficulty of quantifying them. Some benefits, such as protection from crime, disease, or military attack, are uncertain. Yet they can be monetized in terms of willingness to pay for risk reduction, or costs of equivalent non-government services (e.g., private police). Even when they are monetized in these ways, however, we expect that benefits will not fully offset the certainty of taxes, so that people will lower taxes even when the taxes are more than offset by higher expected expenses, or higher actual expenses.

2.2.6  Hidden taxes and employee benefits

Hidden taxes are somewhat the opposite of the situation in which government provides services in response to higher nominal taxes. Here, the nominal taxes are lower, but taxpayers lose in some other way. In the situations of interest, they lose more than they would lose if they paid taxes directly, because of the indirect effects of hidden taxes on (e.g.) production of goods and services.

The corporate tax is fully hidden. As McCaffery (1994) put it, ''deceit precedes receipt.'' Money is diverted from the stream of commerce before it arrives in households. So assume an incidence on a given household. Then with corporate (all ''indirect'') taxes, one sees neither the income nor the tax (nor, perhaps, but independently, the benefit).

Because the tax is hidden, people can be expected to resist a reform that replace the corporation tax with an increase in some other tax, even if the total tax is lower (or if it has other benefits in terms of job creation).

Employees may see their benefits as truly paid by their employer, although economic data [ref?] indicates that an extra dollar of benefits requires the loss of almost a dollar of salary. This situation is like hidden taxes, but the hidden tax is in the form of a lower paycheck rather than in the cost of goods or services.

In all of these cases, we hypothesize that people will favor the status quo. We shall compare cases in which hidden taxes are the status quo to otherwise equivalent cases in which they are the proposed reform.

An additional hypothesis is based on the distribution of benefits and burdens. In some cases, the winners and losers are the same people, to the same extent, as in the payroll tax. In other cases, the distribution of gains and losses from a hidden tax is slightly different. The corporate tax, for example, saves tax money for people who would have to pay still higher taxes if it were abolished, and it penalizes those who invest in corporations or buy their products. In situations like this, we hypothesize that calling attention to possible differences in distribution will increase the status-quo effect, because people will see the differences as unfair. This will happen even when there is no obvious difference in the deservingness of those most affected in each case. (We will assess this with follow-up questions.)

2.2.7  Incentives

We shall carry out experiments examining the understanding of incentive, modeled after those of Baron & Ritov (1993). In particular, we shall ask for judgments of tax reforms that differ in the extent to which the provide incentives for socially desirable behavior. For example, we shall compare one-time deductions for behavior already done (e.g., research credits for business, credits for installing efficient appliances) with one-time deductions for behavior not yet done. If people understand incentive, they should see that the latter are better than the former, other things being equal (which, of course, they are, in the hypothetical situations we present).

We expect to find that many subjects regard the two situations as equivalent, totally ignoring incentive (as found by Baron & Ritov, 1993). We then plan to debias this effect by asking subjects what they would do if the tax were passed and if they were in a position to undertake the socially beneficial activity. Then we would ask again about their evaluation of the tax proposal.

2.3  Summary of methods and significance of findings

Some of our questions can be addressed with respect to every issue. Others (such as the role of incentives and hidden taxes) are relevant to some issues and require separate studies.

By extending our work on framing effects and other heuristics and biases to new situations, we shall be able to paint a more complete picture of the role of framing effects in thinking about tax and, more generally, trade-offs in public policy. We shall also apply different methods, thus making our work more accessible to those who distrust Web studies alone.

We shall also examine the role of self-interest. Do people judge what is fair according to what is good for them as individuals? Although many studies suggest that ideology is more important than interest, it is not clear what happens in the field of tax. Are subjects more self interested in regard to tax than they are relative to, say, government spending programs? To test the role of self-interest, we shall, in each study, gather demographic data relevant to the study. For example, in studies involving payroll taxes, we will ask whether each subject pays them. This will help understand the nature of political support for, or opposition to, various reforms. If attitudes are based on interest, then any reform must be crafted to appeal to the interests of a majority, without hurting the minority too much - and of course this should usually be possible or else the reform is not really a reform. If, on the other hand, attitudes are based on ideology, then such efforts at balancing effects for the sole purpose of securing political support are superfluous.

We have already suggested several methods of debiasing. We shall also examine transfer of learning. For example, when people go through a series of questions about pollution taxes and learn to attend to incentives, do they then transfer this attention to new cases that involve the same principle, such as the earned-income tax credit?

We shall also see to examine long-term effects. In particular, we shall examine judgments of students in relevant courses, such as courses in taxation or tax policy in law schools or business schools. To avoid problems of selection, we shall give questionnaires to the students at the beginning and end of the course in question. To control for repetition effects, we shall do the same with students in other classes, checking to see that they have not taken the class in question.

The general issue underlying this research is how it might be possible to forge consensus on truly beneficial reforms, that is, reforms that are ``near Pareto improvements'' (Stiglitz, 1998), that is, reforms that do a great deal of good with only minor harms. We need not commit ourselves to any particular reforms to be concerned about this issue.

More generally, our proposal addresses a tension between expert knowledge in economics and related social sciences, on the one hand, and the intuitions of citizens and legislators, on the other. Economists (in the most general sense) have attained new insights in the last 150 years, which could have major beneficial effects on human welfare. But, without public understanding, the application of these insights will be unstable at best. We need to understand the tension between expertise in economics and democratic government.

2.4  Human subjects

2.4.1  Baron's Web studies

Recently, these studies have been exempt from review by Penn's IRB. The general outline is as follows:

Subject recruitment: The studies are all conducted on the World Wide Web. Subjects are paid by the study, usually $3. The usual time to complete such a questionnaire is 15-20 minutes.

I do warn subjects that they must take the studies seriously. If the time is less than 5 minutes, usually, I regard that as a sign of lack of seriousness; 5 minutes is about what it takes to get through the study clicking buttons but not reading anything. I rarely get challenged when people ask me about why they were not paid and I remind them of my seriousness rule. (In the end, I would back down, if someone really claimed to take it seriously, but I have never had a protracted discussion about this.)

The current instructions to subjects are in
(http://www.psych.upenn.edu/~baron/qs.html). Because I have been doing these studies for a few years, I do not need to recruit. Links exist to my studies in many web pages maintained by others (and not solicited by me), such as those that keep lists of how to ``make money by surfing the web.'' My studies are also discussed in web discussion lists, and in general I have a good reputation. You can find these by searching in Google for my name and ``questionnaires.''

My main problem is not recruiting subjects but, rather, limiting them. I do not want to just pay less. I don't think that is fair. My studies are quite difficult and demanding. I also require serious answers (and I warn people of that). For example, I do not pay people who simply give the same answer to every question.

Past findings: My web page contains the full text of many papers that grew out of this research program, including published papers. There is too much to attempt to summarize here. See
http://www.sas.upenn.edu/~baron.

Research design: Most studies are within-subject, with about 32 trials (screens) varying in a number of variables (e.g., 5, in a 2x2x2x2x2 design).

Potential risks: None beyond the risks of daily life on the internet (computer crashes, misunderstanding, etc.).

Consent procedures: Clicking on the link to the questionnaire amounts to consent. The instructions in the main page are equivalent to a consent form. I have not used a formal consent form in the 26 years I've been at Penn, because I have argued (and still believe) that it is more likely to arouse anxiety than to allay it. But I do very much care about providing people with full information about what they are getting into (and I am constantly revising my page when I notice that I seem not to be getting through to everyone).

Protection of subjects (confidentiality): In addition to doing battle with the business office and Accounts Payable, to try to get the subjects paid in some sort of timely manner, my main concern is confidentiality. In order to get paid, the subjects must provide their name, address, and (if in the U.S.) social security number. They submit this on a registration form. I compile this information into a database, as described in
http://finzi.psych.upenn.edu/~baron. I do keep two copies of this database, one on my home computer and one on my office computer
(finzi.psych.upenn.edu), both in read-protected directories.

Data from individual studies contain only email addresses as identifying information. I use these to link up to my database in order to pay people. I also save one copy of the raw data - kept in a protected directory in cattell.psych.upenn.edu until subjects are paid, and then moved (not copied) to my home computer. I need to keep this in case questions arise about pay (as they often do). It is my most authoritative record of who has done what. Again, there is only one copy of this on a computer that is extremely secure.

After a study is done, I also make a reduced copy of the data for analysis purposes, with all identifying information removed. I make several backups on this, on several different computers (my home computer, my office computer, a laptop, and sometimes on other computers when the study is collaborative). But, again, all identifying information has been removed at this point.

Benefits for subjects: In addition to the pay, many subjects say they find my studies interesting, in the way that solving puzzles is interesting. Some subjects find them boring. I assume they do not come back for more. Occasionally, I get involved in lengthy e-mail correspondence with a subject. In one case, the subject in question applied to graduate school with the idea of working with me. (Ultimately, she went elsewhere, although she was admitted.)

Risks-benefit ratio: For the subjects who actually do participate, it seems that the benefits outweigh the risks. Even if they find the study boring, they find the small amount of money I offer (usually $3 per study) to be adequate compensation. I rarely get complaints about my rate of pay.

2.4.2  Other questionnaire studies

In recruiting jurors, we will use a procedure that Baron has used in numerous prior studies (all in collaboration with Peter Ubel). A research assistant arrives at the courthouse waiting room, where, on a typical day, 50 to 100 prospective jurors are waiting to be empanelled and are available to be interviewed. The research assistant announces that anyone who wishes to may participate in a short paper-and-pencil survey in exchange for a candy bar. Each participant takes a copy of the survey instrument, fills it out, and returns it to the research assistant. In this basic procedure, no names or other identifiers are taken.

3  Other results from prior NSF support

Baron's last NSF grant is still in effect. He is writing up the findings, but nothing is as yet presentable. Some summaries are available in Baron (2001, a paper presented in a talk at Harvard Law School). Some of the major findings so far concern debiasing, a topic of the last proposal and the present proposal as well. Here is the abstract of a current working draft of this work (which will contain more studies when it is completed):

We applied two methods of debiasing to four non-consequentialist biases. One method was to expand a scenario with additional information, e.g., about other people than those described in the scenario. The other method was to provide a minimal description in terms of consequences alone. The four biases were omission bias, zero-harm bias, preference for ex-ante equality, and preference for group equality (even when these made consequences worse). The minimal method reduced biases in two experiments, and subjects tended to accept the minimal redescription as a fair summary. The expansion method reduced biases in one experiment but not in another.

Another set of findings (with some summaries included in ``The rational voter,'') concern the moralistic nature of allocation principles such as those mentioned in the abstract just presented, such as preference for harms caused by omission. These principles are biases in the sense that they go against producing the best consequences overall. The findings in question show that these biases are moralistic values, in the following sense: moralistic values are those we are willing to impose on other people, whether or not they favor these values. Even when 100% of the members of a health maintenance organization (HMO) favor a policy that yields lowest death rate, some respondents who favor omission bias still think that the HMO should reduce deaths from action (leading to a higher death rate).

A final paper, with Simon Kemp, is being revised for re-submission to the Journal of Economic Psychology. Our main result is that attitudes toward free trade are correlated with understanding of the principle of comparative advantage. Failure to understand this principle was one of th topics of Baron's original proposal.


Footnotes:

1``Nation,'' CBS TV, June 13, 1993.


File translated from TEX by TTH, version 3.01.
On 5 Mar 2002, 18:57.