Judgment and Decision Making, vol. 5, no. 2, April 2010, pp. 124132
Gambler’s fallacy, hot hand belief, and the time of patternsYanlong Sun^{*} and Hongbin Wang 
The gambler’s fallacy and the hot hand belief have been classified as two exemplars of human misperceptions of random sequential events. This article examines the times of pattern occurrences where a fair or biased coin is tossed repeatedly. We demonstrate that, due to different pattern composition, two different statistics (mean time and waiting time) can arise from the same independent Bernoulli trials. When the coin is fair, the mean time is equal for all patterns of the same length but the waiting time is the longest for streak patterns. When the coin is biased, both mean time and waiting time change more rapidly with the probability of heads for a streak pattern than for a nonstreak pattern. These facts might provide a new insight for understanding why people view streak patterns as rare and remarkable. The statistics of waiting time may not justify the prediction by the gambler’s fallacy, but paying attention to streaks in the hot hand belief appears to be meaningful in detecting the changes in the underlying process.
Keywords: gambler’s fallacy; hot hand belief; heuristics; mean time; waiting time.
The gambler’s fallacy and the hot hand belief have been classified as two exemplars of human misperceptions of random sequential events and widely studied in multiple disciplines such as psychology, sports, behavioral economics and neuroeconomics (e.g., Camerer, Loewenstein, & Prelec, 2005; Gilovich, Griffin, & Kahneman, 2002; Gilovich, Vallone, & Tversky, 1985; Kahneman, 2002; Malkiel, 2003; Rabin, 2002; Rabin & Vayanos, 2010). Often manifested in more intricate forms, these two phenomena can be demonstrated by independent and identically distributed Bernoulli trials. Suppose that a fair coin with equal probabilities of coming up a head ( h ) and a tail ( t ) is tossed repeatedly and the first three outcomes produce three heads ( h,h,h ). In predicting the next outcome, one with the gambler’s fallacy would predict ( h,h,h,t ) — a reversal of the streak. In contrast, one with the hot hand belief would predict ( h,h,h,h ) — a continuation of the streak.
The fact that people exhibit two opposing expectations upon the same past information — negative recency in the gambler’s fallacy and positive recency in the hot hand belief — has been the center of attention in the research on perception of randomness, pattern detection and judgment of uncertainty (for reviews, see, Ayton & Fischer, 2004; Oskarsson, Van Boven, McClelland, & Hastie, 2009). Among existing theories, a prevailing account is the representativeness heuristic, which attributes both the gambler’s fallacy and the hot hand belief to a false belief of the “law of small numbers” (Gilovich, et al., 1985; Tversky & Kahneman, 1971). By this account, people tend to believe that a local sample should resemble the underlying population and chance is perceived as “a selfcorrecting process in which a deviation in one direction induces a deviation in the opposite direction to restore the equilibrium” (Tversky & Kahneman, 1974, p. 1125). Thus, in the gambler’s fallacy, a tail is due to reverse a streak of heads. In the hot hand belief, a streak of successes may indicate the existence of a hot hand by which the streak tends to be prolonged (see also Tversky & Gilovich, 1989).
However, the representativeness account has been criticized for its incompleteness and testability (e.g., Ayton & Fischer, 2004; Falk & Konold, 1997; Gigerenzer, 1996; Kubovy & Gilden, 1991). Ayton and Fischer (2004) suggest that the gambler’s fallacy arises from the experience of negative recency in sequences of natural events such as roulette games, but the hot hand belief arises from the experience of positive recency in serial fluctuations in human performance. Similarly, it has been proposed that the hot hand belief can arise when people evaluate the performance of a mutual fund manager rather than the fluctuations of the portfolio (Rabin, 2002; Rabin & Vayanos, 2010), or, the gambler’s luck rather than the outcomes of a roulette game (Croson & Sundali, 2005; Sundali & Croson, 2006). Moreover, Burns and Corpus (2004) show that subjects assume positive recency for forecasting scenarios they rated as “nonrandom” and negative recency for scenarios they rated as “random”. Burns (2004) further argues that the hot hand belief is a fast and frugal heuristic to detect changes in the shooting accuracy of basketball players. This argument is consistent with the finding of “residual nonstationarity” in Sun (2004), in which it is suggested that the fluctuations in players’ performance can be obscured by realtime adjustments based on the detection of a hot hand. For example, after making several shots in a row, a player might try a more difficult shot or the opponent players may increase the defense effort. (For a review on the hot hand study, see BarEli, Avugos, & Raab, 2006.)
Compared to the representativeness account, the alternative interpretations distinguish the hot hand belief from the gambler’s fallacy by deviations from a random process. When the underlying process is truly random (or statistically impossible to tell apart from independent and stationary Bernoulli trials), both beliefs are considered as biases or misperceptions of randomness. In particular, both beliefs appear to share a common intuition that streak patterns are “rare” and “remarkable” — a streak of heads is unlikely to occur if the coin is fair, or, a basketball player is unlikely to make shots in streaks unless he or she has a hot hand. However, the independence assumption of Bernoulli trials states that, for a fair coin, a streak will occur as often as any other patterns of the same length in its exact order (i.e., the equiprobability of “ngrams”, Falk & Konold, 1997, p. 306). Then, what is so special about streak patterns that people normally tend to avoid them and only expect them when they feel “hot”? In the present paper, we show that streak patterns do possess a set of properties that set them apart from other patterns, and these properties may provide an alternative explanation for the particular role of streak patterns in people’s perception and judgment of randomness.
We exemplify by comparing two patterns ( h,h,h,t ) and ( h,h,h,h ). When a fair coin is tossed repeatedly, both patterns have the same probability of occurrence in any four successive trials. However, it takes on average 16 tosses to encounter the first occurrence of ( h,h,h,t ) but 30 tosses to encounter the first occurrence of ( h,h,h,h ). In other words, streak pattern ( h,h,h,h ) has been “delayed” for its first occurrence. The expected number of trials required for the first occurrence of a particular pattern is a statistical property known as “waiting time”, which can be different among patterns due to different pattern compositions (see Gardner, 1988; Graham, Knuth, & Patashnik, 1994). While the probability of occurrence (or frequency) describes how often a pattern occurs, the waiting time describes when a pattern will occur from the time at which monitoring begins. Interestingly, these are different statistical properties and clearly bear different psychological relevance. For example, for a passenger who is waiting for a bus, when the first bus arrives probably is more relevant than how often the bus arrives. It is the goal of this paper to demonstrate a plausible link between the statistics of pattern times and people’s perception of randomness.
It is important to note that the concept of waiting time has recently received attention in psychology literature (Hahn & Warren, 2009; Sun, Tweney, & Wang, 2010a, 2010b). Hahn and Warren (2009) show that, in a global sequence of moderate length, streak patterns such as ( h,h,h,h ) have higher “probabilities of nonoccurrence” than ( h,h,h,t ). Base on this result, they argue that, given people’s limited exposure to the environment (e.g., the number of coin tosses is limited), misperceptions of randomness such as the gambler’s fallacy might actually emerge as apt reflections of these environmental statistics. Sun, Tweney, and Wang (2010a) criticize Hahn and Warren’s interpretation by clarifying the relationship between the probability of nonoccurrence and waiting time. In particular, Sun et al. argue that the probability of nonoccurrence is a manifestation of waiting time, which is independent of the length of the global sequence, and neither statistic would justify the prediction of reverting of a streak by the gambler’s fallacy (also see Sun, et al., 2010b). Notwithstanding the debate, the argument of treating waiting time as a part of the environmental statistics appears to be quite plausible. Given that different statistics can arise from the same process of coin tossing (or basketball shooting), it is likely that they have been actually experienced by people and have different effects on people’s perception of randomness. In the following, we examine these statistics in detail and discuss their psychological implications.
Let us call the occurrence of a pattern an arrival of the pattern when a coin (fair or biased) is tossed repeatedly. We can define a counting process N( n ),n ≥ 1, where N( n ) denote the number of arrivals of a pattern by the time n (i.e., by the nth toss). The process has parameters µ and σ^{2} as the mean and variance of the time between successive arrivals. (A more detailed treatment is provided in the Appendix. The results presented in this paper are verified by simulations conducted in the R statistics environment and the scripts are available in this issue of the journal.)
For a particular pattern, its interarrival time T is defined as the number of tosses between any two successive occurrences of the pattern, and the first arrival time T^{*} is defined as the number of tosses until the first occurrence since the beginning of the counting process. The mean of interarrival times E [ T ] , hence referred to as “mean time”, is determined by the individual probabilities of the elements in the pattern. Assume a fair coin with equal probabilities of heads and tails, p_{h}=p_{t}=1/2,
E  ⎡ ⎣  T_{h,h,h,t}  ⎤ ⎦  = E  ⎡ ⎣  T_{h,h,h,h}  ⎤ ⎦  =µ=  ⎛ ⎝  p_{h}  ⎞ ⎠  ^{−4}=16. (1) 
That is, patterns ( h,h,h,t ) and ( h,h,h,h ) have the same mean time (16 tosses) between successive arrivals. This is equivalent to the statement that ( h,h,h,t ) and ( h,h,h,h ) have the same probability of occurrence. Regardless of the number of coin tosses, a gambler will encounter either pattern equally often. After n tosses, the expected number of encounters for either pattern is the same as in
E  ⎡ ⎣ 
 ⎤ ⎦  =  ⎛ ⎝  n − 4 + 1  ⎞ ⎠  /16. (2) 
However, the mean of the first arrival time E[T^{*}], the waiting time, can be different due to the different amount of “selfoverlap” within a particular pattern (see Figure 1). The amount of selfoverlap (s) can be defined as the maximum length of a subpattern that has to occur twice (with or without overlap) to start and finish one occurrence of the original pattern. For example, among all patterns of length 4, pattern (h,h,h,h) has the largest amount of selfoverlap (s=3), and ( h,h,h,t ) is nonoverlapping (s=0). A direct consequence of selfoverlap is that the pattern’s first occurrence will be delayed when s>0. Imagine that one is waiting for an occurrence of ( h,h,h,h ) and has already obtained three heads — a subpattern of length 3, ( h,h,h ) — if the 4th toss is a tail, the waiting has to start from scratch and the waiting time spent on the subpattern ( h,h,h ) is wasted. In contrast, when one is waiting for pattern ( h,h,h,t ) and has already obtained three heads, if the 4th toss is a head, the waiting continues but it still has three heads to start with. It can be shown that, for a fair coin, among all patterns of length 4, ( h,h,h,h ) and ( h,h,h,t ) have the longest and shortest waiting times, respectively (also see Table 1),

E  ⎡ ⎣  T_{h,h,h,t}^{*}  ⎤ ⎦  = E  ⎡ ⎣  T_{h,h,h,t}  ⎤ ⎦  = 16. (4) 
h h h h h h h h h h h h h h h h
h h h t h h h t h h h t h h h t
Moreover, it can be shown that waiting time E[ T^{*} ] is almost perfectly correlated to the variance of interarrival times Var ( T ) for patterns of the same length (see Table 1 and Appendix). Intuitively, both E[ T^{*} ] and Var ( T ) are direct consequences of the selfoverlapping property, and the amount of selfoverlap in the pattern determines the minimum distance by which a consecutive occurrence can follow (i.e., the shortest interarrival time). While consecutive reoccurrences of ( h,h,h,t ) have to be completely separated from each other thus more evenly distributed, consecutive reoccurrences of ( h,h,h,h ) can overlap with each other and tend to be clustered (see Figure 1). As a consequence, among all possible patterns of length 4, these two patterns have the smallest and largest variance of interarrival times, respectively:
Var  ⎛ ⎝  T_{h,h,h,t}  ⎞ ⎠  = 144, SD  ⎛ ⎝  T_{h,h,h,t}  ⎞ ⎠  = 12; 
Var  ⎛ ⎝  T_{h,h,h,h}  ⎞ ⎠  = 592, SD  ⎛ ⎝  T_{h,h,h,h}  ⎞ ⎠  = 24.33. 
In essence, the contrast between mean time and waiting time lies in the contrast between “frequency” and “delay”. On one hand, the mean time estimates the average distance between consecutive occurrences and equals to the inversion of the probability of occurrence, therefore, it is a measure of frequency. When the coin is fair, patterns of the same length have the same mean time thus the same frequency to occur (see Equations 1 and 2). On the other hand, waiting time estimates when a pattern will occur since one starts counting and the time of occurrence is delayed on the basis of the pattern’s mean time: Equations (3) and (4) show that the amount of delay for an overlapping pattern (s>0) equals the waiting time for the repeating subpattern of length s; for a nonoverlapping pattern (s=0), no delay is incurred and its waiting time always equals its mean time (also see Table 1).^{1}
If one assumes that people’s perception of randomness is shaped by the environment (e.g., Ayton & Fischer, 2004; Lopes & Oden, 1987; Pinker, 1997), it is likely that people have actually experienced different statistics from the same process, although they might not be aware of the exact distinction. The contrast, either between mean time and waiting time, or between frequency and delay, might have important implications regarding people’s perception of sequential patterns, particularly in the gambler’s fallacy and the hot hand belief. In the following, we first examine the (exante) perception or expectation of patterns as an integrated sequence, then, the prediction of a single outcome based on the perception of patterns.
First, due to the largest amount of selfoverlap, a streak pattern is the most delayed pattern for its first occurrence, comparing to all other patterns of the same length. The amount of delay is considerably large even for short streaks (see Table 1), and it will grow exponentially as the length of the streak grows. For example, for a streak of 10 heads in tossing a fair coin, its mean time is 1024 tosses, and its waiting time is 2046 tosses, 1022 tosses away from the mean time (which is the waiting time for a streak of 9 heads). Given that the mean time remains the same for all patterns of the same length, it is possible that people’s sense of rareness about streak patterns have stemmed from their experiences of the long waiting times.
Moreover, the waiting time statistic can manifest itself in many other forms. One example is the probability of occurrence at least once — the probability that a particular pattern occurs at least once when a coin is tossed N times — which is complementary of the probability of nonoccurrence — the probability that a particular pattern will not occur at all in N tosses. The latter probability has been discussed by Hahn and Warren (2009), and Sun et al. (2010a) provide an analytical solution to both probabilities. It can be shown that among all patterns of length 4, the streak pattern ( h,h,h,h ) has the lowest probability of occurrence at least once for any N>4, which is another consequence of the selfoverlapping probability in the pattern composition (Figure 2 shows the comparison between ( h,h,h,h ) and ( h,h,h,t )). This fact might from another prospective explain why streak patterns are underrepresented in people’s perception. That is, because of its clustering tendency, overlapped reoccurrences of a streak pattern may be counted only once or replaced by one count of a longer streak. Such speculation appears to be consistent with the finding in a recent study by Olivola and Oppenheimer (2008): when participants recalled the studied binary sequence, the lengths of streaks imbedded in the original sequence were underestimated.
Nevertheless, although the long waiting time might provide a statistical basis to justify people’s perception of streak patterns as rare events, it does not justify the prediction of a single outcome to reverse (or, to avoid) a streak pattern by the gambler’s fallacy. By the independence assumption of Bernoulli trials, given that one has already obtained three heads in a row, the additional time to encounter ( h,h,h,h ) is E[T_{h,h,h,h}], and the additional time to encounter ( h,h,h,t ) is E[T_{h,h,h,t}], which is the same in the case of a fair coin (see Equations 3 and 4). That is, the statement that the streak pattern( h,h,h,h )’s first occurrence is “delayed” is an exante expectation when the pattern is treated as a whole as one starts tossing the coin from scratch. However, such statement does not mean that the “streakreversal” pattern ( h,h,h,t )’s first occurrence is “expedited” since its waiting time cannot be shorter than its mean time. In other words, although waiting time (or probability of occurrence at least once) may depict ( h,h,h,t ) as the most “representative” pattern of the coin tossing process (for its waiting time is equal to its mean time, or, its occurrences are most evenly distributed), it does not predict that a streak of heads will soon be reversed by a tail, thus, it does not vindicate the error in the gambler’s fallacy.
Then, what about the prediction of a single outcome to continue a streak by the hot hand belief? The debate over the statistical validity of the hot hand belief has lasted more than twenty years (e.g., BarEli et al., 2006), and it is not likely to be ended by simply introducing a new set of statistics. However, pattern time statistics do seem to support some of the existing theories. In particular, it has been suggested that the hot hand belief arises when people are evaluating human performance, and people pay particular attention to streak patterns in order to detect a change in the performance, for example, fluctuations in the shooting accuracy of basketball players (e.g., Ayton & Fischer, 2004; Burns, 2004; Burns & Corpus, 2004; Sun, 2004). By such account, the prediction to continue a streak is actually valid on the basis of a higher probability of a single outcome (e.g., a higher shooting accuracy, a higher probability of heads in case of a biased coin). It can be shown that by the measure of either mean time or waiting time, streak patterns are indeed a good indicator for detecting the changes in the probability of single outcomes. Figure 3 shows mean time and waiting time as the function of the probability of heads (p_{h}), respectively. It shows that with a small change in p_{h}, both mean time and waiting time change more rapidly for pattern ( h,h,h,h ) than for pattern ( h,h,h,t ). For example, when p_{h} increases from .5 to .6, E[T_{h,h,h,h}] drops from 16 tosses to 7.7 tosses (Δ = 8.3) and E[ T_{h,h,h,h}^{*} ] drops more rapidly from 30 tosses to 16.8 tosses (Δ = 13.2). In contrast, E[T_{h,h,h,t}] and E[ T_{h,h,h,t}^{*} ] only drops from 16 tosses to 11.6 tosses (Δ= 4.4) (note that for pattern ( h,h,h,t ), its mean time and waiting time are identical at all levels of p_{h}).
Furthermore, Figure 3 also shows that the mean time and waiting time depict different pictures regarding the occurrences of streak pattern ( h,h,h,h ) at various levels of p_{h}. For example, at p_{h} = .5, ( h,h,h,h ) and ( h,h,h,t ) are indifferent by the measure of mean time but quite distinguishable by the measure of waiting time, a fact we have mentioned before. It also shows that ( h,h,h,h ) will occur more frequently than ( h,h,h,t ) (shorter mean time) as soon as p_{h}>.5, but remains delayed (longer waiting time) for its first occurrence until p_{h}>.7. Then, one may wonder if the mean time and waiting time have different effects on people’s perception, or, if people perceive the properties of frequency and delay differently, how these effects can be integrated to produce a single response in the subjective preference over patterns?
As mentioned before, waiting time E[ T^{*} ] is in effect an indicator of the variance of the interarrival times Var ( T ) as both are direct consequences of the interoverlapping property of the pattern. From this prospective, the contrast between frequency (mean time) and delay (waiting time) actually reflects the contrast between the mean and variance of the same random variable interarrival times, E[ T ] and Var ( T ). To evaluate the combined effect of frequency and delay on people’s perception, a quantitative measure may be provided by the meanvariance paradigm in the modern portfolio theory (Markowitz, 1952; Sharpe, 1994) or the coefficient of variation in risk perception (Weber, Shafir, & Blais, 2004).^{2} In this paradigm, to describe the desirability of an option in the tradeoff between return and risk, a Sharpe Ratio (Sharpe, 1994) is calculated as the ratio between the expected return and the variance of the returns. Similarly for pattern occurrences, we can calculate a “FrequencyDelay ratio” (100/µ σ) to describe the tradeoff between the frequency of occurrence (1/µ) and the delay (σ), where µ=E[T] and σ = SD( T ) are the mean and standard deviation of interarrival times, respectively, and the constant 100 is to express the ratio as a percentage. Analogous to the Sharpe Ratio by which people would prefer an option with a higher return and a lower risk, the assumption underlying the FrequencyDelay ratio is that people would be more willingly to make a prediction on a pattern with a higher frequency of occurrence and a lower amount of delay. In other words, a person can possess both the gambler’s fallacy and the hot hand belief, and which belief arises would depend on how frequency and delay are weighted separately and how they are incorporated. Figure 4 shows the FrequencyDelay ratio at various levels of p_{h}, where pattern ( h,h,h,t ) has a higher ratio when p_{h}<.61, and pattern ( h,h,h,h ) has a higher ratio when p_{h}>.62. Described by such ratio, a person would be more willingly to predict on ( h,h,h,t ) when p_{h}<.61, and more willingly to predict on ( h,h,h,h ) when p_{h}>.62. That is, “a streak of heads is unlikely to occur if the coin is fair, and a basketball player is unlikely to make shots in streaks unless he or she (really) has a hot hand.”
We presented a set of statistics on the time of pattern occurrences in Bernoulli trials. In particular, we demonstrated that, due to the different pattern composition, different statistical properties can arise. The mean time measures the frequency of pattern occurrences, and the waiting time measures the amount of delay in pattern occurrence times. While previous research on perception of random patterns has focused on the mean time or the frequency of occurrences (e.g., Budescu, 1987; Falk & Konold, 1997; Gilovich, et al., 1985; Lopes & Oden, 1987; Nickerson, 2002; Oskarsson, et al., 2009), the waiting time and the property of delay have not been addressed until recently. It is likely that people are not able to precisely differentiate these statistics. However, given that these statistics can be observed from the same process, it is possible that they all have played roles in shaping people’s perception of random patterns. It has long been argued that people may have an accurate sense of randomness but fail to reveal it in their behavior (e.g., Pinker, 1997; for an overview of different opinions, see Rapoport & Budescu, 1992). The distinction between mean time and waiting time may provide a new prospective in the studies on people’s perception of random events. For instance, the mean time does not differentiate any patterns in the case of tossing a fair coin; if one assumes people have acquired accurate experiences from the environment, it is quite possible that people’s notion of streak patterns as rare events is guided by the waiting time.
It should be noted that the statistics of pattern times can manifest themselves in many forms and each manifestation may receive different interpretations, depending on the specific assumptions about human perception of sequential events and the specific task environment. For example, regarding the distinction between frequency and delay, people may be more sensitive to one property than to the other; or one property is more important than the other in different situations. For a passenger waiting for a bus, if he or she is concerned only with catching the first bus (or, at least one bus), the waiting time (delay) is certainly more important. However, if the passenger is interested in estimating the number of bus arrivals in a certain period of time, the mean time should be the statistic of choice. Another example is that, if we assume the actual basketball shooting as tossing a coin with a fixed probability of heads (as the null hypothesis), the hot hand belief would be judged as a fallacy. However, if we assume that the basketball player’s shooting accuracy is initially unknown and can fluctuate, paying attention to the occurrence of streaks may actually be an effective way to detect the change.
Even in simple cases like coin tossing, people’s perception of randomness surely cannot be reduced to a certain set of statistics. Many other perceptual and cognitive mechanisms can come into play, such as perception of proportion and symmetry (Rapoport & Budescu, 1997), subjective complexity (Falk & Konold, 1997), or, working memory capacity (Kareev, 1992). Nevertheless, the statistics of pattern times appear to qualify for a useful toolset since they provide objective measures in situations where sequential information is essential. For example, people respond differently depending on whether they view sequences all at once on paper or they actually observe sequences unfolding over time (e.g., Olivola & Oppenheimer, 2008). They habitually look for sequential patterns and their perception of patterns influence their responses in single experimental trials even when the sequence of trials is completely independent (Barraclough, Conroy, & Lee, 2004; Huettel, 2006; Huettel, Mack, & McCarthy, 2002). Moreover, the dissociation of frequency and delay might have important implications in studies on people’s intertemporal choices. And it has been suggested that people are sensitive to time discounting while the behavioral and neural effects of delays are independent of probability (e.g., Luhmann, Chun, Yi, Lee, & Wang, 2008; McClure, Laibson, Loewenstein, & Cohen, 2004). In these aspects, we conjecture that examination of the pattern time statistics, combined with empirical experiments, might be a fruitful approach in future investigations on pattern detection, perception of randomness, and judgment and decisionmaking under uncertainty.
Let X_{1} ,X_{2} ,... be independent variables with P{X_{i}=j}= p(j), j≥ 0 . In the case of coin tossing, j=0,1, p(0)=p_{h} and p(1) = p_{t} represent the probabilities of a head and a tail, respectively. For a pattern ( x_{1} ,...,x_{r} ) of length r, we say that an arrival (renewal) occurs at time n, n≥ r, if ( X_{n − r + 1} ,...,X_{n} ) = ( x_{1} ,...,x_{r} ). LetN(n) denote the number of arrivals by time n. N( n ),n ≥ 1, is a counting process in which the first arrival time has a different distribution than the other interarrival times. Then, N( n ),n ≥ 1 is said to be a general or delayed renewal process with parameters µ and σ^{2} as the mean and variance of the time between successive arrivals.
Define indicator variables I(i), I(i)=1 if there is an arrival at time i and I( i )=0 otherwise, i ≥ r. I( i ) are Bernoulli random variables with parameter p, where
p = ∏_{i = 1}^{r} 
 . (5) 
Then, the mean interarrival time is given by
µ = 1/p, (6) 
and the variance of interarrival times is given by
σ ^{2} = p^{ − 2}  ⎛ ⎝  1 − p  ⎞ ⎠  + 2p^{ − 3} 

 . (7) 
An overlap index s is defined as the maximum number of elements at the end of the pattern that can be used as the beginning part of the next arrival,
s = max  ⎧ ⎨ ⎩ 
 ⎫ ⎬ ⎭  . (8) 
We further define s=0 when no equality can be found in Equation (8). Then, s=0 for (h,h,h,t), s=2 for (h,t,h,t), and s=3 for ( h,h,h,h ).
Since the first arrival time can have a different distribution, let T denote the interarrival time of the process when reoccurrences are included, and T^{*} denote the first arrival time. For pattern ( h,h,h,t ), r=4 and s=0, so that N( n ),n ≥ 1 is an ordinary renewal process. Therefore, for a fair coin, p_{h}=p_{t}=1/2, from Equation (6),
E  ⎡ ⎣  T_{h,h,h,t}  ⎤ ⎦  = E  ⎡ ⎣  T_{h,h,h,t}^{*}  ⎤ ⎦  = µ = 1/p = 16. 
Since two arrivals of ( h,h,h,t ) cannot occur within a distance less than r (r=4) of each other, it follows that I( r )I( r + j ) = 0 when 1 ≤ j ≤ r−1, and,

Assume a fair coin, from Equation (7), we obtain

For pattern ( h,h,h,h ), r=4 and s=3, and
T_{h,h,h,h}^{*} = T_{h,h,h}^{*} + T_{h,h,h,h}, (9) 
where T_{h,h,h,h}^{*} is the first arrival time for pattern ( h,h,h,h ), T_{h,h,h}^{*} is the first arrival time for pattern ( h,h,h ), and T_{h,h,h,h} is the interarrival time for ( h,h,h,h ). Since the coin is tossed independently, we have
E  T_{h,h,h,h}^{*}  ⎤ ⎦  = E  ⎡ ⎣  T_{h,h,h}^{*}  ⎤ ⎦  + E  ⎡ ⎣  T_{h,h,h,h}  ⎤ ⎦  . (10) 
For a fair coin, from Equation (6), we have
E  ⎡ ⎣  T_{h,h,h,h}  ⎤ ⎦  = µ = 16. 
Then, E[ T_{h,h,h,h}^{*} ] can be obtained by recursively applying Equation (10) s times, starting from the shortest nonoverlap pattern ( h ). That is,
E  ⎡ ⎣  T_{h}  ⎤ ⎦  = E  ⎡ ⎣  T_{h}^{*}  ⎤ ⎦  = 1/p_{h} = 2; 
E  ⎡ ⎣  T_{h,h}^{*}  ⎤ ⎦  = E  ⎡ ⎣  T_{h}^{*}  ⎤ ⎦  + E  ⎡ ⎣  T_{h,h}  ⎤ ⎦  = 2 + 4 = 6; 
... 
E  ⎡ ⎣  T_{h,h,h,h}^{*}  ⎤ ⎦  = 2 + 4 + 8 + 16 = 30. 
For the variance of interarrival times, since no two arrivals of ( h,h,h,h ) can occur within a distance (r−s−1) of each other, it follows that I( r )I( r + j )=0 if 1 ≤ j ≤ (r−s−1). Therefore, from Equation (7), we have

The overlapped arrivals can happen in the following ways,
E  ⎡ ⎣ 
 ⎤ ⎦  = P  ⎧ ⎨ ⎩  h,h,h,h,h  ⎫ ⎬ ⎭  = 
 ; 
E  ⎡ ⎣ 
 ⎤ ⎦  = P  ⎧ ⎨ ⎩  h,h,h,h,h,h  ⎫ ⎬ ⎭  = 
 ; 
E  ⎡ ⎣ 
 ⎤ ⎦  = P  ⎧ ⎨ ⎩  h,h,h,h,h,h,h  ⎫ ⎬ ⎭  = 
 . 
Thus,

From Equations (10) and (11), it can be seen the overlapped arrivals will extend both the mean of the first arrival time and the variance of interarrival times. Thus, these two variables are correlated, and both are positively determined by the overlap index s. The procedure of calculating the variance of the first arrival times is omitted here.
This document was translated from L^{A}T_{E}X by H^{E}V^{E}A.