Скачать книгу

sense because 7 is the most likely sum to roll out of the possible outcomes. The standard deviation indicates that the uncertainty associated with that expected value is near 2.4. Inferring from the shape of the distribution, which has most of the probability mass concentrated near the center, one can conclude that on any given roll the outcome will most likely fall between five and nine.

      The distribution just shown is symmetric about the mean, but probability distributions are often asymmetric. To quantify the degree of asymmetry for a distribution, the third moment is used.

      Skew (third moment): This is a measure of the asymmetry of a distribution. A distribution's skew can be positive, negative, or zero and depends on whether the tail to the right of the mean is larger (positive skew), to the left is larger (negative skew), or equal on both sides (zero skew). Unlike mean and standard deviation, which have units defined by the random variable, skew is a pure number that quantifies the degree of asymmetry according to the following formula:

      (1.9)

Figure 1.2 A histogram for 100,000 simulated dice rolls with fair dice. Included is the mean of the distribution (solid line) and the standard deviation of the distribution on either side of the mean (dotted line), both calculated using the observations from the simulated experiment. The average of this distribution was 7.0 and the standard deviation was 2.4, consistent with the theoretical estimates.

The concept of skew and its applications can be best understood with a modification to the dice rolling example. Suppose that the dice are biased rather than fair. Let's consider two scenarios: a pair of unfair dice with a small number bias (two and three more likely) and a pair of unfair dice with a large number bias (four and five more likely). The probabilities of each number appearing on each die for the different cases are shown in Table 1.2.

Table 1.2 The probability of each number appearing on each die in the three different scenarios, one fair and two unfair.

When rolling the fair pair and plotting the histogram of the possible sums, the distribution is symmetric about the mean and has a skew of zero. However, the distributions when rolling the unfair dice are skewed, as shown in Figures 1.3(a) and (b).

      The skew of a distribution is classified according to where the majority of the distribution mass is concentrated. Remember that the positive side is to the right of the mean and the negative side is to the left. The histogram in Figure 1.3(a) has a longer tail on the positive side and has the most mass concentrated on the negative side of the mean: This is an example of positive skew (skew = 0.45). The histogram in Figure 1.3(b) has a longer tail on the negative side and has the majority of the mass concentrated on the positive side of the mean: This is an example of negative skew (skew = –0.45).

      When a distribution has skew, the interpretation of standard deviation changes. In the example with fair dice, the expected value of the experiment is

2.4, suggesting that any given trial will most likely have an outcome between five and nine. This is a valid interpretation because the distribution is symmetric about the mean and most of the distribution mass is concentrated around it. However, consider the distribution in the unfair example with the large number bias. This distribution has a mean of 7.8 and a standard deviation of 2.0, naively suggesting that the outcome will most likely be between six and nine with the outcomes on either side being equally probable. However, because the majority of the occurrences are concentrated on the positive side of the mean (roughly 60% of occurrences), the uncertainty is not symmetric. This concept will be discussed in more detail in a later chapter, as distributions of financial instruments are commonly skewed, and there is ambiguity in defining risk under those circumstances.

Schematic illustration of (a) A histogram for 100,000 simulated dice rolls with unfair dice, biased such that smaller numbers (2 and 3) are more likely to appear on each die. (b) A histogram for 100,000 simulated dice rolls with unfair dice, biased such that larger numbers (4 and 5) are more likely to appear on each die.

Figure 1.3 (a) A histogram for 100,000 simulated dice rolls with unfair dice, biased such that smaller numbers (2 and 3) are more likely to appear on each die. (b) A histogram for 100,000 simulated dice rolls with unfair dice, biased such that larger numbers (4 and 5) are more likely to appear on each die.

Mathematicians and scientists have encountered some probability distributions repeatedly in theory and applications. These distributions have, in turn, received a great deal of study. Assuming the underlying distribution of an experiment resembles a well known form can often greatly simplify statistical analysis. The normal distribution (also known as the Gaussian distribution or the bell curve) is arguably one of the most well‐known probability distributions and foundational in quantitative finance. It describes countless different real‐world systems because of a result known as the central limit theorem. This theorem says, roughly, that if a random variable is made by adding together many independently random pieces, then, regardless of what those pieces are, the result will be normally distributed. For example, the distribution in the two‐dice example is fairly non‐normal, being relatively triangular and lacking tails. If one considered the sum of more and more dice, each of which is an independent random variable, the distribution would gradually take on a bell shape. This is shown in Figure 1.4.

The normal distribution is a symmetric, bell‐shaped distribution, meaning that equidistant events on either side of the center are equally likely and the skew is zero. The distribution is centered around the mean, and outcomes further away from the mean are less likely. The normal distribution has the intriguing property that 68% of occurrences fall within plus-or-minus 1 sigma of the mean, 95% of occurrences are within plus-or-minus 2 sigma of the mean, and 99.7% of occurrences are within plus-or-minus 3 sigma of the mean. Figure 1.5 plots a normal distribution.

      These probabilities can be used to roughly contextualize distributions with similar geometry. For example, in the fair dice pair model, the expected value of the fair dice experiment was 7.0, and the standard deviation was 2.4. With the assumption of normality, one would infer there is roughly a 68% chance that future outcomes will fall between five and nine. The true probability is 66.67% for this random variable, indicating that the normality assumption is not exactly correct but can be used for the purposes of approximation. As more dice are added to the example, this approximation becomes increasingly accurate.

Schematic illustration of a histogram for 100,000 simulated rolls with a group of fair, six-sided dice numbering (a) 2, (b) 4, or (c) 6.

Figure 1.4 A histogram for 100,000 simulated rolls with a group of fair, six‐sided dice numbering (a) 2, (b) 4, or (c) 6.

      Understanding distribution statistics and the properties of the normal distribution is incredibly useful in quantitative finance. The expected return of a stock is usually estimated by the mean return, and the historic risk is estimated with the standard deviation of returns (historical

Скачать книгу