Proper Treatment 正當作法/ cs504/ 2008/ Probability 1
2009-01-18 23:57

Marr might have distinguished between two reasons to learn about probability:

  1. It underlies of a formal definition of rationality, which may give rise to a computational model of perception.
  2. It specifies many algorithms for probabilistic inference, which may give rise to an algorithmic model of perception.

The Monty Hall problem and a gross simplification of medical diagnosis are two examples that motivate the following mathematical topics:

No matter which programming language(s) you use to build your probabilistic simulations, it is important to distinguish between doubling the result of flipping one (fair) coin—

(let ([flip (random 2)])
  (+ flip flip))

—and adding the results of flipping two coins.

(+ (random 2) (random 2))

Here (random 2) returns either 0 or 1 with equal probability. Similarly, (random 6) returns an integer between 0 inclusive and 6 exclusive with equal probability. Also, (random) returns a uniformly random real number between 0 and 1. Of course, these are all just pseudo-random numbers, but that’s good enough for our purposes.

We are now using a programming language in which, as in most languages, evaluating the same expression twice may yield different results. For example, evaluating the expression (random 2) may yield 0 one time and 1 the next time. In other words, we are now using a language with a side effect. In a language with no side effect, there is no reason to want to define a function that takes no argument: after all, instead of defining the function (taking no argument and returning a number)

(define (f) (+ 2 2))

we might as well define the number

(define f (+ 2 2))

In contrast, in a language with a side effect, we may well find it useful to define a “function” that takes no argument. For example, we can say

(define (coin) (random 2))

so as to abbreviate the two coin-flipping programs above to

(let ([flip (coin)])
  (+ flip flip))

and

(+ (coin) (coin))

I put quotation marks around “function” above because we are no longer defining a function in the mathematical sense: mathematical functions always return the same value when given the same arguments. Rather, we are more properly said to be defining procedures. Executing the same procedure twice may yield different results.

In Scheme, a procedure is never executed when it is defined, only when it is invoked. So the definition of coin above does not flip a coin, but the calls to coin above flip coins. In contrast, when a function call is evaluated, its arguments are always evaluated; similarly, when a let is evaluated, its “right hand sides” are always evaluated. That is why the first coin-flipping program above means to flip only one coin and double the result, never to flip two coins and add the results.

Besides random choice, two other useful side effects are input and output. In Scheme, evaluating the expression (read) asks the user for some input and returns that input. Hence the program

(+ (read) (read))

asks the user for two input numbers and adds them. In contrast, the program

(let ([input (read)])
  (+ input input))

asks the user for one input number and doubles it. In Scheme, evaluating the expression (display 5) prints 5 to the screen and returns some garbage result. (The garbage is returned just so that all these “functions” return something.) Hence the program

(let ([garbage (display 5)])
  (display 6))

prints 5 and then 6 on the screen. Because let is often used for such sequencing of side effects, Scheme lets us abbreviate this program to

(begin (display 5)
       (display 6))

Similarly, the program

(+ 1 (begin (read) (read)))

asks the user for two input numbers, throw away the first one, and increments the second one.

Here’s an example of a recursive program that uses random choice: the following “function”, which takes no argument, keeps flipping a coin until heads comes up. It returns the number of coin flips it took (which may vary from run to run—hence the quotation marks around “function”).

(define (until-heads)
  (if (= 0 (random 2))
      1
      (+ 1 (until-heads))))

With all these additional Scheme features in place (choose the “Advanced Student” language if you are using DrScheme), we can build the following simulation of the Monty Hall game with a uniformly random host. First, we define a utility function to remove a given element from a given list.

(define (remove-one element l)
  (cond [(empty? l) empty]
        [(equal? element (first l)) (rest l)]
        [else (cons (first l) (remove-one element (rest l)))]))

Second, we define a utility “function” to pick an element from a non-empty list, so that each element has the same probability of being picked. This definition uses the Scheme functions length (to compute the length of a list) and list-ref (to extract an element of a list at a given index), which are easy to define.

(define (uniform l)
  (list-ref l (random (length l))))

Finally, we specify the steps in a round of the Monty Hall game as a cascade of lets.

(define (monty-hall)
  (let ([prize (random 3)])
    (let ([initial (read)])
      (let ([open (uniform (remove-one prize (remove-one initial '(0 1 2))))])
        (let ([garbage (display open)])
          (let ([final (read)])
            (list prize (= prize final))))))))

We can use let* to make this code look a bit prettier.

(define (monty-hall)
  (let* ([prize (random 3)]
         [initial (read)]
         [open (uniform (remove-one prize (remove-one initial '(0 1 2))))]
         [garbage (display open)]
         [final (read)])
    (list prize (= prize final))))

Or we can use begin to avoid naming the garbage variable.

(define (monty-hall)
  (let ([prize (random 3)])
    (let ([initial (read)])
      (let ([open (uniform (remove-one prize (remove-one initial '(0 1 2))))])
        (begin (display open)
               (let ([final (read)])
                  (list prize (= prize final))))))))

Be sure to read Bertsekas and Tsitsiklis’s chapter!