Exhaustive search over infinite binary trees

標籤 Tags: english

2008-11-23 01:17

Martín Escardó writes a seemingly impossible functional program: for any total function f from infinite bit sequences to Bool, his total function find maps f to an infinite bit sequence, such that if f returns True on any sequence then find f is the least such sequence. It’s a good read and I recommend it. (Update: he wrote a follow-up article that defines a monad of searchable sets.)

Essentially find performs an exhaustive search by depth-first backtracking. It’s impressive that find can be programmed totally and without inspecting the source code of its argument f, even though there are an uncountable number of infinite bit sequences. Escardó makes find go faster using lazy evaluation in Haskell: his find_iii uses an infinite sequence of lazily evaluated bits. But it wasn’t clear to me what role lazy evaluation plays in find_vii, which is his final, fastest version of find.

Contrary to what some people inexplicably seem to think, I don’t know from category theory, and I haven’t read Escardó’s paper or code at LICS 2007. Still, I suspect the/an essence of find_vii is to replace an infinite sequence of lazily evaluated bits by a lazily evaluated binary tree of bits. To confirm this hypothesis, I reimplemented his algorithm as follows. You can download this post as a program.

module Exhaustive where

We represent a function from natural numbers to some type a by an infinite binary tree whose nodes are labeled by a. To be more precise: the root node stores what the function returns at 0; the left branch stores the return values at positive odd integers, and the right branch stores the return values at positive even integers.

data T a = T a (T a) (T a)

Of course, the container T is a functor.

instance Functor T where
    fmap f (T z o e) = T (f z) (fmap f o) (fmap f e)

Given a value a and a tree representing a function f, it is easy to build another tree representing the function g defined by g(0) = a and g(n+1) = f(n).

cons :: a -> T a -> T a
cons a (T z o e) = T a (cons z e) o

It is also easy to represent the identity function.

infinite :: (Num i) => T i
infinite = T 0 o (fmap (1+) o) where o = fmap (\n -> n * 2 + 1) infinite

The index function converts a tree back to the function on natural numbers that it represents.

index :: (Integral i) => T a -> i -> a
index _ i | i < 0 = error "index: out of range"
index (T z _ _) 0 = z
index (T _ o e) i
    | k == 0    = index e (j-1)
    | otherwise = index o j
    where (j,k) = divMod i 2

Finally, given a natural number i and two trees representing the functions f and g, we can build another tree representing the function h defined by h(n) = f(n) if n<i, or g(n−i) otherwise.

takeAppend :: (Integral i) => i -> T a -> T a -> T a
takeAppend i _ _ | i < 0 = error "takeAppend: out of range"
takeAppend 0 _ t = t
takeAppend i (T z1 o1 e1) (T z2 o2 e2)
    | k == 0    = T z1 (takeAppend j o1 o2 ) (takeAppend (j-1) e1 ze2)
    | otherwise = T z1 (takeAppend j o1 ze2) (takeAppend j     e1 o2 )
    where (j,k) = divMod i 2
          ze2   = cons z2 e2

Now we can define find. It uses the tree representation above to skip constructing long unused parts of the infinite bit sequence.

find :: (T Bool -> Bool) -> T Bool
find p = b where
    b = fmap (\n -> p (takeAppend n b (cons True (find (\a ->
                    p (takeAppend n b (cons True a)))))))
             infinite

That’s it!

Escardó tests find by comparing computable functions on integer sets.

type Cantor = Integer -> Bool

forsome, forevery :: (Cantor -> Bool) -> Bool
forsome p = p (index (find (p . index)))
forevery p = not (forsome (not . p))

equal :: (Eq y) => (Cantor -> y) -> (Cantor -> y) -> Bool
equal f g = forevery (\a -> f a == g a)

coerce :: Bool -> Integer
coerce True  = 0
coerce False = 1

f',g',h' :: Cantor -> Integer
f' a = a'(10*a'(3^80)+100*a'(4^80)+1000*a'(5^80))
    where a' i = coerce (a i)
g' a = a'(10*a'(3^80)+100*a'(4^80)+1000*a'(6^80))
    where a' i = coerce (a i)
h' a = a' k
    where i = if a'(5^80) == 0 then 0    else 1000
          j = if a'(3^80) == 1 then 10+i else i
          k = if a'(4^80) == 0 then j    else 100+j
          a' i = coerce (a i)

On my laptop, this code takes only several times more time and space than Escardó’s to compare f', g', and h' to each other.

*Exhaustive> :set +s
*Exhaustive> equal f' g'
False
(38.20 secs, 2786375560 bytes)
*Exhaustive> equal f' h'
True
(14.10 secs, 1055343780 bytes)
*Exhaustive> equal g' h'
False
(39.12 secs, 2762261544 bytes)
*Exhaustive> equal f' f'
True
(13.87 secs, 1055610904 bytes)
*Exhaustive> equal g' g'
True
(13.65 secs, 972683784 bytes)
*Exhaustive> equal h' h'
True
(14.20 secs, 1054789196 bytes)

I take this experiment to show that a main essence of Escardó’s algorithm is to represent a set of integers by a lazily evaluated binary tree of bits. Maybe it’s worth defining a type class Find?

class Find a where
    find :: (a -> Bool) -> a

Cheers!