Exhaustive search over infinite binary trees

2008-11-23 01:17

Martín Escardó writes a seemingly impossible functional program: for any total function `f` from infinite bit sequences to `Bool`, his total function `find` maps `f` to an infinite bit sequence, such that if `f` returns `True` on any sequence then `find f` is the least such sequence. It’s a good read and I recommend it. (Update: he wrote a follow-up article that defines a monad of searchable sets.)

Essentially `find` performs an exhaustive search by depth-first backtracking. It’s impressive that `find` can be programmed totally and without inspecting the source code of its argument `f`, even though there are an uncountable number of infinite bit sequences. Escardó makes `find` go faster using lazy evaluation in Haskell: his `find_iii` uses an infinite sequence of lazily evaluated bits. But it wasn’t clear to me what role lazy evaluation plays in `find_vii`, which is his final, fastest version of `find`.

Contrary to what some people inexplicably seem to think, I don’t know from category theory, and I haven’t read Escardó’s paper or code at LICS 2007. Still, I suspect the/an essence of `find_vii` is to replace an infinite sequence of lazily evaluated bits by a lazily evaluated binary tree of bits. To confirm this hypothesis, I reimplemented his algorithm as follows. You can download this post as a program.

````module Exhaustive where`
```

We represent a function from natural numbers to some type `a` by an infinite binary tree whose nodes are labeled by `a`. To be more precise: the root node stores what the function returns at 0; the left branch stores the return values at positive odd integers, and the right branch stores the return values at positive even integers.

````data T a = T a (T a) (T a)`
```

Of course, the container `T` is a functor.

``````instance Functor T where
fmap f (T z o e) = T (f z) (fmap f o) (fmap f e)```
```

Given a value `a` and a tree representing a function f, it is easy to build another tree representing the function g defined by g(0) = `a` and g(n+1) = f(n).

``````cons :: a -> T a -> T a
cons a (T z o e) = T a (cons z e) o```
```

It is also easy to represent the identity function.

``````infinite :: (Num i) => T i
infinite = T 0 o (fmap (1+) o) where o = fmap (\n -> n * 2 + 1) infinite```
```

The `index` function converts a tree back to the function on natural numbers that it represents.

``````index :: (Integral i) => T a -> i -> a
index _ i | i < 0 = error "index: out of range"
index (T z _ _) 0 = z
index (T _ o e) i
| k == 0    = index e (j-1)
| otherwise = index o j
where (j,k) = divMod i 2```
```

Finally, given a natural number `i` and two trees representing the functions f and g, we can build another tree representing the function h defined by h(n) = f(n) if n<`i`, or g(n`i`) otherwise.

``````takeAppend :: (Integral i) => i -> T a -> T a -> T a
takeAppend i _ _ | i < 0 = error "takeAppend: out of range"
takeAppend 0 _ t = t
takeAppend i (T z1 o1 e1) (T z2 o2 e2)
| k == 0    = T z1 (takeAppend j o1 o2 ) (takeAppend (j-1) e1 ze2)
| otherwise = T z1 (takeAppend j o1 ze2) (takeAppend j     e1 o2 )
where (j,k) = divMod i 2
ze2   = cons z2 e2```
```

Now we can define `find`. It uses the tree representation above to skip constructing long unused parts of the infinite bit sequence.

``````find :: (T Bool -> Bool) -> T Bool
find p = b where
b = fmap (\n -> p (takeAppend n b (cons True (find (\a ->
p (takeAppend n b (cons True a)))))))
infinite```
```

That’s it!

Escardó tests `find` by comparing computable functions on integer sets.

``````type Cantor = Integer -> Bool

forsome, forevery :: (Cantor -> Bool) -> Bool
forsome p = p (index (find (p . index)))
forevery p = not (forsome (not . p))

equal :: (Eq y) => (Cantor -> y) -> (Cantor -> y) -> Bool
equal f g = forevery (\a -> f a == g a)

coerce :: Bool -> Integer
coerce True  = 0
coerce False = 1

f',g',h' :: Cantor -> Integer
f' a = a'(10*a'(3^80)+100*a'(4^80)+1000*a'(5^80))
where a' i = coerce (a i)
g' a = a'(10*a'(3^80)+100*a'(4^80)+1000*a'(6^80))
where a' i = coerce (a i)
h' a = a' k
where i = if a'(5^80) == 0 then 0    else 1000
j = if a'(3^80) == 1 then 10+i else i
k = if a'(4^80) == 0 then j    else 100+j
a' i = coerce (a i)```
```

On my laptop, this code takes only several times more time and space than Escardó’s to compare `f'`, `g'`, and `h'` to each other.

``````*Exhaustive> :set +s
*Exhaustive> equal f' g'
False
(38.20 secs, 2786375560 bytes)
*Exhaustive> equal f' h'
True
(14.10 secs, 1055343780 bytes)
*Exhaustive> equal g' h'
False
(39.12 secs, 2762261544 bytes)
*Exhaustive> equal f' f'
True
(13.87 secs, 1055610904 bytes)
*Exhaustive> equal g' g'
True
(13.65 secs, 972683784 bytes)
*Exhaustive> equal h' h'
True
(14.20 secs, 1054789196 bytes)```
```

I take this experiment to show that a main essence of Escardó’s algorithm is to represent a set of integers by a lazily evaluated binary tree of bits. Maybe it’s worth defining a type class `Find`?

``````class Find a where
find :: (a -> Bool) -> a```
```

Cheers!