- Recent Changes 新聞
- History 歷史
- Preferences 喜好
- Discussion 討論
The present message recasts the recent Martín Escardó’s article on satisfaction of total computable predicates over all Cantor numbers in terms of explicit streams and depth-limited depth-first search. This alternative point of view may make it easier to see why the satisfaction problem is decidable. We show how to add transparent memoization to stream predicates: without changing their code, we can cause our predicates to apply themselves to partially available data and yield the residual. For dense predicates our code is about 9 times more efficient than the most sophisticated Martín Escardó’s code. Our approach is particularly fast at reporting dissatisfaction.
The complete code for this article is available at http://okmij.org/ftp/Haskell/StreamPEval.hs
In a recent article Martín Escardó wrote about decidable
checking of satisfaction of a total computable predicate on Cantor
numerals. The latter represent infinite bit strings, or real
numbers within [0,1]
.
http://math.andrej.com/2007/09/28/seemingly-impossible-functional-programs/
Martín Escardó’s technique can tell, in finite time, if a given total computable predicate is satisfied over all possible infinite bit strings. (The cardinality of the set of Cantor numerals is greater than that of natural numbers). Furthermore, for sparse predicate (defined below), Martín Escardó’s technique is very fast.
The mystery of Martín Escardó article is revealed by noticing
that the predicate in question must be total and computable. We can
represent any predicate p
on Cantor numbers (i.e., on
infinite bit strings) as a function [Bit] -> Maybe
Bool
. Given a finite prefix of the infinite bit string, the
predicate returns True
or False
– or
requests to be given a longer prefix. We shall assume all
predicates to be monotone in that once the predicate returned
Just True
or Just False
on some prefix,
it will return the same result on any longer prefix. Undecidability
of p
comes in two forms: First, a predicate may fail
to return any result, by looping or raising a run-time error like
stack overflow. Second, the predicate may keep returning
Nothing
asking for longer and longer prefix. A total
computable predicate has none of the above problems. Given a
prefix, it always (in finite time) returns the answer. Furthermore,
for each predicate there exists a number N
so that the
predicate will return either Just True
or Just
False
for any prefix of the length N
. In other
words, given progressively longer prefixes, the total computable
predicate will give the answer Nothing
only finitely
many times.
It is clear then we can check the satisfaction of the total
computable predicate over all Cantor numerals, using, for
example, iterative deepening strategy. Since we are guaranteed
there exists a depth (prefix length) so that the predicate returns
either True
or False
and no further
increase in depth changes the result, the strategy is complete.
Some “obvious” predicates are alas not in the class of total computable. For example, equality of two Cantor numerals. Mathematically speaking, we can tell if any two Cantor numbers are the same or different. One can hardly build any mathematical theory if one cannot identify its objects. Computationally speaking, equality is only semi-decidable. If two Cantor numbers are different, they differ in at least one bit at a finite index. That difference can be detected in finite time. Deciding if two numbers (infinite bit streams) are the same is not generally computable. More precisely, the paper
G. Rosu, Equality of streams is a Pi^0_2-Complete Problem, ICFP06
proved that deciding equality of two infinite (bit) streams is
equivalent to the Halting problem. In the context of Martín
Escardó’s article, we can define a binary equality predicate
ceq
and an (unary) predicate badp1
:
ceq c1 c2 = foldr (\i r -> r && c1 i == c2 i) True [0..] badp1 a = ceq (Zero # a) a
If we add this code to the one in Martín Escardó’s article and
use his forsome
quantifier to check the satisfaction
of badp1
– (forsome badp1)
– we get the
stack overflow error. However, mathematically speaking
badp1
is satisfiable, by the Cantor number represented
by the infinite string of zero bits. We have encountered
incompleteness – the satisfaction of badp1
is true
but we can’t compute it.
The original Martín Escardó’s article represented Cantor numbers as functions from a bit index to the bit value. We chose a different approach, a pair of a finite prefix and a function to extend this prefix. Rather than using a list of bits, we chose a finite map (from bit indices to their values) to represent the finite prefix.
data Bit = Zero | One deriving (Eq,Show) type Prefix = M.IntMap Bit data Cantor = Cantor Prefix (Prefix -> Cantor)
Here is one Cantor number, an alternating stream of one and zero bits.
twothirds = twothirds1 0 M.empty where twothirds0 i pr = Cantor (M.insert i Zero pr) (twothirds1 (succ i)) twothirds1 i pr = Cantor (M.insert i One pr) (twothirds0 (succ i))
As was mentioned already, a function on Cantor numbers takes a finite prefix and either produces a value or reports “failure” (hence the monadic type):
type CantorFN m a = Prefix -> m a
The failure means the given data are not sufficient to compute
the result and so the function ought to be called with a longer
prefix. If the monad in question is the Maybe monad (with
Nothing
indicating failure), we define the application
of a function to a Cantor number as follows:
capply1 :: CantorFN Maybe a -> Cantor -> a capply1 f (Cantor pr next) = maybe (capply1 f (next pr)) id (f pr)
Quite literally, we feed the function a progressively longer prefix until the function succeeds and yields the value. For a total computable function, that moment shall happen sooner or later.
We can define a few sample functions on Cantor numbers. To make
them easier to write, we assume (m a)
is in
Num
provided that a
is. We further define
a method to extract a bit from the prefix, of fail trying.
class Monad m => GetBit m where getbit :: Prefix -> Int -> m Natural
For the Maybe monad, we have
instance Num a => Num (Maybe a) where (+) = liftM2 (+) (-) = liftM2 (-) (*) = liftM2 (*) fromInteger = return . fromInteger instance GetBit Maybe where getbit pr i = liftM coerce (M.lookup i pr)
The first sample function on Cantor numbers is
proj i = \pr -> getbit pr i
which returns the i-th bit of a number. More sophisticated are the following functions, originally introduced in Martín Escardó’s article. We keep their original notation in comments, to contrast his and our notations.
-- f, g, h :: Cantor -> Integer f,g,h :: (GetBit m, Num (m Natural)) => CantorFN m Natural -- f a = coerce(a(7 * coerce(a 4) + 4 * (coerce(a 7)) + 4)) f a = getbit a =<< (7 * (getbit a 4) + 4 * (getbit a 7) + 4) -- g a = coerce(a(coerce(a 4) + 11 * (coerce(a 7)))) g a = getbit a =<< (getbit a 4 + 11 * (getbit a 7)) -- h a = if a 7 == Zero -- then if a 4 == Zero then coerce(a 4) else coerce(a 11) -- else if a 4 == One then coerce(a 15) else coerce(a 8) h a = ifm ((== 0) `liftM` getbit a 7) (ifm ((== 0) `liftM` getbit a 4) (getbit a 4) (getbit a 11)) (ifm ((== 1) `liftM` getbit a 4) (getbit a 15) (getbit a 8)) ifm test tb eb = do b <- test; if b then tb else eb
The following function is not present in Martín Escardó’s
article. It converts the prefix of the width w
of a
Cantor number to an Integer. The function is, too, total, and
dense: it uses all of the bits from its argument, up to a
certain limit. In contrast, (proj i)
is an ultimate
“sparse” function: of all the bits in the given prefix it uses only
one for its result.
-- ton w a = foldl (\ac i -> 2*ac + coerce (a i)) 0 [0..w-1] ton w a = foldl (\ac i -> 2*ac + getbit a i) 0 [0..w-1]
For example, capply1 (ton 5) twothirds
gives the
answer 21 (which is 10101 in binary).
We now come to quantification: given a total computable
predicate, CantorFN Maybe Bool
, we check to see if it
is satisfiable. If so, we return the witness, a Cantor number that
makes the predicate succeed with a True
value. The
following predicate uses the standard recursive depth-first search,
with no memoization.
search1 :: CantorFN Maybe Bool -> Maybe Cantor search1 p = s0 M.empty 0 where s0 pr i = let pr' = M.insert i Zero pr i' = succ i in case p pr' of Just True -> Just $ Cantor pr' (zeros i') Nothing | c@(Just _) <- s0 pr' i' -> c _ -> s1 pr i s1 pr i = let pr' = M.insert i One pr i' = succ i in case p pr' of Just True -> Just $ Cantor pr' (zeros i') Nothing -> s0 pr' i' Just False -> Nothing
Definitions of more conventional quantifiers from Martín Escardó’s paper is straightforward:
forsome1,forevery1 :: CantorFN Maybe Bool -> Bool forsome1 p = maybe False (const True) (search1 p) forevery1 p = not (forsome1 (liftM not . p)) -- equal :: Eq y => (Cantor -> y) -> (Cantor -> y) -> Bool equal1 f g = forevery1(\a -> liftM2 (==) (f a) (g a))
Time for some tests:
tests11 = equal1 (proj 14) (proj 14) -- Martín Escardó's find_i result: (1.82 secs, 39783148 bytes) -- ours: (0.42 secs, 22456632 bytes) tests12 = equal1 (proj 14) (proj 15) -- Martín Escardó's find_i result: (3.64 secs, 79934000 bytes) -- ours: (0.00 secs, 0 bytes) tests13 = equal1 (ton 14) (ton 14) -- Martín Escardó's find_i result: (5.98 secs, 153746524 bytes) -- ours: (2.96 secs, 245394268 bytes) tests14 = equal1 (ton 14) (ton 15) -- Martín Escardó's find_i result: (12.85 secs, 325268744 bytes) -- ours: (0.00 secs, 0 bytes) tests15 = equal1 (ton 15) (ton 15) -- Martín Escardó's find_i result: (13.61 secs, 342230580 bytes) -- ours: (6.03 secs, 524111536 bytes)
The tests are satisfying in one respect. Still, the inefficiency
is obvious. If we enable tracing of each bit retrieval and evaluate
capply1 (ton 3) twothirds
, we see
bit 0: 1 no bit 1 bit 0: 1 bit 1: 0 no bit 2 bit 0: 1 bit 1: 0 bit 2: 1 5
The predicate ton 3
requested bit 0 and got it. It
then requested bit 1 and didn’t get, and so failed. The Cantor
number computed bit 1 as requested, and the predicate is
re-invoked. It again requests the bit 0, and bit 1. Now it needs
bit 2, and so it fails. The Cantor numeral computes bit 2,
re-invokes the predicate, which proceeds to retrieve the bits it
retrieved already. Memoization would help.
As it turns out, we can impose memoization transparently, without changing the code of the functions. We can cause the function to compute the result of their partial application to available bits. We refine the Maybe monad to return not just the indicator of failure (meaning, the request for a longer prefix). Rather, we return a function that takes the longer prefix; that new function – the continuation – is in general specialized to the already seen prefix:
data Inc a = Done a | More (Prefix -> Inc a)
The following instances show that once some data is available, they can be used right away:
instance Monad Inc where return = Done (Done x) >>= k = k x (More f) >>= k = More (\pr -> f pr >>= k) instance GetBit Inc where getbit pr i = maybe (More (\pr -> getbit pr i)) (return . coerce) (M.lookup i pr)
We should modify our previous capply
to use the
residualized, partially applied predicate. The changes compared to
capply1
are trivial:
capply2 :: CantorFN Inc a -> Cantor -> a capply2 f (Cantor pr next) = case f pr of Done x -> x More f -> capply2 f (next pr)
More importantly though, none of the previously defined
functions on Cantor numbers, e.g., f
, h
,
ton
, proj
, need to be re-defined. They
can be used as they are. We may evaluate capply2 (ton 3)
twothirds
and see the difference:
bit 0: 1 no bit 1 bit 1: 0 no bit 2 bit 2: 1 5
Once bit i has been requested and received, it is no longer retrieved again.
After we modify search1
along the lines of
capply2
(please see the complete code), we repeat the
stress tests:
tests23 = equal2 (ton 14) (ton 14) -- Martín Escardó's find_i result: (5.98 secs, 153746524 bytes) -- Ours: (0.70 secs, 27485480 bytes) tests24 = equal2 (ton 14) (ton 15) -- Martín Escardó's find_i result: (12.85 secs, 325268744 bytes) -- ours: (0.00 secs, 0 bytes) tests25 = equal2 (ton 15) (ton 15) -- Martín Escardó's find_i result: (13.61 secs, 342230580 bytes) -- ours: (1.41 secs, 55052820 bytes)
Incidentally, even with the most sophisticated
find_vii
, Martín Escardó’s code equal (ton 14)
(ton 14)
still takes (6.87 secs, 186692916 bytes) Thus, our
technique performs about 9 times faster on “dense” functions.