- Recent Changes 新聞
- History 歷史
- Preferences 喜好
- Discussion 討論

We return to ITA Software’s “word numbers” problem, like the following:

If the integers from 1 to 999,999,999 are written as words, sorted alphabetically, and concatenated, what is the 51 billionth letter?

In the first part, we came up
with a problem statement. We defined a term `ten9`

of
type `(Seminearring a, Character a) => a`

; that is, a
value of any type where we know how to add, multiply, and what to
do with characters. For instance, `ten9`

could take a
values in `[String]`

:

`*WordNumbers1> take 5 (ten9 :: [String]) ["one","two","three","four","five"]`

and the answer to the above problem is easy to write:

`*WordNumbers1> concat (List.sort (ten9 :: [String])) !! 51000000000 *** Exception: Prelude.(!!): negative index`

As you can see, though, this specification has no chance of
working: the indexing function `(!!)`

takes an
`Int`

as its second argument, which one my computer
cannot represent numbers bigger than 2^{31}, which is only
around 2 billion. This limitation merely represents a practical
limitation, that you are unlikely to be able to work with lists
that are so colossally big, and indeed the computation above would
take much too long to complete.

In this post we will start to solve such problems efficiently.
We start by counting the total length of all strings in
`ten9`

. The key idea is to think of converting lists of
strings to polynomials, where each character gets mapped to the
variable *x*, a string of length *n* gets mapped to
*x ^{n}*, and a list of strings get mapped to a sum.
Thus instead of a list starting

`["one","two","three","four","five",...]`

we get a polynomial whose first few terms are

x^{3}+x^{3}+x^{5}+x^{4}+x^{4}+ …

If we evaluate this polynomial at *x* = 1, we get the total
number of terms, which is not very interesting. If we instead take
the derivative first, we get

3

x^{2}+ 3x^{2}+ 5x^{4}+ 4x^{3}+ 4x^{3}+ …

Now evaluation at *x* = 1 gives

3 + 3 + 5 + 4 + 4 + …

which is the total length of all these strings, as desired.

To implement this efficiently, we use the idea of *automatic
differentiation*: we can make the value and first derivative of
a function around *x* = 1 into a seminearring supporting
addition and multiplication. (We could include subtraction and
division if we wanted, but we do not need it here.) Conceptually,
it is best to think about this as the first two terms in the Taylor
series expansion, as we will explain more below.

We now turn to the implementation. As before, you can download this post as a program. First we import what we did previously.

`{-# OPTIONS -W -fglasgow-exts #-} module WordNumbers2 where import WordNumbers1 import Prelude hiding ((+), (*), sum, product) import qualified Prelude as P`

As a warm-up, we first count the total number of strings, using the fact that natural numbers are a nearring.

`newtype Nat a = Nat a deriving (Eq, Ord) type Count = Nat Integer instance (Show a) => Show (Nat a) where show (Nat x) = show x instance (Num a) => Monoid (Nat a) where zero = Nat 0 Nat a + Nat b = Nat (a P.+ b) instance (Num a) => Seminearring (Nat a) where one = Nat 1 Nat a * Nat b = Nat (a P.* b)`

We convert every character into the number 1. This corresponds
to the evaluation at *x* = 1 as explained above.

`instance (Num a) => Character (Nat a) where char _ = one`

We check our work by counting the total number of terminals in
the grammar for `ten9`

.

`*WordNumbers2> ten9 :: Count 999999999`

Note that we get 10^{9} - 1 since ‘zero’ is not produced
by the grammar.

To compute the total count of all letters, we need a kind of
derivative, as mentioned above. In general, we want to keep track
of the value of a function and its first derivative. If we are
working with several variables (as we will later), the first
derivative becomes a vector of partial derivatives. In general, the
derivatives need to be a *module* over the scalars (the base
seminearring), with addition (a monoid structure), and
multiplication by scalars—

`class (Seminearring r, Monoid m) => Module r m where (*>) :: r -> m -> m (*>) = flip (<*) (<*) :: m -> r -> m (<*) = flip (*>)`

—satisfying associativities ```
(x*y) *> m = x *
(y*>m)
```

, `m <* (x*y) = (m<*x) * y`

, and
`(x*>m) <* y = x *> (m<*y)`

along with
various distributive laws. (It’s not clear to me precisely which
distributive laws should hold for a general seminearring.)

(This structure is more properly called a *bimodule*,
because we allow multiplication by scalars on both sides.)

Now a derivative as a pair of a value and a derivative—

`data Deriv r m = Deriv r m deriving (Eq, Show)`

—satisfying the usual law for addition
*d*(*f*+*g*) = *df* + *dg*—

`instance (Monoid r, Monoid m) => Monoid (Deriv r m) where zero = Deriv zero zero Deriv c1 m1 + Deriv c2 m2 = Deriv (c1 + c2) (m1 + m2)`

—while for multiplication we use Leibniz’s rule,
*d*(*f*·*g*) = *f*·*dg* +
*df*·*g*.

`instance (Module r m) => Seminearring (Deriv r m) where one = Deriv one zero Deriv c1 m1 * Deriv c2 m2 = Deriv (c1 * c2) (c1 *> m2 + m1 <* c2)`

We convert characters to derivatives component-wise.

`instance (Character r, Character m) => Character (Deriv r m) where char c = Deriv (char c) (char c)`

To actually use these derivatives, we introduce a wrapper type to keep track of the units, with some obvious instances.

`newtype Wrap s a = Wrap a deriving (Eq, Ord) instance (Monoid a) => Monoid (Wrap s a) where zero = Wrap zero Wrap a + Wrap b = Wrap (a + b) instance (Show a) => Show (Wrap s a) where show (Wrap x) = show x instance (Seminearring a) => Seminearring (Wrap s a) where one = Wrap one Wrap a * Wrap b = Wrap (a * b) instance (Seminearring r) => Module r (Wrap s r) where r *> Wrap m = Wrap (r * m) Wrap m <* r = Wrap (m * r)`

In our case, the units are numbers of characters, which we call
`Volume`

.

`data V type Volume = Wrap V (Nat Integer)`

Every character has length one:

`instance Character Volume where char _ = one`

We could change this definition to count, for instance, the total number of pen strokes used to write all these letters.

Now we can quickly count the total number of characters in the English words from one to a billion.

`*WordNumbers2> ten9 :: Deriv Count Volume Deriv 999999999 70305000000`

For a sanity check, we check the count to a million…

`*WordNumbers2> ten6 :: Deriv Count Volume Deriv 999999 44872000`

… which agrees with our previous result, but completes much more quickly.

It’s worth thinking about how this works. In computing

`*WordNumbers2> ten2 :: Deriv Count Volume Deriv 99 854`

for instance, the total length of the strings from
`one`

to `nine`

is used a total of nine
times, but we don’t want to count their length that many times.
Instead we count how many times they will be used, and multiply
using Leibniz’s rule.

In the next installment we will use this setup to find the 51 billionth letter of this sequence (omitting the sorting), and then we will move on to see how to sort the sequence while maintaining efficiency.

**Update:** Added one associativity law that I
forgot earlier.