Proper Treatment 正當作法/ blog/ tags/ blog

This is Chung-chieh Shan’s English blog. All posts are archived.

Higher-order shell

Many Unix utilities execute their arguments as a command line in turn. Some well-known examples are nice, sudo, xargs, and find. These programs are sometimes called meta-commands, but I think of them as higher-order programs by analogy to higher-order functions.

Surely I was not the first to write the following two higher-order programs, given how useful they are.

tmp :: String -> (FilePath -> IO ()) -> IO ()

The first program is tmp. It puts standard input in a temporary file, then invokes the arguments as a command with the name of the temporary file appended.

This program is useful because some programs require input in the form of an ordinary file rather than a pipe. For example, say that we want to retrieve a URL and display the contents using xdvi. It would be nice to be able to say

curl http://www.diku.dk/~andrzej/papers/RC.dvi | xdvi -

or

xdvi <(curl http://www.diku.dk/~andrzej/papers/RC.dvi)

but my xdvi only works on an ordinary file named on the command line:

$ curl http://www.diku.dk/~andrzej/papers/RC.dvi | xdvi -
xdvi.bin: Fatal error: -: No such file,
          and -.dvi doesn't exist either.

$ xdvi <(curl http://www.diku.dk/~andrzej/papers/RC.dvi)
xdvi.bin: Fatal error: /dev/fd/63: File has zero size,
          and /dev/fd/63.dvi doesn't exist either.

Instead, we can use tmp to adapt xdvi to our purpose.

$ curl --silent http://www.diku.dk/~andrzej/papers/RC.dvi |
  tmp xdvi

When the command is done, tmp deletes the temporary file.

$ echo foo | tmp cat
foo

$ echo foo | tmp echo | xargs cat
cat: /tmp/uyocf2fHQx: No such file or directory

Often I use tmp when I need to process the same data twice. For example, it takes two invocations of psselect to print duplex using manual feed on a printer without duplex.

$ a2ps -o- Higher-order_shell.mdwn |
  tmp sh -c 'psselect -e -r $0 | lpr;
             read </dev/tty;
             psselect -o $0 | lpr'

The use of $0 above is documented in the sh man page:

-c

Read commands from the command_string operand. Set the value of special parameter 0 (see Special Parameters) from the value of the command_name operand and the positional parameters ($1, $2, and so on) in sequence from the remaining argument operands. No commands shall be read from the standard input.

We can also use sh -c in the same way to nest multiple invocations of tmp together. For example, the following command uses lynx to convert two HTML files into plain text, then uses xxdiff to show how the results differ. We need to invoke tmp twice because xxdiff requires both of its arguments to name ordinary files rather than pipes.

$ lynx -dump Control-Monad-State-Lazy.html |
  tmp sh -c 'lynx -dump Control-Monad-State-Strict.html |
             tmp xxdiff $0'

Because xxdiff takes the argument - to mean standard input, the following command works too.

$ lynx -dump Control-Monad-State-Lazy.html |
  tmp sh -c 'lynx -dump Control-Monad-State-Strict.html |
             xxdiff $0 -'

keep :: (forall m. MonadIO m => m ()) -> IO a

The second program is keep. It invokes the arguments as a command and monitors the files the command accesses. Whenever any of the files change, keep invokes the command again. Most of the time, I use keep to invoke LaTeX when I don’t want to bother writing a Makefile:

$ keep pdflatex talk.tex

Internally, keep uses strace (another higher-order program) to find out which files are accessed by the command, and Linux’s inotify facility to watch the files for changes. The incron utility also uses inotify to run a command when files change, but keep discovers which files to watch automatically.

Errata

Thanks to bennymack on reddit for pointing out that file can work on standard input. I changed the example from file to xdvi.

I also fixed the out-of-scope type variable in the faux signature of keep.

Shadow cities

The essay collection Letters of Transit: Reflections on Exile, Identity, Language, and Loss, published after a lecture series at the New York Public Library, includes “Shadow Cities” by André Aciman. An excerpt follows.

¶ And, indeed, there was something physically central about Straus Park. This, after all, was where Broadway and West End Avenue intersected, and the park seemed almost like a raised hub on West 106th Street, leading to Riverside Park on one side and to Central Park on the other. Straus Park was not on one street but at the intersection of four. Suddenly, before I knew why, I felt quite at home. I was in one place that had at least four addresses.

img 3503-edit.jpg

¶ Here you could come, sit, and let your mind drift in four different directions: Broadway, which at this height had an unspecified Northern European cast;

img 3500-edit.jpg

West End, decidedly Londonish;

img 3507-edit.jpg

107th, very quiet, very narrow, tucked away around the corner, reminded me of those deceptively humble alleys where one finds stately homes along the canals of Amsterdam.

img 3516-edit.jpg

And 106th, as it descended toward Central Park, looked like the main alley of a small town on the Italian Riviera,

img 3526-edit.jpg

where, after much trundling in the blinding light at noon as you take in the stagnant odor of fuel from the train station where you just got off, you finally approach a sort of cove, which you can’t make out yet but which you know is there, hidden behind a thick row of Mediterranean pines,

img 3531-edit.jpg

over which, if you really strain your eyes, you’ll catch sight of the tops of striped beach umbrellas jutting beyond the trees, and beyond these, if you could just take a few steps closer, the sudden, spectacular blue of the sea.

img 3537-edit.jpg

¶ To the west of Straus Park, however, the slice of Riverside and 106th had acquired a character that was strikingly Parisian,

img 3487-edit.jpgimg 3495-edit.jpg

and with the fresh breeze which seems to swell and subside all afternoon long, you sensed that behind the trees of Riverside Park,

img 3489-edit.jpg

serene and silent flowed an elusive Seine,

img 3491-edit.jpg

and beyond it, past the bridges that were to take you across, though you couldn’t see any of it yet, was not the Hudson, not New Jersey, but the Left Bank—

img 3563-edit.jpg

not the end of Manhattan, but the beginning of a whole bustling city waiting beyond the trees—as it waited so many decades ago when, as a boy, dreaming of Paris, I would go to the window, look out to the sea at night, and think that this was not North Africa at all, but the Ile de la Cité. Perhaps what lay beyond the trees was not the end of Manhattan, or even Paris, but the beginnings of another, unknown city, the real city, the one that always beckons, the one we invent each time and may never see and fear we’ve begun to forget.

¶ There were moments when, despite the buses and the trucks and the noise of people with boom boxes, the traffic light would change and everything came to a standstill and people weren’t speaking, and the unrelenting sun beat strong on the pavement, and I would almost swear this was an early summer afternoon in Italy, and that what lay behind Riverside Park was not just my imaginery Seine, but the Tiber as well. What made me think of Rome was that everything here reminded me of the kind of place all tourists know well: that tiny, empty piazza with a little fountain, where, thirsty and tired with too much walking all day, you douse your face, then unbuckle your sandals, sit on the scalding marble edge of a Baroque fountain, and simply let your feet rest a while in what is always exquisitely clear, non-drinkable water.

img 3554-edit.jpg

¶ Depending on where I sat, or on which corner I moved to within the park, I could be in any of four or five countries and never for a second be in the one I couldn’t avoid hearing, seeing, and smelling. This, I think, is when I started to love, if love is the word for it, New York. I would return to Straus Park every day, because returning was itself now part of the ritual of remembering the shadow cities hidden there—so that I, who had put myself there, the way squatters put themselves somewhere and start to build on nothing, with nothing, would return for no reason other than perhaps to run into my own footprints. This became my habit, and ultimately my habitat. Sometimes finding that you are lost where you were lost last year can be oddly reassuring, almost familiar. You may never find yourself; but you do remember looking for yourself. That too can be reassuring, comforting.

img 3561-edit.jpg

¶ How uncannily appropriate, therefore, to find out fifteen years later that the statue that helped me step back in time was not that of a nymph, but of Memory herself.

img 3553-edit.jpg

In Greek, her name is Mnemosyne, Zeus’s mistress, mother of the Muses. I had, without knowing it, been coming to the right place after all. This is why I was so disturbed by the imminent demolition of the park: my house of memories would become a ghost park. If part of the city goes, part of us dies as well.

Differentiating regions

(An alternative title for this post is, what is the type of differentiation? Hint: it’s not quite (ℝ→ℝ)→(ℝ→ℝ), because how would you make sure the input function is differentiable?)

You can download this post as a literate Haskell program.

{-# LANGUAGE Rank2Types             #-}
{-# LANGUAGE MultiParamTypeClasses  #-}
{-# LANGUAGE FlexibleInstances      #-}
{-# LANGUAGE FunctionalDependencies #-}
{-# LANGUAGE UndecidableInstances   #-}
{-# LANGUAGE OverlappingInstances   #-}

module Differentiation where

Automatic differentiation

The overloading approach to automatic differentiation has become more popular recently among typed functional programmers. The basic idea is to overload arithmetic operators such as +, ×, and sin so that they work on not just floating-point numbers but pairs (or more generally, sequences) of them, which track quantities along with their rates of change. The overloaded operators are easy to define because, unlike with integration, the rules of differentiation are compositional: you know, d(x + y) = dx + dy, d(x × y) = dx × y + x × dy, d(sin x) = cos x × dx, and so on. Here they are in Haskell.

data D a = D a a
  deriving Show

lift :: Num a => a -> D a
lift x = D x 0

infinitesimal :: Num a => D a
infinitesimal = D 0 1

instance Eq a => Eq (D a) where
  D x _ == D y _ = x == y

instance Ord a => Ord (D a) where
  compare (D x _) (D y _) = compare x y

instance Num a => Num (D a) where
  D x x' + D y y' = D (x + y) (x' + y')
  D x x' * D y y' = D (x * y) (x' * y + x * y')
  negate (D x x') = D (negate x) (negate x')
  abs    (D x x') = D (abs x) (signum x * x')
  signum (D x _)  = lift (signum x)
  fromInteger x   = lift (fromInteger x)

instance Fractional a => Fractional (D a) where
  recip (D x x') = D (recip x) (-x'/x/x)
  fromRational x = lift (fromRational x)

The two components of a D value are a quantity and its derivative, so the lift function ‘lifts’ a number to a constant quantity, and infinitesimal is a quantity with value 0 and derivative 1. Were abs defined by abs x = signum x * x by default, we wouldn’t have to define abs for D above.

Let us use these definitions to model how a parabola reflects light.

picture-1p.png

Suppose that a light ray (red above) enters a parabolic mirror y = x²/4 from above. Where does the reflected ray cross the y axis? In the diagram above, if the point where the ray hits the parabola is (x,y), then the derivative of y with respect to x is tan θ, and the y coordinate where the reflected ray crosses the y axis is y + x/tan 2θ, which is equal to y + x (1/tan θ − tan θ)/2. We can compute this coordinate by automatically differentiating y with respect to x:

curve x = x^2/4

reflect x = let D y y' = curve (lift x + infinitesimal)
            in y + x * (recip y' - y') / 2

As expected, the parabola reflects all incoming rays from above to the focal point (0,1).

*Differentiation> map reflect [1..5]
[1.0,1.0,1.0,1.0,1.0]

To be sure, this code does not work by symbolically differentiating y = x²/4 to yield y′ = x/2. Rather, it computes y alongside y′ for one particular x at a time, so the curve could just as well be defined by a more complex program that uses if and recursion. For example, a ray tracer usually deals with scenes much more complex than a single parabola. This code also does not work by numerically comparing the values of y at nearby values of x. Instead of approximating dy/dx by Δy/Δx where Δx is a very small real number, we compute with an actually infinitesimal dx.

Differentiation is a higher-order function

Before we continue, let us abstract the pattern for differentiation in reflect above into a new higher-order function d, which differentiates any given function at the input 0. For convenience, d also returns the value of the given function at 0.

d :: Num a => (D a -> D b) -> (b, b)
d f = let D y y' = f infinitesimal
      in (y,y')

reflect :: Fractional a => a -> a
reflect x = let (y,y') = d (\h -> curve (lift x + h))
            in y + x * (recip y' - y') / 2

Even though the reflect function uses differentiation internally, we can still differentiate it. Such differentiate is said to be nested. For the parabola, we can confirm that the reflected ray hits the focal point not just at but also around x = 3, because the derivative computed below is zero.

*Differentiation> d (\k -> reflect (3 + k))
(1.0,0.0)

For other curves and surfaces, the derivative is typically not zero and tells us the density of light energy that falls around each point on the y axis. Hence, as Dan Piponi noted, this kind of calculation is performed by ray tracers and other programs that sample from probability distributions.

Another application of automatic differentiation is to find roots of a function using Newton’s method. Therefore, we can use nested automatic differentiation to find local extrema and saddle points of a function using Newton’s method.

The danger of confusing infinitesimals

Jeff Siskind and Barak Pearlmutter pointed out a kind of programmer mistake that makes nested differentiation give the wrong result. As they show, this kind of mistake is easy to make in the framework defined above, because all it takes is putting a call to the lift function in the wrong place. The definition of reflectBug below is only slightly different from reflect, but the result is very different and very wrong.

reflectBug x = let (y,y') = d (\h -> lift (curve (x + h)))
               in y + x * (recip y' - y') / 2

*Differentiation> d (\k -> reflectBug (3 + k))
(Infinity,NaN)

They demonstrate this problem using running code in Haskell and Scheme that computes

picture-2g.png

to be 2 rather than the correct answer 1.

The essence of this problem is that the two nested invocations of differentiation use two different infinitesimals, which a mathematician would denote by dh and dk. These two infinitesimals should not be confused, just as years and feet and persons should not be confused.

image1.jpg

Using types to check units

Björn Buckwalter showed that we can use a generic type system to prevent such confusion statically, just as Haskell uses state threads to distinguish pointers into different memory regions. Recall that the type ST s a in Haskell represents a monadic computation that yields a result of type a using mutable cells in the state thread represented by the phantom type s. To construct such a computation, we can use primitive operations such as

newSTRef :: a -> ST s (STRef s a)

and the fact that ST s is a monad. To run such a computation, we must use the primitive function

runST :: (forall s. ST s a) -> a

in which forall s forces different state threads, created by different calls to runST, to be represented by different phantom types s. One way to understand this rank-2 type is that it makes the type checker generate a new phantom type s for each argument to runST. Analogously, Buckwalter redefines the type constructor D to take a phantom-type argument s, which represents an infinitesimal unit.

data D s a = D a a
  deriving Show

Accordingly, the other definitions above change in their types, but not in their terms or behavior. The most important change is the new rank-2 type of d, which forces different infinitesimals, created by different invocations of differentiation, to be represented by different phantom types s.

d :: Num a => (forall s. D s a -> D s a) -> (a, a)
d f = let D y y' = f infinitesimal
      in (y,y')

lift :: Num a => a -> D s a
lift x = D x 0

infinitesimal :: Num a => D s a
infinitesimal = D 0 1

instance Eq a => Eq (D s a) where
  D x _ == D y _ = x == y

instance Ord a => Ord (D s a) where
  compare (D x _) (D y _) = compare x y

instance Num a => Num (D s a) where
  D x x' + D y y' = D (x + y) (x' + y')
  D x x' * D y y' = D (x * y) (x' * y + x * y')
  negate (D x x') = D (negate x) (negate x')
  abs    (D x x') = D (abs x) (signum x * x')
  signum (D x _)  = lift (signum x)
  fromInteger x   = lift (fromInteger x)

instance Fractional a => Fractional (D s a) where
  recip (D x x') = D (recip x) (-x'/x/x)
  fromRational x = lift (fromRational x)

Because the phantom type s is part of the type of a number, and because arithmetic operations such as + require the arguments and the return value to have the same type, it is a type error to add numbers denominated in different infinitesimals. In particular, the erroneous definition reflectBug above is now a type error, as desired.

Occurs check: cannot construct the infinite type: t = D s t
  Expected type: t
  Inferred type: D s t
In the first argument of `lift', namely `(curve (x + h))'
In the expression: lift (curve (x + h))

For this checking of infinitesimal units to be sound, this library for automatic differentiation should not export the values D and infinitesimal to its users, though of course the type constructor D and its type-class instances need to be exported, along with the functions d and lift.

(In the discussion that ensued, David Roundy noted that the same kind of static safety can be achieved by exporting just a function

d :: Num a => (forall b.
      Num b => (a -> b) -> b -> b) -> (a, a)

for differentiation. The type b makes it unnecessary and useless to export the type constructor D, even though d is still implemented using D. Also, the new argument of type a -> b makes it unnecessary and useless to export the lift function. Finally, the type-class context Num b makes it unnecessary and useless to export the Eq, Show, and Num instances for D. However, as Chris Smith lamented, we need additional differentiation functions of the types

Fractional a => (forall b.
 Fractional b => (a -> b) -> b -> b) -> (a, a)

Floating a => (forall b.
 Floating b => (a -> b) -> b -> b) -> (a, a)

in order to differentiate functions that use Fractional or Floating operations. Oleg Kiselyov used similar types to express symbolic differentiation.)

Automatic lifting

Although the type system now prevents us from putting calls to lift in the wrong place, it is still annoying to have to invoke lift manually—especially for nested differentiation, a useful case as discussed above. Depending on ‘how constant’ a quantity is, we need to feed it through a composition of exactly the right number of lifts. This manual coding is frustrating because the unique right number of lifts to apply is obvious from the input and output types desired: to convert a type a to the type D s a, apply lift once; to convert a to D s (D s' a), apply lift twice; and so on. We want the compiler to manage these subtyping coercions automatically.

tourist.png

An analogous situation arises with state threads, which can be organized into a hierarchy of memory regions. As part of a monadic computation that uses mutable cells in a parent region, we can create a child region and perform a subcomputation that allocates and accesses mutable cells in both regions. After the subcomputation completes, the child region is destroyed en bloc, but we can still use the parent region and observe any effect on it brought about by the subcomputation. To allow the subcomputation to use the parent region, we want every region to be a subtype of its descendents. Matthew Fluet and Greg Morrisett’s implementation of nested regions in Haskell uses explicit subtyping coercions just like our lift: depending on ‘how senior’ a region is, we need to compose exactly the right number of region coercions.

In a pending submission to the Haskell symposium, Oleg and I show how to automate region subtyping coercions using type classes. One might hope to apply that approach to lifting in automatic differentiation. Indeed we can, but I only know how to automate counting lifts, not how to automate placing them. That is, instead of feeding each use of an input quantity to the lift function exactly the right number of times, we can feed it to a new function once. The new function, called lifts, belongs to a new type class Lifts, which takes two type parameters. The constraint Lifts a b holds if and only if the type b is the result of applying zero or more type constructors D s to the type a.

class Lifts a b where
  lifts :: a -> b

More concretely, the following instances incompletely approximate the intended meaning of Lifts.

instance Lifts a a where
  lifts = id

instance Num a => Lifts a (D s a) where
  lifts = lift

instance Num a => Lifts a (D s (D s' a)) where
  lifts = lift . lift

instance Num a => Lifts a (D s (D s' (D s'' a))) where
  lifts = lift . lift . lift

The definition above of reflect in terms of d applies lift once to one occurrence of x but not to other occurrences of x and h. The same function can be expressed using Lifts, by applying lifts once to each occurrences of the input variables x and h.

reflectAuto :: Fractional a => a -> a
reflectAuto x
  = let (y,y') = d (\h -> curve (lifts x + lifts h))
    in y + lifts x * (recip y' - y') / 2

Expressing reflect in this new way frees us from counting how many times to lift the inputs x and h each time they are used.

How to implement Lifts? On one hand, as an implementation of Lifts, the approximate instances above are incomplete and unsatisfactory in theory, in that they restrict how many lift each lifts can stand for. They are perfectly useful in practice, however, and rely on no extension to Haskell other than rank-2 types and multiparameter type classes with flexible instances. On the other hand, a complete implementation is possible using the TypeCast class for type improvement (originally used by Kiselyov, Lämmel, and Schupke to implement heterogeneous collections), but it requires more Haskell extensions: functional dependencies, overlapping instances, and undecidable instances. Without further ado, below is the complete implementation.

instance Lifts a a where
  lifts a = a

instance (TypeCast (D s b') b, Num b', Lifts a b')
  => Lifts a b where
  lifts = typeCast . lift . lifts

class TypeCast a b | a -> b, b -> a where
  typeCast :: a -> b
class TypeCast' t a b | t a -> b, t b -> a where
  typeCast' :: t -> a -> b
class TypeCast'' t a b | t a -> b, t b -> a where
  typeCast'' :: t -> a -> b
instance TypeCast' () a b => TypeCast a b where
  typeCast x = typeCast' () x
instance TypeCast'' t a b => TypeCast' t a b where
  typeCast' = typeCast''
instance TypeCast'' () a a where
  typeCast'' _ x  = x

Although Lifts makes it easier to use automatic differentiation, this implementation is heavy lifting. I wonder if it is easier to express this combination of subtyping and rank-2 polymorphism in a language like Scala?

美國二卡 金綠雙輝 Two American membership cards

美國航空頒發的金卡、以及美國政府頒發的綠卡,張張都攸關大家在美國的生活。它們就如同深藏不露的礦泉水,讓不肖之徒伺機正中我們飛安的要害,又如同免費提供的行李牌,幫地勤人員千里尋回我們失散的骨肉。在這鳳凰花開、四面驪歌的時節,您是否也在金綠之間猶豫不決呢?與其腳踏兩條船,不如把雞蛋集中以便管理。為方便您睿智的抉擇,記者謹將兩卡的十二優劣點整理如下。

The gold card issued by American Airlines and the green card issued by the American government are each an integral part of our American lives. They are as powerful as the hidden water bottles that strip us of flight safety; they are as handy as the free luggage tags that unite us with abducted children. In this season of blooming Delonix regia and booming Auld Lang Syne, are you also hesitating to choose between gold and green? Decide wisely which basket to put your eggs in, with the help of this 12-point summary of the two cards’ pros and cons.

美國航空金卡
American Airlines gold card
美國政府綠卡
American government green card
其實是塑膠的
Actually plastic
其實是白色的
Actually white
使用 Helvetica 字體印製,中性無襯線
Typeset using Helvetica, a neutral sans-serif typeface
gld-card-crop.jpg
使用 Arial 字體印製,仿製沒水準
Typeset using Arial, a cheap knock-off typeface
i-551-back-edit.png
附送精美彩色印刷《新會員指南》一本,十二頁無索引
Comes with a guide for new members printed in 12 pages of fine color but without index
Gold Member Guide.jpg
附送精美彩色印刷《新移民指南》一本,一百頁附索引
Comes with a guide for new immigrants printed in 100 pages of fine color and with index
M-618.jpg
有搭機、住宿、購物等多種參加方式
Qualify in a variety of ways: flights, accommodations, shopping, etc.
有就業、依親、投資等多種參加方式
Qualify in a variety of ways: employment, family, investment, etc.
接受會員挑戰,三個月衝刺後拿到身份
Get it in 3 months when you enter a Membership Challenge
要求補送證據,三個月期限後失去身份
Lose it in 3 months when you receive a Request for Evidence
快速通過出境安全檢查
Avoid long security lines at departure
080526 2008 p154-crop.jpg
快速通過入境邊防檢查
Avoid long immigration lines at arrival
36172.jpg
選位、客服優先
Priority for seats and customer service
謀職、徵兵優先
Priority for jobs and military service
每飛行萬哩,可在網上免費獲取升級證
Earn complimentary upgrades online every 10,000 miles
每居留兩年,可在網上付費申請回美證
Apply for reentry permits online every 2 years
旅館、租車公司等合作夥伴提供優待
Special offers on hotels and cars from partner companies
國家、過境機場等合作夥伴提供優待
Special offers on transit and visas from partner countries
票選傑出空服人員
Vote for great flight attendants
Gold Member Guide-vote.jpg
票選傑出校務委員
Vote for great school board members
M-618-vote.jpg
一年換一張新卡
Renew every year
十年換一張新卡
Renew every decade
在貴賓室裡靜待登機
Lounge while waiting for your flight
在移民監裡靜待入籍
Lounge while waiting for your naturalization
Louder and louder

In response to the daytime jackhammering across the street, I put up the sign below at the entrance to our apartment building.

Did you know that you can call 311 to complain about excessive construction noise?

Sangre de mi sangre

Title: Sangre de Mi Sangre (Blood of My Blood), or Padre Nuestro

Overall rating: A (strong accept; will champion film)

Reviewer’s confidence: Y (I am knowledgeable in the area, though not an expert.)

General review:

By weaving together four deep instances of detachment, this well-organized film demonstrates how to derive immigrants from persons mechanically and vice versa. These derivations are intuitively obvious and should not bear any tedious demonstration. Yet, perhaps because they are so painfully obvious, they are often neglected—I often neglect them—when interacting with an immigrant. This neglect injures life, liberty, and the pursuit of happiness as we eat immigrant food, wear immigrant clothes, inhabit immigrant tenements, take immigrant transport, learn immigrant business, have immigrant fun. For example, just the other day I had the urge to run down some jaywalking immigrants as they crossed Canal Street in front of my bike. Or was it in Times Square?

This cinematic pearl thus serves as a rational reconstruction of motion, a timely and entertaining reminder that immigrants are people and people are immigrants. It points the way to a wide variety of practical applications, such as snapshotting the dialects of the Spanish diaspora, establishing my love for New York, and advocating open immigration alongside free trade. I recommend that y’all go see it (in New York this week and Los Angeles next week). Especially if you know how many passports Jason Bourne has. Less if you feel cheated at the end of “The Perfect Human”.

Comments for other reviewers:

Several reviews criticize this movie, on two fronts. On one hand, the story is too clear. The plot is too plotted, the morals too moralistic, and the characters too characteristic: “As a character Pedro never develops beyond a credulous, good-hearted cipher.” On the other hand, the story is too obscure: “As you watch the movie, questions accumulate… Little about the connection between Pedro and Magda makes sense.”

These fronts answer each other. (Pondering the questions develops the characters.) This movie is not simplistic and complicated but simple yet complex. Drawing “characters directly out of Hispanic-cliche central casting” reduces distraction, like studying continuation-passing style (named the canonical program transformation by Olivier Danvy). If “Sangre de Mi Sangre”, taking advantage of a conventional narrative structure as this review tries to, manages to pull tropes out of the ashes of an I-94 form in Williamsburg, then so much the better for a mechanical derivation and its inverse: as those viewers fixated on “impoverished Mexican illegal immigrants” show, it is all too easy to forget that we are all in the image of immigration. A moving target, so to speak.

Questions for the authors:

Why does this trailer credit neither the actress Paola Mendoza nor the actor Eugenio Derbez?

Do you have a shoe fetish?

Tidbits 1
hokkey, a Japanese art student, with a mask and a cold

Have you noticed that the tact filter applies to not just verbal behavior but also physical behavior? They’re installed in opposite directions in New York and Tokyo—except at Tsukijishijo (築地市場) perhaps? The most obvious example of a physical tact filter is the mouth mask that sick Japanese people wear in public, ostensibly to avoid infecting others.

The Suica is addictive. It’s like Cosmo’s access card in Sneakers, except of course the Suica charges you, and I bet Cosmo’s card doesn’t open his locker at the gym.

The leftover container problem is to partition a partial order into as few total orders as possible.

Pat Hanrahan wrote that “The BRDF is, in general, anisotropic” (on page 29 of “Rendering concepts”, chapter 2 of Radiosity and Realistic Image Synthesis by Cohen and Wallace, 1993). This sentence is the negation of “The BRDF is, in general, isotropic”! If “in general” in both sentences means “without exception” (which it does not in the true reading of “Primes are odd in general”), then the fact that these two sentences are the negation of each other is reminiscent of sentences such as “Hanna sucht kein Buch”, “Ze mogen geen eenhoorn zoeken”, and “You don’t need to bring any dessert” (Jacobs 1980, Theoretical Linguistics 7:121–136; de Swart 2000, in Reference and anaphoric relations, 118–142; inter alia). However, the sentences “The BRDF is always anisotropic” and “The BRDF is always isotropic” are not the negation of each other.

An elegantly simple red and black mobile in the style of Alexander Calder

One of Olivier Danvy’s great examples of program transformation and continuation passing is to test in linear time whether a Calder mobile is balanced. For a mobile to be balanced is for each submobile to locate its center of mass right at its root. Real mobiles are of course never exactly balanced, yet the rods stay roughly horizontal—because even the rigid rods are not straight! Below is a program that rotates the submobiles of a two-dimensional mobile so that each submobile locates its center of mass right below its root. (Exercises left to the reader include handling higher-dimensional mobiles and using continuations to avoid repeatedly constructing and taking apart pairs.)

module Calder where
import Complex

data Mobile a = Point a | Rigid [(Complex a, Mobile a)]

balance :: RealFloat a => Mobile a -> (a, Mobile a)
balance (Point a) = (a, Point a)
balance (Rigid zms) = (sum as, Rigid zms') where
  (as, azs, zm's) = unzip3 [ (a, (a :+ 0) * z, (z, m'))
                           | (z, m) <- zms
                           , let (a, m') = balance m ]
  center = sum azs
  zms' | center == 0 = zm's
       | otherwise = [ (z / signum center, m') | (z, m') <- zm's ]

“Hendrik Lenstra claims that he can no more suggest to his students what problems they should work on than who they should marry.” (Update 2008-06-10: Vladimir Arnol’d said in 1995: “I never assign a thesis topic to my students. This is like assigning them a spouse.”)

Has anyone started designing the all-important logo for a joint Hillary-Obama/Obama-Hillary campaign?

What symbol(s) for disjoint union do you use—one of “+|∐⊔⊍⊎⊕” or something else? Do you use a different symbol for the presupposed disjoint union (which is undefined when applied to {{}} and {{}}) than for the tagged disjoint union (which yields a two-element set when applied to {{}} and {{}})? Those who work with sets up to isomorphism, such as topologists, are exempt from answering the latter question.

An excerpt from Lipman Bers's Czechoslovakian passport

It is interesting to read “Nationalität: Jude” on the cover of the August 2007 AMS Notices. Inside the issue is more information about the cover and about Lipman Bers’s immigration.

Strengthening in Logical Frameworks (Twelf) oleg at pobox.com

Many logic systems admit a weakening principle: a derived conclusion continues to hold when more assumptions are added. Strengthening is the opposite: proving that the established conclusion holds given fewer assumptions. Strengthening certainly is not generally valid and has to be proven case-by-case. It seems strengthening is often needed when proving meta-theoretical properties such as totality of non-surjective endomorphisms of higher-order terms.

The article on Twelf Wiki <http://twelf.plparty.org/wiki/Strengthening> shows an example of the need for strengthening and describes one solution, which relies on defining term equality. In this article we demonstrate simpler examples requiring strengthening, and two surprisingly trivial solutions. Quite literally, one solution deals with the problem head-on, whereas the second one avoids the problem in a round-about way. The examples are inspired by staging calculi. At the end we describe the real development, which gives perhaps the first formal proof of type preservation of primitive reductions in an expressive staged calculus with escapes, run and cross-staged persistence. The type-preservation proof used both of our solutions and required two strengthening lemmas. Throughout we use Twelf – properly: we use the higher-order abstract syntax (HOAS) for higher-order terms, without ever resorting to concrete syntax for “variables”. The complete code is available at <http://okmij.org/ftp/Computation/staging/> .

The simplest example requiring strengthening

Perhaps the simplest example to illustrate the need of strengthening is based on lambda-calculus with terms marked by either 1 or 0. The challenge is to implement a function that converts terms marked by 1 into those marked by 0 – and show its totality.

Our calculus is elementary: we define two marks, 0 and 1

mark: type.
0: mark.
1: mark.

and marked terms, built of applications and abstractions:

exp: mark -> type.              %name exp E.
a: exp M -> exp M -> exp M.
l: (exp M -> exp M) -> exp M.

Here is a sample term, marked with 1:

t1 : exp 1 = a (l [x] (l [y] (a x y))) (l [x] x).

(in Haskell notation, it is (\x -> \y -> x y) (\x -> x) ). The definition of exp makes it clear that all subterms of a marked term have the same mark.

Our goal is to write a transformation taking a term marked by 1 and producing a term marked by 0:

cp1: exp 1 -> exp 0 -> type.
%mode cp1 +E1 -E2.

It is almost like copying terms and hence straightforward:

cp1-a: cp1 (a E1 E2) (a E1' E2')
         <- cp1 E2 E2'
         <- cp1 E1 E1'.
cp1-l: cp1 (l E1) (l E1')
         <- ({x}{x'} cp1 x x' -> cp1 (E1 x) (E1' x')).
%block cp1-l-bl : block {x:exp 1} {x':exp 0} {D:cp1 x x'}.
%worlds (cp1-l-bl) (cp1 _ _).
%total {E1} (cp1 E1 _).  %% Does not pass!

The clause cp1-a inductively handles applications. To transform an abstraction, we assume a term x marked with 1, a term x' marked with 0, and assume that x' is the result of transforming x, see cp1 x x'. Under those assumptions, we transform the body of the abstraction, cp1 (E1 x) (E1' x'). The notation (E1' x') means that E1' is a higher-order term applied to x'. That is, E1' is of the type exp 0 -> exp 0. The block declaration cp1-l-bl lists our set of assumptions; the regular worlds declaration asserts that cp1 is performed given no assumptions, or one set assumption cp1-l-bl, or two sets of such assumptions (with the different x and x' across the sets), etc.

Our transformation procedure is obviously total: we can take any 1-marked term, and, in finite time, always deliver the corresponding 0-marked term. We wish to state this formally, hence the %total assertion. Twelf verifies it first by checking that the transformation is terminating. It then checks that all pattern-matching is exhaustive: we have not missed a case. Alas, the totality check fails and Twelf reports an error:

Totality: Output of subgoal not covered
Output coverage error --- missing cases:
{E1:exp 1 -> exp 1} {E2:exp 1 -> exp 0 -> exp 0}
   |- {x:exp 1} {x':exp 0} cp1 x x' -> cp1 (E1 x) (E2 x x').

In the clause cp1-l, we have introduced three assumptions: x, x' and cp1 x x'. By writing (E1' x') we asked the Twelf system to “factor out” the dependence on x'. Implicitly we stated that the result of transforming (E1 x) depends only on x' but not on x or cp1 x x' (more formally, the independence is seen from the fact E1' is used outside of the scope of x introduced via the universal quantification {x} in cp1-l). Twelf can see that the assumption cp1 x x' cannot possibly enter the set of terms because the definition of the type family exp (with constructors a and l) shows that no term can contain anything of the type cp1. But terms may occur in terms, and so the result of transforming the body of the abstraction generally depends on both x and x'. We have not handled that general case, and so our transformation is not total.

Twelf’s complaint is quite insightful: (E1 x) is the body of the abstraction that includes some term x (“the variable”). We transform the body replacing all those occurrences of x with x' – and so the result should include only x' but no x. Thus Twelf demands the proof that we indeed replaced all those x with x'.

Upon consideration, we can see that the result of transforming (E1 x) cannot really depend on, or contain, x. After all, x is an expression marked by 1 and the result of transforming (E1 x) is an expression marked by 0. Any sub-expression of a marked expression is marked with the same mark. Alas, Twelf cannot see this proof by itself and we have to write it down explicitly. The reason Twelf cannot see that proof by itself is because Twelf tracks the dependencies (so-called subordination) using only the names of the type families (e.g., exp) but not their parameters (e.g., exp 0 or exp 1).

Astonishingly, the most trivial way of writing this independence proof works. We state that whenever a term marked by 0 appears to depend on a term marked by 1, it actually doesn’t. Or, constructively, we assert the following proposition

strength : (exp 1 -> exp 0) -> exp 0 -> type.
%mode strength +H -E.

For any implication exp 1 -> exp 0, we can always constructively produce exp 0. Here is the proof, the detailed procedure of producing that exp 0 term:

-: strength ([x] E) E.
-: strength ([x] (a (E1 x) (E2 x))) (a E1' E2')
   <- strength E1 E1'
   <- strength E2 E2'.
-: strength ([x] (l [x'] (E x' x))) (l E')
   <- {x':exp 0} strength (E x') (E' x').

The first clause handles the case when E patently does not contain anything of the type exp 1. The rest of the clauses do straightforward induction on terms; the last clause requires hypothetical reasoning to examine the body of the abstraction, and for that we introduce the assumption x' of the type exp 0. Perhaps it is because of that hypothetical reasoning that Twelf cannot see the strengthening property on its own. We assert that so defined procedure is total, i.e., it is a constructive proof of the strength proposition:

%block strength-bl : block {x':exp 0}.
%worlds (strength-bl | cp2-l-bl) (strength _ _).
%covers strength +H -E.
%total {E} (strength E _).

The label cp2-l-bl describes the set of assumptions of the improved cp2 procedure below (in the actual code, it is defined in a forward declaration). Twelf has verified the totality of strength, and so we can use it to write the improved transformation procedure, cp2:

cp2: exp 1 -> exp 0 -> type.
%mode cp2 +E1 -E2.
cp2-a: cp2 (a E1 E2) (a E1' E2')
         <- cp2 E2 E2'
         <- cp2 E1 E1'.
cp2-l: cp2 (l E1) E1''
         <- ({x}{x'} cp2 x x' -> cp2 (E1 x) (E1' x x'))
         <- strength ([x] (l (E1' x))) E1''.
%block cp2-l-bl : block {x:exp 1} {x':exp 0} {d:cp2 x x'}.
%worlds (cp2-l-bl) (cp2 _ _).
%total {E1} (cp2 E1 _).

The only difference from cp1 is in the clause cp2-l: by writing (E1' x x') we state that the result of transforming the body of the abstraction may depend not only on x' but also on x. We build the transformed abstraction under that additional assumption: ([x:exp 1] (l (E1' x))). Using strengthening, we then prove that (l (E1' x)) does not actually depend on x. Now the totality assertion succeeds: cp2 is the total transformation of 1-marked terms.

The complete code is available at <http://okmij.org/ftp/Computation/staging/strengthen.elf>

A more realistic example: type preservation

We have encountered the problem of strengthening in staged calculi, e.g., when proving type preservation. We outline the problem and the solution; please see the complete code <http://okmij.org/ftp/Computation/staging/translation-problem.elf> for the full details.

We introduce a simple staged language, where each expression is annotated with the numeric stage label A. The language has three forms of expressions: natural literals, addition, and fix (i.e., recursive functions).

exp : nat -> type.              %name exp E.
n : nat -> exp A.               %prefix 110 n.
+ : exp A -> exp A -> exp A.    %infix left 154 +.
fix : (exp A -> exp A) -> exp A.

We introduce a trivial type system with only one type, int, and the equally simple type-checking relation of E T:

of : exp A -> tp -> type.       %name of Q.
%mode of +E *T.

Please see the code for details. As in the previous section we write the procedure to downgrade the level of an expression

downgrade : exp (1 N) -> exp N -> type.
%mode downgrade +E1 -E2.

We are not that interested in proving downgrade is total. Rather, we wish to prove a meta-theorem that downgrade is type preserving: if the original expression was typed, the downgraded expression shall be typed too. We state that property as a constructive proposition:

dn-pres : downgrade E1 E2 -> of E1 T -> of E2 T -> type.
%mode dn-pres +PD +PTI -PTO.

That is, given any downgradeable expression E1 and its type derivation of E1 T (the evidence that E1 is typed), we produce the typing derivation for the corresponding downgraded expression E2. The most interesting case is the following:

- : dn-pres (dn-f PEH) (of-f PTH) PTH''
        <- ({e:exp (1 A)} 
            {pt:of e int}
            {e':exp A}
            {dn:downgrade e e'}
            {pt':of e' int}
            dn-pres dn pt pt' ->
            dn-pres (PEH e e' dn) (PTH e pt) (PTH' e pt e' pt'))
        <- strength-of ([e] [pt] (of-f  (PTH' e pt))) PTH''.

Here, dn-f PEH of the type downgrade (fix E1) (fix E2) is the derivation of downgrading (fix E1) into (fix E2). We assume a term e of the type exp (1 A), and further assume it is typed (the assumption pt:of e int). Likewise we assume a typed term e' that is the result of downgrading e. Finally, we assume that the downgrading from e to e' was type preserving. Under those assumptions, we obtain the type derivation of the body of the downgraded fix, by inductively invoking dn-pres. The resulting type derivation (PTH' e pt e' pt') generally depends not only on the assumption e' and its typeability pt' – but also on e and pt, which are in scope and could in principle occur in type derivations. We must thus prove that we can remove the latter dependency. Again, the formal statement of that proposition is surprisingly simple:

strength-of : ({e:exp (1 N)} of e T -> of (E:exp N) T) -> 
              of (E:exp N) T -> type.
%mode strength-of +H -P.

We assert that the typing derivation of E T for an expression E at the level N cannot depend on any N+1-level expression e and the typing derivation of the latter. Our proof is again constructive: from the corresponding implication we produce its conclusion without assuming the hypotheses. The complete code has all details.

Other examples

The need for strengthening does occur (although not frequently) in other circumstances. Here are the example from Twelf distribution. One is “Representation of Mini-ML expressions in Mini-ML*” by David Swasey, file exercises/opt-eval/opt-rep1.elf. It defines a transformation from one higher-order language to another. Proving meta-theoretical properties of that translation (totality, type preservation, etc) requires strengthening.

The second, indirect solution

The second solution to strengthening is, too, trivial, albeit round-about. We use the same problem as in Section 1: in lambda-calculus whose terms are marked with 0 or 1, implement a function that converts terms marked by 1 into those marked by 0 – and show its totality.

Rather than transforming exp 1 to exp 0 directly, we introduce terms expi M isomorphic to exp M, turn exp 1 to expi 1 and then transform expi 1 to exp 0. The advantage is that terms exp M cannot appear in expi M and vice versa, so exp M cannot depend on assumptions of the type {x:expi M} and vice versa. We avoid strengthening altogether.

Here are a few details: we define intermediary, also marked, terms

expi: mark -> type.             %name expi _Ei.
a2: expi M -> expi M -> expi M.
l2: (expi M -> expi M) -> expi M.

that are clearly isomorphic to the terms exp M of the original calculus. We can state the isomorphism formally; however, we only need one direction of it: given any term exp M, we can always obtain the corresponding expi M:

exp-expi: exp M -> expi M -> type.
%mode exp-expi +E1 -E2.

We show only the more complex case of transforming abstractions:

-: exp-expi (l E1) (l2 E1')
   <- {e:exp M} {e2:expi M}
      exp-expi e e2 ->
      exp-expi (E1 e) (E1' e2).
%block bl-exp-expi : 
       some {M:mark} 
       block {e:exp M} {e2:expi M} {t:exp-expi e e2}.
%worlds (bl-exp-expi) (exp-expi _ _).
%total {E} (exp-expi E _).

Now Twelf verifies the totality assertion without any complaints that E1' could depend (that is, include) the assumption e, which is also present in the LF context. Twelf now sees that E1' cannot include e. Compared to Section 1, converting an expression of the type exp M gives us the term of the type expi M. The latter terms can only be constructed by l2 and a2; neither constructor can take any argument of the type exp M or contain exp M elsewhere. Strengthening is no longer needed.

The label-changing transformation now takes expi 1 rather than exp 1 as the source a term:

cpaux: expi 1 -> exp 0 -> type.
%mode cpaux +E1 -E2.

Here is the interesting case

cpa-l: cpaux (l2 E1) (l E1')
       <- ({x:expi 1}{x':exp 0} cpaux x x' -> cpaux (E1 x) (E1' x')).
%block bl-cpaux : block {x:expi 1} {x':exp 0} {t:cpaux x x'}.
%worlds (bl-cpaux) (cpaux _ _).
%total {E1} (cpaux E1 _).

Again, Twelf accepts the totality assertion without complaint because clearly no term of the type expi M may appear in the term exp M – in fact, expi was not even defined when exp was introduced. Thus (E1' x') cannot depend on x.

What remains is a mere composition of the two above transformations

cpi: exp 1 -> exp 0 -> type.
%mode cpi +E1 -E2.
-: cpi E1 E2 <- exp-expi E1 E1' <- cpaux E1' E2.
%worlds () (cpi _ _).
%total {} (cpi _ _).

The desired transformation now includes two steps: transforming the source expression into the isomorphic form, and changing the mark of the latter. Albeit indirect, the solution avoids strengthening altogether.

A real example: type-preservation in an expressive staged calculus

The file <http://okmij.org/ftp/Computation/staging/lambda-am1.elf> is the formalization of the two-level call-by-value calculus lambda-a1v based on lambda-a of Taha and Nielsen’s “Environment Classifiers” (POPL03). The calculus supports manipulation and splicing in of the open code, running of the closed code, and cross-staged persistence. The lambda-a1v calculus has been introduced in the paper Closing the Stage: From Staged Code to Typed Closures written with Yukiyoshi Kameyama and Chung-chieh Shan (PEPM08).

The file includes (maybe the first) formal proof that any non-value term can be decomposed into a pre-value and the context. This is quite challenging given that the calculus supports the evaluation under lambda and so the context can cross the arbitrary number of dependent binders: variable binders include classifiers that must be bound, too, at that point.

In our staged calculus, all terms are marked by their stage, present and future. This is quite like the examples in the earlier Sections. There is also an operation to demote a term, which takes a future-stage-marked term and changes its mark so the term can be evaluated at the present stage. Demotion is how a future-stage computation can be run. Demotion rebuilds a term as it traverses it, changing the marks along the way. The only special case is handling a cross-staged persistent (CSP) value. The latter is a present-stage expression that was “lifted” to the future stage, i.e., was encapsulated unevaluated. The demotion operation merely removes the encapsulation, splicing CSP into the demoted code.

The type-preservation proof is at the end of the development. The most complex part is proving that demotion is type preserving, in particular, that demoting an abstraction (l [x:exp (1q AL)] Body) from the future stage (marked by (1q AL)) to the present, marked as qt, stage yielding (l [x':exp qt] Body') is type preserving. As in the simpler examples earlier, the proof proceeds by introducing assumptions {x:exp (1q AL)}, {x':exp qt} and that x' is the result of demoting x. The challenge is to prove that Body' does not actually depend on x. That is no longer as simple as before: a future-stage expression exp (1q AL) could actually occur in the demoted expression – because of CSP. For example, consider the following future-stage term

(l [x:exp (1q AL)] (csp (AL > x)))

where AL > x is future-stage-x marked by classifier AL (in the PEPM08 paper, we write (l [x^AL] %(<x>^AL)). An attempt to run such future-stage computations does not type in MetaOCaml; accordingly, the lambda-a1v calculus rejects such expressions as mis-typed, due to a special typing rule of CSP. Our preservation proof must take into account that the demoted expression is typed and so possible CSPs are restricted in form. Only then strengthening can be proven.

In our proof, we had to use both approaches to proving strengthening described in this article. We had to prove a direct strengthening lemma

strength-tp-dem : csp-ok1 E ->
                  ({x:exp (1q AL)} of x TV -> of E T) -> 
                  of E T -> type.
%mode strength-tp-dem +CS +H -P.

asserting that the typing of a present-stage expression (of (E:exp qt) T) cannot depend on a future-stage expression exp (1q AL) and the latter’s typing derivation (of (_:exp (1q AL)) T)provided that E is the result of a demotion. We also use the indirect approach and introduce a CSP restriction predicate csp-ok1 isomorphic to the predicate csp-ok. The latter may occur in typing derivations but the former may not. Finally, we had to prove the strengthening lemma for csp-ok1.

Continuation fest

Please consider submitting a paper to this year’s International Conference on Functional Programming (ICFP) and Conference on Generative Programming and Component Engineering (GPCE). The respective deadlines are April 2 and May 12–19. Both conferences solicit experience reports in addition to regular papers. As you might expect, ICFP also solicits functional pearls—“elegant, instructive, and fun essays on functional programming”.

Less formal is a Continuation Fest that a bunch of us are organizing. If you can be in Tokyo on April 13, please consider participating! Talk proposals are due on March 31. This event is scheduled on the day before the Symposium on Functional and Logic Programming (FLOPS); you can travel from Tokyo to Ise on the morning of April 14.

Continuations are used, and constantly re-discovered, in web programming, logic, event-based concurrency, linguistics, transactional systems, optimization, program generation and verification. Programming language systems … are (re-)implementing continuations, generators, iterators and related control structures. Continuations are also being studied for their own sake.

We call for an informal gathering of people working in all these areas—language designers, users, and researchers who study or use continuations in academic or practical work or as a hobby. The goal is to exchange ideas, terminology, new results, experience reports, works in progress, open problems, and friendly feedback—as well as meet each other and strengthen the community.

We plan a combination of 7–8 broadly accessible talks and 1–2 invited talks, with plenty of breaks for socializing and informal conversations. An evening event is planned. Participants are invited to submit working notes, source files, and abstracts for distribution to the attendees, but as the workshop has no formal proceedings, contributions may still be submitted for publication elsewhere.

We solicit talk proposals, about 2 paragraphs long … The topics of interest include, but are not at all limited to:

  • descriptions of uses of continuations in program generation, linguistics, proof theory, model checking,
  • system demonstrations
  • libraries using continuations, such as generators
  • verifying programs that use continuations explicitly or implicitly (exceptions, concurrency, backtracking)
  • type systems of continuations and their formalizations
  • open problems

It seems appropriate to translate “Continuation Fest” into Japanese as “継続祭”. As for Mandarin Chinese, how about “續繼節” (“续继节”)? It was 穆信成 Shin-Cheng Mu who taught me to noun a bisyllabic Chinese verb, such as “繼續” (“继续”, “continue”), by exchanging the two syllables.

Your point of view

hsbc.png