- Recent Changes 新聞
- History 歷史
- Preferences 喜好
Discussion 討論
This is Chung-chieh Shan’s English blog. All posts are archived.
The essay collection Letters of Transit: Reflections on Exile, Identity, Language, and Loss, published after a lecture series at the New York Public Library, includes “Shadow Cities” by André Aciman. An excerpt follows.
¶ And, indeed, there was something physically central about Straus Park. This, after all, was where Broadway and West End Avenue intersected, and the park seemed almost like a raised hub on West 106th Street, leading to Riverside Park on one side and to Central Park on the other. Straus Park was not on one street but at the intersection of four. Suddenly, before I knew why, I felt quite at home. I was in one place that had at least four addresses.

¶ Here you could come, sit, and let your mind drift in four different directions: Broadway, which at this height had an unspecified Northern European cast;

West End, decidedly Londonish;

107th, very quiet, very narrow, tucked away around the corner, reminded me of those deceptively humble alleys where one finds stately homes along the canals of Amsterdam.

And 106th, as it descended toward Central Park, looked like the main alley of a small town on the Italian Riviera,

where, after much trundling in the blinding light at noon as you take in the stagnant odor of fuel from the train station where you just got off, you finally approach a sort of cove, which you can’t make out yet but which you know is there, hidden behind a thick row of Mediterranean pines,

over which, if you really strain your eyes, you’ll catch sight of the tops of striped beach umbrellas jutting beyond the trees, and beyond these, if you could just take a few steps closer, the sudden, spectacular blue of the sea.

¶ To the west of Straus Park, however, the slice of Riverside and 106th had acquired a character that was strikingly Parisian,


and with the fresh breeze which seems to swell and subside all afternoon long, you sensed that behind the trees of Riverside Park,

serene and silent flowed an elusive Seine,

and beyond it, past the bridges that were to take you across, though you couldn’t see any of it yet, was not the Hudson, not New Jersey, but the Left Bank—

not the end of Manhattan, but the beginning of a whole bustling city waiting beyond the trees—as it waited so many decades ago when, as a boy, dreaming of Paris, I would go to the window, look out to the sea at night, and think that this was not North Africa at all, but the Ile de la Cité. Perhaps what lay beyond the trees was not the end of Manhattan, or even Paris, but the beginnings of another, unknown city, the real city, the one that always beckons, the one we invent each time and may never see and fear we’ve begun to forget.
¶ There were moments when, despite the buses and the trucks and the noise of people with boom boxes, the traffic light would change and everything came to a standstill and people weren’t speaking, and the unrelenting sun beat strong on the pavement, and I would almost swear this was an early summer afternoon in Italy, and that what lay behind Riverside Park was not just my imaginery Seine, but the Tiber as well. What made me think of Rome was that everything here reminded me of the kind of place all tourists know well: that tiny, empty piazza with a little fountain, where, thirsty and tired with too much walking all day, you douse your face, then unbuckle your sandals, sit on the scalding marble edge of a Baroque fountain, and simply let your feet rest a while in what is always exquisitely clear, non-drinkable water.

¶ Depending on where I sat, or on which corner I moved to within the park, I could be in any of four or five countries and never for a second be in the one I couldn’t avoid hearing, seeing, and smelling. This, I think, is when I started to love, if love is the word for it, New York. I would return to Straus Park every day, because returning was itself now part of the ritual of remembering the shadow cities hidden there—so that I, who had put myself there, the way squatters put themselves somewhere and start to build on nothing, with nothing, would return for no reason other than perhaps to run into my own footprints. This became my habit, and ultimately my habitat. Sometimes finding that you are lost where you were lost last year can be oddly reassuring, almost familiar. You may never find yourself; but you do remember looking for yourself. That too can be reassuring, comforting.

…
¶ How uncannily appropriate, therefore, to find out fifteen years later that the statue that helped me step back in time was not that of a nymph, but of Memory herself.

In Greek, her name is Mnemosyne, Zeus’s mistress, mother of the Muses. I had, without knowing it, been coming to the right place after all. This is why I was so disturbed by the imminent demolition of the park: my house of memories would become a ghost park. If part of the city goes, part of us dies as well.
(An alternative title for this post is, what is the type of differentiation? Hint: it’s not quite (ℝ→ℝ)→(ℝ→ℝ), because how would you make sure the input function is differentiable?)
You can download this post as a literate Haskell program.
{-# LANGUAGE Rank2Types #-} {-# LANGUAGE MultiParamTypeClasses #-} {-# LANGUAGE FlexibleInstances #-} {-# LANGUAGE FunctionalDependencies #-} {-# LANGUAGE UndecidableInstances #-} {-# LANGUAGE OverlappingInstances #-} module Differentiation where
Automatic differentiation
The overloading approach to automatic differentiation has become more popular recently among typed functional programmers. The basic idea is to overload arithmetic operators such as +, ×, and sin so that they work on not just floating-point numbers but pairs (or more generally, sequences) of them, which track quantities along with their rates of change. The overloaded operators are easy to define because, unlike with integration, the rules of differentiation are compositional: you know, d(x + y) = dx + dy, d(x × y) = dx × y + x × dy, d(sin x) = cos x × dx, and so on. Here they are in Haskell.
data D a = D a a deriving Show lift :: Num a => a -> D a lift x = D x 0 infinitesimal :: Num a => D a infinitesimal = D 0 1 instance Eq a => Eq (D a) where D x _ == D y _ = x == y instance Ord a => Ord (D a) where compare (D x _) (D y _) = compare x y instance Num a => Num (D a) where D x x' + D y y' = D (x + y) (x' + y') D x x' * D y y' = D (x * y) (x' * y + x * y') negate (D x x') = D (negate x) (negate x') abs (D x x') = D (abs x) (signum x * x') signum (D x _) = lift (signum x) fromInteger x = lift (fromInteger x) instance Fractional a => Fractional (D a) where recip (D x x') = D (recip x) (-x'/x/x) fromRational x = lift (fromRational x)
The two components of a D value are a quantity and
its derivative, so the lift function ‘lifts’ a number
to a constant quantity, and infinitesimal is a
quantity with value 0 and derivative 1. Were abs
defined by abs x = signum x * x by default, we
wouldn’t have to define abs for D
above.
Let us use these definitions to model how a parabola reflects light.

Suppose that a light ray (red above) enters a parabolic mirror y = x²/4 from above. Where does the reflected ray cross the y axis? In the diagram above, if the point where the ray hits the parabola is (x,y), then the derivative of y with respect to x is tan θ, and the y coordinate where the reflected ray crosses the y axis is y + x/tan 2θ, which is equal to y + x (1/tan θ − tan θ)/2. We can compute this coordinate by automatically differentiating y with respect to x:
curve x = x^2/4 reflect x = let D y y' = curve (lift x + infinitesimal) in y + x * (recip y' - y') / 2
As expected, the parabola reflects all incoming rays from above to the focal point (0,1).
*Differentiation> map reflect [1..5] [1.0,1.0,1.0,1.0,1.0]
To be sure, this code does not work by symbolically
differentiating y = x²/4 to yield
y′ = x/2. Rather, it computes y
alongside y′ for one particular x at a time, so
the curve could just as well be defined by a more complex program
that uses if and recursion. For example, a ray tracer
usually deals with scenes much more complex than a single parabola.
This code also does not work by numerically comparing the values of
y at nearby values of x. Instead of
approximating dy/dx by Δy/Δx where Δx is a very small real number,
we compute with an actually infinitesimal dx.
Differentiation is a higher-order function
Before we continue, let us abstract the pattern for
differentiation in reflect above into a new
higher-order function d, which differentiates any
given function at the input 0. For convenience, d
also returns the value of the given function at 0.
d :: Num a => (D a -> D b) -> (b, b) d f = let D y y' = f infinitesimal in (y,y') reflect :: Fractional a => a -> a reflect x = let (y,y') = d (\h -> curve (lift x + h)) in y + x * (recip y' - y') / 2
Even though the reflect function uses
differentiation internally, we can still differentiate it. Such
differentiate is said to be nested. For the parabola, we
can confirm that the reflected ray hits the focal point not just at
but also around x = 3, because the derivative
computed below is zero.
*Differentiation> d (\k -> reflect (3 + k)) (1.0,0.0)
For other curves and surfaces, the derivative is typically not zero and tells us the density of light energy that falls around each point on the y axis. Hence, as Dan Piponi noted, this kind of calculation is performed by ray tracers and other programs that sample from probability distributions.
Another application of automatic differentiation is to find roots of a function using Newton’s method. Therefore, we can use nested automatic differentiation to find local extrema and saddle points of a function using Newton’s method.
The danger of confusing infinitesimals
Jeff Siskind
and Barak Pearlmutter pointed out a kind of programmer mistake that
makes nested differentiation give the wrong result. As they
show, this kind of mistake is easy to make in the framework defined
above, because all it takes is putting a call to the
lift function in the wrong place. The definition of
reflectBug below is only slightly different from
reflect, but the result is very different and very
wrong.
reflectBug x = let (y,y') = d (\h -> lift (curve (x + h))) in y + x * (recip y' - y') / 2 *Differentiation> d (\k -> reflectBug (3 + k)) (Infinity,NaN)
They demonstrate this problem using running code in Haskell and Scheme that computes
to be 2 rather than the correct answer 1.
The essence of this problem is that the two nested invocations of differentiation use two different infinitesimals, which a mathematician would denote by dh and dk. These two infinitesimals should not be confused, just as years and feet and persons should not be confused.

Using types to check units
Björn
Buckwalter showed that we can use a generic type system to prevent
such confusion statically, just as Haskell uses state
threads to distinguish pointers into different memory
regions. Recall that the type ST s a in Haskell
represents a monadic computation that yields a result of
type a using mutable cells in the state thread
represented by the phantom type s. To construct
such a computation, we can use primitive operations such as
newSTRef :: a -> ST s (STRef s a)
and the fact that ST s is a monad. To run such a
computation, we must use the primitive function
runST :: (forall s. ST s a) -> a
in which forall s forces different state threads,
created by different calls to runST, to be represented
by different phantom types s. One way to
understand this rank-2 type is that it makes the type checker
generate a new phantom type s for each argument
to runST. Analogously, Buckwalter redefines the type
constructor D to take a phantom-type
argument s, which represents an infinitesimal
unit.
data D s a = D a a deriving Show
Accordingly, the other definitions above change in their types,
but not in their terms or behavior. The most important change is
the new rank-2 type of d, which forces different
infinitesimals, created by different invocations of
differentiation, to be represented by different phantom
types s.
d :: Num a => (forall s. D s a -> D s a) -> (a, a) d f = let D y y' = f infinitesimal in (y,y') lift :: Num a => a -> D s a lift x = D x 0 infinitesimal :: Num a => D s a infinitesimal = D 0 1 instance Eq a => Eq (D s a) where D x _ == D y _ = x == y instance Ord a => Ord (D s a) where compare (D x _) (D y _) = compare x y instance Num a => Num (D s a) where D x x' + D y y' = D (x + y) (x' + y') D x x' * D y y' = D (x * y) (x' * y + x * y') negate (D x x') = D (negate x) (negate x') abs (D x x') = D (abs x) (signum x * x') signum (D x _) = lift (signum x) fromInteger x = lift (fromInteger x) instance Fractional a => Fractional (D s a) where recip (D x x') = D (recip x) (-x'/x/x) fromRational x = lift (fromRational x)
Because the phantom type s is part of the type
of a number, and because arithmetic operations such as
+ require the arguments and the return value to have
the same type, it is a type error to add numbers denominated in
different infinitesimals. In particular, the erroneous definition
reflectBug above is now a type error, as desired.
Occurs check: cannot construct the infinite type: t = D s t Expected type: t Inferred type: D s t In the first argument of `lift', namely `(curve (x + h))' In the expression: lift (curve (x + h))
For this checking of infinitesimal units to be sound, this
library for automatic differentiation should not export the values
D and infinitesimal to its users, though
of course the type constructor D and its type-class
instances need to be exported, along with the functions
d and lift.
d :: Num a => (forall b. Num b => (a -> b) -> b -> b) -> (a, a)
for differentiation. The type b makes it
unnecessary and useless to export the type
constructor D, even though d is
still implemented using D. Also, the new argument
of type a -> b
makes it unnecessary and useless to export the lift
function. Finally, the type-class context Num b makes
it unnecessary and useless to export the Eq,
Show, and Num instances
for D.
However, as Chris Smith lamented, we need additional
differentiation functions of the types
Fractional a => (forall b. Fractional b => (a -> b) -> b -> b) -> (a, a) Floating a => (forall b. Floating b => (a -> b) -> b -> b) -> (a, a)
in order to differentiate functions that use
Fractional or Floating operations.
Oleg
Kiselyov used similar types to express symbolic
differentiation.)
Automatic lifting
Although the type system now prevents us from putting calls to
lift in the wrong place, it is still annoying to have
to invoke lift manually—especially for nested
differentiation, a useful case as discussed above. Depending on
‘how constant’ a quantity is, we need to feed it through a
composition of exactly the right number of lifts. This
manual coding is frustrating because the unique right number of
lifts to apply is obvious from the input and output
types desired: to convert a type a to the type
D s a, apply lift once; to convert
a to D s (D s' a), apply
lift twice; and so on. We want the compiler to manage
these subtyping coercions automatically.
An analogous situation arises with state threads, which can be
organized into a hierarchy of memory regions. As
part of a monadic computation that uses mutable cells in a
parent region, we can create a child region and
perform a subcomputation that allocates and accesses mutable cells
in both regions. After the subcomputation completes, the
child region is destroyed en bloc, but we can still use the parent
region and observe any effect on it brought about by the
subcomputation. To allow the subcomputation to use the parent
region, we want every region to be a subtype of its
descendents. Matthew Fluet
and Greg Morrisett’s implementation of nested regions in Haskell
uses explicit subtyping coercions just like our
lift: depending on ‘how senior’ a region is, we need
to compose exactly the right number of region coercions.
In a
pending submission to the Haskell symposium, Oleg and I show how to
automate region subtyping coercions using type classes. One
might hope to apply that approach to lifting in automatic
differentiation. Indeed we can, but I only know how to automate
counting lifts, not how to automate placing them. That
is, instead of feeding each use of an input quantity to the
lift function exactly the right number of times, we
can feed it to a new function once. The new function, called
lifts, belongs to a new type class Lifts,
which takes two type parameters. The constraint Lifts a
b holds if and only if the type b is the
result of applying zero or more type constructors D s
to the type a.
class Lifts a b where lifts :: a -> b
More concretely, the following instances incompletely
approximate the intended meaning of Lifts.
instance Lifts a a where lifts = id instance Num a => Lifts a (D s a) where lifts = lift instance Num a => Lifts a (D s (D s' a)) where lifts = lift . lift instance Num a => Lifts a (D s (D s' (D s'' a))) where lifts = lift . lift . lift
The definition above of reflect in terms
of d applies lift once to one
occurrence of x but not to other occurrences of
x and h. The same function can be
expressed using Lifts, by applying lifts
once to each occurrences of the input variables x
and h.
reflectAuto :: Fractional a => a -> a reflectAuto x = let (y,y') = d (\h -> curve (lifts x + lifts h)) in y + lifts x * (recip y' - y') / 2
Expressing reflect in this new way frees us from
counting how many times to lift the inputs
x and h each time they are used.
How to implement Lifts? On one hand, as an
implementation of Lifts, the approximate instances
above are incomplete and unsatisfactory in theory, in that they
restrict how many lift each lifts can
stand for. They are perfectly useful in practice, however, and rely
on no extension to Haskell other than rank-2 types and
multiparameter type classes with flexible instances. On the other
hand, a complete implementation is possible using the
TypeCast class for type improvement (originally used by Kiselyov,
Lämmel, and Schupke to implement heterogeneous collections),
but it requires more Haskell extensions: functional dependencies,
overlapping instances, and undecidable instances. Without further
ado, below is the complete implementation.
instance Lifts a a where lifts a = a instance (TypeCast (D s b') b, Num b', Lifts a b') => Lifts a b where lifts = typeCast . lift . lifts class TypeCast a b | a -> b, b -> a where typeCast :: a -> b class TypeCast' t a b | t a -> b, t b -> a where typeCast' :: t -> a -> b class TypeCast'' t a b | t a -> b, t b -> a where typeCast'' :: t -> a -> b instance TypeCast' () a b => TypeCast a b where typeCast x = typeCast' () x instance TypeCast'' t a b => TypeCast' t a b where typeCast' = typeCast'' instance TypeCast'' () a a where typeCast'' _ x = x
Although Lifts makes it easier to use automatic
differentiation, this implementation is heavy lifting. I wonder if
it is easier to express this combination of subtyping and rank-2
polymorphism in a language like Scala?
美國航空頒發的金卡、以及美國政府頒發的綠卡,張張都攸關大家在美國的生活。它們就如同深藏不露的礦泉水,讓不肖之徒伺機正中我們飛安的要害,又如同免費提供的行李牌,幫地勤人員千里尋回我們失散的骨肉。在這鳳凰花開、四面驪歌的時節,您是否也在金綠之間猶豫不決呢?與其腳踏兩條船,不如把雞蛋集中以便管理。為方便您睿智的抉擇,記者謹將兩卡的十二優劣點整理如下。
The gold card issued by American Airlines and the green card issued by the American government are each an integral part of our American lives. They are as powerful as the hidden water bottles that strip us of flight safety; they are as handy as the free luggage tags that unite us with abducted children. In this season of blooming Delonix regia and booming Auld Lang Syne, are you also hesitating to choose between gold and green? Decide wisely which basket to put your eggs in, with the help of this 12-point summary of the two cards’ pros and cons.
| 美國航空金卡 American Airlines gold card |
美國政府綠卡 American government green card |
||
|---|---|---|---|
| ✗ | 其實是塑膠的 Actually plastic |
✗ | 其實是白色的 Actually white |
| ✓ | 使用 Helvetica 字體印製,中性無襯線 Typeset using Helvetica, a neutral sans-serif typeface ![]() |
✗ | 使用 Arial 字體印製,仿製沒水準 Typeset using Arial, a cheap knock-off typeface ![]() |
| ✗ | 附送精美彩色印刷《新會員指南》一本,十二頁無索引 Comes with a guide for new members printed in 12 pages of fine color but without index ![]() |
✓ | 附送精美彩色印刷《新移民指南》一本,一百頁附索引 Comes with a guide for new immigrants printed in 100 pages of fine color and with index ![]() |
| ✓ | 有搭機、住宿、購物等多種參加方式 Qualify in a variety of ways: flights, accommodations, shopping, etc. |
✓ | 有就業、依親、投資等多種參加方式 Qualify in a variety of ways: employment, family, investment, etc. |
| ✓ | 接受會員挑戰,三個月衝刺後拿到身份 Get it in 3 months when you enter a Membership Challenge |
✗ | 要求補送證據,三個月期限後失去身份 Lose it in 3 months when you receive a Request for Evidence |
| ✓ | 快速通過出境安全檢查 Avoid long security lines at departure ![]() |
✓ | 快速通過入境邊防檢查 Avoid long immigration lines at arrival ![]() |
| ✓ | 選位、客服優先 Priority for seats and customer service |
▵ | 謀職、徵兵優先 Priority for jobs and military service |
| ✓ | 每飛行萬哩,可在網上免費獲取升級證 Earn complimentary upgrades online every 10,000 miles |
✓ | 每居留兩年,可在網上付費申請回美證 Apply for reentry permits online every 2 years |
| ✓ | 旅館、租車公司等合作夥伴提供優待 Special offers on hotels and cars from partner companies |
✓ | 國家、過境機場等合作夥伴提供優待 Special offers on transit and visas from partner countries |
| ✓ | 票選傑出空服人員 Vote for great flight attendants ![]() |
✓ | 票選傑出校務委員 Vote for great school board members ![]() |
| ✓ | 一年換一張新卡 Renew every year |
✓ | 十年換一張新卡 Renew every decade |
| ✓ | 在貴賓室裡靜待登機 Lounge while waiting for your flight |
✓ | 在移民監裡靜待入籍 Lounge while waiting for your naturalization |
In response to the daytime jackhammering across the street, I put up the sign below at the entrance to our apartment building.

Title: Sangre de Mi Sangre (Blood of My Blood), or Padre Nuestro
Overall rating: A (strong accept; will champion film)
Reviewer’s confidence: Y (I am knowledgeable in the area, though not an expert.)
General review:
By weaving together four deep instances of detachment, this well-organized film demonstrates how to derive immigrants from persons mechanically and vice versa. These derivations are intuitively obvious and should not bear any tedious demonstration. Yet, perhaps because they are so painfully obvious, they are often neglected—I often neglect them—when interacting with an immigrant. This neglect injures life, liberty, and the pursuit of happiness as we eat immigrant food, wear immigrant clothes, inhabit immigrant tenements, take immigrant transport, learn immigrant business, have immigrant fun. For example, just the other day I had the urge to run down some jaywalking immigrants as they crossed Canal Street in front of my bike. Or was it in Times Square?
This cinematic pearl thus serves as a rational reconstruction of motion, a timely and entertaining reminder that immigrants are people and people are immigrants. It points the way to a wide variety of practical applications, such as snapshotting the dialects of the Spanish diaspora, establishing my love for New York, and advocating open immigration alongside free trade. I recommend that y’all go see it (in New York this week and Los Angeles next week). Especially if you know how many passports Jason Bourne has. Less if you feel cheated at the end of “The Perfect Human”.
Comments for other reviewers:
Several reviews criticize this movie, on two fronts. On one hand, the story is too clear. The plot is too plotted, the morals too moralistic, and the characters too characteristic: “As a character Pedro never develops beyond a credulous, good-hearted cipher.” On the other hand, the story is too obscure: “As you watch the movie, questions accumulate… Little about the connection between Pedro and Magda makes sense.”
These fronts answer each other. (Pondering the questions develops the characters.) This movie is not simplistic and complicated but simple yet complex. Drawing “characters directly out of Hispanic-cliche central casting” reduces distraction, like studying continuation-passing style (named the canonical program transformation by Olivier Danvy). If “Sangre de Mi Sangre”, taking advantage of a conventional narrative structure as this review tries to, manages to pull tropes out of the ashes of an I-94 form in Williamsburg, then so much the better for a mechanical derivation and its inverse: as those viewers fixated on “impoverished Mexican illegal immigrants” show, it is all too easy to forget that we are all in the image of immigration. A moving target, so to speak.
Questions for the authors:
Why does this trailer credit neither the actress Paola Mendoza nor the actor Eugenio Derbez?
Do you have a shoe fetish?
Have you noticed that the tact filter applies to not just verbal behavior but also physical behavior? They’re installed in opposite directions in New York and Tokyo—except at Tsukijishijo (築地市場) perhaps? The most obvious example of a physical tact filter is the mouth mask that sick Japanese people wear in public, ostensibly to avoid infecting others.
The Suica is addictive. It’s like Cosmo’s access card in Sneakers, except of course the Suica charges you, and I bet Cosmo’s card doesn’t open his locker at the gym.
The leftover container problem is to partition a partial order into as few total orders as possible.
Pat Hanrahan wrote that “The BRDF is, in general, anisotropic” (on page 29 of “Rendering concepts”, chapter 2 of Radiosity and Realistic Image Synthesis by Cohen and Wallace, 1993). This sentence is the negation of “The BRDF is, in general, isotropic”! If “in general” in both sentences means “without exception” (which it does not in the true reading of “Primes are odd in general”), then the fact that these two sentences are the negation of each other is reminiscent of sentences such as “Hanna sucht kein Buch”, “Ze mogen geen eenhoorn zoeken”, and “You don’t need to bring any dessert” (Jacobs 1980, Theoretical Linguistics 7:121–136; de Swart 2000, in Reference and anaphoric relations, 118–142; inter alia). However, the sentences “The BRDF is always anisotropic” and “The BRDF is always isotropic” are not the negation of each other.
One of Olivier Danvy’s great examples of program transformation and continuation passing is to test in linear time whether a Calder mobile is balanced. For a mobile to be balanced is for each submobile to locate its center of mass right at its root. Real mobiles are of course never exactly balanced, yet the rods stay roughly horizontal—because even the rigid rods are not straight! Below is a program that rotates the submobiles of a two-dimensional mobile so that each submobile locates its center of mass right below its root. (Exercises left to the reader include handling higher-dimensional mobiles and using continuations to avoid repeatedly constructing and taking apart pairs.)
module Calder where import Complex data Mobile a = Point a | Rigid [(Complex a, Mobile a)] balance :: RealFloat a => Mobile a -> (a, Mobile a) balance (Point a) = (a, Point a) balance (Rigid zms) = (sum as, Rigid zms') where (as, azs, zm's) = unzip3 [ (a, (a :+ 0) * z, (z, m')) | (z, m) <- zms , let (a, m') = balance m ] center = sum azs zms' | center == 0 = zm's | otherwise = [ (z / signum center, m') | (z, m') <- zm's ]
“Hendrik Lenstra claims that he can no more suggest to his students what problems they should work on than who they should marry.” (Update 2008-06-10: Vladimir Arnol’d said in 1995: “I never assign a thesis topic to my students. This is like assigning them a spouse.”)

Has anyone started designing the all-important logo for a joint Hillary-Obama/Obama-Hillary campaign?
What symbol(s) for disjoint union do you use—one of “+|∐⊔⊍⊎⊕” or something else? Do you use a different symbol for the presupposed disjoint union (which is undefined when applied to {{}} and {{}}) than for the tagged disjoint union (which yields a two-element set when applied to {{}} and {{}})? Those who work with sets up to isomorphism, such as topologists, are exempt from answering the latter question.
It is interesting to read “Nationalität: Jude” on the cover of the August 2007 AMS Notices. Inside the issue is more information about the cover and about Lipman Bers’s immigration.
Many logic systems admit a weakening principle: a derived conclusion continues to hold when more assumptions are added. Strengthening is the opposite: proving that the established conclusion holds given fewer assumptions. Strengthening certainly is not generally valid and has to be proven case-by-case. It seems strengthening is often needed when proving meta-theoretical properties such as totality of non-surjective endomorphisms of higher-order terms.
The article on Twelf Wiki <http://twelf.plparty.org/wiki/Strengthening> shows an example of the need for strengthening and describes one solution, which relies on defining term equality. In this article we demonstrate simpler examples requiring strengthening, and two surprisingly trivial solutions. Quite literally, one solution deals with the problem head-on, whereas the second one avoids the problem in a round-about way. The examples are inspired by staging calculi. At the end we describe the real development, which gives perhaps the first formal proof of type preservation of primitive reductions in an expressive staged calculus with escapes, run and cross-staged persistence. The type-preservation proof used both of our solutions and required two strengthening lemmas. Throughout we use Twelf – properly: we use the higher-order abstract syntax (HOAS) for higher-order terms, without ever resorting to concrete syntax for “variables”. The complete code is available at <http://okmij.org/ftp/Computation/staging/> .
The simplest example requiring strengthening
Perhaps the simplest example to illustrate the need of strengthening is based on lambda-calculus with terms marked by either 1 or 0. The challenge is to implement a function that converts terms marked by 1 into those marked by 0 – and show its totality.
Our calculus is elementary: we define two marks, 0
and 1
mark: type.
0: mark.
1: mark.
and marked terms, built of applications and abstractions:
exp: mark -> type. %name exp E.
a: exp M -> exp M -> exp M.
l: (exp M -> exp M) -> exp M.
Here is a sample term, marked with 1:
t1 : exp 1 = a (l [x] (l [y] (a x y))) (l [x] x).
(in Haskell notation, it is (\x -> \y -> x y) (\x
-> x) ). The definition of exp makes it
clear that all subterms of a marked term have the same mark.
Our goal is to write a transformation taking a term marked by
1 and producing a term marked by 0:
cp1: exp 1 -> exp 0 -> type.
%mode cp1 +E1 -E2.
It is almost like copying terms and hence straightforward:
cp1-a: cp1 (a E1 E2) (a E1' E2')
<- cp1 E2 E2'
<- cp1 E1 E1'.
cp1-l: cp1 (l E1) (l E1')
<- ({x}{x'} cp1 x x' -> cp1 (E1 x) (E1' x')).
%block cp1-l-bl : block {x:exp 1} {x':exp 0} {D:cp1 x x'}.
%worlds (cp1-l-bl) (cp1 _ _).
%total {E1} (cp1 E1 _). %% Does not pass!
The clause cp1-a inductively handles applications.
To transform an abstraction, we assume a term x marked
with 1, a term x' marked with 0, and assume that
x' is the result of transforming x, see
cp1 x x'. Under those assumptions, we transform the
body of the abstraction, cp1 (E1 x) (E1' x'). The
notation (E1' x') means that E1' is a
higher-order term applied to x'. That is,
E1' is of the type exp 0 -> exp 0. The
block declaration cp1-l-bl lists our set of
assumptions; the regular worlds declaration asserts that
cp1 is performed given no assumptions, or one set
assumption cp1-l-bl, or two sets of such assumptions
(with the different x and x' across the
sets), etc.
Our transformation procedure is obviously total: we can take any
1-marked term, and, in finite time, always deliver the
corresponding 0-marked term. We wish to state this formally, hence
the %total assertion. Twelf verifies it first by
checking that the transformation is terminating. It then checks
that all pattern-matching is exhaustive: we have not missed a case.
Alas, the totality check fails and Twelf reports an error:
Totality: Output of subgoal not covered
Output coverage error --- missing cases:
{E1:exp 1 -> exp 1} {E2:exp 1 -> exp 0 -> exp 0}
|- {x:exp 1} {x':exp 0} cp1 x x' -> cp1 (E1 x) (E2 x x').
In the clause cp1-l, we have introduced three
assumptions: x, x' and cp1 x
x'. By writing (E1' x') we asked the Twelf
system to “factor out” the dependence on x'.
Implicitly we stated that the result of transforming (E1
x) depends only on x' but not on x
or cp1 x x' (more formally, the independence is seen
from the fact E1' is used outside of the scope of
x introduced via the universal quantification
{x} in cp1-l). Twelf can see that the
assumption cp1 x x' cannot possibly enter the set of
terms because the definition of the type family exp
(with constructors a and l) shows that no
term can contain anything of the type cp1. But terms
may occur in terms, and so the result of transforming the body of
the abstraction generally depends on both x and
x'. We have not handled that general case, and so our
transformation is not total.
Twelf’s complaint is quite insightful: (E1 x) is
the body of the abstraction that includes some term x
(“the variable”). We transform the body replacing all those
occurrences of x with x' – and so the
result should include only x' but no x.
Thus Twelf demands the proof that we indeed replaced all
those x with x'.
Upon consideration, we can see that the result of transforming
(E1 x) cannot really depend on, or contain,
x. After all, x is an expression marked
by 1 and the result of transforming (E1 x) is an
expression marked by 0. Any sub-expression of a marked expression
is marked with the same mark. Alas, Twelf cannot see this proof by
itself and we have to write it down explicitly. The reason Twelf
cannot see that proof by itself is because Twelf tracks the
dependencies (so-called subordination)
using only the names of the type families (e.g., exp)
but not their parameters (e.g., exp 0 or exp
1).
Astonishingly, the most trivial way of writing this independence proof works. We state that whenever a term marked by 0 appears to depend on a term marked by 1, it actually doesn’t. Or, constructively, we assert the following proposition
strength : (exp 1 -> exp 0) -> exp 0 -> type.
%mode strength +H -E.
For any implication exp 1 -> exp 0, we
can always constructively produce exp 0. Here is the
proof, the detailed procedure of producing that exp 0
term:
-: strength ([x] E) E.
-: strength ([x] (a (E1 x) (E2 x))) (a E1' E2')
<- strength E1 E1'
<- strength E2 E2'.
-: strength ([x] (l [x'] (E x' x))) (l E')
<- {x':exp 0} strength (E x') (E' x').
The first clause handles the case when E patently
does not contain anything of the type exp 1. The rest
of the clauses do straightforward induction on terms; the last
clause requires hypothetical reasoning to examine the body of the
abstraction, and for that we introduce the assumption
x' of the type exp 0. Perhaps it is
because of that hypothetical reasoning that Twelf cannot see the
strengthening property on its own. We assert that so defined
procedure is total, i.e., it is a constructive proof of the
strength proposition:
%block strength-bl : block {x':exp 0}.
%worlds (strength-bl | cp2-l-bl) (strength _ _).
%covers strength +H -E.
%total {E} (strength E _).
The label cp2-l-bl describes the set of assumptions
of the improved cp2 procedure below (in the actual
code, it is defined in a forward declaration). Twelf has verified
the totality of strength, and so we can use it to write the
improved transformation procedure, cp2:
cp2: exp 1 -> exp 0 -> type.
%mode cp2 +E1 -E2.
cp2-a: cp2 (a E1 E2) (a E1' E2')
<- cp2 E2 E2'
<- cp2 E1 E1'.
cp2-l: cp2 (l E1) E1''
<- ({x}{x'} cp2 x x' -> cp2 (E1 x) (E1' x x'))
<- strength ([x] (l (E1' x))) E1''.
%block cp2-l-bl : block {x:exp 1} {x':exp 0} {d:cp2 x x'}.
%worlds (cp2-l-bl) (cp2 _ _).
%total {E1} (cp2 E1 _).
The only difference from cp1 is in the clause
cp2-l: by writing (E1' x x') we state
that the result of transforming the body of the abstraction may
depend not only on x' but also on x. We
build the transformed abstraction under that additional assumption:
([x:exp 1] (l (E1' x))). Using strengthening, we then
prove that (l (E1' x)) does not actually depend on
x. Now the totality assertion succeeds:
cp2 is the total transformation of 1-marked terms.
The complete code is available at <http://okmij.org/ftp/Computation/staging/strengthen.elf>
A more realistic example: type preservation
We have encountered the problem of strengthening in staged calculi, e.g., when proving type preservation. We outline the problem and the solution; please see the complete code <http://okmij.org/ftp/Computation/staging/translation-problem.elf> for the full details.
We introduce a simple staged language, where each expression is
annotated with the numeric stage label A. The language
has three forms of expressions: natural literals, addition, and fix
(i.e., recursive functions).
exp : nat -> type. %name exp E.
n : nat -> exp A. %prefix 110 n.
+ : exp A -> exp A -> exp A. %infix left 154 +.
fix : (exp A -> exp A) -> exp A.
We introduce a trivial type system with only one type,
int, and the equally simple type-checking relation
of E T:
of : exp A -> tp -> type. %name of Q.
%mode of +E *T.
Please see the code for details. As in the previous section we write the procedure to downgrade the level of an expression
downgrade : exp (1 N) -> exp N -> type.
%mode downgrade +E1 -E2.
We are not that interested in proving downgrade is total.
Rather, we wish to prove a meta-theorem that downgrade
is type preserving: if the original expression was typed, the
downgraded expression shall be typed too. We state that property as
a constructive proposition:
dn-pres : downgrade E1 E2 -> of E1 T -> of E2 T -> type.
%mode dn-pres +PD +PTI -PTO.
That is, given any downgradeable expression E1 and
its type derivation of E1 T (the evidence that
E1 is typed), we produce the typing derivation for the
corresponding downgraded expression E2. The most
interesting case is the following:
- : dn-pres (dn-f PEH) (of-f PTH) PTH''
<- ({e:exp (1 A)}
{pt:of e int}
{e':exp A}
{dn:downgrade e e'}
{pt':of e' int}
dn-pres dn pt pt' ->
dn-pres (PEH e e' dn) (PTH e pt) (PTH' e pt e' pt'))
<- strength-of ([e] [pt] (of-f (PTH' e pt))) PTH''.
Here, dn-f PEH of the type downgrade (fix E1)
(fix E2) is the derivation of downgrading (fix
E1) into (fix E2). We assume a term
e of the type exp (1 A), and further
assume it is typed (the assumption pt:of e int).
Likewise we assume a typed term e' that is the result
of downgrading e. Finally, we assume that the
downgrading from e to e' was type
preserving. Under those assumptions, we obtain the type derivation
of the body of the downgraded fix, by inductively invoking dn-pres.
The resulting type derivation (PTH' e pt e' pt')
generally depends not only on the assumption e' and
its typeability pt' – but also on e and
pt, which are in scope and could in principle occur in
type derivations. We must thus prove that we can remove the latter
dependency. Again, the formal statement of that proposition is
surprisingly simple:
strength-of : ({e:exp (1 N)} of e T -> of (E:exp N) T) ->
of (E:exp N) T -> type.
%mode strength-of +H -P.
We assert that the typing derivation of E T for an
expression E at the level N cannot depend
on any N+1-level expression e and the
typing derivation of the latter. Our proof is again constructive:
from the corresponding implication we produce its conclusion
without assuming the hypotheses. The complete code has all
details.
Other examples
The need for strengthening does occur (although not frequently)
in other circumstances. Here are the example from Twelf
distribution. One is “Representation of Mini-ML expressions in
Mini-ML*” by David Swasey, file
exercises/opt-eval/opt-rep1.elf. It defines a
transformation from one higher-order language to another. Proving
meta-theoretical properties of that translation (totality, type
preservation, etc) requires strengthening.
The second, indirect solution
The second solution to strengthening is, too, trivial, albeit round-about. We use the same problem as in Section 1: in lambda-calculus whose terms are marked with 0 or 1, implement a function that converts terms marked by 1 into those marked by 0 – and show its totality.
Rather than transforming exp 1 to exp
0 directly, we introduce terms expi M
isomorphic to exp M, turn exp 1 to
expi 1 and then transform expi 1 to
exp 0. The advantage is that terms exp M
cannot appear in expi M and vice versa, so exp
M cannot depend on assumptions of the type {x:expi
M} and vice versa. We avoid strengthening altogether.
Here are a few details: we define intermediary, also marked, terms
expi: mark -> type. %name expi _Ei.
a2: expi M -> expi M -> expi M.
l2: (expi M -> expi M) -> expi M.
that are clearly isomorphic to the terms exp M of
the original calculus. We can state the isomorphism formally;
however, we only need one direction of it: given any term exp
M, we can always obtain the corresponding expi
M:
exp-expi: exp M -> expi M -> type.
%mode exp-expi +E1 -E2.
We show only the more complex case of transforming abstractions:
-: exp-expi (l E1) (l2 E1')
<- {e:exp M} {e2:expi M}
exp-expi e e2 ->
exp-expi (E1 e) (E1' e2).
%block bl-exp-expi :
some {M:mark}
block {e:exp M} {e2:expi M} {t:exp-expi e e2}.
%worlds (bl-exp-expi) (exp-expi _ _).
%total {E} (exp-expi E _).
Now Twelf verifies the totality assertion without any complaints
that E1' could depend (that is, include) the
assumption e, which is also present in the LF context.
Twelf now sees that E1' cannot include e.
Compared to Section 1, converting an expression of the type
exp M gives us the term of the type expi
M. The latter terms can only be constructed by
l2 and a2; neither constructor can take
any argument of the type exp M or contain exp
M elsewhere. Strengthening is no longer needed.
The label-changing transformation now takes expi 1
rather than exp 1 as the source a term:
cpaux: expi 1 -> exp 0 -> type.
%mode cpaux +E1 -E2.
Here is the interesting case
cpa-l: cpaux (l2 E1) (l E1')
<- ({x:expi 1}{x':exp 0} cpaux x x' -> cpaux (E1 x) (E1' x')).
%block bl-cpaux : block {x:expi 1} {x':exp 0} {t:cpaux x x'}.
%worlds (bl-cpaux) (cpaux _ _).
%total {E1} (cpaux E1 _).
Again, Twelf accepts the totality assertion without complaint
because clearly no term of the type expi M may appear
in the term exp M – in fact, expi was not
even defined when exp was introduced. Thus (E1'
x') cannot depend on x.
What remains is a mere composition of the two above transformations
cpi: exp 1 -> exp 0 -> type.
%mode cpi +E1 -E2.
-: cpi E1 E2 <- exp-expi E1 E1' <- cpaux E1' E2.
%worlds () (cpi _ _).
%total {} (cpi _ _).
The desired transformation now includes two steps: transforming the source expression into the isomorphic form, and changing the mark of the latter. Albeit indirect, the solution avoids strengthening altogether.
A real example: type-preservation in an expressive staged calculus
The file <http://okmij.org/ftp/Computation/staging/lambda-am1.elf>
is the formalization of the two-level call-by-value calculus
lambda-a1v based on lambda-a of Taha and Nielsen’s
“Environment Classifiers” (POPL03). The calculus supports
manipulation and splicing in of the open code, running of the
closed code, and cross-staged persistence. The
lambda-a1v calculus has been introduced in the paper
Closing
the Stage: From Staged Code to Typed Closures written with
Yukiyoshi Kameyama and Chung-chieh Shan (PEPM08).
The file includes (maybe the first) formal proof that any non-value term can be decomposed into a pre-value and the context. This is quite challenging given that the calculus supports the evaluation under lambda and so the context can cross the arbitrary number of dependent binders: variable binders include classifiers that must be bound, too, at that point.
In our staged calculus, all terms are marked by their stage, present and future. This is quite like the examples in the earlier Sections. There is also an operation to demote a term, which takes a future-stage-marked term and changes its mark so the term can be evaluated at the present stage. Demotion is how a future-stage computation can be run. Demotion rebuilds a term as it traverses it, changing the marks along the way. The only special case is handling a cross-staged persistent (CSP) value. The latter is a present-stage expression that was “lifted” to the future stage, i.e., was encapsulated unevaluated. The demotion operation merely removes the encapsulation, splicing CSP into the demoted code.
The type-preservation proof is at the end of the development.
The most complex part is proving that demotion is type preserving,
in particular, that demoting an abstraction (l [x:exp (1q
AL)] Body) from the future stage (marked by (1q
AL)) to the present, marked as qt, stage
yielding (l [x':exp qt] Body') is type preserving. As
in the simpler examples earlier, the proof proceeds by introducing
assumptions {x:exp (1q AL)}, {x':exp qt}
and that x' is the result of demoting x.
The challenge is to prove that Body' does not actually
depend on x. That is no longer as simple as before: a
future-stage expression exp (1q AL) could actually
occur in the demoted expression – because of CSP. For example,
consider the following future-stage term
(l [x:exp (1q AL)] (csp (AL > x)))
where AL > x is future-stage-x
marked by classifier AL (in the PEPM08 paper, we write
(l [x^AL] %(<x>^AL)). An attempt to run such
future-stage computations does not type in MetaOCaml; accordingly,
the lambda-a1v calculus rejects such expressions as
mis-typed, due to a special typing rule of CSP. Our preservation
proof must take into account that the demoted expression is typed
and so possible CSPs are restricted in form. Only then
strengthening can be proven.
In our proof, we had to use both approaches to proving strengthening described in this article. We had to prove a direct strengthening lemma
strength-tp-dem : csp-ok1 E ->
({x:exp (1q AL)} of x TV -> of E T) ->
of E T -> type.
%mode strength-tp-dem +CS +H -P.
asserting that the typing of a present-stage expression
(of (E:exp qt) T) cannot depend on a future-stage
expression exp (1q AL) and the latter’s typing
derivation (of (_:exp (1q AL)) T) – provided
that E is the result of a demotion. We also use the
indirect approach and introduce a CSP restriction predicate
csp-ok1 isomorphic to the predicate
csp-ok. The latter may occur in typing derivations but
the former may not. Finally, we had to prove the strengthening
lemma for csp-ok1.
Please consider submitting a paper to this year’s International Conference on Functional Programming (ICFP) and Conference on Generative Programming and Component Engineering (GPCE). The respective deadlines are April 2 and May 12–19. Both conferences solicit experience reports in addition to regular papers. As you might expect, ICFP also solicits functional pearls—“elegant, instructive, and fun essays on functional programming”.
Less formal is a Continuation Fest that a bunch of us are organizing. If you can be in Tokyo on April 13, please consider participating! Talk proposals are due on March 31. This event is scheduled on the day before the Symposium on Functional and Logic Programming (FLOPS); you can travel from Tokyo to Ise on the morning of April 14.
Continuations are used, and constantly re-discovered, in web programming, logic, event-based concurrency, linguistics, transactional systems, optimization, program generation and verification. Programming language systems … are (re-)implementing continuations, generators, iterators and related control structures. Continuations are also being studied for their own sake.
We call for an informal gathering of people working in all these areas—language designers, users, and researchers who study or use continuations in academic or practical work or as a hobby. The goal is to exchange ideas, terminology, new results, experience reports, works in progress, open problems, and friendly feedback—as well as meet each other and strengthen the community.
We plan a combination of 7–8 broadly accessible talks and 1–2 invited talks, with plenty of breaks for socializing and informal conversations. An evening event is planned. Participants are invited to submit working notes, source files, and abstracts for distribution to the attendees, but as the workshop has no formal proceedings, contributions may still be submitted for publication elsewhere.
We solicit talk proposals, about 2 paragraphs long … The topics of interest include, but are not at all limited to:
- descriptions of uses of continuations in program generation, linguistics, proof theory, model checking,
- system demonstrations
- libraries using continuations, such as generators
- verifying programs that use continuations explicitly or implicitly (exceptions, concurrency, backtracking)
- type systems of continuations and their formalizations
- open problems
It seems appropriate to translate “Continuation Fest” into Japanese as “継続祭”. As for Mandarin Chinese, how about “續繼節” (“续继节”)? It was 穆信成 Shin-Cheng Mu who taught me to noun a bisyllabic Chinese verb, such as “繼續” (“继续”, “continue”), by exchanging the two syllables.
“Perhaps, as many neuroscientists believe, the brain escapes the limitations and requirements that computer scientists believe are imposed by mathematics, logic, and physics.”
“Such an architecture is prodigally wasteful of material resources. It is nakedly exposed to combinatorial explosions that lurk behind every tree in the computational forest.”
—Gallistel, C. R. 2008. Learning and representation. In Learning and memory: A comprehensive reference, behavioral approaches, ed. Randolf Menzel and John Byrne. Amsterdam: Elsevier Science.
Are associative theories of learning the Viet Nam of cognitive science?












