Proper Treatment 正當作法/ blog/ posts/ Really understanding effects
標籤 Tags:
2010-12-19 06:35

To: Mike Solomon
Cc: Chris Barker, Jim Pryor

Thanks for starting this great discussion, whether the start was this month, this year, or this millennium.

Let me first invoke the usual hypothesis of cognitive science: the mind is a computer/program/algorithm so let the study of the mind be informed by the study of computers/programs/algorithms. A lot of Mike’s comments have direct analogues in programming languages and software engineering—lots of people question why monads and continuations are useful and point out that mutation and GOTO are intuitively easy to understand. So, it’s not clear to me that time spent writing typical programs on a computer is more beneficial or convincing than the same time spent writing, say, Montagovian fragments with paper and pencil.

The concern that is more convincing, and that functional programming, monads, and continuations help address, is how to write and maintain large and reliable programs. The claim that a given program or part of a program works, like the claim that a given Montagovian fragment or lexical entry works, is basically a theorem. It takes effort to formulate and justify such theorems, especially if we want to do it not only by example but also by rigorous reasoning. As we change and grow our program over time, we want to avoid repeating what is essentially the same effort for parts that have not changed or lexical entries that are analogous. For example, we might want to prove one lemma about all values of a certain type, or all transitive verbs. In sum, we want modular programming, modular testing, and modular reasoning.

By “module”, I just mean that you could swap one out and stick another one in without affecting what we care about in the rest of the system. Whenever two pieces of code or testing or reasoning are similar, we want to put the similar parts in a module so that we can reuse our work as we repeat and change the pattern. Monads are one kind of module that emerged when people got tired of writing similar programs and proofs over and over again. It’s just like in the rest of math: groups, vector spaces, etc.

On 2010-12-14T12:49:18-0500, Mike Solomon wrote:

I would agree that monads give a potentially unifying perspective. I was challenging the claim that without monads you can’t “really” understand mutation/dynamic update. Or, that functional programming gives the best way to understand mutation. This just seems wrong to me. Monads are notoriously difficult to understand, mutation is intuitively easy to understand.

That is, I would say: we already understood mutation, now we look at it from a different perspective to try better understand how to combine it with other effects.

Of course what I just wrote may not be what it means to ‘“really” understand mutation/dynamic update’. Nevertheless, I would suggest that to “really” understand something entails proving theorems about it that go beyond particular usage examples—at least, not just “this program gives the correct output on this sample input” but “this program part works on all inputs regardless of the rest of the program”.

As soon as we humans who get bored easily try to prove such theorems, we want to capture repeated patterns in our reasoning so as to avoid repeating them. (Hoare logic for reasoning about imperative programs is a great and successful example of such capture, even though it treats mutation specifically and not as a special case of a side effect.) Maybe understanding mutation even entails being confident in a program part that uses mutation and no other side effect, regardless of what side effects occur in the rest of the program.

One question I asked yesterday (last night) is whether the analogy between quantification and shift/reset yielded insight independent of what kind of semantics one gave for shift/reset: shift/reset can be understood operationally, as a reduction calculus. Can quantification? But maybe this is just what QR is. In general, is the transformation to LF a kind of operational semantics?

I would answer yes, yes, yes.

I take at least 3 lessons from programming-language semantics into natural-language semantics: First, there are different kinds of semantics, not only denotational but also operational and axiomatic, each good for different purposes. Operational makes it more obvious how much time and space a program will take; denotational makes it easier to substitute equals for equals in a larger program. Second, we should relate the different kinds of semantics. For example, I’m a big fan of Olivier Danvy’s work on mechanically turning denotational semantics (well, definitional interpreters) into operational semantics or back. Imagine putting towers into a vending machine and getting QR out or vice versa! Finally, it is worth applying specific ideas such as monads in denotational semantics and evaluation contexts in operational semantics.

Is it only from the point of view of continuations that one can understand (best) a programming language that contains control and state?

Continuations are the best denotational account of control. It makes it easy to prove that the following expressions can be substituted for each other:

(shift c (and (c (mother 'john)) (c (mother 'mary))))
; John and Mary's mother

(mother (shift c (and (c 'john) (c 'mary))))
; John's mother and Mary's mother

Do continuations help us understand/reason about order of evaluation per se, or do they help us reason about order of evaluation in the context of a compositional semantics?

The latter, and thus the former, no? :)

In general, should the monad/continuation approach give us insights into natural language meaning (empirically relevant insights), or should it give us insights into how to analyze natural language meaning in a certain way, compositional semantics?

The latter, and thus the former, no? :) :)

It seems to me that the analogy between computational side effects and “apparently noncompositional” linguistic phenomena could be insightful in the first way, and that these insights would be conceptually prior to any implementation in terms of monads/continuations.

Sure, if I were patenting all this then I would claim an implementation in terms of monads/continuations as “a preferred embodiment” of the analogy.

Take weak crossover. The insight is that evaluation order matters. You and Ken used continuations to capture this compositionally. But now your account of order-sensitivity is order-independent. This is what we need for traditional, bottom-up compositional semantics. But why do we still want this? Or, what is the status of such a semantics? It is often said that a compositional semantics is necessary for humans to understand novel utterances. But I think it is apparent that weak crossover arises not because humans learn a compositional semantics which simulates order-sensitivity, but rather because of the actual order-sensitivity of processing (we hear from left to right). What kind of semantic system captures this directly?

You probably wouldn’t find any denotational account of evaluation order “direct” in your sense. The “kind of semantic system” you want is probably operational semantics. Indeed, the way I would argue that some denotational semantics accounts for evaluation order is to show that it corresponds to an operational semantics that obviously accounts for evaluation order.

Here’s another attempt to put things generally. Is natural language a language with side effects (e.g. C, OCaml), which really does work impurely/noncompositionally, but such that, at a level of abstraction, its behavior can be described compositionally? or is natural language a pure functional language (e.g. Haskell) which just happens to contain lexical items whose (pure functional) meanings simulate side effects?

If it is the former, which seems plausible, how does language “really” do what it does?

What’s the difference? In other words, what does “really” mean?

Haskell programs can perform IO and manipulate mutable state just fine. The “semantics/pragmatics interface” (aka “run-time system”) of Haskell lets you define a value “main”, of type “IO ()”, which gets performed when you eventually run the program. It’s like, how does a Hamblin set of alternatives get actually asked?