- Recent Changes 新聞
- History 歷史
- Preferences 喜好
- Discussion 討論
Quines are normally defined as “self-reproducing programs”, or programs that output the copy of themselves: http://www.nyx.net/~gthompso/quine.htm
If we define an interpreter IN
as a function that
takes an encoding ET
of a term t
to
t
’s value, then quine q
is a term such
that its value is its encoding
q = EQ
or,
IN EQ = EQ
from which follows that a quine is a fix-point of the interpreter.
A quine is a fix-point, usually not the least fix-point. Indeed, in Scheme or other CBV languages, the least fix-point of eval is bottom – which is not that interesting.
Let’s consider Scheme first [footnote: our eval
here is (define (our-eval term) (eval term
r5rs-environment))
. Many Scheme systems, e.g., Petite, offer
such a single-argument eval]. The shortest fix-point of
eval
is 1 (the numeral one). Indeed, (eval
1)
is 1. The same holds for other numbers, as well as
characters, strings, and booleans. These are often called
self-evaluating data – the very name self-evaluating suggests they
are the fix-points of eval.
Can we find a fix-point of eval that actually “does something”? The answer comes from the expression for the core of the fix-point combinator (so-called “the omega term”)
((lambda (f) (f f)) (lambda (f) (f f)))
Clearly one step of reducing this expression gives the same expression back, so the evaluation never finishes. We do wish the evaluation finish however, to give us a value to pass to eval (whose result can again be passed to eval, etc). We just need to place quotes in the right places:
(define ff ((lambda (f) `(,f ',f)) '(lambda (f) `(,f ',f))))
Indeed,
> ff
((lambda (f) `(,f ',f)) '(lambda (f) `(,f ',f)))
> (eval ff)
((lambda (f) `(,f ',f)) '(lambda (f) `(,f ',f)))
> (equal? ff (eval ff))
#t
The repeated re-evaluations of ff
are similar to
the stepping through the evaluation of the omega term.
We can re-write ff
using only quasi-quote. That is
not entirely trivial because now we have to mind the levels of
quotation and quasi-quotation. But that expression will be more
useful in other staged languages, which only have quasi-quotation
and which insists on properly matching unquotes with
quasi-quotes.
(define ff1
((lambda (g) `(,g (,`quasiquote ,g)))
`(lambda (g) `(,g (,`quasiquote ,g)))))
> ff1
((lambda (g) `(,g (,`quasiquote ,g)))
`(lambda (g) `(,g (,`quasiquote ,g))))
> (eval ff1)
((lambda (g) `(,g (,`quasiquote ,g)))
`(lambda (g) `(,g (,`quasiquote ,g))))
> (equal? ff1 (eval ff1))
#t
Let us now turn to MetaOCaml, where we immediately run into
problem with types. In MetaOCaml, the evaluator is an operation
.!
of the type ('a,'b) code -> 'b
. Its
fix-point should have 'b = ('a,'b)
code, which
impossible without equi-recursive types. There is the second
problem in the above equation: The evaluator .!
insists that 'a
be a free type variable. We need
first-class polymorphism (to “generate” these free type variables,
informally speaking). We arrive at the first approach. We define an
iso-recursive type with first-class polymorphism
type rc = {rc : 'c. ('c, rc) code};;
and a custom run
function that does appropriate
record unwrapping for us:
let runrc t = .! (t.rc);;
val runrc : rc -> rc = <fun>
Now the type of runrc
is such that it may indeed
have a fix-point.
Our first approach to the fix-point of runrc
uses
the value recursion and cross-stage persistent values (CSP). It is
cheating as we shall see – yet the expression is so simple it is
hard to pass on.
let rec q1 () = {rc = .<q1 ()>.};;
val q1 : unit -> rc = <fun>
Now,
let tq1 = q1 ();;
val tq1 : rc =
{rc = .<(((* cross-stage persistent value (as id: q1) *)) ())>.}
let tq2 = runrc tq1;;
val tq2 : rc =
{rc = .<(((* cross-stage persistent value (as id: q1) *)) ())>.}
etc.
The result shows why using CSP is cheating: the encoding,
tq1
, does not make sense in the empty environment. Our
run-time system should have the custom binding of q1
.
We cannot “pack” tq1
and transmit to another MetaOCaml
system running on a separate host. That system will have no idea
about what q1
is being associated with.
One may informally say that q1 ()
creates an
non-founded code value: code value that refers to itself. That is
reminiscent of an old trick of writing loops in Lisp
interpreter by creating such cyclic code (e.g., in
macros). This technique was popularized by Olin Shivers
Stylish Lisp programming techniques
http://www.ccs.neu.edu/home/shivers/newstyle.html
Technique n: passing circular list structures to the evaluator
We may attempt to port the ff1
function of Scheme
to MetaOCaml. At first, porting seems easy: MetaOCaml’s brackets
are the equivalent of a quasi-quote, and MetaOCaml’s escapes are
like unquote. But we run across two difficulties: we can’t (quasi-)
quote quasiquote in MetaOCaml. Brackets always “take effect” and
can’t be quoted. We can attempt to emulate Scheme’s quasi-quote by
defining a form lift:
let lift body = .<body ()>.;;
But that will only introduce CSP (variable
body
).
This points out the major difficulty with writing quines in
MetaOCaml. Many (most?) approaches to quines rely on being able to
“inline” a value inside quotation marks. For example, if
s:string
is the encoding of some term, we can insert
that s
replacing $x
within the string
"here $x"
and "here \"$x\""
. In fact, it
is this approach that is used when writing quines in C and other
such languages.
Likewise, the first Scheme solution relied on inlining
f
at two places within `(,f ',f)
, the
latter is under the quotation. But in MetaOCaml, we can’t inline
under the brackets without resorting to CSP. If g
is a
code value in MetaOCaml, we can inline it here
(fun g -> .<begin .~g; () end>.) .<1::[]>.;;
but not under brackets here
(fun g -> .<begin .<.~g>.; () end>.) .<1::[]>.;;
(one level of brackets disable escape, giving us a CSP
g
) or here
(fun g -> .<begin .<.~.~g>.; () end>.) .<1::[]>.;;
(type error)
Indeed, the common approach to quine construction relies on the
conversion of a value of the type ('v code)
into the
value of the type (('v code) code)
– or, to persist a
value from one stage onto the latter stage. But the this is the
very definition of CSP.
There might be one way to avoid CSP: that is, writing a code that takes an interpreter. By supplying different interpreters, we can generate different quoted versions of the same code. Here’s a simple illustration:
(* a sample term *)
let t1 = (fun (l,a) -> l (fun f -> a f));;
(* an immediate interpreter *)
let i0 = ((fun x -> x), (fun f -> begin f; f; () end));;
t1 i0;;
- : '_a -> unit = <fun>
(* a lifting interpreter *)
let i1 = ((fun f -> .<fun x -> .~(f .<x>.)>.),
(fun f -> .<begin .~f; .~f; () end>.));;
t1 i1;;
- : ('a, '_b -> unit) code = .<fun x_1 -> x_1; x_1; ()>.
(* a twice lifting interpreter *)
let i2 = ((fun f -> .<.<fun x -> .~.~(f .<.<x>.>.)>.>.),
(fun f -> .<.<begin .~.~f; .~.~f; () end>.>.));;
t1 i2;;
- : ('a, ('b, '_c -> unit) code) code =
.<.<fun x_1 -> .~(.<begin .~(.<x_1>.); .~(.<x_1>.); () end>.)>.>.
.! (t1 i2);; (* is same as t1 i1 *)
To use this approach in writing quines, we need to be generating functions that take interpreters – poly-interpreted terms, so to speak. This approach may indeed work out, but I haven’t pursued it in detail.
We finish with a more straightforward encoding of
ff1
in MetaOCaml, which does, alas, use CSP (as
explained above) – albeit theoretically avoidable CSP:
type af = {af : 'c. ('c, af -> rc) code};;
let t1 =
(fun f -> {rc = .<.~(f.af) f>.}) {af = .<(fun f -> {rc = .<.~(f.af) f>.})>.};;
val t1 : rc =
{rc =
.<((fun f_1 -> {rc = .<((.~(f_1.af)) f_1)>.})
(* cross-stage persistent value (as id: f) *))>.}
let t2 = runrc t1;;
This expression has type ('a, rc) code but is here used with type
('b, rc) code
Exception: Trx.TypeCheckingError.
We get a type-checking error at run-time! Something is odd,
which might be indicative of an issue with MetaOCaml. One can see
however that manually running t1
indeed gives the same
code as t1
…
The cross-stage persistent value af
is the closed
code, which is serializable and so can be “copied” rather than made
available by reference. Mutable values or closures can’t generally
be copied and so it is their reference that cross-stage persists.
For values of primitive or code types and their tuples and records
– values that can be serialized and migrated – one may persist
either the reference or the value itself. In the latter case, we
inline the cross-staged persistent value into the host AST. The
current version of MetaOCaml inlines only integers, booleans,
characters and doubles – but not tuples of integers, for example.
Therefore, the appearance of the CSP in t1
above is an
artefact of the MetaOCaml implementation. That CSP is theoretically
avoidable.