Hacker News

Autology: A Lisp with access to its own interpreter

127 points by simonpure - 48 comments

Kimbsy [3 hidden]5 mins ago

Hi this is my project!

I'm giving a talk on this in May for London Clojurians, and I'll also be talking about it at Lambda Days in Krakow in June.

https://www.meetup.com/london-clojurians/events/306843409/

The project was initially inspired by Dr John Sturdy's thesis "A Lisp through the Looking Glass" which is all about interpreter towers and the ability to modify them in both directions.

This is all just for fun, I've yet to think of anything truly useful you can do with this. If you have any cool ideas please let me know, they might even make it into my talk!

jlarocco [3 hidden]5 mins ago

In Common Lisp, at least, it's pretty typical for the compiler or interpretter to have packages for interacting with its internals. For example, `sb-c`, `sb-vm`, and `sb-ext` packages in SBCL; and the `ccl` package with Clozure Common Lisp, etc.

That ability also shows up in the inspector - it's possible to inspect types, classes, functions, etc. and inspect the internal data structures the compiler uses to represent them.

With SBCL it's even possible to modify the "virtual ops" the compiler uses, and emit assembly code that SBCL doesn't support out of the box.

It's a really convenient feature, and it would be nice if to see other languages pick it up.

johnisgood [3 hidden]5 mins ago

Are there any examples?

phoe-krk [3 hidden]5 mins ago

https://pvk.ca/Blog/2014/03/15/sbcl-the-ultimate-assembly-co...

johnisgood [3 hidden]5 mins ago

Thank you! Looks useful.

"The Common Lisp Condition System" is a great book, by the way, thanks. :)

phoe-krk [3 hidden]5 mins ago

Thanks! I hope it serves you well.

lisper [3 hidden]5 mins ago

Here is a little code snippet for CCL that changes the compiler to accept a Scheme-like ((...) ...) syntax.

https://flownet.com/ron/lisp/combination-hook.lisp

johnisgood [3 hidden]5 mins ago

Thank you! If you have any SBCL-related snippets as well, let me know!

abeppu [3 hidden]5 mins ago

I haven't read any of the code to see how it compares, but this work reminds me of research from Nada Amin & Tiark Rompf on "towers of interpreters", where a stack of meta-circular interpreters uses some explicit staging operations that make it possible to specialize for the interpreter code in a way that removes overhead.

https://www.cs.purdue.edu/homes/rompf/papers/amin-popl18.pdf

cipherself [3 hidden]5 mins ago

This in turn, is based on black[0] which is an extension to scheme, which in turn is inspired by [1]

[0] http://pllab.is.ocha.ac.jp/~asai/Black/

[1] https://www.lirmm.fr/~dony/enseig/MR/notes-etudes/Reflective...

almostgotcaught [3 hidden]5 mins ago

There's no relationship - tower of interpreters is related to lightweight modular staging

https://dl.acm.org/doi/10.1145/1868294.1868314

This is just lisp or more broadly "I'm an interpreted language" since you know... Python has access to much of its interpreter runtime too . And if you think you can't do wild stuff in Python you can (hint: you can use ctypes on libpython to do way more than you think you can do).

bandrami [3 hidden]5 mins ago

Isn't having access to the interpreter (and scanner, and parser, and compiler) kind of what makes something Lisp rather than something else with a lot of parentheses?

drob518 [3 hidden]5 mins ago

Typically, Lisp is quite flexible, via macros and reader macros, but the smallest core is still fixed in function (the part that implements special forms, order of evaluation, etc., basically eval and apply). This makes even that core replaceable on a form by form basis. Interesting from a theoretical perspective but not very useful in practice, and performance killing as the author points out.

tines [3 hidden]5 mins ago

You can already kind of replace that core on a form-by-form basis in plain Lisp by redefining EVAL and APPLY, and passing whatever forms you do not care about redefining to the original versions of them.

chabska [3 hidden]5 mins ago

that requires dynamic scoping right? Wouldn't work in lexical scoping.

kazinator [3 hidden]5 mins ago

Once upon a time that was true. The original Lisp interpreter could be extended by writing new FEXPR routines, which are given their argument code as source to interpret.

In the FEXPR era, Lisp didn't have lexical scopes, so FEXPR routines did not require an environment parameter.

FEXPRs were eventually abandoned in favor of macros. That was largely due to practical reasons; Lisp people were pressuring themselves or else being pressured to make Lisp perform better. It was not because FEXPRs were exhaustively researched.

A modern treatment of FEXPRs will almost certainly be lexically scoped, so that FEXPRs will have a lexical environment parameter to pass to the recursive eval calls.

Autology makes a different choice here by passing a parameter *i* (implicitly) which is not simply the lexical environment, but an object for interpretation, which can be selectively customized. This makes a categorical difference.

It's immediately obvious to me that this allows possibilities that are impossible under FEXPRs.

Here is why: when a FEXPR recursively evaluates code, whether through the standard eval provided by the host Lisp, or its own custom evaluation routine, it will encounter invocations of other FEXPRs. And those FEXPRS are "sealed off": they are coded to invoke a certain eval and that's it.

Concretely, suppose we are writing an if operator as a FEXPR. We get the if form unevaluated, and destructure it into the pieces: condition, consequent, alternative. What our FEXPR does then is recursively call eval to evaluate the condition. If that is true, we recursively call eval to evaluate consequent, otherwise we call eval on the alternative. (In all eval calls we pass down the lexical environment we were given).

Now suppose someone is writing a FEXPR in which they want to heavily customize the Lisp dialect with their custom evaluator, custom-eval. Problem is what, happens when custom-eval sees an if form that is programmed using our if FEXPR? It just calls that FEXPR, whose body calls eval; it doesn't know that the code is in a customized dialect which requires custom-eval!

It looks as if that the approach in Autology handles (or if not, could easily handle) the situation. The FEXPR which customizes the Lisp dialect can pass down a modified interpreter via *i*. Then when interpretation encounters if, it will call the if FEXPR, and what FEXPR will use the customized interpreter on the condition, then and alternative.

Now (non-hygienic) macros can already support this, in theory. If you've written a DSL as a macro, and in that DSL you choose to expand regular macros defined in the host Lisp, at least some of them with simple behaviors like the if example will also Just Work. This is because macros don't (usually!) call eval, but generate code. That code just ends up being pasted into the expression of whatever dialect is being processed, and will therefore be interpreted as that dialect. A macro arranges for evaluation by inserting the pieces of code into the template such that they are in the right kind of syntactic context to be evaluated. For this to work across dialects, the dialects have to be similar. E.g. an if macro might rewrite to cond. If the target dialect has a sufficiently compatible cond construct, then everything is golden: the if to cond rewrite will work fine.

So how we can look at dynamic interpreter customization is that it brings one aspect of FEXPRs closer to macros. Macros can do some things that FEXPRs cannot, and vice versa.

jdougan [3 hidden]5 mins ago

Obligatory reference to John Shutt and Kernel whennever FEXPRs are mentioned.

Y_Y [3 hidden]5 mins ago

Reified reference for contexts where this wasn't yet defined: https://web.cs.wpi.edu/~jshutt/kernel.html

agumonkey [3 hidden]5 mins ago

Access to eval and a generic recursive data type as internal structure means the language is closed on itself, can introspect, manipulate, create more through the same means it's programmed (see norvig lispy with python [lists] instead of parens)

kazinator [3 hidden]5 mins ago

But eval is opaque; it implements a particular dialect and that'w what you get when you recurse into it. It looks like Autology provides a way to customize the dialect that eval will process recursively. Interpreter extensions that handle subexpressions via eval will then work in the context of the altered dialect (that being provided by a parent interpreter extension itself).

agumonkey [3 hidden]5 mins ago

Fair point, but the usual eval core is so small that I couldn't see it being split apart, and then layering new traits can still be done (still opaque though).

ludston [3 hidden]5 mins ago

You can actually get completely stupid with this in Common Lisp using something called a "Reader Macro" which lets you temporarily take complete control over the interpreter.

For example, I have this joke project that defines a DSL for fizzbuzz:

https://github.com/DanielKeogh/fizzbuzz

bsima [3 hidden]5 mins ago

If you like this, you'll love Shen https://shenlanguage.org/

ape4 [3 hidden]5 mins ago

Since its written in Clojure which runs on the JVM why not a *jvm* variable also ;)

sargstuff [3 hidden]5 mins ago

Lisp take on generating an embedded VM at runtime/ad hoc basis (vs. polymorphic coding)?

How would use the most recent 'trie of lisp git diffs' differ from command to do a system call fork of lisp / duplicate lisp lists / apply 'diffs' to forked system call of lisp?

Perhaps, lisp concurrency support without a system()/debugger.[1]

(humor) To bad iLisp would be confused with other brand name(s).

[1] : Evil Scheduler: Mastering Concurrency Through Interactive Debugging : http://aoli.al/blogs/deadlock-empire/

johnisgood [3 hidden]5 mins ago

These "inter-lingual" examples such as https://github.com/Kimbsy/autology/blob/main/resources/examp... are pretty cool. What is the use-case of this or why would one want to do this? Just trying to bounce off ideas.

Kimbsy [3 hidden]5 mins ago

So the best thing I could come up with (assuming a theoretical perfect implementation that wasn't horribly slow and expensive) was the idea that since many languages have their own niches and categories of problems that they are best used to express/solve, you could conceivably work on a complex problem where different parts of the computation are most elegantly expressed in different languages. So having a metalanguage that allows you to switch syntax in a lexical scope would be useful!?

Highly doubtful that this benefit would outweigh the egregious complexity cost of using a language like Autology, but a fun thought experiment.

johnisgood [3 hidden]5 mins ago

It is definitely worth pursuing. :)

jampekka [3 hidden]5 mins ago

Calling code across languages at least is very widespread. And typically requires custom, often clunky, solutions for all language pairs.

johnisgood [3 hidden]5 mins ago

So is this like a "better FFI", in some sense of the term?

immibis [3 hidden]5 mins ago

Looks like just automated FFI, putting some glue around your function and then invoking the appropriate compiler for that language.

I don't really see the difference between (with-*i* "c" foo) and a macro that expands to (with-c "foo")

souenzzo [3 hidden]5 mins ago

It's awesome to see the amount of ideas you can explore with 1kloc lines of clojure!

Cieric [3 hidden]5 mins ago

I was working on something similar to this a while ago, but the goal of it was experimenting with what a language that could modify it's own syntax would look like. I was going to write something as basic as possible and then have example scripts on transforming the syntax to other languages. I was probably just going to do brainfuck and c in the end, but I wanted something like that to be possible. I couldn't figure out how to make a language modify a tree structure for tweaking the ast of the interpreter, but I guess lisp fits the bill there.

stevekemp [3 hidden]5 mins ago

You could be inspired by FORTH and not have an AST at all ..

rpcope1 [3 hidden]5 mins ago

This is basically just Forth or Factor.

timonoko [3 hidden]5 mins ago

Mind blown when you write better language using the language itself. Why I wasted so much time in assembly? Then you remember there are certain restrictions as regards to time travel.

lifthrasiir [3 hidden]5 mins ago

I think CosmicOS had the same idea [1] to seamlessly introduce special forms without defining the implicit interpreter in advance.

[1] https://cosmicos.github.io/message.html#section18

johnisgood [3 hidden]5 mins ago

What is "" and so forth? It does not even render properly here on HN after having copy pasted it from the website. Custom font or what? I will check later if no one replies.

lifthrasiir [3 hidden]5 mins ago

The CosmicOS message has multiple possible encodings, including base 4 representations shown below (e.g. the message starts with 121001031211132233...). The very first page contains glyphs corresponding their compact graphic representations. I think there was some description about these, but I couldn't find them so I will just recall my understanding here. The base 4 representation uses four symbols 0--3, which are derived from textual tokens:

---

12 (0|1)+ 3 -- Names. A particular name `xxx` is only available after the `intro xxx;` declaration. Note that the name `intro` itself is the first token ever and never explicitly defined.

02 (0|1)+ 3 -- Binary numbers. Note that unary numbers used initially are encoded using a special function "unary" with ordinary binary numbers 0 (0203) and 1 (0213) as bits.

2 without preceding 0 or 1 -- Opening parenthesis.

3 without preceding 0 or 1 -- Closing parenthesis.

023 -- Auto-closing opening parenthesis, denoted `|` in the textual form. `(a | b | c d)` is same as `(a (b (c d)))`.

123 -- Single-item parentheses, denoted `$` in the textual form. `$x` means `(x)` and typically used in variable getters.

2233 -- Marks the end of the current expression. Each expression in the CosmicOS message should evaluate to true.

---

Now, keep in mind that every name is effectively a specially marked number. I think each line segment in glyphs corresponds to a particular bit position, and numbers have short overline and underline to distinguish themselves from names.

ropejumper [3 hidden]5 mins ago

https://imgur.com/a/5zDRli5

JoelMcCracken [3 hidden]5 mins ago

reminds me a little bit of the Kernel programming language

bsder [3 hidden]5 mins ago

Reminds me a lot, except that Kernel does away with "special forms".

Link for "Kernel" because it's a lousy name to have to search for: https://web.cs.wpi.edu/~jshutt/kernel.html

One interesting bit that remains unappreciated about Kernel is that environments are reified and are "copy-on-write". So, if you make an environment change, it propagates forward but previous calls only see the previous environment unless explicitly give them access.

It's a little unusual, but it works out pretty well. You can call the "ground environment" which is immutable and lets you compile things. You can access the "lexical environment" which works like normal. And you can call a "dynamic environment" that lets you capture things.

It's interesting in that it contains the scope of "dynamic environments" and still allows things to be compiled (which was the big downside of dynamic environments circa 1970s).

One downside I found is that the obvious implementation thrashes the hell out of your garbage collector. "Environments" really want a data structure that is more complicated and copy-friendly than "cons pairs".

shadowgovt [3 hidden]5 mins ago

Interesting!

Racket has the capacity to declare DSLs that are then applied at the file level, but this is even finer-grained.

behnamoh [3 hidden]5 mins ago

so it's hot code reloading on steroids? really neat idea, I'm working on a similar Lisp-style language and will probably adopt this idea (and cite you of course). In my language, you can redefine any symbol, even numbers, so (def 10 12) is valid code (so is (def def 42), which breaks `def`!).

I wonder what use cases there are for such extreme flexibility, aside from funs and games!

v9v [3 hidden]5 mins ago

Redefining 'def' is a trick used in Tcl to make "safe" configuration languages exposed to the end user. You define all of your configuration options as functions and then undefine everything else (including "proc" which corresponds to defun) allowing you to evaluate users' config files with this safe interpreter as code.

I can't find a good source for this practice of manually undefining functions, but here is a link to relevant built-in functionality: https://www.tcl-lang.org/man/tcl8.6/TclCmd/safe.htm

anonzzzies [3 hidden]5 mins ago

None; it's for fun and games. But maybe thats the most enjoyable use case anyway.

Kimbsy [3 hidden]5 mins ago

I couldn't agree more

Kimbsy [3 hidden]5 mins ago

Please feel free to use any/all of it, it's really fun to play around with.