Skip to content

Qi Meeting Jun 7 2024

Siddhartha Kasivajhula edited this page Nov 7, 2024 · 2 revisions

Another Step on a Winding Staircase

Qi Meeting Jun 7 2024

Adjacent meetings: Previous | Up | Next [missing notes: Jun 14]

Summary

We discussed Dominik's progress on the "state consing" approach to deforesting stateful transformers like take. We started to "continuously deform" the structure of pattern matching in deforestation to extract the matching logic to the expander layer. We discussed what alternative backends we should consider implementing for Qi, and caught up on news of the latest rabbit holes.

Background

Our initial implementation of deforestation in Qi 4 did not include any notion of "state" in transformers (such as map and filter) as we hadn't encountered any up to that point that were stateful. As take involves state (it must somehow remember how many elements have been taken so far), it requires a modified approach. After discarding the initial approach of using mutation, we felt that state consing (similar to Haskell's state tupling) would be a promising approach to try.

Last time, we discussed the potential for various problems arising from doing literal matching in the compiler on unvalidated (host language) syntax, and agreed on a solution.

State Consing

Dominik showed a preliminary implementation of state consing for the take transformer which resembles the original CPS transformer implementation, but includes "inline consing," analogous to the inline-compose macro that serves a similar purpose for concatenating stream components without an explicit composition. We discussed that Haskell's "state tupling" is somewhat different in that it includes a primitive stream and "conses up" to create a bigger state (TODO: clarify). We felt that it was a more generic approach, but not one that is necessitated by the data we've seen so far.

We found a couple of bugs in edge cases and wrote regression tests to validate them, and then wrote fixes and validated them using the regression tests.

Currently it appears that the stream would only be augmented with state if there is actually a take present. But we wondered what the cost would be in adding take in the case where it wouldn't affect the result. For instance:

(~>> (range 10) (filter odd?) (take 5))

vs

(~>> (range 10) (filter odd?))

Here, the result is the same in either case since there are only five elements in the result. We felt that we would need to benchmark these against one another (with much larger lists, probably) to measure the impact of annotating with state.

Sound Optimizations

We considered how to go about implementing the remedy we'd decided on to avoid matching host language syntax in optimizations.

Topological Refactoring

In general, there are many ways to refactor code. One way would be to just get deep into the guts of the implementation and make all the changes we see as necessary, changing things to their intended final form with abandon, and then testing things until they work as expected.

While this way is the most direct and could therefore in principle be fastest, it is also the most risky. If things don't work the first time, the scope of the error is the entire set of changes, and error messages may not easily identify the cause. It could end up taking much longer to debug as a result. Intuitively, this path is short but it is dynamically unstable.

Another way is to "continuously deform" the implementation by making small sets of changes that are each self-consistent, and which can each therefore be validated incrementally by the tests. This deforms the existing implementation to the desired one without introducing too much complexity at any stage through a series of fully-functional intermediate stages.

We felt that this second way would be prudent to follow in implementing the solution.

Step 1: Match Qi Bindings Instead of Racket Bindings

Currently, as we noted last time, the following happens in production:

#lang racket

(require qi
         qi/list)

(define (filter f lst)
  'hello)

(~>> () (range 1000) (filter odd?) (map sqr) (foldl + 0)) ;=> 166666500 (!)

The essential problem here is that in applying optimizations in the compiler, we should be doing datum matching of core language syntax which has been validated by the expander. Instead, as the core language is not rich enough to express the optimization we seek to perform (deforestation), we are forced here to match host language syntax where datum matching is insufficient as the syntax has not been validated.

Thus, we are tempted to use literal matching to remedy this, when the real problem is the limitation of the core language.

For now, the in-between step we identified here is to match host language syntax literally, but at least, ensure that these are bindings provided by Qi rather than by Racket.

We accomplished this by simply defining bindings in Qi such as map and filter that refer to the corresponding Racket ones. Although they ultimately refer to the same functions, they are still treated as distinct literal bindings by the Racket expander, so that serves our needs well, to decouple Qi's deforestation from racket/list and contain it within qi/list.

Step 2: Add a New Core Form for Deforestation

The next step is to enrich the core language so that it can express deforestation.

To do this, we will need to add a syntax production (i.e. core language form), but in addition, we will also need to rewrite uses of deforestable bindings to uses of this new core form, requiring a rewrite production.

Something like:

    (#%deforest (f arg ...))
    (~> (f:deforestable arg ...)
        #'(#%deforest (f ...)))

We could have the deforestable pattern matching class reject all syntax initially, and the code generation for the #%deforest form could be left unimplemented (or could compile to a runtime error if necessary).

Step 3: Move Validation to the Expander Layer

At this point, we are still doing literal matching in the compiler, but at least we are matching native bindings. Additionally, the new #%deforest form should allow the core language to express the optimizations we want, but it is still unused, and most of the validation logic still remains in the compiler, annotating the input syntax with prepare properties and so on.

So the next step would be to move this validation up to the expander via the new core form, while keeping the logic the same.

This change simplifies the logic in the compiler as it does not need to do any validation or other preparation of the syntax prior to applying the optimization anymore, as all that is moved up to the expander. Additionally, the compiler can now safely use datum literal matching on the core language.

Of course, we would still be doing literal matching in the expander, which brings us to the next step.

Step 4: Add Macros for Extending Deforestation

Before we fully incorporated Syntax Spec into Qi's implementation, we mimicked its implementation of macro-extensibility inside the Qi codebase. We can follow the same approach here to implement extensibility of deforestation. The exact interface remains to be defined, but it could be something like the following.

First, a new module for the purpose would define a compile-time datatype, say, (struct deforestable-component [done skip yield]). This would be provided publicly by qi-lib.

Then, we would define macros resembling define-qi-deforestable-transformer, define-qi-deforestable-producer, etc. that allow deforestation to be extended. These macros would produce instances of the deforestable-component datatype, and include all the necessary information, such as done, skip, and yield implementations in the respective attributes of this object.

In the expander, the deforestable syntax class could now be modified so that the matching criteria is (deforestable-component? (syntax-local-value stx)) and this eliminates the literal matching. We could also potentially offload much of the logic from this syntax class to the extension macros themselves so that we don't have to infer anything and it can just be included in the input syntax already by the extension macro that's used (e.g. define-qi-deforestable-transformer vs -consumer).

There shouldn't be any changes required in the compiler at this stage, unless the transition to using the deforestable-component datatype implies some simplifications to the implementation somehow.

The former implementation of macro-extensibility in Qi may be a useful reference to follow.

Lastly, given the analogy to macro-extensibility, it's worth considering whether Syntax Spec could similarly abstract any part of this for extending DSL semantics in general.

Step 5: Extract qi-list as a Separate Package

Finally, at this point, the core language includes only the abstract deforestation primitives, and it includes everything that is necessary for deforestation of the list APIs in racket/list, by design.

So we can now reimplement all of the qi/list bindings that were defined in Step 1 using these new extension macros.

This allows qi/list (analogous to racket/list) to be made up of extensions only, and thus, it could be safely removed from the compiler in qi-lib to a new qi-list package. This package would be included in the qi package (e.g. raco pkg install qi would include qi-list) but not in qi-lib.

qi-list would depend on qi-lib, so any utilities defined in the core library would be available to it, which should mean that qi-lib could still include a handful of deforestable forms for lists such as range, map, filter and foldl/foldr, in the same way that it includes built-in macros such as switch, while more specialized extensions could be housed in qi-list without duplication.

Documenting qi/list Bindings

As qi/list has different semantics than racket/list (i.e. it is deforested and has a different order of effects), these bindings need to be documented somewhere.

We considered autogenerating the docs from the source using scribble/srcdoc, but decided against it since it would introduce a dependency on Scribble into qi-lib.

We will just document it in qi-doc for now, and later move it depending on what we do with qi-list. For example, if we follow Step 5 mentioned earlier so that it is a separate package, then the docs for this package would be cross-linked from the main Qi docs.

Other Backends for Qi

We discussed writing a proof of concept compiling Qi to another backend besides ordinary functions, and considered which backends would be worthwhile to explore.

Places and Futures

We felt that unless we imagine Qi as a language for writing operating systems, we don't want to go down to abstraction levels as low as OS threads, green threads, etc., and probably just a futures backend, or a places backend, should be fine. It's likely that using futures or places would suffice in settings where we are likely to use Qi to orchestrate distributed computations at least on a local machine.

High Performance Computing

We also talked about High Performance Computing in a distributed setting. In general, writing synchronization primitives (e.g. for composing increments, such as a compare-and-swap operation) for large architectures is hard, and it seems doubtful that compiling to futures or places alone would allow the implementation to scale to HPC settings. Then again, Qi's theory of effects favors local effects and avoiding the use of shared memory in complicated ways, so it may prove to scale well if problems can be expressed in this way. Additionally, nonlocal effects may still be OK if they are commutative (as in the case of increments).

While every algorithm may not scale using Qi in an HPC setting, it's conceivable that some can be redesigned (e.g. in relation to well-ordering and effect locality) so that they would.

Shell Scripting

Qi could also be good as a shell scripting language as an alternative to Perl or Unix shell pipelines.

Rabbit Holes

We (ahem, some of us more than others) are perpetually in unstable equilibria navigating N-dimensional rabbit hole space (an N-d Pantůček manifold). The latest rabbit hole from which Dominik has emerged is writing an object system purely in R5RS, despite having been sagely dissuaded from this quest by Sam TH and many others. Going down rabbit holes is easy (that's their very nature) but consistently emerging from them unscathed and with something to show for it, that's talent! (or maybe insanity).

Next Steps

(Some of these are carried over from last time)

  • Continue "deforming" to sound and extensible deforestation.
  • Decide on the syntax of Qi's map, filter, etc.
  • Deforest other racket/list APIs via qi/list
  • Revisit extending deforestation to multi-valued settings.
  • Write a proof-of-concept compiling Qi to another backend (such as threads or futures), and document the recipe for doing this.
  • Ask Ben to review tests from the perspective of Frosthaven Manager.
  • Review Cover's methodology for checking coverage.
  • Document the release model in the user docs and announce the performance regression (and the remedy) to users.
  • Improve unit testing infrastructure for deforestation.
  • Discuss and work out Qi's theory of effects and merge the corresponding PR.
  • Decide on appropriate reference implementations to use for comparison in the new benchmarks report and add them.
  • Decide on whether there will be any deforestation in the Qi core, upon (require qi) (without (require qi/list))
  • Continue investigating options to preserve or synthesize the appropriate source syntax through expansion for blame purposes.

Attendees

Dominik, Hendrik, Michael, Sid

Clone this wiki locally