💦 clive: distributed spreadsheets of code+data

On a journey of self-pity and self-indulgence, I have spent the last 5+ years obsessing about how to improve productivity and comfort for software and reliability engineering. I have explored systems like git or Nix, various distributed stores, artifact repositories, build/test/package/deployment systems for a variety of languages. I built, maintained and/or operated dozens of tools and infrastructure components, built a distributed CI for distributed systems & a monorepo environment with IDE and cloud integration. This is the one pragmatic step I could conceive to jump-start a revolution in the user experience of technology workers: a solid, scalable, distributed architecture for spreadsheets of code+data.

Platform

Architecture diagram

A ❤Nix-like system turned reactive and implemented through a launcher/manager/client CLI (Go) and a service (JVM).
Mostly Kotlin instead of the Nix language and a tall layer of C++.
jgit on top of Xodus as an object and “filesystem” store, instead of the Nix store on top of the POSIX filesystem.
Live reactions instead of individual derivations.
High-level abstractions for code generation, builds, tests, assets pipelines, service and task execution, deployments, etc.
SSH-based communication with Git-based data propagation. Maximized interoperability with IntelliJ IDEA.

Steps to self-hosting

  1. Implement all the non-unicorn bits in the architecture diagram. Now everything happens in Kotlin.
  2. Implement a self-reproducing unicorn. Now we have a living and breathing dev toolchain unicorn.
  3. Evolve the unicorn so it can build and publish the non-unicorn bits. Now anybody can adopt a baby unicorn and live a harmonious relationship.

Unicorn design draft

As the crazy in my head crystallises,
everything that follows needs rewriting over better-defined abstractions.
Abstractions and their layering are emerging as follow:
- A Git over Xodus database.
- On top, auto-reloading classloader, w/ precise invalidation tracking
(not yet loaded ⇒ no reload, possibly in-place bytecode replacements).
- A functional object model, fully serializable and persistable in the database
  using Kryo, where entities can exist and evolve as references.
  Entities are maintained by arbitrary code through Git references and can contain
  full object graphs which can refer to other live references. Some references
  propagate their changes back to any consuming code.
  The full Git reference store is ACID.
  In that functional project model:
  - Trust/delegation models using SSH keypairs.
  - On top of said trust/delegation model, an execution dispatch model across services,
    including through P2P / clusters / cloud providers or config GUIs on laptops.
  - A distributed execution engine, which continuously rebuilds parts of the model
    as changes to their inputs are progatated.
    Amongst other tasks, that execution engine could, for example:
    - Start and stop other execution engines from anywhere in any DB over time,
      including as the result of tasks from other models,
    - Watch local filesystems and propagate changes into the object model,
    - Stream parts of the object model back to local filesystems,
      or various external services as they change,
    - Run arbitrary services, recurring tasks, etc.

Click to reveal outdated notes

Example of model declaration code: /demo/server/clive.kt

// We parse imports to figure out which modules to bring in the classpath.
// For example this loads /thirdParty/clive.kt (or /thirdParty/ratpack/clive.kt if it exists)

import clive.model.jvm.artifacts.docker
import clive.model.jvm.artifacts.macosApp
import clive.model.jvm.artifacts.windowsSelfExtract
import clive.model.jvm.Dependencies
import clive.model.jvm.JVM
import clive.model.jvm.JVMArgs
import clive.model.kotlin.backendModule
import thirdParty.ratpack

backendModule {
  dependencies = Dependencies {
    compile {
      ratpack.classes
    }
    runtime {
      ratpack.transitive.exclude { thirdParty.hadoop }
    }
  }

  jvm = JVM {
    args = JVMArgs {
      memory = 16.GB
    }
  }

  artifacts = Artifacts {
    docker
    windowsSelfExtract
    macosApp(jvm = this@backendModule.jvm.copy(provider = JVM.Provider.Oracle))
  }
}