đź’¦ clive
: distributed spreadsheets of code+data
On a journey of self-pity and self-indulgence, I have spent the last 5+
years obsessing about how to improve productivity and comfort for software
and reliability engineering. I have explored systems like git or Nix,
various distributed stores, artifact repositories,
build/test/package/deployment systems for a variety of languages. I built,
maintained and/or operated dozens of tools and infrastructure components,
built a distributed CI for distributed systems & a monorepo
environment with IDE and cloud integration. This is the one pragmatic step
I could conceive to jump-start a revolution in the user experience of
technology workers: a solid, scalable, distributed architecture for
spreadsheets of code+data.
Platform
A ❤Nix-like system turned reactive
and implemented through a launcher/manager/client CLI (Go) and a service
(JVM).
Mostly Kotlin instead of the Nix language and a tall layer of C++.
jgit on top of
Xodus as an object and
“filesystem” store, instead of the Nix store on top of the POSIX
filesystem.
Live reactions instead of individual derivations.
High-level abstractions for code generation, builds, tests, assets
pipelines, service and task execution, deployments, etc.
SSH-based communication with Git-based data propagation. Maximized
interoperability with IntelliJ IDEA.
Steps to self-hosting
-
Implement all the non-unicorn bits in the architecture diagram. Now
everything happens in Kotlin.
-
Implement a self-reproducing unicorn. Now we have a living and breathing
dev toolchain unicorn.
-
Evolve the unicorn so it can build and publish the non-unicorn bits. Now
anybody can adopt a baby unicorn and live a harmonious relationship.
Unicorn design draft
Everything that follows needs rewriting over better-defined abstractions.
Abstractions and their layering are emerging as follow:
- A Git over Xodus database.
- On top, auto-reloading classloader, w/ precise invalidation tracking
(not yet loaded ⇒ no reload, possibly in-place bytecode replacements).
- A functional object model, fully serializable and persistable in the database
using Kryo, where entities can exist and evolve as references.
Entities are maintained by arbitrary code through Git references and can contain
full object graphs which can refer to other live references. Some references
propagate their changes back to any consuming code.
The full Git reference store is ACID.
In that functional project model:
- Trust/delegation models using SSH keypairs.
- On top of said trust/delegation model, an execution dispatch model across services,
including through P2P / clusters / cloud providers or config GUIs on laptops.
- A distributed execution engine, which continuously rebuilds parts of the model
as changes to their inputs are progatated.
Amongst other tasks, that execution engine could, for example:
- Start and stop other execution engines from anywhere in any DB over time,
including as the result of tasks from other models,
- Watch local filesystems and propagate changes into the object model,
- Stream parts of the object model back to local filesystems,
or various external services as they change,
- Run arbitrary services, recurring tasks, etc.
Click to reveal outdated notes
What I think an adult unicorn looks like
Peaceful, specialized, living in big nerds bound in harmony by
functional dynamic networks of trust
-
Clive is a distributed ecosystem of expressions for a Redux-like
architecture (where the resolvers are replaced with Kotlin code built
on top of layers of higher-and-higher level APIs) + React-like
architecture (where the virtual DOM is replaced with Git objects and
trees). After bootstrapping or seeding from an existing workspace, one
drives and they feed each other. During those reaction chains, both
arbitrary computations (such as code generation, compilation,
packaging, code relocation, arbitrary execution all interleaved for
environments like TypeScript where needed type definitions pour in and
compilation doesn't depend on the source code itself) and their
results can be cached, persisted and distributed programmatically.
-
Any directory with a
Clive.toml
becomes a Clive
workspace. A living and breathing workspace started by a simple
command line tool with zero dependencies, maybe from a
$vcs clone
to start from and alongside which to co-exist
(with a simple ignore .clive/
as required integration).
-
In that workspace, remote and local code and data materialize by
purpose; they can be modified, replaced and fetched, evaluated,
propagated on the fly. For example one might want to start modifying
just a couple of files in a published module, or fully replace it with
a local checkout. Maybe they'd rather like to read some pre-built
version from their vendor's store. Arbitrary local checkouts,
arbitrarily attached to the workspace, can be manipulated with tools
like version control systems,
git
included, without any
disruption to the platform's live operation.
-
As anything changes, whether the current list of targets for the
service (live-deploy this backend, run all the client tests against a
local copy of this backend, etc.) or any of their transitive source
materials (including local files monitored on the fly), all the
corresponding state derivations and side-effects can be applied as
fast possible, either by reading a local or remote cache (
git
over HTTPS and/or SSH), or by computing locally or remotely (remote
Clive instance over HTTPS or SSH), from a long-lived service with hot
caches and through long-lived connections and Git sessions.
-
The end result is soft-real-time code generation, builds, tests,
publishing of artifacts, restarts of services in one's local debugger
or development cluster, etc., from any change to some file on one's
laptop, either offline or through a cloud worker, all within seconds
for most small changes.
-
What's more, the full history of the workspace is recorded into
sessions. Check out any point in time in a session and the full state
might be ready in the local cache: the IDE instantly sees all its
config, the generated code, compiled classes, fetched remote
dependencies, pre-built
node_modules
hierarchies, failed compilations and test results as fast as the
service can copy from this hot Git repository to its local filesystem.
-
A full history of developer sessions would be valuable as a cloud
service; whenever the service on their laptop connects to the
Internet, it automatically pushes all state transitions including file
saves, streaming live when possible. This starting point would be
quite close to the beginnings of real-time continuous integration and
continuous deployment as a service. Secrets management, including for
cloud clusters, would be one the interesting problems to address
through Clive's highly dynamic nature.
-
Every workspace has an SSH private key associated with it in its VFS.
This is a key starting point for most of the clustering and
peer-to-peer networking functionality. That private SSH key could come
from your Unix user, or derived from a config option in the VFS. It
starts propagating into your developer session, and can spread to
copies you personally trust. More keys can easily be inserted into the
VFS through arbitrary logic which lets reactors maintain a set of
active credentials on the fly for the systems which they access. At
the networking level, those workspace SSH keys work to authenticate
both clients and servers. But the trust delegation model is fully
dynamic and the tricks for peer-to-peer don't satisfy all requirements
for an enterprise setup today. For those environments, trust of
GithHub Enterprise's SSH host key when fetching from repositories
there can be provisioned, eg through shared plugin configuration in
workspaces, and architectured around VFS derivation rule expressions.
This incredible level of flexibility combined with strong conventions
throughout the standard modules are the key to a secure distributed
platform, where solid patterns for authentication and role delegation
allow for fine-grained access control policies, a should-have for a
professional tooling platform but unfortunately lacking today.
Inside most unicorns
-
A VFS reactor configured dynamically from the project reactor, hence
fully programmatic, but with amongst other native features:
-
Live writeback to the filesystem (mostly in
.clive
subtrees),
-
Live ingest from the filesystem (excludes
.clive
everywhere),
-
A mount-point system so a session could easily mount from a local
copy of a module rather point to a remote and vice-versa,
-
Overlay mounts, to override small parts of external projects for
example;
-
A project reactor hooked to the VFS reactor with:
-
Live modelization of the full project model in a single timeline;
-
Functional model for the project, persisted as an immutable
input-addressible object graph, LRU-persisted in the DB;
-
A set of components to materialize the functional model into a
filesystem, eg:
- Remote artifact fetchers,
- Code generators,
- Compilers,
- IDE configuration generator,
- Test executors,
- Service runners;
-
“Targets” in the project functional model tracked as “alive” to
drive execution of components, eg:
- Keep all generated code up-to-date for the IDE,
-
Run client tests against the latest version of services,
- Live-deploy the last tested server to my cloud stack,
-
On the cloud worker for
master
, live-publish
static assets to S3 and invalidate CDN caches in
soft-realtime;
-
Powerful distribution models for the execution of some or all of
those components, eg:
-
Only run those tasks when a device is running on sector or is
disconnected, otherwise distribute it to a local cloud of
workers,
-
From those, forward builds requiring a Mac operating system to
a separate cluster of workers (useful to build iOS apps or
Node native modules for example);
-
Pretty strict conventions throughout, eg maximizing out-of-the-box
integration in IDEA, mostly through config auto-generation within
the Clive service rather than through an IDEA plugin, and
carefully optimized behaviours as
clive.kt
is
iterated on (including very eager dynamic mounting of
import
paths in the VFS, which can help with
autocomplete and offering error checking instantly in IDEs and
live service state views).
-
An SSH server offering a
git
-like interface, extended
with session management;
-
An web server offering a rich UI, session management, a session
browser and controller;
-
For clustered environments,
Atomix-based coordination for
parts of the VFS reactions.
Example of model declaration code: /demo/server/clive.kt
// We parse imports to figure out which modules to bring in the classpath.
// For example this loads /thirdParty/clive.kt (or /thirdParty/ratpack/clive.kt if it exists)
import clive.model.jvm.artifacts.docker
import clive.model.jvm.artifacts.macosApp
import clive.model.jvm.artifacts.windowsSelfExtract
import clive.model.jvm.Dependencies
import clive.model.jvm.JVM
import clive.model.jvm.JVMArgs
import clive.model.kotlin.backendModule
import thirdParty.ratpack
backendModule {
dependencies = Dependencies {
compile {
ratpack.classes
}
runtime {
ratpack.transitive.exclude { thirdParty.hadoop }
}
}
jvm = JVM {
args = JVMArgs {
memory = 16.GB
}
}
artifacts = Artifacts {
docker
windowsSelfExtract
macosApp(jvm = this@backendModule.jvm.copy(provider = JVM.Provider.Oracle))
}
}