Following the lead in A Design Doc for Design Docs, this is the beginning of documentation on the software design of nteract.
Status: DRAFT
There are multiple facets to our design, from software to intent to the look-and-feel of our applications.
Mission: to build great experiences for interacting with compute + data
We kicked this mission off originally by creating a desktop application that worked well for students and analysts, especially those that were more keyboard driven. We still have more to do to make that a deeper reality.
As we've expanded our scope to include web applications that assist people with collaborating with each other more easily (commuter) or getting fast feedback (play), so too has our library and application design changed.
nteract originally started out with many many repos for each individual module and component we wanted to build. This proved to be extremely painstaking for both us and collaborators when we wanted to publish and upgrade any given dependency that other packages were relying on. Fast forward to now, we now keep both packages and applications managed in a Monorepo.
packages/
includes all libraries and components (that aren't app specific)
applications/
includes the desktop app, the jupyter extension, play, and commuter
Much of the complexity in a frontend application is handling asynchronous code and events. One way that people have been managing this is with Redux, which forces several principles:
- Central state tree, encapsulated in a store
- The next state is based on an event (action) and the previous state (called a reducer)
- All events go through the same dispatch function
To manage external asynchronous APIs (websockets, REST APIs, TCP sockets, etc.), we use redux-observable and by extension RxJS. The core primitive we use is an epic
which has the signature
(action$, state$) => action$
Given a stream of actions and state, we compute new actions for the central store to consume. Anything that passes through the central dispatch also passes through the action$
.
The way in which we approach building epics is in its own separate document.
Each state is an immutable object. We've chosen to use immutable data structures for these reasons:
- Minimizes the need to copy or cache data
- Comparison between immutable objects is fast
- Enforces strict contracts between functions (keep them pure, no side effects on state)
We've chosen to use Immutable.JS for our immutable data structures.
This document attempts to specify the goal for an nteract core package that can provide generic/core state management to all notebook-y applications. Right now this is focused on these apps:
ref
- an internal reference to an entity upon recognition, e.g. kernels, hosts, kernelspec collections, etc.id
- likely an external identifier, e.g. with /api/kernels/9092, 9092 is the id
We use the term recognition over creation because we want to have a way to
reference an entity before we get a response from some api. A good example is
having a ref for an active kernel before the kernel has been launched with a
jupyter notebook server. Since there will be a proliferation of id-strings, the
internal ones are called ref
s and they are only meant for use inside the
application--i.e., they have no meaning externally. The external id-string that
will typically be found is called id
.
Stemming from the Redux docs'
Normalizing State Shape,
we setup our application to be collections of entries built in a relational
fashion. We have been doing something similar with cellMap
(map of cell id to cell)
and cellOrder
(list of cell ids). This takes it to the next level.
At a high level,
type core = {
// The core state is meant to be document-centric. So, we basically use the
// currently selected document to set the context for the rest of the app.
// This model may need to change when we have things like split panes and
// support objects in core that are not really considered *documents*.
selectedContentRef: Ref
// These are the actual data that we get back from
// * API Calls
// * User input
// * Kernel output
entities: {
// Each host implementation has a set of kernels which may be activated
// The way to run the kernel, the "spec", is called a kernelspec
kernelspecs: {
byRef: {
[ref: Ref]: {
defaultKernelName: string,
hostRef: Ref,
byName: {
[name: string]: {
argv: Array<string>,
displayName: string,
env: Object,
language: string,
interruptMode: string,
resources: Object
}
},
}
},
refs: List<KernelspecsRef>
},
hosts: {
// On desktop we have the one built-in local host that connects to
// zeromq directly. On jupyterhub backed apps, you should be able to switch to
// different hosts.
byRef: {
[ref: Ref]: {
id: string,
type: ("local" | "jupyter"),
token: string,
serverUrl: string,
crossDomain: boolean
}
}
refs: Array<Ref>
},
// A notebook may have one active kernel (but we allow multiple to allow smooth
// transitions between switching kernels). This also allows us to have multiple kernels
// on the page if we ever expand scope to allow it.
kernels: {
byRef: {
[ref: Ref]: {
type: ("local" | "jupyter"), // same as server, literal, unchanging
hostRef: HostRef,
name: string,
lastActivity: Date,
channels: rxjs$Subject,
status: string,
id: Id, // jupyter only
spawn: ChildProcess, // local only
connectionFile: string, // local only
cwd: string, // current working directory, absolute on local, relative to server on jupyter
}
}
},
sessions: {
byRef: {
[ref: Ref]: {
id: Id,
name: string, // This is just a display name.
type: string, // TODO: this should be an enum.
kernelRef: Ref
}
},
refs: Array<Ref>
},
contents: {
byRef: {
[ref: Ref]: {
type: "directory" | "notebook" | "file",
mimetype: ?string, // file-type only.
path: string,
name: string,
created: Date,
lastSaved: Date,
modified: boolean,
writable: bool,
format: null | "json" | "text" | "base64", // "json" for dir / nb
// The model is a little confusing. Think of it as the in-memory, app
// version of the content string that you get back from the contents
// api. So, for a plain file, which we don't necessarily know how to
// handle, the model will just be a string still. However, for a
// notebook, we basically flesh out all the references to cells in
// here.
model: ?Object, // null | DirectoryModel | NotebookModel | FileModel
// The sessionRef is nullable here because we don't necessarily need
// the session to be running to display the document. This allows us
// to render a document and start up a session in parallel.
sessionRef: ?SessionRef
}
}
},
notifications: {
byRef: {
[ref: Ref]: {
message: string,
// TODO: Figure out our structure here
},
refs: Array<Ref>
}
}
}
}
We've chosen React as our core library for building and sharing components. It's declarative, easy to use, highly used.
TODO: Fill this in 😊