🚧 This documentation is for the developer preview of m-ld.
This documentation presents the implementation-independent technical specification of m-ld. It covers internal and external APIs and protocols. Architecture, principles, use-cases, a list of clone engine implementations, and guiding narrative can be found on the m-ld documentation portal.
Term | Definition |
---|---|
domain | A logical, decentralised set of shared data |
clone | One of many physical containers of the domain data, typically embedded in the app |
(clone) engine | A clone implementation, which presents the clone API to an app and enacts the clone protocol |
app | The domain-specific application code that uses m-ld for data sharing |
subject | A resource, represented as a JSON object, that is part of the domain data |
This narrative and the associated API definitions use Typescript as an abstract specification language, for familiarity. m-ld itself is language- and platform-agnostic. Clone engine implementations provide language-specific bindings which will differ from these definitions, both syntactically and in the completeness with which the engine implements them.
A m-ld clone must be initialised before it is ready to participate in data transactions. To initialise, it must be configured with (at least):
test.m-ld.org
. This
is used to establish communication with other clones.m-ld natively uses JSON-LD as its data syntax, to ensure the widest possible applicability and ease of integration with existing systems. However, it is not generally necessary for users to have intimate knowledge of JSON-LD and Linked Data unless in advanced use-cases.
The clone API comprises three primary methods for interacting with data:
In addition, it is possible to react to clone status via the status
property.
During initialisation, a clone will determine its initial 'online' status, and if possible, rev-up with recent updates from the domain. See the Clone Protocol for more details.
The read
and
write
APIs take a single parameter, a JSON
object which declaratively describes the transaction. The read method returns an
observable stream of subjects, which represent query results.
The transaction request JSON object and each returned subject are a json-rql Pattern and Subject, respectively. json-rql is a superset of JSON-LD, designed for query expressions. The following provides an informal introduction to the syntax. Note that clone engines may legitimately offer a limited subset of the full json-rql syntax. Check the engine documentation for details.
The simplest transaction inserts some data. In this case the transaction description is just the data, a JSON subject, such as:
{
"@id": "fred",
"name": "Fred"
}
No data is returned.
The subject is identified in the domain by the keyword property @id
. The
value of this property must be unique.
This property is defined to be an IRI, but by default, m-ld will scope a relative IRI to the domain. For example, if the domain is
test.m-ld.org
, this subject's identity will actually behttp://test.m-ld.org/fred
. This scoping is not significant in most use-cases, since queries for this data also use and retrieve the un-scoped identity, as shown below.
To retrieve this data subject, a Query JSON object is used as the transaction
description. A query uses a keyword property, in this case @describe
, to
indicate the data filter and return format:
{
"@describe": "fred"
}
The return stream contains a single subject:
{
"@id": "fred",
"name": "Fred"
}
In this case the response to the query returns an identical subject to that first inserted. In general though, the inserted subject can be an arbitrarily nested JSON object, but a describe query will only return the top-level attributes.
A key difference between m-ld and typical JSON stores is that in m-ld, the JSON is a representation of a graph, and there is no storage of the original structure of any subject.
This affects how write transactions are processed. All raw subject transactions are treated as insertions to the data that already exists. For example, following the above transactions with:
{
"@id": "fred",
"age": 40
}
results in data that will be Described as:
{
"@id": "fred",
"name": "Fred",
"age": 40
}
In order to update a subject with changed data, it is necessary to explicitly remove unwanted old data. This can be done with the more verbose Update syntax, for example:
{
"@delete": {
"@id": "fred",
"name": "Fred"
},
"@insert": {
"@id": "fred",
"age": 40
}
}
See the Data Semantics section below for more detail of subject representation.
The need for explicit removal of prior data can lead to unexpected data structure changes if not accounted for. Some clone engines provide an explicit
PUT
- orUPDATE
-like API to reduce verbosity. However similar situations can also arise due to concurrent data changes, so it is important for an app to be aware of this characteristic.
The query language also supports @select
statements, which are able to gather
data values in arbitrarily complex ways from subjects in the domain. This
requires the use of a @where
clause and Variables, which are placeholders
for subject keys, properties or values. For example:
{
"@select": "?nm",
"@where": { "@id": "fred", "name": "?nm" }
}
The return stream contains a single pseudo-subject with matching values for the variable:
{
"?nm": "Fred"
}
🚧 Further documentation and examples coming soon. Please get in touch to tell us about your use-case!
Whenever data changes in a clone, an update event is notified to "followers" who
have subscribed using the follow
API. Data
can change due to both local and remote transactions, so this API is
essential for an app to maintain a current view on the domain data. Such a view
may be used for:
Each update has a strict structure indicating the data that has been deleted and inserted, in both cases as arrays of Subjects. Note that each subject is partial: it contains only the properties that were affected by the transaction.
For example, given the following subject:
{
"@id": "fred",
"name": "Fred"
}
and the following transaction (either remotely or locally):
{
"@id": "fred",
"age": 40
}
The resultant update event will include:
{
"@delete": [],
"@insert": [{ "@id": "fred", "age": 40 }]
}
On receipt of this update the app may not need to know any more about the current state of the object, for example because it is already displayed in the user interface; and the update can be trivially applied. If this is not so, then the app can make a query to retrieve current state.
Since data updates can arise at any time, to guarantee consistency in downstream data representations like a database, care may need to be taken to ensure that asynchronous queries do not receive data from more recent updates than intended. Update events and clone status include a field for local logical clock
ticks
, which can be used by the clone engine to identify a specific data snapshot. However due to differences in engine data stores and language concurrency models, engines may vary in how this field is used. Check the engine documentation for the necessary details.
A clone engine's status can be obtained using the status property. This provides the current status description, an observable stream of changing status, and a way to await a particular status. This can be used to refine the app's behaviour depending on its requirements, for example:
Data in m-ld is structured, stored as a graph, and represented as JSON in the clone API.
This graph nature, along with the convergence model for concurrent updates, gives rise to the following set of semantic rules, awareness of which will help an app developer to correctly handle the data.
A top-level JSON object represents a Subject
, that is, something interesting
to talk about in the domain.
{
Every Subject may have an identity, given with the @id
field. If an
identity is not provided on first insertion, an identifier will be generated
of the form .well-known/genid/GUID
, which is visible when
querying the Subject.
"@id": "fred",
Properties of a Subject can be:
"name": "Fred Flintstone",
"address": { "number": 55, "street": "Cobblestone Rd" },
@id
field"spouse": { "@id": "wilma" },
Array properties have
Set semantics
by default, unlike normal JSON arrays, unless they are qualified with the
@list
keyword (see next). They do not contain duplicate members, and they
are unordered. Insertion of duplicate values in a transaction results in
only one of the values being stored.
"interests": ["bowling", "pool", "golf", "poker"],
A Subject having an @list
property represents a list. The value of the
@list
key is the full, ordered content of the list (if an array), or a set
of index-item pairs (if a hash). See Lists below for more details.
"episodes": {
"@list": ["The Flintstone Flyer", "Hot Lips Hannigan", "The Swimming Pool"]
},
In the absence of a single-valued
constraint (see below), any property of a
Subject except the @id
property can become multi-valued (an array) in the
data. This can happen by inserting a value without deleting the old one, or
due to conflicting edits.
"height": [5, 6]
In the absence of a mandatory
constraint (see below), any property of a
Subject except the @id
property can become empty (see next).
When accepting data in a transaction, the following JSON values are equivalent, and represent an empty property:
[]
)null
In particular, it is not possible to 'nullify' a value using an @insert
clause, because passing a value of null
actually tells the engine that the
transaction has nothing to say about the value – as if it was not mentioned
at all. To remove a value, it is necessary to use a @delete
clause.
When providing data in response to a Read transaction, an engine will never
emit null
or an empty array ([]
) – the property will be omitted.
}
A 'constraint' is a semantic rule that describes invariants about the data. As part of m-ld's concurrency model, engines may provide a set of available constraints that can be declared in the engine initialisation.
🚧 Inclusion of declarative integrity constraints in m-ld is an experimental feature, and the subject of active research. The available constraints and the means by which they are declared for a domain is likely to change. Please do get in touch with your requirements.
Declarative constraints have two modes of operation:
The following is a list of candidate declarable constraints. See the engine documentation for supported constraints and syntax.
single-valued
: A subject property must have a single atomic value.
Conflict Scenario: Any subject property in the domain can become multi-valued (an array) if concurrent inserts are made to the same subject property.
Resolution: Pick a 'winning' value using a rule. This could be based on the conflicting values (e.g. maximum or average), or based on another property value (e.g. a timestamp).
mandatory
: A subject must have a value for a property.
Conflict Scenario: If one app instance removes a subject in its entirety at the same time as another app instance updates a property, then the updated property value remains in the converged domain – all other properties are now missing, even if mandatory. (Note that neither app instance violated the rule locally.)
Resolution: Treat a subject without a value for a mandatory field as an invalid subject. The subject is deleted (note that in the conflict scenario, this was the intention of one of the updates).
unique
: A set of subjects in the domain (e.g. of a specific type) must
have unique values for a property (besides their identity).
Conflict Scenario: Concurrent updates to two different subjects could both update the property to the same value.
Resolution: Decide the Subject to receive the conflicting value. Delete the other subject's property. If the property is mandatory, revert the value to the previous (it must exist in the same transaction).
As noted above, plain JSON arrays as Subject property values are interpreted as unordered sets. However as in most programming languages, an ordered collection or list is also natively supported by m-ld, using additional syntax as follows.
A list in m-ld is a kind of Subject. It and can therefore have an identity
and properties. It differs from a normal Subject by the inclusion of the @list
keyword.
{ "@id": "shopping", "@list": ["Bread", "Milk"] }
This syntax is a super-set of standard JSON-LD, which does not permit a list object to have other properties. JSON-LD list objects can be loaded into m-ld as anonymous Subjects, but the reverse is typically not possible without some pre-processing.
The value of the @list
property represents the ordered collection of 'items',
which can be any normal Subject property value type such as JSON values (except
null
), Subjects and References. Duplicate items are allowed, and will remain
duplicated when retrieved.
When retrieving a list, the contents of the @list
property will always be
consistently ordered. In the example above, "Milk"
will always follow
"Bread"
unless an update has been made to the list.
Updating and querying a list makes use of an alternate syntax for the @list
key, using a JSON object to specify index positions.
{ "@insert": { "@id": "shopping", "@list": { "2": "Spam" } } }
This appends "Spam"
to the shopping list at index position 2. After this
update, the shopping list content will be ["Bread", "Milk", "Spam"]
. If the
given index position was "1"
instead, the final content would be ["Bread", "Spam", "Milk"]
.
@list
value object must be a non-negative integer
JSON number or a
variable. As with all JSON keys, this must be surrounded by quotes. Any other
key format will cause an error.A variable can be used in the index or item position to query a list. The
following query selects the index position of "Spam"
in the shopping list.
{
"@select": "?spamIndex",
"@where": { "@id": "shopping", "@list": { "?spamIndex": "Spam" } }
}
The following query selects the item at position 1 in the shopping list.
{
"@select": "?item",
"@where": { "@id": "shopping", "@list": { "1": "?item" } }
}
It is therefore possible to delete items from a list using this syntax; for example, the item at index 1 regardless of its value:
{ "@delete": { "@id": "shopping", "@list": { "1": "?" } } }
Moving an item in a list can require a little more syntax, depending on the meaning of the list and so how concurrent edits should be understood. Like all transactions in m-ld, a move comprises a delete and an insert. Since list items can have duplicates, the outcome of a concurrent move of an item to two different locations could be:
In m-ld, the latter meaning is captured with the concept of list slots. In this case it's not the item that is moved but the slot – like a box containing the item. Slots can only appear once in a list, so the final position is chosen as one of the two user-specified positions.
Most of the time list slots are implicit in the interface, for simplicity. They
can be made explicit using the keyword @item
. Just like a list is identified by
having a @list
property, a slot is identified by having an @item
property.
{ "@insert": { "@id": "shopping", "@list": { "2": "Spam" } } }
(implicit slot) is the same as:
{ "@insert": { "@id": "shopping", "@list": { "2": { "@item": "Spam" } } } }
(the explicit slot is { "@item": "Spam" }
). A slot is a Subject, and has an
@id
, which is normally automatically generated for implicit slots.
Using slots, it is possible to move an item as follows:
{
"@delete": {
"@id": "shopping",
"@list": { "2": { "@id": "?slot", "@item": "Spam" } }
},
"@insert": {
"@id": "shopping",
"@list": { "0": { "@id": "?slot", "@item": "Spam" } }
}
}
This moves the slot containing Spam at index 2 to the head of the list.
🚧 Documentation coming soon. If you are interested in the protocol details, or in developing a clone engine, please do get in touch.
Generated using TypeDoc. Delivered by Vercel. @m-ld/m-ld-spec - v0.7.1-edge.0 Source code licensed MIT. Privacy policy