5.1 KiB
IdSets and IdMaps
IdSet is a data structure (formerly DeleteSet) that allows us to efficiently
represent ranges of ids in Yjs (all content is identifyable by ids).
IdMap is a new data structure that allows us to efficiently map ids to
attributes. It can be efficiently encoded.
We can perform all usual set operations on IdMaps and IdSets: diff, merge,
intersect.
Attribution of content
In order to implement a Google Docs-like versioning feature, we want to be able to attribute content with additional information (who created the change, when was this change created, ..).
When we click on a version in Google Docs, we might get annotated changes like this:
# E.g. If Bob appends "world" to the previous version "hello "
[{ insert: 'hello' }, { insert: 'world', color: 'blue', creator: 'Bob', when: 'yesterday' }]
# E.g. If Bob deletes "world" from the previous version "hello world"
[{ insert: 'hello' }, { insert: 'world', backgroundColor: 'red', creator: 'Bob', when: 'yesterday' }]
In Yjs, we can now "attribute" changes with additional information. When we
render content using methods like toString() or getDelta(), Yjs will render
the unattributed content as-is, but it will render the attributed content with
the additional information. As all changes in Yjs are identifyable by Ids, we
can use IdMaps to map changes to "attributions". For example, we could
attribute deletions and insertions of a change and render them:
// We create some initial content "Hello World!". Then we create another
// document that will have a bunch of changes (make "Hell" italic, replace "World"
// with "Attribution").
const ydocVersion0 = new Y.Doc({ gc: false })
ydocVersion0.getText().insert(0, 'Hello World!')
const ydoc = new Y.Doc({ gc: false })
Y.applyUpdate(ydoc, Y.encodeStateAsUpdate(ydocVersion0))
const ytext = ydoc.getText()
ytext.applyDelta([{ retain: 4, attributes: { italic: true } }, { retain: 2 }, { delete: 5 }, { insert: 'attributions' }])
// this represents to all insertions of ydoc
const insertionSet = Y.createInsertionSetFromStructStore(ydoc.store)
const deleteSet = Y.createDeleteSetFromStructStore(ydoc.store)
// exclude the changes from `ydocVersion0`
const insertionSetDiff = Y.diffIdSet(insertionSet, Y.createInsertionSetFromStructStore(ydocVersion0.store))
const deleteSetDiff = Y.diffIdSet(deleteSet, Y.createDeleteSetFromStructStore(ydocVersion0.store))
// assign attributes to the diff
const attributedInsertions = createIdMapFromIdSet(insertionSetDiff, [new Y.Attribution('insert', 'Bob')])
const attributedDeletions = createIdMapFromIdSet(deleteSetDiff, [new Y.Attribution('delete', 'Bob')])
// now we can define an attribution manager that maps these changes to output. One of the
// implementations is the TwosetAttributionManager
const attributionManager = new TwosetAttributionManager(attributedInsertions, attributedDeletions)
// we render the attributed content with the attributionManager
let attributedContent = ytext.getContent(attributionManager)
console.log(JSON.stringify(attributedContent.toJSON().ops, null, 2))
let expectedContent = delta.create().insert('Hell', { italic: true }, { attributes: { italic: ['Bob'] } }).insert('o ').insert('World', {}, { delete: ['Bob'] }).insert('attributions', {}, { insert: ['Bob'] }).insert('!')
t.assert(attributedContent.equals(expectedContent))
// this is how the output would look like
const output = [
{
"insert": "Hell",
"attributes": {
"italic": true
},
"attribution": {
"attributes": {
"italic": [
"Bob"
]
}
}
},
{
"insert": "o "
},
{
"insert": "World",
"attribution": {
"delete": [
"Bob"
]
}
},
{
"insert": "attributions",
"attribution": {
"insert": [
"Bob"
]
}
},
{
"insert": "!"
}
]
We get a similar output to Google Docs: Insertions, Deletions, and changes to formatting (attributes) are clearly associated to users. It will be the job of the editor to render those changes with background-color etc..
Of course, we could associated changes also to multiple users like this:
const attributedDeletions = createIdMapFromIdSet(deleteSetDiff, [new Y.Attribution('insert', 'Bob'), new Y.Attribution('insert', 'OpenAI o3')])
You could use the same output to calculate a real diff as well (consisting of deletions and insertions only, without Attributions).
AttributionManager is an abstract class for mapping attributions. It is
possible to highlight arbitrary content with this approach.
The next steps are to:
- finish the implementation for Y.Map and Y.Xml* (which should be easy, compared to Y.Map).
- Implement an AttributionManager-CRDT for the backend that sits there and associates changes with users.
- use
getContent(attributionManager)instead oftoDeltain y-prosemirror. Would like to make the attribution part of y-prosemirror, however Nick can also use this approach to customly render the changes in ProseMirror.
The AttributionManager is encodes very efficiently. The ids are encoded using run-length encoding and the Attributes are de-duplicated and only encoded once. The above example encodes in 20 bytes.