upend/docs/UpEnd - Conceptual Tutorial.md

5.8 KiB

UpEnd - A Conceptual Tutorial

UpEnd is not a traditional database - at its core, there are no tables, objects, or files. An entire UpEnd Vault is a flat list of entries.

An entry is a single "statement" that's true within the system. The core of an entry is an Entity/Attribute/Value triplet. For example:

Entity Attribute Value
John Age 23
Prague Is In Country Czech Republic
(hash of) track01.mp3 Artist Various Artists
https://upend.dev Title UpEnd

Formally speaking:

  • Entity is the thing the statement is about. It can be one of the following:
    • Hash, typically of a file, but also possibly of another entry.
    • UUID, for arbitrary objects that exist solely within UpEnd (groups/tags, annotations, etc.)
    • URL, anything that exists on the web.
    • Attribute, for data that belong to UpEnd's attributes (such as their different names, etc.).
  • Attribute is the "kind" of a statement.
    • It can be any text string, but there are some "reserved" attributes by UpEnd by default.
  • Value is the actual "fact" you're stating. It can be one of the following:
    • A text string
    • A number
    • An address of an entity.

(Each entry also has a timestamp, denoting when it was added, and provenance, i.e. the origin of this entry - whether it was added by an automatic process or a user. A full example of an entry therefore would be John / Age / 23 / 2023-05-01 19:20:00 / API IMPORT)

All other concepts within UpEnd arise as a consequence of combinations of entries.

Objects emerge as multiple entries with the same entities accrue. In other words, an object is a collection of entries pointing to the same entity. A file object therefore may look something like:

Entity Attribute Value
(hash of) photo01.jpg Author John Doe
(hash of) photo01.jpg Label photo01.jpg
(hash of) photo01.jpg Label Birthday 001.jpg
(hash of) photo01.jpg Taken at 2020-04-01
(hash of) photo01.jpg ... ...

(In the UI, the Entity part of entry listings is often left out, as it's redundant and implied by the object view.)

However, while a file object has an obvious entity to point to, a Tag or a folder has no inherent identity of its own, and therefore no hash. This is the purpose of UUIDs. A UUID is randomly generated for every object as needed.

A Group is a equivalent of a folder or a tag. Its purpose is to serve as a collection of related items.

It is a "conventional" object - there is nothing about UpEnd that necessitates Groups to exist, but since it provides a very useful abstraction, there is built-in functionality that works with Groups, as well as affordances in the UI. It looks like this:

Entity Attribute Value
f9305ca5-eabd-4a97-9aa4-37036d2a6ca4 (UUID) Label Birthday Photos
f9305ca5-eabd-4a97-9aa4-37036d2a6ca4 (UUID) Contains (hash of) photo01.jpg

Issue CONTENT UNADDRESSABLE
This means that while the vaults of various users will refer to the same files by the same Entity addresses - because a file is uniquely identified by its hash - this does not apply to any other objects such as Groups, as they are identified by a UUID, which is random. If two vaults were therefore combined, entries referring to the same files would "add up" correctly, and your existing entries about given files would be complemented by the entries of the other vault, but any groups would potentially be duplicated.
This is an inherent problem, and cannot be easily solved; if everything were content-addressed, including groups, any single change (such as adding or removing a file from a group) would ripple throughout the entire system, as other related entries would have to update their entities or values to match this new address, which would change their content, and therefore their hash, and so on. Furthermore, this would also mean that no two folders (or groups) could ever share names, for example, as their content would at one point be identical, and therefore their identity as well. UUIDs provide a way for two otherwise identical objects to coexist.
Not all is lost, though - two vaults can still combine in a meaningful way allowing mutual understanding - but it does necessitate an explicit mechanism resolving the semantics of combining UUID referred objects such as groups (in other words, a separate addressing scheme). For example, if it's desirable that no matter what vault you happen to be in, the music group is always the same (and thus two users categorizing their favorite articles in the music group can see each other's articles) a convention can be established that all "universal" groups also receive an entry with a Universal Key attribute, which is then used to tell which groups are supposed to be the same across different vaults - and which are, for example, just a music group someone happened to create to categorize their favorite songs.
Notably, this issue is completely moot unless you happen to compare different vaults. If all you're concerned with is a single vault on your computer, you don't need to worry at all about UUID objects.