docs: add conceptual tutorial

feat/type-attributes
Tomáš Mládek 2023-05-02 22:19:06 +02:00
parent db2c382933
commit 267531b4c1
1 changed files with 59 additions and 0 deletions

View File

@ -0,0 +1,59 @@
# UpEnd - A Conceptual Tutorial
UpEnd is not a traditional database - at its core, there are no tables, objects, or files. An entire UpEnd **Vault** is a flat list of **entries**.
An **entry** is a single "statement" that's true within the system. The core of an entry is an _Entity/Attribute/Value triplet_. For example:
| Entity | Attribute | Value |
| ----------------------- | ------------- | --------------- |
| John | Age | 23 |
| Prague | Is In Country | Czech Republic |
| (hash of) `track01.mp3` | Artist | Various Artists |
| https://upend.dev | Title | UpEnd |
Formally speaking:
- **Entity** is the thing the statement is about. It can be one of the following:
- **Hash**, typically of a file, but also possibly of another **entry**.
- [**UUID**](https://en.wikipedia.org/wiki/Universally_unique_identifier), for arbitrary objects that exist solely within UpEnd (groups/tags, annotations, etc.)
- **URL**, anything that exists on the web.
- **Attribute**, for data that belong to UpEnd's attributes (such as their different names, etc.).
- **Attribute** is the "kind" of a statement.
- It can be any text string, but there are some "reserved" attributes by UpEnd by default.
- **Value** is the actual "fact" you're stating. It can be one of the following:
- A text string
- A number
- An address of an **entity**.
(Each **entry** also has a _timestamp_, denoting when it was added, and _provenance_, i.e. the origin of this entry - whether it was added by an automatic process or a user. A full example of an **entry** therefore would be `John / Age / 23 / 2023-05-01 19:20:00 / API IMPORT`)
All other concepts within UpEnd arise as a consequence of combinations of **entries**.
**Objects** emerge as multiple **entries** with the same **entities** accrue. In other words, an **object** is a collection of entries pointing to the same **entity**. A file object therefore may look something like:
| Entity | Attribute | Value |
| ----------------------- | --------- | ---------------- |
| (hash of) `photo01.jpg` | Author | John Doe |
| (hash of) `photo01.jpg` | Label | photo01.jpg |
| (hash of) `photo01.jpg` | Label | Birthday 001.jpg |
| (hash of) `photo01.jpg` | Taken at | 2020-04-01 |
| (hash of) `photo01.jpg` | ... | ... |
(In the UI, the **Entity** part of entry listings is often left out, as it's redundant and implied by the object view.)
However, while a file object has an obvious **entity** to point to, a _Tag_ or a folder has no inherent identity of its own, and therefore no hash. This is the purpose of [_UUIDs_](https://en.wikipedia.org/wiki/Universally_unique_identifier). A _UUID_ is randomly generated for every object as needed.
A **Group** is a equivalent of a folder or a tag. Its purpose is to serve as a collection of related items.
It is a "conventional" object - there is nothing about UpEnd that necessitates **Groups** to exist, but since it provides a very useful abstraction, there is built-in functionality that works with **Groups**, as well as affordances in the UI. It looks like this:
| Entity | Attribute | Value |
| --------------------------------------------- | --------- | ----------------------- |
| `f9305ca5-eabd-4a97-9aa4-37036d2a6ca4` (UUID) | Label | Birthday Photos |
| `f9305ca5-eabd-4a97-9aa4-37036d2a6ca4` (UUID) | Contains | (hash of) `photo01.jpg` |
>Issue `CONTENT UNADDRESSABLE`
>This means that while the **vaults** of various users will refer to the same files by the same **Entity** addresses - because a file is uniquely identified by its hash - this does not apply to any other objects such as **Groups**, as they are identified by a _UUID_, which is random. If two **vaults** were therefore combined, **entries** referring to the same files would "add up" correctly, and your existing **entries** about given files would be complemented by the **entries** of the other **vault**, but any **groups** would potentially be duplicated.
>This is an inherent problem, and cannot be easily solved; if everything were content-addressed, including **groups**, any single change (such as adding or removing a file from a **group**) would ripple throughout the entire system, as other related **entries** would have to update their **entities** or **values** to match this new address, which would change their content, and therefore their hash, and so on. Furthermore, this would also mean that no two folders (or **groups**) could ever share names, for example, as their content would at one point be identical, and therefore their identity as well. UUIDs provide a way for two otherwise identical **objects** to coexist.
>Not all is lost, though - two vaults can still combine in a meaningful way allowing mutual understanding - but it does necessitate an explicit mechanism resolving the semantics of combining _UUID_ referred objects such as **groups** (in other words, a separate addressing scheme). For example, if it's desirable that no matter what vault you happen to be in, the `music` **group** is always the same (and thus two users categorizing their favorite articles in the `music` group can see each other's articles) a convention can be established that all "universal" **groups** also receive an entry with a `Universal Key` **attribute**, which is then used to tell which **groups** are supposed to be the same across different vaults - and which are, for example, just a `music` group someone happened to create to categorize their favorite songs.
>Notably, this issue is completely moot unless you happen to compare different **vaults**. If all you're concerned with is a single **vault** on your computer, you don't need to worry at all about *UUID* objects.