upend/docs/UpEnd - Conceptual Tutorial.md

# UpEnd - A Conceptual Tutorial

UpEnd is not a traditional database - at its core, there are no tables, objects, or files. An entire UpEnd **Vault** is a flat list of **entries**.

An **entry** is a single "statement" that's true within the system. The core of an entry is an _Entity/Attribute/Value triplet_. For example:

| Entity                  | Attribute     | Value           |
| ----------------------- | ------------- | --------------- |
| John                    | Age           | 23              |
| Prague                  | Is In Country | Czech Republic  |
| (hash of) `track01.mp3` | Artist        | Various Artists |
| https://upend.dev       | Title         | UpEnd           |

Formally speaking:

- **Entity** is the thing the statement is about. It can be one of the following:
  - **Hash**, typically of a file, but also possibly of another **entry**.
  - [**UUID**](https://en.wikipedia.org/wiki/Universally_unique_identifier), for arbitrary objects that exist solely within UpEnd (groups/tags, annotations, etc.)
  - **URL**, anything that exists on the web.
  - **Attribute**, for data that belong to UpEnd's attributes (such as their different names, etc.).
- **Attribute** is the "kind" of a statement.
  - It can be any text string, but there are some "reserved" attributes by UpEnd by default.
- **Value** is the actual "fact" you're stating. It can be one of the following:
  - A text string
  - A number
  - An address of an **entity**.

(Each **entry** also has a _timestamp_, denoting when it was added, and _provenance_, i.e. the origin of this entry - whether it was added by an automatic process or a user. A full example of an **entry** therefore would be `John / Age / 23 / 2023-05-01 19:20:00 / API IMPORT`)

All other concepts within UpEnd arise as a consequence of combinations of **entries**.

**Objects** emerge as multiple **entries** with the same **entities** accrue. In other words, an **object** is a collection of entries pointing to the same **entity**. A file object therefore may look something like:

| Entity                  | Attribute | Value            |
| ----------------------- | --------- | ---------------- |
| (hash of) `photo01.jpg` | Author    | John Doe         |
| (hash of) `photo01.jpg` | Label     | photo01.jpg      |
| (hash of) `photo01.jpg` | Label     | Birthday 001.jpg |
| (hash of) `photo01.jpg` | Taken at  | 2020-04-01       |
| (hash of) `photo01.jpg` | ...       | ...              |

(In the UI, the **Entity** part of entry listings is often left out, as it's redundant and implied by the object view.)

However, while a file object has an obvious **entity** to point to, a _Tag_ or a folder has no inherent identity of its own, and therefore no hash. This is the purpose of [_UUIDs_](https://en.wikipedia.org/wiki/Universally_unique_identifier). A _UUID_ is randomly generated for every object as needed.

A **Group** is a equivalent of a folder or a tag. Its purpose is to serve as a collection of related items.

It is a "conventional" object - there is nothing about UpEnd that necessitates **Groups** to exist, but since it provides a very useful abstraction, there is built-in functionality that works with **Groups**, as well as affordances in the UI. It looks like this:

| Entity                                        | Attribute | Value                   |
| --------------------------------------------- | --------- | ----------------------- |
| `f9305ca5-eabd-4a97-9aa4-37036d2a6ca4` (UUID) | Label     | Birthday Photos         |
| `f9305ca5-eabd-4a97-9aa4-37036d2a6ca4` (UUID) | Contains  | (hash of) `photo01.jpg` |

>Issue `CONTENT UNADDRESSABLE`  
>This means that while the **vaults** of various users will refer to the same files by the same **Entity** addresses - because a file is uniquely identified by its hash - this does not apply to any other objects such as **Groups**, as they are identified by a _UUID_, which is random. If two **vaults** were therefore combined, **entries** referring to the same files would "add up" correctly, and your existing **entries** about given files would be complemented by the **entries** of the other **vault**, but any **groups** would potentially be duplicated.  
>This is an inherent problem, and cannot be easily solved; if everything were content-addressed, including **groups**, any single change (such as adding or removing a file from a **group**) would ripple throughout the entire system, as other related **entries** would have to update their **entities** or **values** to match this new address, which would change their content, and therefore their hash, and so on. Furthermore, this would also mean that no two folders (or **groups**) could ever share names, for example, as their content would at one point be identical, and therefore their identity as well. UUIDs provide a way for two otherwise identical **objects** to coexist.  
>Not all is lost, though - two vaults can still combine in a meaningful way allowing mutual understanding - but it does necessitate an explicit mechanism resolving the semantics of combining _UUID_ referred objects such as **groups** (in other words, a separate addressing scheme). For example, if it's desirable that no matter what vault you happen to be in, the `music` **group** is always the same (and thus two users categorizing their favorite articles in the `music` group can see each other's articles) a convention can be established that all "universal" **groups** also receive an entry with a `Universal Key` **attribute**, which is then used to tell which **groups** are supposed to be the same across different vaults - and which are, for example, just a `music` group someone happened to create to categorize their favorite songs.  
>Notably, this issue is completely moot unless you happen to compare different **vaults**. If all you're concerned with is a single **vault** on your computer, you don't need to worry at all about *UUID* objects.
docs: add conceptual tutorial 2023-05-02 22:19:06 +02:00			`# UpEnd - A Conceptual Tutorial`

			`UpEnd is not a traditional database - at its core, there are no tables, objects, or files. An entire UpEnd Vault is a flat list of entries.`

			`An entry is a single "statement" that's true within the system. The core of an entry is an _Entity/Attribute/Value triplet_. For example:`

			`\| Entity \| Attribute \| Value \|`
			`\| ----------------------- \| ------------- \| --------------- \|`
			`\| John \| Age \| 23 \|`
			`\| Prague \| Is In Country \| Czech Republic \|`
			\| (hash of) `track01.mp3` \| Artist \| Various Artists \|
			`\| https://upend.dev \| Title \| UpEnd \|`

			`Formally speaking:`

			`- Entity is the thing the statement is about. It can be one of the following:`
			`- Hash, typically of a file, but also possibly of another entry.`
			`- [UUID](https://en.wikipedia.org/wiki/Universally_unique_identifier), for arbitrary objects that exist solely within UpEnd (groups/tags, annotations, etc.)`
			`- URL, anything that exists on the web.`
			`- Attribute, for data that belong to UpEnd's attributes (such as their different names, etc.).`
			`- Attribute is the "kind" of a statement.`
			`- It can be any text string, but there are some "reserved" attributes by UpEnd by default.`
			`- Value is the actual "fact" you're stating. It can be one of the following:`
			`- A text string`
			`- A number`
			`- An address of an entity.`

			(Each entry also has a _timestamp_, denoting when it was added, and _provenance_, i.e. the origin of this entry - whether it was added by an automatic process or a user. A full example of an entry therefore would be `John / Age / 23 / 2023-05-01 19:20:00 / API IMPORT`)

			`All other concepts within UpEnd arise as a consequence of combinations of entries.`

			`Objects emerge as multiple entries with the same entities accrue. In other words, an object is a collection of entries pointing to the same entity. A file object therefore may look something like:`

			`\| Entity \| Attribute \| Value \|`
			`\| ----------------------- \| --------- \| ---------------- \|`
			\| (hash of) `photo01.jpg` \| Author \| John Doe \|
			\| (hash of) `photo01.jpg` \| Label \| photo01.jpg \|
			\| (hash of) `photo01.jpg` \| Label \| Birthday 001.jpg \|
			\| (hash of) `photo01.jpg` \| Taken at \| 2020-04-01 \|
			\| (hash of) `photo01.jpg` \| ... \| ... \|

			`(In the UI, the Entity part of entry listings is often left out, as it's redundant and implied by the object view.)`

			`However, while a file object has an obvious entity to point to, a _Tag_ or a folder has no inherent identity of its own, and therefore no hash. This is the purpose of [_UUIDs_](https://en.wikipedia.org/wiki/Universally_unique_identifier). A _UUID_ is randomly generated for every object as needed.`

			`A Group is a equivalent of a folder or a tag. Its purpose is to serve as a collection of related items.`

			`It is a "conventional" object - there is nothing about UpEnd that necessitates Groups to exist, but since it provides a very useful abstraction, there is built-in functionality that works with Groups, as well as affordances in the UI. It looks like this:`

			`\| Entity \| Attribute \| Value \|`
			`\| --------------------------------------------- \| --------- \| ----------------------- \|`
			\| `f9305ca5-eabd-4a97-9aa4-37036d2a6ca4` (UUID) \| Label \| Birthday Photos \|
			\| `f9305ca5-eabd-4a97-9aa4-37036d2a6ca4` (UUID) \| Contains \| (hash of) `photo01.jpg` \|

			>Issue `CONTENT UNADDRESSABLE`
			>This means that while the vaults of various users will refer to the same files by the same Entity addresses - because a file is uniquely identified by its hash - this does not apply to any other objects such as Groups, as they are identified by a _UUID_, which is random. If two vaults were therefore combined, entries referring to the same files would "add up" correctly, and your existing entries about given files would be complemented by the entries of the other vault, but any groups would potentially be duplicated.
			>This is an inherent problem, and cannot be easily solved; if everything were content-addressed, including groups, any single change (such as adding or removing a file from a group) would ripple throughout the entire system, as other related entries would have to update their entities or values to match this new address, which would change their content, and therefore their hash, and so on. Furthermore, this would also mean that no two folders (or groups) could ever share names, for example, as their content would at one point be identical, and therefore their identity as well. UUIDs provide a way for two otherwise identical objects to coexist.
			>Not all is lost, though - two vaults can still combine in a meaningful way allowing mutual understanding - but it does necessitate an explicit mechanism resolving the semantics of combining _UUID_ referred objects such as groups (in other words, a separate addressing scheme). For example, if it's desirable that no matter what vault you happen to be in, the `music` group is always the same (and thus two users categorizing their favorite articles in the `music` group can see each other's articles) a convention can be established that all "universal" groups also receive an entry with a `Universal Key` attribute, which is then used to tell which groups are supposed to be the same across different vaults - and which are, for example, just a `music` group someone happened to create to categorize their favorite songs.
			`>Notably, this issue is completely moot unless you happen to compare different vaults. If all you're concerned with is a single vault on your computer, you don't need to worry at all about UUID objects.`