Grid Document
- Feature Name: Grid Document
- Start Date: 2022-08-01
- RFC PR:
- Grid Issue:
Summary
Grid Document allows for the sharing of arbitrary documents between parties on a Grid network. Documents are managed in a simple filesystem-like model with a single layer of uniquely named folders and the ability to upload uniquely named documents to those locations.
Motivation
Grid Document supports flexible prototyping of new data concepts using a document metaphor. Exchanging data using document records on a Grid network may be an attractive precursor to designing and implementing standards-based or new custom data models and workflows for the specific records types. In addition, Grid Document enables use cases where document sharing is the end goal.
Guide-level explanation
Entities
Grid Document persists and allows access to local files as opaque objects, organized into folders.
A file is referenced using its name and its parent folder name. The contents of the file, encoded by the client, are stored in state as bytes.
A folder will have a list of names, referencing the files it contains. Folders also have a unique name property.
Transactions
In order to add a file or folder to state, a transaction must be submitted. Grid Document defines three transactions, used to create a file or delete a file or delete a folder. Currently, no permissions are enforced when handling state objects. The next section explains how Grid Document may be extended in the future to enforce permissioning.
Create a folder
The operation to create a folder will take in a unique folder name and add it to state. If the folder already exists, the transaction is invalid.
Delete a folder
The operation to delete a folder will take in the folder name and remove it from state.
Create a file
The operation to create a file will take in a destination folder, file name, and file contents. The file will be stored in state in relation to the destination folder.
Delete a file
The operation to delete a file will take in the file name and the name of its parent folder. The file name will be removed from the parent folder and the file removed from state.
Reference-level explanation
State
File Representation
The primary object that may be stored in state is a “File”, which represents an opaque version of the local file with a name and contents. As file contents are stored and handled as bytes, Grid Document cannot see any file details beyond the designated name. This file is addressed in state in relation to a folder. The transaction responsible for creating a file must ensure the name of a new file is unique within the destination folder.
Grid Document stores objects in state using protobuf message format. Therefore,
the file’s content size is limited to the capacity of its associated data type.
Based on the protocol buffer documentation, bytes
have a maximum size of 2 GB.
This information and more on protobuf encoding can be found in the
[Protocol Buffer Developer’s Guide]
(https://developers.google.com/protocol-buffers/docs/encoding). The
transaction responsible for creating a file must validate the content is not
larger than 2 GB.
A “File” is represented as a protobuf message in state, as follows:
message File {
string name = 1;
bytes content = 2;
}
Folder Representation
The other object that may be stored in state is a “Folder”, which holds lists of
files. In the Merkle-Radix state, a folder is represented by its unique name and
a list of the names of the files it contains. The DocumentRoot
object
represents a list of all the folders in state. The transaction responsible
for creating a folder must validate its name is unique in this list.
To ensure the Merkle-Radix trie does not become overloaded with data, which leads to more hash collisions and slow state operations, the number of folders overall and the number of files per folder will be limited. This will ensure state operations at folder addresses remain manageable. Therefore, the transaction responsible for creating a folder must validate that state has not surpassed its maximum folder capacity. Additionally, there will be a defined limit to the total amount of files a folder can hold. The transaction responsible for creating a file must validate the folder has not surpassed its maximum file capacity.
The protocol buffer representation of DocumentRoot and Folder are as follows:
message DocumentRoot {
repeated Folder folders = 1;
}
message Folder {
string name = 1;
repeated string files = 2;
}
Transaction Payloads and Execution
DocumentPayload Transaction
DocumentPayload has an enum field, containing the only possible actions, and
fields for each of the associated actions’ payloads. The enum field determines
how the transaction payload will be handled. Only one Action
variant can be
defined for a payload. Therefore, only one action payload, correlating with the
defined Action
, is processed in a transaction.
message DocumentPayload {
enum Action {
UNSET_ACTION = 0;
FOLDER_CREATE = 1;
FOLDER_DELETE = 2;
FILE_CREATE = 3;
FILE_DELETE = 4;
}
Action action = 1;
FolderCreateAction folder_create = 2;
FolderDeleteAction folder_delete = 3;
FileCreateAction file_create = 4;
FileDeleteAction file_delete = 5;
}
FolderCreateAction
FolderCreateAction creates a new unique folder in state.
message FolderCreateAction {
string name = 1;
}
Validation requirements:
- Folder must not exist in state
- Name must be unique within the DocumentRoot
- Name must not contain any special characters
- Length of DocumentRoot’s
folders
list must not exceed the limit
FolderDeleteAction
FolderDeleteAction removes an empty folder from state.
message FolderDeleteAction {
string name = 1;
}
Validation requirements:
- Length of the folder’s
files
list must be 0 - Folder must exist in state
FileCreateAction
FileCreateAction creates a new file in a specified folder in state, creating the folder if necessary.
message FileCreateAction {
string folder = 1;
string name = 2;
bytes content = 3;
}
Validation requirements:
- Name must be unique within the folder
- Name must not contain any special characters
- Length of the folder’s
files
list must not exceed the limit - Content size must not be over 2 GB
FileDeleteAction
FileDeleteAction removes an existing file from state. This transaction is invalid if the folder or file does not exist in state.
message FileDeleteAction {
string folder = 1;
string name = 2;
}
Validation requirements:
- Folder must exist in state
- File must exist in state
Document Addressing in the Merkle-Radix State System
Grid Document defines a formula to compute the Merkle-Radix trie address for its state objects. Merkle-Radix addresses consist of 70 characters. All Grid addresses are prefixed with the 6-hex-character “621dee” namespace. Grid Document state is further namespaced with a “07” prefix. All state entries with an address beginning in “621dee07” are Grid Document objects. The section of state is further namespaced for folders and files using a 2-hex-character string. The “00” namespace is reserved for folders and the “01” namespace is reserved for files.
Grid Document state addresses begin with the previously defined namespaces, leaving 60 characters to construct. The name of a file or folder will be hashed and a subset of the first characters of the resulting hash is used for the remaining address.
The first 10-hex-characters of the SHA-512 hash of the folder name is then followed by trailing zeroes. The formula to construct a folder’s Merkle-Radix address follows:
“621dee” + “07” + “00” + Sha512(folder_name)[:10] + “000000000000000000000000000000000000000000000000”
Grid Document allows files to be stored in relation to a folder, similar to native environments. If a file is stored in relation to a Grid Document folder, the address constructed will include a hash of both the unique folder and file name. Similar to how addresses are constructed for folders, a file will contain the first 10-hex-characters of the hashed folder name. Remaining characters of the address are pulled from the hashed file name. Therefore, the formula for creating a Grid Document file follows:
“621dee” + “07” + “01” + Sha512(folder_name)[:10] + Sha512(file_name)[:10]
Client Designs
Command Line Interface
Commands will be added to support creating and accessing files and folders
stored by Grid. Grid Document commands will begin with the grid-doc
subcommand.
grid doc cp [FLAGS] [OPTIONS] {SOURCE} {DESTINATION}
This command will copy the bytes of a file, decode it to a specified format, and download the resulting file to a destination folder.
When this command copies to a remote destination, a Sabre transaction is submitted to create a file.
The cp command will fail in either direction if the destination directory does not exist.
The command supports file globbing. When specifying multiple local files, normal shell globbing is used. When specifying multiple remote files, the CLI will perform globbing against the list of remote files.
ARGS
SOURCE
- Specify the source file or folder being copied.
DESTINATION
- Specify a destination to save the copied contents.
EXAMPLES
$ grid doc cp \
remote::/invoices/invoice_01.inv \
new_invoice.inv
The above command will attempt to copy a remote file, /invoices/invoice_01.inv
,
to a local destination, new_invoice.inv
.
This command may also be used to copy a local file to a remote location.
$ grid doc cp \
local_file_name.txt \
remote::/documents
The above command will attempt to copy the local file, local_file_name.txt
to
a remote destination, /documents
, in Grid Document state.
$ grid doc cp \
*.txt \
remote::/documents
The above command will copy all files in the current directory with the .txt extension to the remote documents folder.
$ grid doc cp \
* \
remote::/documents
The above command will copy all files in the current directory to the remote documents folder.
$ grid doc cp \
"remote::/documents/*.txt" \
local_folder/
The above command will copy all files from Grid Document in the documents folder that have the .txt extension.
grid-doc-mkdir [FLAGS] [OPTIONS] {NAME}
This command will create a folder. The command submits a Sabre transaction, validated by the Grid Document smart contract, to create a new folder in Grid Document state.
ARGS
NAME
- Unique name of the folder to be created
EXAMPLES
$ grid doc mkdir invoices
The above command will attempt to create a folder named “invoices” in Grid Document state.
grid-doc-ls [FLAGS] [OPTIONS] {REMOTE_DIRECTORY}
This command will list contents in state. Listed contents are limited to a remote directory, if specified.
A couple hidden aliases should exist for this command: “grid doc list” and “grid doc dir”.
ARGS
REMOTE_DIRECTORY
- Specify a remote directory to list contents.
EXAMPLES
$ grid doc ls invoices
The above command will list all files in the ‘invoices’ folder in Grid Document state. The next command will list all folders in Grid Document state.
$ grid doc ls
grid-doc-rm [FLAGS] [OPTIONS] {FILE}
This command will delete a file. The command submits a Sabre transaction to delete the file from Grid Document state.
A couple aliases could be provided for this command: “grid doc delete” and “grid doc del”.
This command also supports file globbing.
ARGS
FILE
- Name of the file to be removed.
OPTIONS
-r
- Remove the folder recursively, first removing all files in the folder then the folder itself.
EXAMPLES
$ grid doc rm \
/invoices/invoice_01.inv
The above command will attempt to delete the remote file, invoice_01.inv, from the remote folder, /invoices. The following command will attempt to delete a folder, /invoices.
$ grid doc rm \
/invoices
Example 1:
$ grid doc rm \
a_real_folder/a_real_file
The above command deletes the file a_real_file
from the a_real_folder
folder.
$ grid doc rm \
"a_real_folder/*.txt"
The above command deletes all files with the .txt extension from the Grid
Document folder a_real_folder
.
$ grid doc rm \
“a_real_folder/*”
The above command will delete all files from the ‘a_real_folder’ folder,
this is the same action as the -r
option.
grid-doc-rmdir [FLAGS] [OPTIONS] {FOLDER}
This command will attempt to delete an empty directory. The command submits a Sabre transaction to delete the folder from Grid Document state.
ARGS
FOLDER
- Name of the empty folder to be removed.
EXAMPLES
The following command will attempt to delete the remote folder, /invoices
.
$ grid doc rmdir \
invoices
REST API
Grid’s REST API will be extended to include endpoints to access Grid Document state and to submit transactions. The required endpoints are as follows:
GET /docs
This endpoint will list all folders. The response body will be formatted as a paginated JSON list.
GET /docs/{folder}
This endpoint will list all files in the specified folder. The response body will be formatted as a paginated JSON list.
POST /docs/{folder}
This endpoint will accept both JSON- and byte-encoded payloads to create a folder. The signed batch is then persisted in Grid’s database. Once the batch has been successfully stored, the endpoint will respond with a JSON-formatted list of the batch’s identifiers.
DELETE /docs/{folder}
This endpoint will accept both JSON- and byte-encoded payloads to delete a folder. The signed batch is then persisted in Grid’s database. Once the batch has been successfully stored, the endpoint will respond with a JSON-formatted list of the batch’s identifiers.
GET /docs/{folder}/{file}
This endpoint will download the contents of the specified file.
POST /docs/{folder}/{file}
This endpoint will accept both JSON- and byte-encoded payloads to create a file. The signed batch is persisted in Grid’s database. Once the batch has been successfully stored, the endpoint will respond with a JSON-formatted list of the batch’s identifiers.
DELETE /docs/{folder}/{file}
This endpoint will accept both JSON- and byte-encoded payloads to delete a file. The signed batch is then persisted in Grid’s database. Once the batch has been successfully stored, the endpoint will respond with a JSON-formatted list of the batch’s identifiers.
Future Considerations
Grid Document is intentionally simple to use but may be extended in the future to support more unique and robust business scenarios. This may include allowing for a greater folder depth than one, allowing other smart contracts to access and edit Grid Document state, and incorporating permissions. This section explains how Grid Document may be improved in the future.
Folder Depth
Grid Document currently supports a single layer of folders. More complex folder organization may be implemented by augmenting the addressing formulas for files or folders. The state address of a folder is currently calculated using a pre-defined prefix and a hash of the folder’s name. More complex addressing techniques may include allowing relative paths in file name hashes, hashing a folder’s parent directories along with its unique name, or prefixing an addresses namespace to separate files and additional content. Adding these to the addressing formulas defined earlier would allow more flexibility to store additional folders, without rendering the predefined formulas useless.
Relative path in file names
One route to allowing more complex file organization in Grid Document state is to allow relative paths within a file name. Files could include a slash character to represent the successive levels of directories. Folders in state would remain in a single layer. This would also use the same addressing formulas defined earlier and does not add complexity to storing additional folder contents.
Relative path in folder names
Support for multi-level folder schemes could be incorporated by enforcing a format to indicate multiple folders in a path. Users are accustomed to folder and file paths joined by a slash character, “/”, which makes it simple for users to understand within Grid Document.
The folder state object will be extended to include a list of folder names to
support folder depth greater than one. Grid Document would use the common
slash character, “/”, to indicate nested folder references. The transaction to
create a folder could take in a nested folder path, separated by “/”. If one of
the folders in the path does not exist, the transaction would be invalid. If all
files in the path exist, the folder will be created within the destination
folder. Furthermore, a file may also be saved within a nested folder. The
folder
argument in the CreateFileAction
payload will also take a path
separated by a slash character. If any of the folders in this path do not exist,
the transaction is invalid.
The full path of a folder in Grid Document state, including parent folders up to DocumentRoot, may be used to calculate state addresses rather than just the unique folder name. However, allowing more folders in state may also necessitate increasing the number of characters to refer to the folder in state addresses to avoid hash collisions. This may also necessitate more safe-guards to ensure state size does not grow exponentionally.
Folder namespace prefixes
Additionally, additional folder layers may be addressed in Grid Document by using the already existing folder address. This address can be further prefixed using 2-hex-characters, similar to the construction of the first part. The “00” prefix would be reserved for storing files. The formula for a file would be as follows:
“621dee” + “07” + “01” + Sha512(folder_name)[10:] + “00” + Sha512(file_name)[48:]
Grid Document may allow folders to be addressed under other folders using the following formula:
“621dee” + “07” + “01” + Sha512(folder_name)[10:] + “01” + Sha512(folder_name)[48:]
The remaining namespaces may be defined in the future, depending on the needs of the smart contract or specific implementations.
Access to Grid Document state
Sharing documents is useful alongside other Grid smart contracts. Other smart contracts may read and/or write to Grid Document state by adding the namespace to transaction inputs and/or outputs. Addresses from the inputs of a transaction are able to be read while the addresses in the the outputs may be written to by the transaction. This will be interesting in cases where users want to provide more context or corroborating inter company communications. A simple example of this would be a supplier uploading an image of the product in a purchase order as it is shipped off.
Permissions
Grid Document does not initially enforce permissions, but may be extended to support both Grid Pike and Workflow. In both cases, Grid Document could validate permissions to access, create, or delete a file or folder. An organization would be indicated as the owner of a file or folder. The organization indicated as the owner defines if and how agents, within and outside of the organization, may interact with files and folders in state.
Incorporating Pike
Grid Document can validate permissions to create or delete a file or folder
using Grid Pike. An agent is assigned one or more Pike roles, defined by the
organization, which contain permissions to alter Grid state. Permissions are
enforced by the Grid Pike smart contract when executing a Grid Document
transaction. Files and folders will include an owner_id
field, to indicate the
owner’s Grid Pike organization ID. The owning organization is then able to assign
roles for using Grid Document to agents within and outside of their
organization.
If an organization maintains multiple folders with unique permissions, an
additional field may be added to the folder state object. This field would
be an organization-unique “group_id”, used in permissions to define an agent’s
level of access to a folder. Permissions specific to these groups will be
post-fixed with the unique “group_id” field. For example, permissions for a
folder with the “group_id” field “01234” may include can-alter-01234
,
can-delete-01234
or can-create-01234
.
State objects could be extended to support Grid Pike as follows:
message Folder {
…
string owner_id = 3;
string group_id = 4; // optional
}
message File {
…
string owner_id = 3;
}
Transactions will validate that the submitting agent has been permitted to execute the transaction. The transaction messages and validation requirements will be extended as follows:
CreateFolderAction
To create a folder, an agent must be assigned the document::can-create-folder
permission by the owning-organization. If the agent does not have this permission,
the transaction is invalid. If the transaction includes a group_id
, the agent
must have the document::can-create-<group_id>
permission for the transaction
to be valid. The transaction payload will be extended as follows:
message CreateFolderAction {
…
string owner_id = 2;
string group_id = 3; // optional
}
DeleteFolderAction
In order to delete a folder, an agent must have been assigned the
document::can-delete-folder
permission by the owner organization. Otherwise,
the transaction is invalid. If the specified folder contains a group_id
, the
agent must also have the document::can-delete-<group_id>
permission for the
transaction to be valid. The transaction payload will remain the same.
CreateFileAction
To create a file within a folder, an agent must have the
document::can-create-file
permission for the owner organization. If the
destination folder contains a group_id
, the agent must also have the
document::can-alter-<group_id>
permission. Otherwise, the transaction is
invalid.
DeleteFileAction
An agent must be assigned the document::can-delete-file
permission by the
owner organization to delete a file. If the folder has a group_id
, the agent
must also have a document::can-alter-<group_id>
permission. If the agent
does not have the correct permissions, the transaction is invalid.
Workflow Permissions
Grid Workflow gives granular control over how agents interact with files and folders and the different states of those objects. The main workflow consists of sub-workflows, representing unique workflow states and defines how objects may move through these states. To enforce workflow permissions, files and folders will be extended to include workflow identifiers. Additionally, transactions will be extended to include workflow arguments. Allowing Grid Document to validate transactions with workflow permissions assigned to agents by their organization.
To support workflow permissions at the folder and file level, both state objects
will have a workflow_state
field, indicating the object’s position within the
subworkflow. The folder object will hold a workflow identifier, allowing Grid
Document access to the workflow’s state.
message Folder {
…
string workflow_state = 4;
string workflow_id = 5;
}
message File {
…
string worfklow_state = 4;
}
Furthermore, transactions will be extended to allow these state objects to move through workflow states. Workflow states allow for more complex permission configurations using constraints, which may be supported by adding fields to the transactions and/or state objects. Therefore, future validation requirements may vary from the ones described below to include more robust rules around workflow constraints. Grid Document transactions may appear as follows:
CreateFolderAction
The action to create a folder will include a workflow_id
and workflow_state
field. These fields must refer to an existing workflow and workflow state for the
transaction to be valid. Additionally, the agent submitting the transaction must
have the workflow permission document::can-create-folder
within the specified
workflow state.
message CreateFolderAction {
…
string workflow_id = 4;
string workflow_state = 5;
}
DeleteFolderAction
The transaction payload to delete a folder does not need to be extended to
support workflow permissions. An agent attempting to delete a folder must
have the document::can-delete-folder
permission for the current workflow
state of the folder. If the agent does not have this permission within the
folder’s workflow state, the transaction is invalid.
CreateFileAction
The transaction to create a file will be extended to include a workflow_state
.
This allows Grid Document to verify the submitting agent has the
document::can-create-file
within the indicated workflow_state
. Furthermore,
Grid Document will validate that the submitting agent has the
document::can-access-folder
permission for the folder in its current workflow
state. If the agent does not have either of these permissions, the transaction
is invalid. The transaction payload may appear as follows:
message CreateFileAction {
…
string workflow_state = 4;
}
DeleteFileAction
The transaction payload to delete a file will also remain unchanged to support
workflow permissions. An agent attempting to delete a file must have the
document::can-delete-file
permission within the file’s current workflow state.
Additionally, Grid Document will validate that the agent has the
document::can-access-folder
permission for the parent folder’s current workflow
state. If the submitting agent does not have either of these permissions, the
transaction is invalid.
Drawbacks
This change will increase the size of state for grid instances. There is not a way around this as the files have to be saved somewhere. This necessitates the ability to groom state, i.e., deleting files and folders. Based on Grid’s storage backend, this may cause issues with large amounts of historical data being persisted. Grid Document allows for deleting objects from state and state pruning techniques are already employed to maintain state size.
Rationale and alternatives
Rationale
-
Light-weight implementation
-
Very familiar and simple use-case to ease users in to Grid
Alternatives
- Users may use e-mail or online document sharing sites. This does not necessarily provide the same amount of security as Grid Document.
Prior art
Interplanetary File System + good key management does largely the same thing, in largely the same way. It uses the hash of a file as a content ID and the file gets split up into a bunch of smaller cacheable pieces that get stored on other nodes. This caching mechanism means files can be very persistent, but by default they are globally accessible and unencrypted.
DropBox has DocSend, which is advertised as a way to “Securely send critical documents and get real-time analytics”. The biggest downside is that it is closed source, paid, and not easily automatable.
Unresolved questions
-
File count per directory, directory count. Should those be user settings or immutable system defaults?
-
What should Grid Document’s default be for file count per directory and directory count?
-
Should file count per directory and directory count be user settings in Grid Document? A separate transaction to create a folder would allow users to choose a maximum file count.
-
Should files be allowed to be created without providing a destination folder?
-
Should Grid Document provide a default file location?