Naming
🚧 Sprout is still in active development and evolving quickly, so the documentation and functionality may not work as described and could undergo substantial changes 🚧
This document describes a general naming scheme for Sprout that is guided by our style guide as well as by the Frictionless Framework and Data Package structure. This naming scheme only applies to data objects, functions, URLs, API endpoints, and command line interfaces that are exposed to or associated with user-facing content. It does not apply to internal content; see the style guide for details on naming internal (developer-facing) content.
We compose names based on the objects we and our users interact with as well as the actions taken on those objects. Following the Frictionless Data Package terminology, we simplify “data package” to “package” and “data resource” to “resource”. We use the words “property” or “properties” to describe package and resource metadata contained in the datapackage.json
file that follows the Frictionless Data Package specification.
While we use the term “properties” to mean anything related to the metadata contained within the datapackage.json
file, the Frictionless Data Package specification uses the term “descriptor” to describe metadata as the structure of the properties (the metadata as a whole) and as the file itself. Since we mainly work with properties, not descriptors, and want to minimize introducing additional terms, we use “properties” to refer to the content of the datapackage.json
file rather than its structure. We may also occasionally use “properties” to refer to the file itself.
Objects
Object | Description |
---|---|
package | A data package that contains a collection of related data resources and properties. |
resource | A single data file within a package, along with its properties and associated batch data files. |
properties | Metadata about a package or resource. |
path | A file path listing the location of folders or files that Sprout interacts with. |
data | The data file(s) within a resource. Within Sprout, this consists of the Parquet data and batch files. |
identifier | A unique identifier for a package (only for server environments), denoted as id . |
observational unit | The unit of observation, meaning the data associated with a specific entity at a specific time, e.g. a person who had their weight measured on their 30th birthday. |
File or folder | Description |
---|---|
dir | A shortened name for folder or directory. |
packages | The folder that contains all packages. Only for server environments. |
resources | The folder that contains all resource files in a package. |
batch | The folder that contains the batch data files for a data resource (data files timestamped when first added (or updated) to a resource). |
JSON | The main way we save and pass information around on the package’s or resource’s properties. |
Parquet | A columnar, efficient storage file format that we use to save data within a resource. Both the final resource data as well as resource batch files are stored as Parquet. |
Actions
Action | Description |
---|---|
write | Write an object to a file. |
read | Read an object from a file. |
delete | Permanently delete a file(s) associated with an object (e.g. delete an observational unit). |
extract | Extract specific information from an object, specifically the properties from a data file. |