Requirements

Warning

🚧 Sprout is still in active development and evolving quickly, so the documentation and functionality may not work as described and could undergo substantial changes 🚧

Sprout must:

Run on Windows, MacOS, and Linux (likely on servers): Our potential users work on any of these systems, so we need to ensure compatibility across most commonly used operating systems.
Integrate GDPR, privacy, and security compliance: Our target users work with health data, so this is vital to consider
Run remotely on servers and locally on computers: The location where data are stored should be flexible based on the needs and restrictions of the user.
Be able to handle a variety of data file sizes: While the size of research data does not compare to those found in industry, it can still become large enough that it requires special care and handling.
Store data in a format that is open source, integrates with many tools, and is storage efficient: Sprout is first and foremost a data engineering tool for research data storage and distribution (or at least, easier sharing).
Store, organize, and manage multiple distinct data sources per user or group of users (for example in a server setting): Researchers rarely collect and work on one data source at any given time. So, Sprout must be able to handle multiple distinct data sources.
Upload and update data: Data can be added to Sprout could happen in batches or on a more frequent basis. We anticipate that batch uploads will be the most common.
Store, organize, and manage metadata connected to the data: Metadata are vital to understanding the data and its context, without which data can be near useless. Sprout needs to make managing and organizing metadata fundamental to its functionality.
Track changes to the data in a changelog and versioning system: Data are not static and can change over time. Sprout must track these changes and provide a way to show, track, and manage versions of the data. This is also necessary for legal compliance for auditing and record-keeping.

Sprout will not:

Run any analytic computing or data science work: While some data processing and analysis will occur, it will be limited to running checks on the quality of the data and metadata.