CSV on the Web: Sidecars for Spreadsheets
If you share data on the web as delimiter-separated values – that is, as spreadsheets – there is a world of power-ups available to you.
The term “sidecar” is used for a functional addition. A motorcycle sidecar can carry things and people. A Kubernetes sidecar container has access to the namespace and storage volumes of it’s pod’s main container, and so supports auxiliary work. Unstructured documentation, e.g. a typical README file, is not a sidecar.
The W3C’s “CSV on the Web” (CSVW) working group published seven
documents, including a note on 25 identified use
cases and a
primer on effective use of its
recommendations in practice. In the simplest case, when you’re serving a csv file like mydata.csv
,
you also serve a JSON sidecar by adding -metadata.json
to the name (e.g.
mydata.csv-metadata.json
), and you use the CSVW vocabulary to provide extra information about your
data.
There are limits to the logic you can express using the CSVW vocabulary, and this is reasonable (cf. the “rule of least power”)! The Shapes Constraint Language (SHACL) vocabulary extends expression of logic to JavaScript functions (Python functions seem doable…) – this could help to transform a spreadsheet’s cell values to conform to a desired schema.