Just the Tip of Your Data Iceberg

I’ve heard the phrase “just the tip of the iceberg” used as a positive phrase when revealing value of which an audience might not have been previously aware.

just the tip of the iceberg: Only a small, often unrepresentative portion of something much larger or more complex that cannot yet be seen or understood.1

In the context of disseminating scientific data, this tip might be a publication. A reader sees paragraphs and figures that describe and show data. A Supplemental Information section might link to a much greater volume of data – the rest of the iceberg.

The iceberg tip might also be a website with interfaces to explore some data, with links to API documentation or database dumps for users to obtain the greater volume.

But is this great volume usable to your audience? Can they leverage your data for follow-up studies that result in citations? Include yourself, your lab, your collaborators, in this audience. Can you reuse this data six months from now? Six years?

Here’s Jim Gray’s take on data icebergs:2

…what I mean by data iceberg is that there is a lot of data that is collected but not curated or published in any systematic way.

You can be made aware of a great volume of data so that it’s now visible and perhaps understood in an aggregate sense. But it’s another thing to have that volume thawed: queryable and integrable.

  1. https://idioms.thefreedictionary.com/just+the+tip+of+the+iceberg ↩︎

  2. “Jim Gray on eScience: A Tranformed Scientific Method”, in The Fourth Paradigm: Data-Intensive Scientific Discovery. Microsoft Research, 2009. ↩︎