Software User Facilities

There are user facilities for scientific research. They were born out of the cost of doing “big science” with hardware that was increasingly resource-intensive, in terms of physical space, personnel time, and materials. Lawrence’s first cyclotron fit in the palm of his hand. Then he needed the hills behind Berkeley’s university campus. Today, to do effective and interesting particle physics research, you need a facility (CERN’s LHC) that spans three national borders.

PIs still allocate portions of grant money to develop and procure small-footprint hardware in their labs. They still apply for larger grants as co-PIs for larger-footprint hardware. But lots of specialized hardware is maintained in user facilities. PIs apply to use these facilities and to get some allocation of its support staff to help operate its inventory of instruments.

Imagine hardware user facilities that don’t actually support and maintain the specialized instruments. Rather, they are more like “water and power” facilities that provide pipes and hookups and outlets and floor space where researchers can build, operate, and maintain their domain-specific instruments. This is typically the case with scientific-computing user facilities, or of scientific computing services within lab IT departments.

More and more “big science” is both compute- and data-intensive, being done with “big data” and what might be termed “big software”, domain-specific scientific instruments that are increasingly resource-intensive to develop, support, and maintain/extend. However, it’s hard to contend with the lack of concrete physicality of software. With software, cathedrals (or bazaars) are built in the mind.

User facilities are measured not by the publications/research they output directly, but by the research they enable, by the citations of research products that used facility resources, typically to synthesize, transform, and characterize physical samples. In typical user facilities, hardware comes and goes – the sample is central, portable from instrument to instrument. I believe that analogous software user facilities must be data-centric: specific software instruments (applications) come and go, whereas data is the thing that is synthesized, transformed, and characterized by various domain-specific tools.

What has been your experience with emerging software user facilities or their analogue in your domain, with resource environments pre-allocated by your organization to reduce the waste of individual or small groups of PIs spending a portion of study-grant allocations on functionally equivalent domain-specific software and data systems?

This post was adapted from a note sent to my email list on Scientific Data Unification.
I'd love for you to subscribe.