<?xml version="1.0" encoding="utf-8" standalone="yes"?><feed xmlns="http://www.w3.org/2005/Atom" xmlns:webfeeds="http://webfeeds.org/rss/1.0"><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><generator uri="https://gohugo.io">Hugo 0.110.0</generator><id>urn:uuid:390a272f-8fa2-425a-b44e-09b477223a39</id><link rel="self" type="application/atom+xml" href="https://donnywinston.com/feed.xml" hreflang="en-us"/><link rel="alternate" type="text/html" href="https://donnywinston.com/" hreflang="en-us"/><icon>https://files.polyneme.xyz/polyneme-logo-sq-AdH548JPkxW0qf3M5NVTVLt5qdVKFN28AKKvS35A2ndmDMQ0baH90H5APJvIITO2UkFht8rLzZGQTxob8DCqG3KqnsEOczShPKoT.png</icon><logo>https://files.polyneme.xyz/polyneme-logo-sq-AdH548JPkxW0qf3M5NVTVLt5qdVKFN28AKKvS35A2ndmDMQ0baH90H5APJvIITO2UkFht8rLzZGQTxob8DCqG3KqnsEOczShPKoT.png</logo><rights>Copyright © 2020-2025 Donny Winston. All posts licenced under &lt;https://creativecommons.org/licenses/by-nc-sa/4.0/>.</rights><subtitle>Made as simple as possible, but not simpler.</subtitle><title>Donny Winston</title><updated>2025-07-11T08:34:33+00:00</updated><webfeeds:icon>https://files.polyneme.xyz/polyneme-logo-sq-AdH548JPkxW0qf3M5NVTVLt5qdVKFN28AKKvS35A2ndmDMQ0baH90H5APJvIITO2UkFht8rLzZGQTxob8DCqG3KqnsEOczShPKoT.png</webfeeds:icon><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2025:/posts/relating-precondition-levels-for-validation-to-ontologies-and-epistemologies/</id><link rel="alternate" href="https://donnywinston.com/posts/relating-precondition-levels-for-validation-to-ontologies-and-epistemologies/"/><title>Relating Precondition Levels for Validation to Ontologies and Epistemologies</title><published>2025-07-11T10:00:11+02:00</published><updated>2025-07-11T10:00:11+02:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>Can you even de-member and re-member (like with key-value pairs) the message?</p>
<p>Okay, now can you even handle (like with a handler function) the message?</p>
<p>Okay, now can you even commit (like with a database) to act on / respond to the message?</p>
<p>Validation levels<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>:</p>
<ul>
<li>Syntactic: Grammatical. About the structure. Subject verb object? Fine.</li>
<li>Semantic: Ontological. About the meaningfulness. Subject is a type of thing you know can verb? Object is a type of thing you know can be verbed? Fine.</li>
<li>Pragmatic: Epistemological. About the context. You know that you can commit to make subject verb object right now in this environment? You can (and ideally do!) distinguish &ldquo;Subject verb object&rdquo; as a justified belief and not merely <em>something given</em> (Latin meaning of <em>datum</em>)? Fine.</li>
</ul>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>H. J. W. Percival and R. G. Gregory, &ldquo;Appendix E: Validation&rdquo;, in &ldquo;Architecture patterns with Python: enabling test-driven development, domain-driven design, and event-driven microservices,&rdquo; First edition. O’Reilly, 2020.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2025:/posts/a-set-of-patterns-as-a-set-of-directories/</id><link rel="alternate" href="https://donnywinston.com/posts/a-set-of-patterns-as-a-set-of-directories/"/><title>A Set of Patterns as a Set of Directories</title><published>2025-07-09T19:37:41+02:00</published><updated>2025-07-09T19:37:41+02:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>What does it look like to combine these patterns: domain-driven design, test-driven development, and event-driven microservices?</p>
<p>First two first. Design precedes development. Design and development of what? Services. So then, what does domain-driven design look like? It looks like commits within <code>domain/</code> before commits within <code>services/</code>. And then, what does test-driven development look like? It looks like commits within <code>tests/</code> before commits within <code>services/</code>. All together now: combining domain-driven design and test-driven development looks like commits first within <code>domain/</code>, followed by commits within <code>tests/</code>, followed by commits within <code>services/</code>.</p>
<p>And then what does it look like to fold in the pattern of event-driven microservices? It looks like commits within <code>domain/events[.py|/]</code> before commits within <code>services/</code>, and it looks like content (so maybe you don’t need new commits each time you add domain events) within <code>endpoints/</code> to ensure that all services are independently addressable and accessible outside of a service(s)-plus-endpoint(s) host process. That is, the “micro” part looks like external processes being able to invoke services à la carte.</p>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2025:/posts/unit-integration-e2e-tests-and-data-information-knowledge-wisdom-objects/</id><link rel="alternate" href="https://donnywinston.com/posts/unit-integration-e2e-tests-and-data-information-knowledge-wisdom-objects/"/><title>{Unit,Integration,End-to-End} Tests and {Data,Information,Knowledge,Wisdom} Objects</title><published>2025-07-02T12:55:42+02:00</published><updated>2025-07-02T12:55:42+02:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>Data are elements. A datum is a unit, subject to unit tests. Data may be combined in formations, i.e. as information.</p>
<p>Information objects, i.e. data composites, can be treated as elemental units as well, subject to unit tests. Information objects are also readily subject to integration tests; they are integrations of data units.</p>
<p>Knowledge objects are sets of relations among addressable data/information objects, i.e. among fact-(package-)units. Consider these units as nodes of a graph. Consider the relations among these units, i.e. the (also addressable) finite set of conceptual relations in use by a knowledge object, as edges connecting nodes. And there you have it: a knowledge graph.</p>
<p>Knowledge objects can be treated as composite units as well, subject to integration tests. Treated as knowledge graphs, they are also readily subject to end-to-end (e2e) tests in which use cases are expressed as (a sequence of) graph journeys/queries/traversals.</p>
<p>Some end-to-end scenarios are not well-represented as a user presenting a structured query over known situational elements (data/information objects) and their known conceptual relations. Rather, the user presumes and seeks operation over informal associations among informally defined composites/situations and their informally defined (and informally salient) elements/aspects. Digital objects can represent this sought-after wisdom, often as learned vector embeddings trained over temporally-qualified data/information objects, i.e. events. Such wisdom objects are subject only to end-to-end tests, not to integration or unit tests.</p>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2024:/posts/2024.04.18.1/</id><link rel="alternate" href="https://donnywinston.com/posts/2024.04.18.1/"/><title>For each paper with an author from my institution, which of that paper's authors are from my institution?</title><published>2024-04-18T10:13:15-04:00</published><updated>2024-04-18T10:13:15-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>Asked on the OpenAlex Community group:</p>
<blockquote>
<p>Is there a way to find out which authors on a paper are from my institution? I downloaded a list of DOI&rsquo;s from the website, and thought naively that I could look up the index of my institution (by &lsquo;author_institution_ids&rsquo; or by &lsquo;author_institution_names&rsquo;) and then match that index to the list of authors. But soon found out that those indices don&rsquo;t match because any author can have list multiple affiliations. Any ideas?</p>
<p>— <a href="https://groups.google.com/g/openalex-community/c/T0OjBFXSIUg">https://groups.google.com/g/openalex-community/c/T0OjBFXSIUg</a></p>
</blockquote>
<p>I recently learned about <a href="https://semopenalex.org">https://semopenalex.org</a>, which maps openalex data to RDF and thereby facililates graph-oriented queries using SPARQL.</p>
<p>Wanting to take the questioner&rsquo;s institution as my constraining example, I determined via a search of their name that &ldquo;Vrije Universiteit Amsterdam&rdquo; was their institution.</p>
<p>Via the <a href="https://semopenalex.org/resource/?uri=https%3A%2F%2Fsemopenalex.org%2Fontology%2F">SemOpenAlex ontology explorer</a>, I saw that <code>&lt;https://semopenalex.org/ontology/Institution&gt;</code> is the URI designating the class of institutions.</p>
<p>Via their <a href="https://semopenalex.org/sparql">SPARQL interface</a>, I asked for any institution, so that I could see how names were expressed.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-sparql" data-lang="sparql"><span style="display:flex;"><span><span style="color:#66d9ef">PREFIX</span> soa: &lt;https://semopenalex.org/ontology/&gt;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">SELECT</span> ?inst <span style="color:#66d9ef">WHERE</span> {
</span></span><span style="display:flex;"><span>  ?inst <span style="color:#66d9ef">a</span> soa:<span style="color:#f92672">Institution</span> .
</span></span><span style="display:flex;"><span>} <span style="color:#66d9ef">LIMIT</span> <span style="color:#ae81ff">1</span>
</span></span></code></pre></div><p>Got one, <code>&lt;https://semopenalex.org/institution/I28290843&gt;</code>. What triples (<em><strong>s</strong></em>ubject, <em><strong>p</strong></em>redicate, <em><strong>o</strong></em>bject) is this the subject of?</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-sparql" data-lang="sparql"><span style="display:flex;"><span><span style="color:#66d9ef">SELECT</span> ?p ?o <span style="color:#66d9ef">WHERE</span> {
</span></span><span style="display:flex;"><span>  &lt;https://semopenalex.org/institution/I28290843&gt; ?p ?o .
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>Scrolling through a results table of 30 rows, I see &ldquo;University of Surrey&rdquo; as an object for the predicate <code>&lt;http://xmlns.com/foaf/0.1/name&gt;</code>, i.e. the <code>name</code> term from the fried-of-a-friend (FOAF) vocabulary.
Okay, so that&rsquo;s the predicate SemOpenAlex uses to connect an institution to a name.
Now, let&rsquo;s find &ldquo;Vrije Universiteit Amsterdam&rdquo;.
Because there may not be an exact match, I&rsquo;ll ask for institutions with names containing &ldquo;Amsterdam&rdquo;:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-sparql" data-lang="sparql"><span style="display:flex;"><span><span style="color:#66d9ef">PREFIX</span> foaf: &lt;http://xmlns.com/foaf/0.1/&gt;
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">PREFIX</span> soa: &lt;https://semopenalex.org/ontology/&gt;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">SELECT</span> ?inst ?instName <span style="color:#66d9ef">WHERE</span> {
</span></span><span style="display:flex;"><span>  ?inst <span style="color:#66d9ef">a</span> soa:<span style="color:#f92672">Institution</span> .
</span></span><span style="display:flex;"><span>  ?inst foaf:<span style="color:#f92672">name</span> ?instName .
</span></span><span style="display:flex;"><span>  <span style="color:#66d9ef">FILTER</span>(<span style="color:#a6e22e">contains</span>(?instName, <span style="color:#e6db74">&#34;Amsterdam&#34;</span>))
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>Okay, I see that <code>inst</code> <code>&lt;https://semopenalex.org/institution/I865915315&gt;</code> has <code>instName</code> &ldquo;Vrije Universiteit Amsterdam&rdquo;, and none of the other results appear to be a duplicate of this.
I&rsquo;ve found the institution&rsquo;s URI.
I&rsquo;ll confirm that I get a single result from the following:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-sparql" data-lang="sparql"><span style="display:flex;"><span><span style="color:#66d9ef">PREFIX</span> foaf: &lt;http://xmlns.com/foaf/0.1/&gt;
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">PREFIX</span> soa: &lt;https://semopenalex.org/ontology/&gt;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">SELECT</span> ?inst ?instName <span style="color:#66d9ef">WHERE</span> {
</span></span><span style="display:flex;"><span>  ?inst <span style="color:#66d9ef">a</span> soa:<span style="color:#f92672">Institution</span> .
</span></span><span style="display:flex;"><span>  ?inst foaf:<span style="color:#f92672">name</span> ?instName .
</span></span><span style="display:flex;"><span>  <span style="color:#66d9ef">FILTER</span>(?instName <span style="color:#f92672">=</span> <span style="color:#e6db74">&#34;Vrije Universiteit Amsterdam&#34;</span>)
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>I do. Great.
Going back to the <a href="https://semopenalex.org/resource/?uri=https%3A%2F%2Fsemopenalex.org%2Fontology%2F">ontology explorer</a>, I can see how the model connects institutions, authors, and works:</p>
<figure> <img src="/img/semopenalex-inst-auth-work.png" width="100%" alt="screenshot of model diagram" title="screenshot of model diagram"/>
<figcaption>screenshot of model diagram</figcaption>
</figure>
<p>I see that SemOpenAlex records a <code>Work</code> as having any number of <code>Author</code>s as creators (via <code>&lt;http://purl.org/dc/terms/creator&gt;</code>).
I also note that an <code>Author</code> is recorded as being a member of (<code>&lt;http://www.w3.org/ns/org#memberOf&gt;</code>) any number of <code>Institution</code>s.</p>
<p>So here&rsquo;s what I end up with:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-sparql" data-lang="sparql"><span style="display:flex;"><span><span style="color:#66d9ef">PREFIX</span> foaf: &lt;http://xmlns.com/foaf/0.1/&gt;
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">PREFIX</span> dcterms: &lt;http://purl.org/dc/terms/&gt;
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">PREFIX</span> org: &lt;http://www.w3.org/ns/org#&gt;
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">PREFIX</span> soa: &lt;https://semopenalex.org/ontology/&gt;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">SELECT</span> ?work (<span style="color:#a6e22e">GROUP_CONCAT</span>(?author) <span style="color:#66d9ef">as</span> ?authors) <span style="color:#66d9ef">WHERE</span> {
</span></span><span style="display:flex;"><span>  ?inst <span style="color:#66d9ef">a</span> soa:<span style="color:#f92672">Institution</span> .
</span></span><span style="display:flex;"><span>  ?inst foaf:<span style="color:#f92672">name</span> ?instName .
</span></span><span style="display:flex;"><span>  <span style="color:#66d9ef">FILTER</span>(?instName <span style="color:#f92672">=</span> <span style="color:#e6db74">&#34;Vrije Universiteit Amsterdam&#34;</span>)
</span></span><span style="display:flex;"><span>  ?author org:<span style="color:#f92672">memberOf</span> ?inst .
</span></span><span style="display:flex;"><span>  ?work dcterms:<span style="color:#f92672">creator</span> ?author .
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">GROUP BY</span> ?work
</span></span></code></pre></div><p>This query retrieves works authored by at least one author that is also a member of the institution, and lists all member-of-the-institution authors for each work.
As of 2024-04-18, this is 182,498 works, and <a href="https://semopenalex.org/sparql?query=PREFIX+foaf%3A+%3Chttp%3A%2F%2Fxmlns.com%2Ffoaf%2F0.1%2F%3E%0APREFIX+xsd%3A+%3Chttp%3A%2F%2Fwww.w3.org%2F2001%2FXMLSchema%23%3E%0APREFIX+dcterms%3A+%3Chttp%3A%2F%2Fpurl.org%2Fdc%2Fterms%2F%3E%0APREFIX+org%3A+%3Chttp%3A%2F%2Fwww.w3.org%2Fns%2Forg%23%3E%0APREFIX+soa%3A+%3Chttps%3A%2F%2Fsemopenalex.org%2Fontology%2F%3E%0APREFIX+skos%3A+%3Chttp%3A%2F%2Fwww.w3.org%2F2004%2F02%2Fskos%2Fcore%23%3E%0A%0ASELECT+%3Fwork+%28GROUP_CONCAT%28%3Fauthor%29+as+%3Fauthors%29+WHERE+%7B%0A++%3Finst+a+soa%3AInstitution+.%0A++%3Finst+foaf%3Aname+%3FinstName+.%0A++FILTER%28%3FinstName+%3D+%22Vrije+Universiteit+Amsterdam%22%29%0A++%3Fauthor+org%3AmemberOf+%3Finst+.%0A++%3Fwork+dcterms%3Acreator+%3Fauthor+.%0A%7D%0AGROUP+BY+%3Fwork">the query</a> executes in under 5 s.</p>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2024:/posts/feeding-the-scholarly-need/</id><link rel="alternate" href="https://donnywinston.com/posts/feeding-the-scholarly-need/"/><title>Feeding the Scholarly Need</title><published>2024-04-08T10:49:37-04:00</published><updated>2024-04-08T10:49:37-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>This post marks the re-introduction of <a href="/posts/tag-test/">a feed for each tag</a> on this blog.</p>
<p>I want this so that I can post without worrying about contributing to &ldquo;pollution&rdquo; of the scholarly record.
I can accomplish this by tagging posts as <code>#scholarly</code> when I want them to be e.g.
fetched by <a href="https://rogue-scholar.org/">The Rogue Scholar</a> for <a href="https://www.doi.org/">DOI</a> minting
and for subsequent linking to my <a href="https://orcid.org/">ORCiD</a> profile.</p>
<p>This post should hopefully be my last act of such pollution. 🙂</p>
<p>For this <a href="https://gohugo.io">Hugo</a>-based blog, I accomplished this by creating a <code>layouts/tags/term.atom.xml</code> file
with this content:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-xml" data-lang="xml"><span style="display:flex;"><span>{{ $taxo := &#34;tags&#34; }}
</span></span><span style="display:flex;"><span>{{- $pages := .RegularPages  -}}
</span></span><span style="display:flex;"><span>{{- with .Site.Config.Services.RSS.Limit -}}
</span></span><span style="display:flex;"><span>  {{- if ge . 1 -}}
</span></span><span style="display:flex;"><span>    {{- $pages = $pages | first . -}}
</span></span><span style="display:flex;"><span>  {{- end -}}
</span></span><span style="display:flex;"><span>{{- end -}}
</span></span><span style="display:flex;"><span>{{ print &#34;<span style="color:#75715e">&lt;?xml version=\&#34;1.0\&#34; encoding=\&#34;utf-8\&#34; standalone=\&#34;yes\&#34;?&gt;</span>&#34; | safeHTML }}
</span></span><span style="display:flex;"><span><span style="color:#f92672">&lt;feed</span> <span style="color:#a6e22e">xmlns=</span><span style="color:#e6db74">&#34;http://www.w3.org/2005/Atom&#34;</span> <span style="color:#a6e22e">xmlns:webfeeds=</span><span style="color:#e6db74">&#34;http://webfeeds.org/rss/1.0&#34;</span><span style="color:#f92672">&gt;</span>
</span></span><span style="display:flex;"><span>  <span style="color:#f92672">&lt;author&gt;</span>
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&lt;name&gt;</span>{{ .Site.Author.name }}<span style="color:#f92672">&lt;/name&gt;</span>
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&lt;uri&gt;</span>{{ .Site.Author.orcid }}<span style="color:#f92672">&lt;/uri&gt;</span>
</span></span><span style="display:flex;"><span>  <span style="color:#f92672">&lt;/author&gt;</span>
</span></span><span style="display:flex;"><span>  <span style="color:#f92672">&lt;generator</span> <span style="color:#a6e22e">uri=</span><span style="color:#e6db74">&#34;https://gohugo.io&#34;</span><span style="color:#f92672">&gt;</span>Hugo {{ .Site.Hugo.Version }}<span style="color:#f92672">&lt;/generator&gt;</span>
</span></span><span style="display:flex;"><span>  <span style="color:#f92672">&lt;id&gt;</span>{{ if .Site.Params.feedUUID }}urn:uuid:{{.Site.Params.feedUUID }}{{ else }}{{ .Permalink }}{{ end }}<span style="color:#f92672">&lt;/id&gt;</span>
</span></span><span style="display:flex;"><span>  {{ with .OutputFormats.Get &#34;atom&#34; }}
</span></span><span style="display:flex;"><span>  {{ printf `<span style="color:#f92672">&lt;link</span> <span style="color:#a6e22e">rel=</span><span style="color:#e6db74">&#34;self&#34;</span> <span style="color:#a6e22e">type=</span><span style="color:#e6db74">&#34;%s&#34;</span> <span style="color:#a6e22e">href=</span><span style="color:#e6db74">&#34;%s&#34;</span> <span style="color:#a6e22e">hreflang=</span><span style="color:#e6db74">&#34;%s&#34;</span><span style="color:#f92672">/&gt;</span>` .MediaType.Type .Permalink $.Site.LanguageCode | safeHTML }}
</span></span><span style="display:flex;"><span>  {{ end }}
</span></span><span style="display:flex;"><span>  {{ range .AlternativeOutputFormats }}
</span></span><span style="display:flex;"><span>  {{ printf `<span style="color:#f92672">&lt;link</span> <span style="color:#a6e22e">rel=</span><span style="color:#e6db74">&#34;alternate&#34;</span> <span style="color:#a6e22e">type=</span><span style="color:#e6db74">&#34;%s&#34;</span> <span style="color:#a6e22e">href=</span><span style="color:#e6db74">&#34;%s&#34;</span> <span style="color:#a6e22e">hreflang=</span><span style="color:#e6db74">&#34;%s&#34;</span><span style="color:#f92672">/&gt;</span>` .MediaType.Type .Permalink $.Site.LanguageCode | safeHTML }}
</span></span><span style="display:flex;"><span>  {{ end }}
</span></span><span style="display:flex;"><span>  {{ with .Site.Params.icon }}<span style="color:#f92672">&lt;icon&gt;</span>{{ . | absURL }}<span style="color:#f92672">&lt;/icon&gt;</span>{{ end }}
</span></span><span style="display:flex;"><span>  {{ with .Site.Params.logo }}<span style="color:#f92672">&lt;logo&gt;</span>{{ . | absURL }}<span style="color:#f92672">&lt;/logo&gt;</span>{{ end }}
</span></span><span style="display:flex;"><span>  {{ with .Site.Copyright }}<span style="color:#f92672">&lt;rights&gt;</span>{{ replace . &#34;{year}&#34; now.Year }}<span style="color:#f92672">&lt;/rights&gt;</span>{{ end }}
</span></span><span style="display:flex;"><span>  {{ with .Site.Params.Description }}<span style="color:#f92672">&lt;subtitle&gt;</span>{{ .  }}<span style="color:#f92672">&lt;/subtitle&gt;</span>{{ end }}
</span></span><span style="display:flex;"><span>  <span style="color:#f92672">&lt;title&gt;</span>{{ .Site.Title }} - posts tagged &#34;{{ .Data.Term }}&#34; <span style="color:#f92672">&lt;/title&gt;</span>
</span></span><span style="display:flex;"><span>  <span style="color:#f92672">&lt;updated&gt;</span>{{ now.Format .Site.Params.dateFormatAtomFeed | safeHTML }}<span style="color:#f92672">&lt;/updated&gt;</span>
</span></span><span style="display:flex;"><span>  {{ with .Site.Params.icon96 }}<span style="color:#f92672">&lt;webfeeds:icon&gt;</span>{{ . | absURL }}<span style="color:#f92672">&lt;/webfeeds:icon&gt;</span>{{ end }}
</span></span><span style="display:flex;"><span>  {{ range $pages }}
</span></span><span style="display:flex;"><span>  <span style="color:#f92672">&lt;entry&gt;</span>
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&lt;author&gt;</span>
</span></span><span style="display:flex;"><span>    {{ if .Params.author }}
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&lt;name&gt;</span>{{ .Params.author.name }}<span style="color:#f92672">&lt;/name&gt;</span>
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&lt;uri&gt;</span>{{ .Params.author.orcid }}<span style="color:#f92672">&lt;/uri&gt;</span>
</span></span><span style="display:flex;"><span>    {{ else }}
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&lt;name&gt;</span>{{ .Site.Author.name }}<span style="color:#f92672">&lt;/name&gt;</span>
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&lt;uri&gt;</span>{{ .Site.Author.orcid }}<span style="color:#f92672">&lt;/uri&gt;</span>
</span></span><span style="display:flex;"><span>    {{ end }}
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&lt;/author&gt;</span>
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&lt;id&gt;</span>tag:{{ $u := urls.Parse .Permalink }}{{ $u.Hostname }},{{ .Date.Format .Site.Params.dateFormatTag }}:{{ replace $u.Path &#34;#&#34; &#34;_&#34; }}<span style="color:#f92672">&lt;/id&gt;</span>
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&lt;link</span> <span style="color:#a6e22e">rel=</span><span style="color:#e6db74">&#34;alternate&#34;</span> <span style="color:#a6e22e">href=</span><span style="color:#e6db74">&#34;{{ .Permalink }}&#34;</span><span style="color:#f92672">/&gt;</span>
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&lt;title&gt;</span>{{ .Title }}<span style="color:#f92672">&lt;/title&gt;</span>
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&lt;published&gt;</span>{{ .Date.Format .Site.Params.dateFormatAtomFeed | safeHTML }}<span style="color:#f92672">&lt;/published&gt;</span>
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&lt;updated&gt;</span>{{ .Lastmod.Format .Site.Params.dateFormatAtomFeed | safeHTML }}<span style="color:#f92672">&lt;/updated&gt;</span>
</span></span><span style="display:flex;"><span>    {{ with .Description }}<span style="color:#f92672">&lt;summary</span> <span style="color:#a6e22e">type=</span><span style="color:#e6db74">&#34;text&#34;</span><span style="color:#f92672">&gt;</span>{{ . }}<span style="color:#f92672">&lt;/summary&gt;</span>{{ end }}
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&lt;content</span> <span style="color:#a6e22e">type=</span><span style="color:#e6db74">&#34;html&#34;</span> <span style="color:#a6e22e">xml:base=</span><span style="color:#e6db74">&#34;{{ .Site.BaseURL }}&#34;</span> <span style="color:#a6e22e">xml:lang=</span><span style="color:#e6db74">&#34;en&#34;</span><span style="color:#f92672">&gt;</span>
</span></span><span style="display:flex;"><span>      {{ printf &#34;<span style="color:#75715e">&lt;![CDATA[%s]]&gt;</span>&#34; .Content | safeHTML }}
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&lt;/content&gt;</span>
</span></span><span style="display:flex;"><span>  <span style="color:#f92672">&lt;/entry&gt;</span>
</span></span><span style="display:flex;"><span>  {{ end }}
</span></span><span style="display:flex;"><span><span style="color:#f92672">&lt;/feed&gt;</span>
</span></span></code></pre></div><p>And here is my revised <code>hugo.toml</code> site configuration:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-toml" data-lang="toml"><span style="display:flex;"><span><span style="color:#a6e22e">baseURL</span> = <span style="color:#e6db74">&#34;&#34;</span>
</span></span><span style="display:flex;"><span><span style="color:#a6e22e">copyright</span> = <span style="color:#e6db74">&#34;Copyright © 2020-{year} Donny Winston. All posts licenced under &lt;https://creativecommons.org/licenses/by-nc-sa/4.0/&gt;.&#34;</span>
</span></span><span style="display:flex;"><span><span style="color:#a6e22e">languageCode</span> = <span style="color:#e6db74">&#34;en-us&#34;</span>
</span></span><span style="display:flex;"><span><span style="color:#a6e22e">rssLimit</span> = <span style="color:#ae81ff">100</span>
</span></span><span style="display:flex;"><span><span style="color:#a6e22e">title</span> = <span style="color:#e6db74">&#34;Donny Winston&#34;</span>
</span></span><span style="display:flex;"><span><span style="color:#a6e22e">timeZone</span> = <span style="color:#e6db74">&#34;America/New_York&#34;</span>
</span></span><span style="display:flex;"><span><span style="color:#a6e22e">theme</span> = <span style="color:#e6db74">&#34;noteworthy&#34;</span>
</span></span><span style="display:flex;"><span><span style="color:#a6e22e">enableRobotsTXT</span> = <span style="color:#66d9ef">true</span>
</span></span><span style="display:flex;"><span><span style="color:#a6e22e">paginate</span> = <span style="color:#ae81ff">4</span>
</span></span><span style="display:flex;"><span><span style="color:#a6e22e">summaryLength</span> = <span style="color:#ae81ff">10</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>[<span style="color:#a6e22e">taxonomies</span>]
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">tag</span> = <span style="color:#e6db74">&#34;tags&#34;</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>[<span style="color:#a6e22e">author</span>]
</span></span><span style="display:flex;"><span>	<span style="color:#a6e22e">name</span> = <span style="color:#e6db74">&#34;Donny Winston&#34;</span>
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">orcid</span> = <span style="color:#e6db74">&#34;https://orcid.org/0000-0002-8424-0604&#34;</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e"># Set to false to disallow raw HTML in markdown files</span>
</span></span><span style="display:flex;"><span>[<span style="color:#a6e22e">markup</span>.<span style="color:#a6e22e">goldmark</span>.<span style="color:#a6e22e">renderer</span>]
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">unsafe</span> = <span style="color:#66d9ef">true</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>[<span style="color:#a6e22e">mediaTypes</span>]
</span></span><span style="display:flex;"><span>  [<span style="color:#a6e22e">mediaTypes</span>.<span style="color:#e6db74">&#34;application/atom+xml&#34;</span>] <span style="color:#75715e"># Thank you &lt;https://c20d.blog/posts/2023/04/atom-feeds-with-hugo/&gt; !</span>
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">suffixes</span> = [<span style="color:#e6db74">&#34;xml&#34;</span>]
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e"># Menu links along the sidebar navigation.</span>
</span></span><span style="display:flex;"><span>[[<span style="color:#a6e22e">menu</span>.<span style="color:#a6e22e">main</span>]]
</span></span><span style="display:flex;"><span>	<span style="color:#a6e22e">identifier</span> = <span style="color:#e6db74">&#34;about&#34;</span>
</span></span><span style="display:flex;"><span>	<span style="color:#a6e22e">name</span> = <span style="color:#e6db74">&#34;About&#34;</span>
</span></span><span style="display:flex;"><span>	<span style="color:#a6e22e">url</span> = <span style="color:#e6db74">&#34;/about/&#34;</span>
</span></span><span style="display:flex;"><span>	<span style="color:#a6e22e">weight</span> = <span style="color:#ae81ff">1</span> <span style="color:#75715e"># Weight is an integer used to sort the menu items. The sorting goes from smallest to largest numbers. If weight is not defined for each menu entry, Hugo will sort the entries alphabetically.</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>[[<span style="color:#a6e22e">menu</span>.<span style="color:#a6e22e">main</span>]]
</span></span><span style="display:flex;"><span>	<span style="color:#a6e22e">identifier</span> = <span style="color:#e6db74">&#34;consulting&#34;</span>
</span></span><span style="display:flex;"><span>	<span style="color:#a6e22e">name</span> = <span style="color:#e6db74">&#34;Consulting&#34;</span>
</span></span><span style="display:flex;"><span>	<span style="color:#a6e22e">url</span> = <span style="color:#e6db74">&#34;/consulting/&#34;</span>
</span></span><span style="display:flex;"><span>	<span style="color:#a6e22e">weight</span> = <span style="color:#ae81ff">2</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e">#[[menu.main]]</span>
</span></span><span style="display:flex;"><span><span style="color:#75715e">#	identifier = &#34;tags&#34;</span>
</span></span><span style="display:flex;"><span><span style="color:#75715e">#	name = &#34;Tags&#34;</span>
</span></span><span style="display:flex;"><span><span style="color:#75715e">#	url = &#34;/tags/&#34;</span>
</span></span><span style="display:flex;"><span><span style="color:#75715e">#	weight = 3</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>[[<span style="color:#a6e22e">menu</span>.<span style="color:#a6e22e">main</span>]]
</span></span><span style="display:flex;"><span>	<span style="color:#a6e22e">name</span> = <span style="color:#e6db74">&#34;Archives&#34;</span>
</span></span><span style="display:flex;"><span>	<span style="color:#a6e22e">identifier</span> = <span style="color:#e6db74">&#34;archives&#34;</span>
</span></span><span style="display:flex;"><span>	<span style="color:#a6e22e">url</span> = <span style="color:#e6db74">&#34;/archives/&#34;</span>
</span></span><span style="display:flex;"><span>	<span style="color:#a6e22e">weight</span> = <span style="color:#ae81ff">4</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>[[<span style="color:#a6e22e">menu</span>.<span style="color:#a6e22e">main</span>]]
</span></span><span style="display:flex;"><span>	<span style="color:#a6e22e">identifier</span> = <span style="color:#e6db74">&#34;feed&#34;</span>
</span></span><span style="display:flex;"><span>	<span style="color:#a6e22e">name</span> = <span style="color:#e6db74">&#34;Feed&#34;</span>
</span></span><span style="display:flex;"><span>	<span style="color:#a6e22e">url</span> = <span style="color:#e6db74">&#34;/feed.xml&#34;</span>
</span></span><span style="display:flex;"><span>	<span style="color:#a6e22e">weight</span> = <span style="color:#ae81ff">5</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>[<span style="color:#a6e22e">outputFormats</span>]
</span></span><span style="display:flex;"><span>  [<span style="color:#a6e22e">outputFormats</span>.<span style="color:#a6e22e">ATOM</span>]
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">mediaType</span> = <span style="color:#e6db74">&#34;application/atom+xml&#34;</span>
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">baseName</span>  = <span style="color:#e6db74">&#34;feed&#34;</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>[<span style="color:#a6e22e">outputs</span>]
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">home</span> = [<span style="color:#e6db74">&#34;ATOM&#34;</span>, <span style="color:#e6db74">&#34;HTML&#34;</span>]
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">page</span> = [<span style="color:#e6db74">&#34;HTML&#34;</span>]
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">section</span> = [<span style="color:#e6db74">&#34;HTML&#34;</span>]
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">taxonomy</span> = [<span style="color:#e6db74">&#34;HTML&#34;</span>]
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">term</span> = [<span style="color:#e6db74">&#34;ATOM&#34;</span>, <span style="color:#e6db74">&#34;HTML&#34;</span>]
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>[<span style="color:#a6e22e">params</span>]
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">favicon</span> = <span style="color:#e6db74">&#34;https://files.polyneme.xyz/polyneme-logo-sq-AdH548JPkxW0qf3M5NVTVLt5qdVKFN28AKKvS35A2ndmDMQ0baH90H5APJvIITO2UkFht8rLzZGQTxob8DCqG3KqnsEOczShPKoT.png&#34;</span>
</span></span><span style="display:flex;"><span>	<span style="color:#a6e22e">math</span> = <span style="color:#66d9ef">true</span>
</span></span><span style="display:flex;"><span>	<span style="color:#75715e"># Blog description at the top of the homepage. Supports markdown.</span>
</span></span><span style="display:flex;"><span>	<span style="color:#a6e22e">description</span> = <span style="color:#e6db74">&#34;Made as simple as possible, but not simpler.&#34;</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Set enableKofi to true to enable the Ko-fi support button. Add your Ko-fi ID to link to your account.</span>
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">enableKofi</span> = <span style="color:#66d9ef">false</span>
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">kofi</span> = <span style="color:#e6db74">&#34;&#34;</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>	<span style="color:#75715e"># Add links to your accounts. Remove the ones you don&#39;t want to include.</span>
</span></span><span style="display:flex;"><span>	<span style="color:#75715e"># Main</span>
</span></span><span style="display:flex;"><span>	<span style="color:#75715e"># email = &#34;mailto:donny@donnywinston.com&#34;</span>
</span></span><span style="display:flex;"><span>	<span style="color:#a6e22e">linkedin</span> = <span style="color:#e6db74">&#34;https://www.linkedin.com/in/donnywinston/&#34;</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>	<span style="color:#75715e"># Programming</span>
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">github</span> = <span style="color:#e6db74">&#34;https://github.com/dwinston/&#34;</span>
</span></span><span style="display:flex;"><span>	<span style="color:#75715e"># stackoverflow = &#34;#&#34;</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Academic</span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># googlescholar = &#34;#&#34;</span>
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">orcid</span> = <span style="color:#e6db74">&#34;https://orcid.org/0000-0002-8424-0604&#34;</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">mastodon</span> = <span style="color:#e6db74">&#34;https://fairpoints.social/@donny&#34;</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">dateFormatAtomFeed</span> = <span style="color:#e6db74">&#34;2006-01-02T15:04:05-07:00&#34;</span>
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">dateFormatTag</span> = <span style="color:#e6db74">&#34;2006&#34;</span>
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">feedUUID</span> = <span style="color:#e6db74">&#34;390a272f-8fa2-425a-b44e-09b477223a39&#34;</span>
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">icon</span> = <span style="color:#e6db74">&#34;https://files.polyneme.xyz/polyneme-logo-sq-AdH548JPkxW0qf3M5NVTVLt5qdVKFN28AKKvS35A2ndmDMQ0baH90H5APJvIITO2UkFht8rLzZGQTxob8DCqG3KqnsEOczShPKoT.png&#34;</span>
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">icon96</span> = <span style="color:#e6db74">&#34;https://files.polyneme.xyz/polyneme-logo-sq-AdH548JPkxW0qf3M5NVTVLt5qdVKFN28AKKvS35A2ndmDMQ0baH90H5APJvIITO2UkFht8rLzZGQTxob8DCqG3KqnsEOczShPKoT.png&#34;</span>
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">logo</span> = <span style="color:#e6db74">&#34;https://files.polyneme.xyz/polyneme-logo-sq-AdH548JPkxW0qf3M5NVTVLt5qdVKFN28AKKvS35A2ndmDMQ0baH90H5APJvIITO2UkFht8rLzZGQTxob8DCqG3KqnsEOczShPKoT.png&#34;</span>
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">mainSections</span> = [<span style="color:#e6db74">&#34;posts&#34;</span>]
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e"># Privacy configurations: https://gohugo.io/about/hugo-and-gdpr/</span>
</span></span><span style="display:flex;"><span>[<span style="color:#a6e22e">privacy</span>]
</span></span><span style="display:flex;"><span>  [<span style="color:#a6e22e">privacy</span>.<span style="color:#a6e22e">disqus</span>]
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">disable</span> = <span style="color:#66d9ef">true</span>
</span></span><span style="display:flex;"><span>  [<span style="color:#a6e22e">privacy</span>.<span style="color:#a6e22e">googleAnalytics</span>]
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">disable</span> = <span style="color:#66d9ef">true</span>
</span></span><span style="display:flex;"><span>  [<span style="color:#a6e22e">privacy</span>.<span style="color:#a6e22e">instagram</span>]
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">disable</span> = <span style="color:#66d9ef">true</span>
</span></span><span style="display:flex;"><span>  [<span style="color:#a6e22e">privacy</span>.<span style="color:#a6e22e">twitter</span>]
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">disable</span> = <span style="color:#66d9ef">true</span>
</span></span><span style="display:flex;"><span>  [<span style="color:#a6e22e">privacy</span>.<span style="color:#a6e22e">vimeo</span>]
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">disable</span> = <span style="color:#66d9ef">true</span>
</span></span><span style="display:flex;"><span>  [<span style="color:#a6e22e">privacy</span>.<span style="color:#a6e22e">youtube</span>]
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">disable</span> = <span style="color:#66d9ef">true</span>
</span></span></code></pre></div><p>Furthermore, I am testing references introspection <sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> with this post.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Fenner, Martin. “Starting to Include References in DOI Metadata for Blog Posts.” Front Matter, June 16, 2023. <a href="https://doi.org/10.53731/6mkrk-dzh02">https://doi.org/10.53731/6mkrk-dzh02</a>.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2024:/posts/community-vis-à-vis-forum/</id><link rel="alternate" href="https://donnywinston.com/posts/community-vis-%C3%A0-vis-forum/"/><title>Community vis-à-vis Forum</title><published>2024-03-29T10:43:02-04:00</published><updated>2024-03-29T10:43:02-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>I think of a community as a <em><strong>state</strong></em> (-ity) of <em><strong>having a purpose in mind</strong></em> (mmun-&gt;mean) <em><strong>together</strong></em> (co-), not as an endurable space.
I think of a forum as an endurable space, as a doored (from the Latin <em>fores</em>, i.e. door) space of focus (from the French <em>foyer</em>).</p>
<p>How many makes a community?
I don&rsquo;t know.
I won&rsquo;t pretend the Hebrew <em>minyan</em> actually shares etymology with community, but it does helpfully suggest a quorum of ten.
How many is too many?
Because a community is a purpose-coherent social state, Dunbar&rsquo;s number suggests a &ldquo;knee of the curve&rdquo; of roughly 150.</p>
<p>It seems hard for a set of people to sustain a state of having a purpose in mind together.
If the purpose can be fulfilled, then if it is, that set of people can attempt to self-herd themselves to an adjacent or follow-on purpose, and thereby &ldquo;evolve&rdquo; &ldquo;the&rdquo; community into a different community, a different state of having a purpose in mind together.</p>
<p>A forum can serve a community.
If that community evolves, i.e. shifts coherence of purpose, the forum may be sustained if both the pre-shift and post-shift purposes benefit from similar-enough kinds of focus.</p>
<p>It seems that a forum sometimes is made to endure despite the dissolution of the community that motivated its origination as a shelter for focus.
I can&rsquo;t help but think of the chartered corporation as typically being such a forum.
The first such charters were granted to incorporate a set of people to sustain, via a shelter for focus, a state of having a purpose in mind together of constructing a particular railroad.</p>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2023:/posts/2023.04.28.1/</id><link rel="alternate" href="https://donnywinston.com/posts/2023.04.28.1/"/><title>Don't archive you assets — frontier them</title><published>2023-04-28T09:07:50-04:00</published><updated>2023-04-28T09:07:50-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>Don&rsquo;t archive you assets<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> — frontier them.
Research is a living process.<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>
Even when a research project is &ldquo;finished&rdquo;, is it really?</p>
<p>ResearchEquals<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup> clearly articulates the idea of asset-story<sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup> continuations<sup id="fnref:5"><a href="#fn:5" class="footnote-ref" role="doc-noteref">5</a></sup>.
Choose your own adventure &ndash; what downstream assets<sup id="fnref:6"><a href="#fn:6" class="footnote-ref" role="doc-noteref">6</a></sup> may use the current &ldquo;leaf node&rdquo;<sup id="fnref:7"><a href="#fn:7" class="footnote-ref" role="doc-noteref">7</a></sup>?</p>
<p>Your digital research assets are outputs in some stories and inputs in others.
Some of these stories are yet to be told — even stories where an asset is an output.
Imagine the serendipity of a future research project producing a digital asset with the same SHA256 hash<sup id="fnref:8"><a href="#fn:8" class="footnote-ref" role="doc-noteref">8</a></sup> as a previously registered asset.</p>
<p>With FAIR<sup id="fnref:9"><a href="#fn:9" class="footnote-ref" role="doc-noteref">9</a></sup> stewardship, every digital research object is a frontier asset, with outbound PID-graph<sup id="fnref:10"><a href="#fn:10" class="footnote-ref" role="doc-noteref">10</a></sup> edges waiting to be claimed<sup id="fnref:11"><a href="#fn:11" class="footnote-ref" role="doc-noteref">11</a></sup>.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p><a href="https://en.wikipedia.org/wiki/Digital_asset_management">https://en.wikipedia.org/wiki/Digital_asset_management</a>&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p><a href="https://www.nist.gov/programs-projects/research-data-framework-rdaf">https://www.nist.gov/programs-projects/research-data-framework-rdaf</a>&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p><a href="https://www.researchequals.com/">https://www.researchequals.com/</a>&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p><a href="https://corise.com/course/data-storytelling">https://corise.com/course/data-storytelling</a>&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:5">
<p><a href="https://en.wikipedia.org/wiki/Continuation">https://en.wikipedia.org/wiki/Continuation</a>&#160;<a href="#fnref:5" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:6">
<p><a href="https://docs.dagster.io/concepts/assets/software-defined-assets">https://docs.dagster.io/concepts/assets/software-defined-assets</a>&#160;<a href="#fnref:6" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:7">
<p><a href="https://en.wikipedia.org/wiki/Tree_(data_structure)">https://en.wikipedia.org/wiki/Tree_(data_structure)</a>&#160;<a href="#fnref:7" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:8">
<p><a href="https://en.wikipedia.org/wiki/SHA-2">https://en.wikipedia.org/wiki/SHA-2</a>&#160;<a href="#fnref:8" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:9">
<p><a href="https://doi.org/10.1038/sdata.2016.18">https://doi.org/10.1038/sdata.2016.18</a>&#160;<a href="#fnref:9" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:10">
<p><a href="https://doi.org/10.1016/j.patter.2020.100180">https://doi.org/10.1016/j.patter.2020.100180</a>&#160;<a href="#fnref:10" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:11">
<p><a href="https://www.w3.org/TR/rdf11-concepts/#h3_entailment">https://www.w3.org/TR/rdf11-concepts/#h3_entailment</a>&#160;<a href="#fnref:11" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2023:/posts/gemd-model/</id><link rel="alternate" href="https://donnywinston.com/posts/gemd-model/"/><title>Modeling a Graphical Expression of Materials Data (GEMD)</title><published>2023-02-02T10:45:00-05:00</published><updated>2023-02-02T10:45:00-05:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<h1 id="model-memo">model-memo</h1>
<p>The things of concern<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> are materials, processes, measurements, and ingredients.
Materials are output by processes and are subject to measurements.
A process may take materials as ingredients.</p>
<p>Things are recorded in three ways: as templates, as specs, and as runs.
You can record a template for a thing &ndash; what might be the case.
You can also record a spec for a thing &ndash; what is intended.
Finally, you can record a run for a thing &ndash; what is, or was.</p>
<p>Each thing can have attributes from three categories: properties, parameters, and conditions.
A property is something measured or calculated.
A parameter is something set.
A condition describes an aspect of the thing&rsquo;s environment.</p>
<h1 id="model-diagram">model-diagram</h1>
<div class="mermaid">erDiagram
  Process ||--|| Material : outputs
  Material ||--o{ Measurement : subjectTo
  Ingredient }o--|| Material : from
  Ingredient }o--|| Process : inputTo
  Record }o--|| RecordType : has
  RecordType }o--o| Template : mayBe
  RecordType }o--o| Spec : mayBe
  RecordType }o--o| Run : mayBe
  Record }o--|| Thing : about
  Thing }o--o| Process : mayBe
  Thing }o--o| Material : mayBe
  Thing }o--o| Measurement : mayBe
  Thing }o--o| Ingredient : mayBe
  Thing }o--o{ Attribute : has
  Attribute }o--o| Property : mayBe
  Attribute }o--o| Parameter : mayBe
  Attribute }o--o| Condition : mayBe
</div>

<h1 id="model-formalism">model-formalism</h1>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-json" data-lang="json"><span style="display:flex;"><span>[
</span></span><span style="display:flex;"><span>  {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;@base&#34;</span>: <span style="color:#e6db74">&#34;terminusdb:///data/&#34;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;@schema&#34;</span>: <span style="color:#e6db74">&#34;terminusdb:///schema#&#34;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;@type&#34;</span>: <span style="color:#e6db74">&#34;@context&#34;</span>
</span></span><span style="display:flex;"><span>  },
</span></span><span style="display:flex;"><span>  {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;@id&#34;</span>: <span style="color:#e6db74">&#34;Thing&#34;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;@type&#34;</span>: <span style="color:#e6db74">&#34;Class&#34;</span>,  
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;@abstract&#34;</span>: [],
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;has&#34;</span>: {
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;@type&#34;</span>: <span style="color:#e6db74">&#34;Set&#34;</span>,
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;@class&#34;</span>: <span style="color:#e6db74">&#34;Attribute&#34;</span>
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>  },
</span></span><span style="display:flex;"><span>  {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;@id&#34;</span>: <span style="color:#e6db74">&#34;Process&#34;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;@type&#34;</span>: <span style="color:#e6db74">&#34;Class&#34;</span>,  
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;@inherits&#34;</span>: [<span style="color:#e6db74">&#34;Thing&#34;</span>],
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;outputs&#34;</span>: <span style="color:#e6db74">&#34;Material&#34;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;hasInput&#34;</span>: {
</span></span><span style="display:flex;"><span>        <span style="color:#f92672">&#34;@type&#34;</span>: <span style="color:#e6db74">&#34;Set&#34;</span>,
</span></span><span style="display:flex;"><span>        <span style="color:#f92672">&#34;@class&#34;</span>: <span style="color:#e6db74">&#34;Ingredient&#34;</span>
</span></span><span style="display:flex;"><span>    }  
</span></span><span style="display:flex;"><span>  },
</span></span><span style="display:flex;"><span>  {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;@id&#34;</span>: <span style="color:#e6db74">&#34;Material&#34;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;@type&#34;</span>: <span style="color:#e6db74">&#34;Class&#34;</span>,  
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;@inherits&#34;</span>: [<span style="color:#e6db74">&#34;Thing&#34;</span>],
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;subjectTo&#34;</span>: {
</span></span><span style="display:flex;"><span>        <span style="color:#f92672">&#34;@type&#34;</span>: <span style="color:#e6db74">&#34;Set&#34;</span>,
</span></span><span style="display:flex;"><span>        <span style="color:#f92672">&#34;@class&#34;</span>:  <span style="color:#e6db74">&#34;Measurement&#34;</span>
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>  },
</span></span><span style="display:flex;"><span>  {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;@id&#34;</span>: <span style="color:#e6db74">&#34;Measurement&#34;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;@type&#34;</span>: <span style="color:#e6db74">&#34;Class&#34;</span>,  
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;@inherits&#34;</span>: [<span style="color:#e6db74">&#34;Thing&#34;</span>]
</span></span><span style="display:flex;"><span>  },    
</span></span><span style="display:flex;"><span>  {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;@id&#34;</span>: <span style="color:#e6db74">&#34;Ingredient&#34;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;@type&#34;</span>: <span style="color:#e6db74">&#34;Class&#34;</span>,  
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;@inherits&#34;</span>: [<span style="color:#e6db74">&#34;Thing&#34;</span>],
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;from&#34;</span>: <span style="color:#e6db74">&#34;Material&#34;</span>  
</span></span><span style="display:flex;"><span>  },
</span></span><span style="display:flex;"><span>  {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;@id&#34;</span>: <span style="color:#e6db74">&#34;Attribute&#34;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;@type&#34;</span>: <span style="color:#e6db74">&#34;Class&#34;</span>,  
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;@abstract&#34;</span>: []
</span></span><span style="display:flex;"><span>  },
</span></span><span style="display:flex;"><span>  {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;@id&#34;</span>: <span style="color:#e6db74">&#34;Property&#34;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;@type&#34;</span>: <span style="color:#e6db74">&#34;Class&#34;</span>,  
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;@inherits&#34;</span>: [<span style="color:#e6db74">&#34;Attribute&#34;</span>]
</span></span><span style="display:flex;"><span>  }, 
</span></span><span style="display:flex;"><span>  {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;@id&#34;</span>: <span style="color:#e6db74">&#34;Parameter&#34;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;@type&#34;</span>: <span style="color:#e6db74">&#34;Class&#34;</span>,  
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;@inherits&#34;</span>: [<span style="color:#e6db74">&#34;Attribute&#34;</span>]
</span></span><span style="display:flex;"><span>  }, 
</span></span><span style="display:flex;"><span>  {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;@id&#34;</span>: <span style="color:#e6db74">&#34;Condition&#34;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;@type&#34;</span>: <span style="color:#e6db74">&#34;Class&#34;</span>,  
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;@inherits&#34;</span>: [<span style="color:#e6db74">&#34;Attribute&#34;</span>]
</span></span><span style="display:flex;"><span>  },     
</span></span><span style="display:flex;"><span>  {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;@id&#34;</span>: <span style="color:#e6db74">&#34;Record&#34;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;@type&#34;</span>: <span style="color:#e6db74">&#34;Class&#34;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;@abstract&#34;</span>: [],
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;about&#34;</span>: <span style="color:#e6db74">&#34;Thing&#34;</span>  
</span></span><span style="display:flex;"><span>  },
</span></span><span style="display:flex;"><span>  {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;@id&#34;</span>: <span style="color:#e6db74">&#34;Template&#34;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;@type&#34;</span>: <span style="color:#e6db74">&#34;Class&#34;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;@inherits&#34;</span>: [<span style="color:#e6db74">&#34;Record&#34;</span>] 
</span></span><span style="display:flex;"><span>  },
</span></span><span style="display:flex;"><span>  {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;@id&#34;</span>: <span style="color:#e6db74">&#34;Spec&#34;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;@type&#34;</span>: <span style="color:#e6db74">&#34;Class&#34;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;@inherits&#34;</span>: [<span style="color:#e6db74">&#34;Record&#34;</span>]
</span></span><span style="display:flex;"><span>  },
</span></span><span style="display:flex;"><span>  {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;@id&#34;</span>: <span style="color:#e6db74">&#34;Run&#34;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;@type&#34;</span>: <span style="color:#e6db74">&#34;Class&#34;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;@inherits&#34;</span>: [<span style="color:#e6db74">&#34;Record&#34;</span>]
</span></span><span style="display:flex;"><span>  }        
</span></span><span style="display:flex;"><span>]
</span></span></code></pre></div><div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p><a href="https://citrineinformatics.github.io/gemd-docs/">https://citrineinformatics.github.io/gemd-docs/</a>&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2023:/posts/model-expression-workflow-connected-content/</id><link rel="alternate" href="https://donnywinston.com/posts/model-expression-workflow-connected-content/"/><title>A Model-Expression Workflow for Connected Content</title><published>2023-01-11T05:54:28-05:00</published><updated>2023-01-11T05:54:28-05:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>For each layer in the structured-content stack,<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> from least to most volatile (i.e. domain modeling
⟶ content design ⟶ interface design), draft successive model expressions,<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup> from most to least
ambiguous (as many expressions as needed to move confidently to the next stack layer).</p>
<p>Structured-content-stack breakdown:</p>
<ol>
<li>Domain model (object types and relationships)</li>
<li>Content
<ul>
<li>content model (content types and attributes)</li>
<li>content spec (labels and data types)</li>
<li>content population</li>
</ul>
</li>
<li>Representation
<ul>
<li>content-type resource templates (incl. resource transclusions)</li>
<li>index templates</li>
<li>collection resource templates</li>
</ul>
</li>
<li>Navigation
<ul>
<li>global navigation</li>
<li>contextual navigation</li>
</ul>
</li>
</ol>
<p>Model expressions:</p>
<ol>
<li>model-memo (functional and non-functional requirements <sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup>)</li>
<li>model-diagram</li>
<li>model-formalism</li>
<li>model-implementation</li>
</ol>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>M. Atherton and C. Hane, Designing connected content: plan and model digital products for today and tomorrow. San Francisco, CA: New Riders, 2018.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>J. M. Żytkow and A. Lewenstam, &ldquo;Analytical chemistry; the science of many models,&rdquo; Fresenius J Anal Chem, vol. 338, no. 3, pp. 225–233, Jan. 1990, doi: 10.1007/BF00323013.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p><a href="https://en.m.wikipedia.org/wiki/Non-functional_requirement">https://en.m.wikipedia.org/wiki/Non-functional_requirement</a>&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/key-technical-foundations-for-fairifying-data/</id><link rel="alternate" href="https://donnywinston.com/posts/key-technical-foundations-for-fairifying-data/"/><title>Key Technical Foundations for FAIRifying Data</title><published>2022-12-22T09:25:52-05:00</published><updated>2022-12-22T09:25:52-05:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>The key technical foundations for FAIRifying data are (1) ubiquitous persistent identifiers; (2) rich controlled metadata; and (3) granular programmatic access. These foundations provide a basis for FAIR data infrastructure.</p>
<p>This note is inspired by <a href="https://anchor.fm/fairdatapodcast/episodes/Sharif-Islam-e1siagt">Rory Macneil’s recent interview with Sharif Islam on the FAIR Data Podcast</a>, published on 2022-12-21. In particular, I expand on the Q&amp;A segment starting at PT14M10S.</p>
<h1 id="ubiquitous-persistent-identifiers-pids">ubiquitous persistent identifiers (PIDs)</h1>
<p>Identifiers must be persistent.
Persistence is a matter of service, which needs organizational support.
Furthermore, you are playing on hard mode here if you don’t ensure global uniqueness via HTTPS URIs.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>
Crucially, PIDs must be ubiquitous across data holdings.
A single PID that addresses all study-publication data elements as an aggregate, e.g. “one DOI for the primary article’s supplemental dataset”, is insufficient.</p>
<h1 id="rich-controlled-metadata">rich controlled metadata</h1>
<p>Metadata makes PIDs findable. Catalogs and search portals use metadata to help you find PID-associated content. Metadata elements must be controlled; that is, so-called controlled vocabularies must be used to boost (a) leverage in tagging and (b) precision and recall in retrieval, which is critical with “big” data-item collections. Furthermore, the controlled metadata needs  to be rich — tracking only “minimal” required metadata elements is insufficient. Finally, you are playing on hard mode if your control mechanism does not use PIDs for knowledge organization. A system of least power here is the W3C Simple Knowledge Organization System (<a href="https://www.w3.org/TR/skos-primer/">SKOS</a>).</p>
<h1 id="granular-programmatic-access">granular programmatic access</h1>
<p>Programmatic access must be supported. A well-documented, open-standards-based protocol
facilitates machine-to-machine interactions to glue things together
in a way that is distinct from affordances possible with human-centered interfaces (including bespoke APIs) and portals. This programmatic access must be granular — egress costs scale with data volume delivered, so let users sub-select slices of data. You are again playing on hard mode if programmatic access, and communicating granularity of such access, does not use PIDs. The <a href="https://www.rfc-editor.org/rfc/rfc9110.html">HTTP</a> protocol and <a href="https://www.rfc-editor.org/rfc/rfc3986.html">URI</a> scheme were designed for this, as were the W3C Resource Description Framework (<a href="https://www.w3.org/TR/rdf11-primer/">RDF</a>) Recommendations.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>[update 2022-12-25]: Global uniqueness for HTTPS URIs in practice is ensured either by (1) securing an HTTP URI <code>authority</code> component via the Domain Name System (DNS) or (2) securing a DNS-authority-delegated URI <code>path</code> prefix such as through w3id.org, the ARK alliance, or a DONA handle system (e.g. DOI) agent.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/implementing-the-fair-principles-through-fair-enabling-artifacts-and-services/</id><link rel="alternate" href="https://donnywinston.com/posts/implementing-the-fair-principles-through-fair-enabling-artifacts-and-services/"/><title>Implementing the FAIR Principles Through FAIR-Enabling Artifacts and Services</title><published>2022-10-21T13:24:46-04:00</published><updated>2022-10-21T13:24:46-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>How does a <a href="https://researchsoftware.org/">Research Software Engineer (RSE)</a> — often
responsible for developing infrastructure to manage and share digital research objects (data,
models, code, notebooks, workflows, etc.) — get from &ldquo;Yes, FAIR sounds great, but how?&rdquo; to &ldquo;I better
understand what the FAIR principles really mean and how I can put them into practice.&rdquo;? I hope the
diagram below can help.</p>
<figure>
<a href="/fair-enabling-resources-services-principles-diagram.pdf">
<img src="/img/fair-enabling-resources-services-principles-diagram-nocaption.svg" width="100%">
</a>
<figcaption>
Relating
<a href="https://w3id.org/fair/fip/terms/FAIR-Enabling-Resource">FAIR-Enabling Resource</a>
<strong>artifacts</strong>, from the 
<a href="https://w3id.org/fair/fip/terms/FIP-Ontology">FAIR Implementation Profile (FIP) ontology</a>,
to <strong>services</strong>.
These services are what you deploy to implement each of the 15
<a href="https://w3id.org/fair/principles/terms/FAIR">FAIR Principles</a>
(from Box 2 of the <a href="https://doi.org/10.1038/sdata.2016.18">seminal publication</a>)
for any actual given digital research object.
</figcaption>
</figure>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/architecture-patterns-for-fair-enabling-services/</id><link rel="alternate" href="https://donnywinston.com/posts/architecture-patterns-for-fair-enabling-services/"/><title>Architecture Patterns for FAIR-Enabling Services</title><published>2022-10-17T10:26:37-04:00</published><updated>2022-10-17T10:26:37-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>I&rsquo;ve been trying to grok architecture patterns as presented by Percival and Gregory<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> to
support domain-driven design and event-driven microservices with Python. I hope you find the diagram below useful.</p>
<figure><img src="/img/architecture-patterns-with-python-for-fair-enabling-services.png" width="100%"/><figcaption>
            <h4>Relating domain-driven design, event-driven microservices, command-query responsibility segregation (CQRS) &#43; views, and validation (of syntax, semantics, and pragmatics)</h4>
        </figcaption>
</figure>

<p>A microservices approach seems apt for FAIR-enabling services that need to be composed, flexibly,
for any given research artifact&rsquo;s digital lifecycle. Consider these services:</p>
<ul>
<li>minter<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup> (F)<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup></li>
<li>binder<sup id="fnref1:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup> (F)<sup id="fnref1:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup></li>
<li>resolver<sup id="fnref2:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup> (F)<sup id="fnref2:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup></li>
<li>index<sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup> (F)<sup id="fnref3:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup></li>
<li>object store<sup id="fnref:5"><a href="#fn:5" class="footnote-ref" role="doc-noteref">5</a></sup> (A)<sup id="fnref4:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup></li>
<li>transactor<sup id="fnref:6"><a href="#fn:6" class="footnote-ref" role="doc-noteref">6</a></sup> (I)<sup id="fnref5:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup></li>
<li>harmonizer<sup id="fnref:7"><a href="#fn:7" class="footnote-ref" role="doc-noteref">7</a></sup> (I)<sup id="fnref6:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup></li>
<li>tracker<sup id="fnref:8"><a href="#fn:8" class="footnote-ref" role="doc-noteref">8</a></sup> (R)<sup id="fnref7:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup></li>
</ul>
<p>Consider how you may want to swap one technology choice for a given FAIR-enabling service with
another choice, at any time, as part of evolving FAIR infrastructure to which you connect in order
to collaborate on and publish / share research artifacts.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>H. J. W. Percival and R. G. Gregory, <em>Architecture patterns with Python: enabling test-driven
development, domain-driven design, and event-driven microservices</em>, First edition. O’Reilly, 2020.
(<a href="https://www.cosmicpython.com/book/preface.html">available online</a>).&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>Example: &ldquo;Arklet - A basic ARK resolver.&rdquo; Internet Archive, Oct. 14, 2022. Accessed: Oct. 17, 2022. [Online]. Available: <a href="https://github.com/internetarchive/arklet">https://github.com/internetarchive/arklet</a>&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a>&#160;<a href="#fnref1:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a>&#160;<a href="#fnref2:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>principle addressed: <em>F</em> — Findable, <em>A</em> — Accessible, <em>I</em> — interoperable, <em>R</em> — reusable.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a>&#160;<a href="#fnref1:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a>&#160;<a href="#fnref2:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a>&#160;<a href="#fnref3:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a>&#160;<a href="#fnref4:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a>&#160;<a href="#fnref5:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a>&#160;<a href="#fnref6:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a>&#160;<a href="#fnref7:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p>Example: &ldquo;Elasticsearch&rdquo;. <a href="https://www.elastic.co/elasticsearch/">https://www.elastic.co/elasticsearch/</a>&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:5">
<p>Example: &ldquo;Amazon Simple Storage Service (Amazon S3)&rdquo;. <a href="https://aws.amazon.com/s3/">https://aws.amazon.com/s3/</a>&#160;<a href="#fnref:5" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:6">
<p>Example: &ldquo;Transactor | Datomic.&rdquo; <a href="https://docs.datomic.com/on-prem/overview/transactor.html">https://docs.datomic.com/on-prem/overview/transactor.html</a> (accessed Oct. 17, 2022). <a href="https://docs.datomic.com/on-prem/overview/transactor.html">https://docs.datomic.com/on-prem/overview/transactor.html</a>&#160;<a href="#fnref:6" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:7">
<p>Example: &ldquo;DataHarmonizer.&rdquo; Centre for Infectious Disease and One Health, Aug. 08, 2022. Accessed: Oct. 17, 2022. [Online]. Available: <a href="https://github.com/cidgoh/DataHarmonizer">https://github.com/cidgoh/DataHarmonizer</a>&#160;<a href="#fnref:7" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:8">
<p>Example: &ldquo;git - the stupid content tracker.&rdquo; <a href="https://git-scm.com/docs/git">https://git-scm.com/docs/git</a> (accessed Oct. 17, 2022).&#160;<a href="#fnref:8" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/from-platforms-to-microservices-for-fair-data-and-analysis/</id><link rel="alternate" href="https://donnywinston.com/posts/from-platforms-to-microservices-for-fair-data-and-analysis/"/><title>From Platforms to Microservices for FAIR Data and Analysis</title><published>2022-10-17T09:34:35-04:00</published><updated>2022-10-17T09:34:35-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>The &ldquo;one platform<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> to rule them all&rdquo; is unlikely to be realized for scientific research in
any domain. Rather, instead of small and numerous on-premises silos for data + code + compute, we
are on track to achieve large and somewhat less numerous cloud-based silos.<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup></p>
<p>What&rsquo;s the alternative? A focus on microservices &ndash; so-called to emphasize that they generally do
not stand alone, but rather are components of larger workflows/services &ndash; such as data-slicing and
data-summary-layer services that allow you to bring big data to code+compute by effectively
subsetting/streaming it.<sup id="fnref1:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup></p>
<p>But how? One approach is to pursue domain-driven design that is devoid of architecture/orchestration
concerns but that yields domain events, wrapped by event-driven microservices that deal with
specific technology choices, wrapped finally by entrypoint interfaces driven by user/user-agent
personas and their use cases.<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup></p>
<figure><img src="/img/entrypoints-services-domain.png" width="90%"/><figcaption>
            <h4>Entrypoints wrap services (orchestration, infrastructure, glue code) that wrap domain conceptualizations.</h4>
        </figcaption>
</figure>

<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>aka gateway, aka portal, aka virtual research environment, aka&hellip;&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>N. C. Sheffield et al., &ldquo;From biomedical cloud platforms to microservices: next steps in FAIR
data and analysis,&rdquo; Sci Data, vol. 9, no. 1, Art. no. 1, Sep. 2022,
<a href="https://doi.org/10.1038/s41597-022-01619-5">doi:10.1038/s41597-022-01619-5</a>.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a>&#160;<a href="#fnref1:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>H. J. W. Percival and R. G. Gregory, <em>Architecture patterns with Python: enabling test-driven
development, domain-driven design, and event-driven microservices</em>, First edition. O’Reilly, 2020.
(<a href="https://www.cosmicpython.com/book/preface.html">available online</a>).&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/fair-enabling-services-redux/</id><link rel="alternate" href="https://donnywinston.com/posts/fair-enabling-services-redux/"/><title>FAIR-Enabling Services Redux</title><published>2022-10-03T11:18:58-04:00</published><updated>2022-10-03T11:18:58-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>I have sought to identify and enumerate core <a href="https://donnywinston.com/posts/fair-enabling-services/">FAIR-enabling
services</a>. I attempted a <a href="https://donnywinston.com/posts/a-five-week-experiment-to-elaborate-on-fair-enabling-services/">five-week
experiment</a>
to expand on my tentative list, but I did not complete it. The list wasn&rsquo;t compelling for me.</p>
<p>I have been brewing an updated list of core FAIR-enabling services, which I hope to be less
bombastic about. Nevertheless, I want to share with you this list and my thoughts about concretizing
them through pedagogically minded demo implementations tied together by a running example that I
intend to refine and deploy for a real project.</p>
<p>FAIR-Enabling services:</p>
<ol>
<li>an identifier <em>minter</em> (e.g. <a href="https://github.com/jkunze/docker-arknoid">arknoid</a>)</li>
<li>a metadata <em>tracker</em> (e.g. <a href="https://github.com/terminusdb/terminusdb">terminusdb</a>)</li>
<li>a metadata <em>transactor</em> (<a href="https://github.com/terminusdb/terminusdb">terminusdb</a> schema)</li>
<li>a metadata <em>indexer</em>, to feed a search engine (e.g. <a href="https://github.com/elastic/elasticsearch">elasticsearch</a>)</li>
<li>an identifier metadata <em>resolver</em> (e.g. <a href="https://github.com/tiangolo/fastapi">fastapi</a> with conneg)</li>
<li>a data object <em>retriever</em> (e.g. <a href="https://github.com/minio/minio">minio</a> s3 presigned urls)</li>
<li>a metadata <em>harmonizer</em> (e.g. <a href="https://www.inkandswitch.com/cambria/">cambria</a>-esque json patch graph)</li>
</ol>
<p>I&rsquo;d like to demo a service stack via <code>docker-compose</code>. Some resources I am thinking to consult or
leverage directly here are <a href="https://github.com/jkunze/docker-arknoid">arknoid</a>,
<a href="https://github.com/terminusdb/terminusdb">TerminusDB</a>,
<a href="https://github.com/elastic/elasticsearch">Elasticsearch</a>,
<a href="https://github.com/tiangolo/fastapi">FastAPI</a>, <a href="https://github.com/minio/minio">MinIO</a>, and
<a href="https://www.inkandswitch.com/cambria/">Project Cambria</a>.</p>
<p>My intended running example is the incorporation of heliophysics concepts into the <a href="https://astrothesaurus.org/">Unified
Astronomy Thesaurus (UAT)</a> and harmonizing that with the
<a href="https://openalex.org/">OpenAlex</a> concept scheme so that one may evaluate semantics-fueled
improvements to query understanding and thus search-result relevance via the OpenAlex dataset. The
OpenAlex dataset has value as a testbed for improvements to in-production search engines such as the
<a href="https://ui.adsabs.harvard.edu/">SAO/NASA Astrophysics Data System</a>.</p>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/fes-4-1-translating-identifiers/</id><link rel="alternate" href="https://donnywinston.com/posts/fes-4-1-translating-identifiers/"/><title>Translating Identifiers</title><published>2022-09-12T15:59:25+02:00</published><updated>2022-09-12T15:59:25+02:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>Don&rsquo;t. Identifiers should be opaque.</p>
<p>If you&rsquo;re given an <a title="http://www.w3.org/2002/07/owl#sameAs" href="https://prefix.zazuko.com/owl:sameAs">owl:sameAs</a> assertion from a party you trust, use that.</p>
<p>If you need to mint surrogates because what you&rsquo;re given aren&rsquo;t Globally Unique, Persistent and
Resolvable Identifiers (GUPRIs)<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>, either house your
inheritance as local parts/suffixes in your global namespace, assert datatype properties to record
the historical correspondence, or both.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>E. Schultes et al., &ldquo;FAIR Digital Twins for Data-Intensive Research,&rdquo; Front. Big Data,
vol. 5, p. 883341, May 2022, doi: 10.3389/fdata.2022.883341.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/fes-3-4-indexing-translators-and-traces/</id><link rel="alternate" href="https://donnywinston.com/posts/fes-3-4-indexing-translators-and-traces/"/><title>Indexing Translators and Traces</title><published>2022-09-12T14:02:22+02:00</published><updated>2022-09-12T14:02:22+02:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>Is a metadata record &ldquo;almost&rdquo; expressed in the same language you used for your filter criteria?</p>
<p>If only the machine knew that what you supplied as &ldquo;depth&rdquo; in meters, expressed as a
double-precision float, was convertible to the target record’s “d_cm” field, expressed in cm as a
string value (to preserve significant digits).</p>
<p>It may be impractical for a user agent to negotiate a bridging of query and content schema in
real-time, considering the multitude of candidate paths for attribute and entity alignment.</p>
<p>Perhaps, though, certain families of translators can be indexed for opportunistic recognition given
acceptable-compute budgets.</p>
<p>Related to this, certain meandering paths of provenance may be routinely important in selecting
resources for reuse.</p>
<p>Rather than repeated union-finding of indexed intersections along these paths, it may be worth
indexing whole paths or segments thereof.</p>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/fes-3-2-indexing-validators/</id><link rel="alternate" href="https://donnywinston.com/posts/fes-3-2-indexing-validators/"/><title>Indexing Validators</title><published>2022-09-07T10:26:18+02:00</published><updated>2022-09-07T10:26:18+02:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>Why would one consider indexing validators? Reuse.</p>
<p>The value of reuse seems obvious for structural and semantic <em>specification</em>, i.e. schemas and
controlled vocabularies &ndash; there is opportunity to perceive two datasets as aligned. But, this
alignment is only <em>indicated</em>, not necessarily <em>validated</em>.</p>
<p>Two datasets, A and B, are stated to both conform to schema S. If you wish to verify this, what do
you do? You apply a validator V to both. Therefore, it seems that if the same validator V is already
stated to have been successfully applied to both datasets A and B in order to verify conformance to
S, you will have higher confidence in proceeding to analysis without applying validation yourself,
or at least without insisting on comprehensive, compute-intensive validation by default.</p>
<p>A given schema-specification validator may also be relatively sophisticated and transform an input
dataset to conform more tightly to the specification, as per <a href="https://en.wikipedia.org/wiki/Robustness_principle">Postel&rsquo;s
Law</a>, making it even more valuable to reuse
unambiguously identified validators as part of data-integration workflows.</p>
<p>Validators may be composed, e.g. conjunctively as <a href="https://docs.datomic.com/on-prem/schema/schema.html#entity-predicates">attribute/predicate specs are in
Datomic</a>, encouraging
granular reuse. However, one could not naively employ conformers-as-validators in such a scheme
unless they formed a commutative semigroup (mutually rectifying robustness &ndash; Postel would
approve!).</p>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/fes-3-1-indexing-identifiers/</id><link rel="alternate" href="https://donnywinston.com/posts/fes-3-1-indexing-identifiers/"/><title>Indexing Identifiers</title><published>2022-09-06T14:09:02+02:00</published><updated>2022-09-06T14:09:02+02:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>Indexing identifiers is key to disambiguating entities.</p>
<p>Wikipedia has <a href="https://en.wikipedia.org/wiki/Category:Disambiguation_pages">disambiguation pages</a>.
For example, there are various concepts in mathematics and computing, various computing products,
and various companies that identify with the term
&ldquo;<a href="https://en.wikipedia.org/wiki/Precision">Precision</a>&rdquo;. I made <a href="https://web.archive.org/web/20220906121536/https://legacy.materialsproject.org/Fe2O3/">disambiguation pages for
same-chemical-formula inorganic crystal
structures</a>
for the Materials Project.</p>
<p>Indexing identifiers is also key to unifying entities. It&rsquo;s an <a href="http://dbpedia.org/resource/Open-world_assumption">open
world</a> after all,<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> with a comcomitant
<a href="https://chempedia.info/info/nonunique_naming_assumption/">non-unique naming assumption</a>. OpenAlex
<a href="https://docs.openalex.org/about-the-data/work#ids">indexes various ID types for a work</a>. For
example,
<a href="http://api.openalex.org/works/https://doi.org/10.7717/peerj.4375">http://api.openalex.org/works/https://doi.org/10.7717/peerj.4375</a>
will funnel you to the payload for <code>https://openalex.org/W2741809807</code>, which has an <code>ids</code> field with
<code>openalex</code>, <code>doi</code>, <code>mag</code> (Microsoft Academic Graph), <code>pmid</code> (Pubmed), and <code>pmcid</code> (Pubmed Central)
IDs.</p>
<p>Finally, indexing identifiers is key to registering and resolving metadata, i.e. relationships
between identifiers. Registries include <a href="https://lov.linkeddata.es/dataset/lov">Linked Open Vocabularies
(LOV)</a>, the <a href="https://www.ebi.ac.uk/ols/index">Ontology Lookup Service
(OLS)</a>, the <a href="https://prefix.zazuko.com/">Zazuko Prefix Server</a>, and
the <a href="https://obofoundry.org/">OBO Foundry</a>. Resolvers include
<a href="https://identifiers.org/">Identifiers.org</a> and <a href="https://n2t.net/">Name-To-Thing (n2t)</a>. There is
even at least one <em>meta</em>registry, <a href="https://bioregistry.io/">Bioregistry.io</a>.</p>
<p>Any time you encounter a web service using a &ldquo;remote data access&rdquo; style, i.e. exposing a query
language via a single access point &ndash; SQL, SPARQL, GraphQL, MongoDB, etc. &ndash; its highly likely that
all entity identifiers are indexed to support efficient retrieval and combination/joining.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Unless you can bask in glorious isolation  in a siloed domain/organization.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/fes-2-5-validating-traces/</id><link rel="alternate" href="https://donnywinston.com/posts/fes-2-5-validating-traces/"/><title>Validating Traces — Syntactically, Semantically, and Situationally</title><published>2022-09-04T13:28:00+02:00</published><updated>2022-09-04T13:28:00+02:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>How do you validate a reified trace of digital-object provenance?</p>
<p>Is it even possible? This is syntactic validation. Values that should be strings are strings, dates
are dates, lists are lists, you know the drill&hellip;</p>
<p>Is it plausible? This is semantic validation. This date should be earlier (i.e., &ldquo;less than&rdquo; in
ISO8601 format) than that date, this number should be an integer multiple of that number, this
field&rsquo;s values are unique across the collection, this field is a reference to an object of that
type, etc. Also known as structural validation.</p>
<p>Is it probable? This is situational valuation, i.e. a matter of
<a href="https://donnywinston.com/posts/validation-syntax-semantics-pragmatics/">pragmatics</a>. This mode of
&ldquo;validation&rdquo; logs not <em>errors</em> per se, but rather <em>warnings</em>. This is the world of statistical
process control, of setting thresholds for anomaly detection and tuning your go-ahead logic to align
with risk tolerance and chosen strategies for mitigation of failures.</p>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/disconnect-between-fair-infra-devs-and-product-devs/</id><link rel="alternate" href="https://donnywinston.com/posts/disconnect-between-fair-infra-devs-and-product-devs/"/><title>A Disconnect Between FAIR Infrastructure Devs and Product Devs</title><published>2022-09-02T18:18:22+02:00</published><updated>2022-09-02T18:18:22+02:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>Rory Macneil nails it:<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></p>
<blockquote>
<p>This seems to me to be a really important  problem. In my experience a lot of the discussion about
things like PIDs and controlled vocabularies seems to assume that these things just exist. But when
you actually go to try to use them and make them usable  in the context of tools that people use in
research,  that presents a whole additional series of challenges;  I think oftentimes  those
challenges are overlooked or ignored or not even thought about  by people who are doing work in PIDs
and controlled vocabularies.</p>
</blockquote>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Rory Macneil and Nick Garabedian, FAIR Data Podcast, August 31, 2022. <a href="https://anchor.fm/fairdatapodcast/episodes/Nick-Garabedian-e1n6vid">https://anchor.fm/fairdatapodcast/episodes/Nick-Garabedian-e1n6vid</a>&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/fes-2-4-validating-translation/</id><link rel="alternate" href="https://donnywinston.com/posts/fes-2-4-validating-translation/"/><title>Validating Translation</title><published>2022-09-01T10:20:34+02:00</published><updated>2022-09-01T10:20:34+02:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>Given a <a href="https://www.w3.org/TR/dx-prof-conneg/#dfn-representation">representation</a> of (meta)data
that <a href="http://purl.org/dc/terms/conformsTo">dcterms:conformsTo</a> some <a href="https://www.w3.org/TR/dx-prof-conneg/#dfn-data-profile">data
profile</a>, you may wish to translate it to
another data profile.</p>
<p>If a resource is accesible from an HTTP server, then as a client you may <a href="https://developer.mozilla.org/en-US/docs/Web/HTTP/Content_negotiation">negotiate the content
representation</a> in a standard
way. Traditional content negotiation (aka &ldquo;conneg&rdquo;) is limited to file formats, aka syntax rather
than semantics, but <a href="https://www.w3.org/TR/dx-prof-conneg/">content negotiation by profile</a> (aka &ldquo;connegp&rdquo;)
can facililate translation.</p>
<p>There may be many ways to <a href="https://www.w3.org/TR/dx-prof-conneg/#dfn-functional-specification">functionally
specify</a> data profile
negotiation, i.e. translation. Ultimately, one <a href="https://www.w3.org/TR/dx-prof-conneg/#dfn-functional-profile">functional
profile</a> is employed for a given
instance of translation.</p>
<p>Thus, it seems that one way to &ldquo;validate translation&rdquo; would be to identify the functional profile
employed and trace process outputs for conformance.</p>
<p>I&rsquo;m really getting into weeds here, aren&rsquo;t I? My insistence on exploring the cross product of
{identifying,validating,indexing,translating,tracing} \(\times\)
{identifiers,validators,indexers,translators,tracers}  to elucidate FAIR-enabling services is a bit
dizzying. I shall cautiously continue.</p>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/fes-2-3-indexing-validations/</id><link rel="alternate" href="https://donnywinston.com/posts/fes-2-3-indexing-validations/"/><title>Semantic Stars Upon Thars</title><published>2022-08-31T23:18:16+02:00</published><updated>2022-08-31T23:18:16+02:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>To validate is to compute, so indexing metadata for past validation events and caching any
detailed payloads can save time and effort.</p>
<p>Why index? To search. Why search? To find relevant (&ldquo;likely valid&rdquo;), ranked (&ldquo;more likely valid&rdquo;)
results.</p>
<p>One may think of validation events as akin to GitHub stars, but with semantics: semantic stars, in
that one can filter by validation types and/or validation agents that you trust, by recency if
applicable, etc.</p>
<p>Crucially, the qualified nature (i.e., <a href="https://w3id.org/fair/principles/terms/I3">fair:I3</a>) of starring may help a community of practice mind the
cautionary tale of The Sneeches.</p>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/fes-2-2-who-validates-the-validators/</id><link rel="alternate" href="https://donnywinston.com/posts/fes-2-2-who-validates-the-validators/"/><title>Who Validates the Validators?</title><published>2022-08-30T15:49:05+02:00</published><updated>2022-08-30T15:49:05+02:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>Given a <a href="https://w3id.org/fair/fip/terms/Metadata-schema">fip:Metadata-schema</a> and a validator for it,
such as a <a title="http://www.w3.org/ns/shacl#Validator" href="https://prefix.zazuko.com/sh:Validator">sh:Validator</a> or a <a href="https://json-schema.org/">JSON Schema</a>, how do you determine that the
validator is&hellip;valid? That it speaks the desired <a href="https://w3id.org/fair/fip/terms/Knowledge-representation-language">fip:Knowledge-representation-language</a>,
that it knows all the terms in a desired <a href="https://w3id.org/fair/fip/terms/Structured-vocabulary">fip:Structured-vocabulary</a> and checks their usage against a desired <a href="https://w3id.org/fair/fip/terms/Semantic-model">fip:Semantic-model</a>?
In other words, that it adheres to a <a title="http://usefulinc.com/ns/doap#Specification" href="https://prefix.zazuko.com/doap:Specification">doap:Specification</a>?</p>
<p>I do not know. However, I suspect that it is more important <a href="https://en.wikipedia.org/wiki/Robustness_principle">to check output rather than input</a>.</p>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
<div style="margin-top: 2em;">
<small>

    This page attempts to be a <a href="https://www.fairpoints.org/">FAIR Point</a>:
    "view page source" in your browser to see its
    <a href="http://schema.org/LearningResource">schema:LearningResource</a> JSON-LD.

</small>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/fes-2-1-identifying-validation/</id><link rel="alternate" href="https://donnywinston.com/posts/fes-2-1-identifying-validation/"/><title>Identifying Validation</title><published>2022-08-29T21:59:10+02:00</published><updated>2022-08-29T21:59:10+02:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>What conveys that data has been validated or is yet to be validated?</p>
<p>How do you identify the nature and process of validation for a given digital object?</p>
<p>Who is involved? What auxiiary resources are involved? Is the process:</p>
<ol>
<li>
<p>Do-it-yourself, with (implicit or explicit) references to validation assets?</p>
</li>
<li>
<p>Do-it-with-you, with references to validation services?</p>
</li>
<li>
<p>Do-it-for-you, with references to validation results and/or signoffs?</p>
</li>
</ol>
<p>An example of explicit reference amenable to do-it-yourself validation is the <code>schemaURL</code> field in an OpenLineage RunEvent JSON document, which links to its JSON Schema definition:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-json" data-lang="json"><span style="display:flex;"><span>{
</span></span><span style="display:flex;"><span>  <span style="color:#f92672">&#34;eventType&#34;</span>: <span style="color:#e6db74">&#34;START&#34;</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#f92672">&#34;eventTime&#34;</span>: <span style="color:#e6db74">&#34;2020-12-09T23:37:31.081Z&#34;</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#f92672">&#34;run&#34;</span>: {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;runId&#34;</span>: <span style="color:#e6db74">&#34;3b452093-782c-4ef2-9c0c-aafe2aa6f34d&#34;</span>,
</span></span><span style="display:flex;"><span>  },
</span></span><span style="display:flex;"><span>  <span style="color:#f92672">&#34;job&#34;</span>: {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;namespace&#34;</span>: <span style="color:#e6db74">&#34;my-scheduler-namespace&#34;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;name&#34;</span>: <span style="color:#e6db74">&#34;myjob.mytask&#34;</span>,
</span></span><span style="display:flex;"><span>  },
</span></span><span style="display:flex;"><span>  <span style="color:#f92672">&#34;inputs&#34;</span>: [
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;namespace&#34;</span>: <span style="color:#e6db74">&#34;my-datasource-namespace&#34;</span>,
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;name&#34;</span>: <span style="color:#e6db74">&#34;instance.schema.table&#34;</span>,
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>  ],
</span></span><span style="display:flex;"><span>  <span style="color:#f92672">&#34;outputs&#34;</span>: [
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;namespace&#34;</span>: <span style="color:#e6db74">&#34;my-datasource-namespace&#34;</span>,
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;name&#34;</span>: <span style="color:#e6db74">&#34;instance.schema.output_table&#34;</span>,
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>  ],
</span></span><span style="display:flex;"><span>  <span style="color:#f92672">&#34;producer&#34;</span>: <span style="color:#e6db74">&#34;https://github.com/OpenLineage/OpenLineage/blob/v1-0-0/client&#34;</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#f92672">&#34;schemaURL&#34;</span>: <span style="color:#e6db74">&#34;https://openlineage.io/spec/1-0-0/OpenLineage.json#/definitions/RunEvent&#34;</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>Flavors of do-it-with-you validation include checksums and content hashing. You give a service some
input with a checksum so that the service can verify that your input is plausible. A service gives
you a content hash so that you can verify that its output is plausible. But how do you identify what
is being done, and to which field (<a href="https://github.com/multiformats/cid">perhaps it&rsquo;s done to the object identifier
itself</a>)? One useful standard is the HTTP
<a href="https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Digest"><code>Digest</code></a> header.</p>
<p>Do-it-for-you signoffs may involve digital signatures (and there is a standards-track HTTP
<a href="https://wicg.github.io/webpackage/draft-yasskin-http-origin-signed-responses.html#name-the-signature-header"><code>Signature</code></a>
Header).</p>
<p>It&rsquo;s clear that cryptography must play a big role here:</p>
<blockquote>
<p>We should accept the premise that people will not run their own servers by designing systems that
can distribute trust without having to distribute infrastructure. This means architecture that
anticipates and accepts the inevitable outcome of relatively centralized client/server
relationships, but uses cryptography (rather than infrastructure) to distribute trust.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></p>
</blockquote>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>M. Marlinspike, &ldquo;My first impressions of web3,&rdquo; Moxie Marlinspike, Jan. 07, 2022. <a href="https://moxie.org/2022/01/07/web3-first-impressions.html">https://moxie.org/2022/01/07/web3-first-impressions.html</a> (accessed Aug. 29, 2022).&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/podcast-martynas-jusevičius/</id><link rel="alternate" href="https://donnywinston.com/posts/podcast-martynas-jusevi%C4%8Dius/"/><title>Interview with Martynas Jusevičius</title><published>2022-08-29T15:42:22+02:00</published><updated>2022-08-29T15:42:22+02:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>This week on Machine-Centric Science, I interviewed Martynas Jusevičius, currently at AtomGraph and
based in Copenhagen, Denmark.</p>
<p>Topics we spoke about included: cell-based UI as computational canvases (e.g. Jupyter), personal
knowledge graph tools and block UI protocols, the Solid Pod spec and ecosystem for decentralized
data ownership and visiting, and the roles both of researchers and those who develop software for
them in realizing FAIR-principled implementations.</p>
<p><a href="https://share.transistor.fm/s/59aa8998">HAVE A LISTEN »</a></p>
<h1 id="quotable-quotes">Quotable Quotes</h1>
<p>&ldquo;The RDF graph data model&hellip;seems like the only realistic implementation at this point for the FAIR
principles.&rdquo;</p>
<p>&ldquo;To me, FAIR data is more or less equal to Linked Data.&rdquo;</p>
<p>&ldquo;The software has to be built around these principles. And that&rsquo;s maybe quite a radical idea because
for a long time, data was just like an add-on to software, right? But essentially now it&rsquo;s the
inverse. It&rsquo;s the data that is at the center &ndash; that&rsquo;s the data-centric paradigm.&rdquo;</p>
<p>&ldquo;&hellip;there has to be some kind of paradigm shift, both in how researchers see this, but also for
those who develop software for researchers, that what scientific publishing produces is not just
PDFs&hellip;Through fair data, we can look at scientific publishing as this huge network of research
artifacts that can be navigated, explored &ndash; as a knowledge graph naturally &ndash; but also recombined,
reused and repurposed in different things.&rdquo;</p>
<h1 id="sharing-is-caring">Sharing is caring!</h1>
<p>If you enjoyed this episode, please consider sharing it with a few friends who might find it useful.
Thanks!</p>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/tracing-identifiers/</id><link rel="alternate" href="https://donnywinston.com/posts/tracing-identifiers/"/><title>Tracing Identifiers</title><published>2022-08-27T22:42:02+02:00</published><updated>2022-08-27T22:42:02+02:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>At a base level, an identifier is simple to trace &ndash;
it is the sequence (modulo concurrency) of assertions of which it is a part.</p>
<p>In fact, this can be the basis for tracing the representation of a &ldquo;thing&rdquo;
as the flock of relationships between identifiers, i.e. metadata,
that waxes and wanes in association with &ldquo;the&rdquo; identifier of the thing.</p>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/translating-identifiers/</id><link rel="alternate" href="https://donnywinston.com/posts/translating-identifiers/"/><title>Translating Identifiers</title><published>2022-08-26T21:46:02+02:00</published><updated>2022-08-26T21:46:02+02:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>Good identifiers are opaque, so translation is by association &ndash;
<a href="http://www.w3.org/2002/07/owl#sameAs">owl:sameAs</a>,
<a href="http://www.w3.org/2004/02/skos/core#exactMatch">skos:exactMatch</a>,
or some other relationship.
Translation doesn&rsquo;t follow from reading a sign, but from retrieving a sense.</p>
<p>If metadata is relationships between identifiers,<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> then metadata is the medium of conceptual convergence.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>M. Bide, &ldquo;Standard Identifiers: an overview of the current landscape,&rdquo; presented at the USPTO Open Meeting: Facilitating the Development of the Online Licensing Environment for Copyrighted Works, Apr. 01, 2015. [<a href="http://www.linkedcontentcoalition.org/phocadownload/150401%20BIDE%20Standard%20Identifiers%20Overview%20with%20embedded%20slides.pdf">Online</a>]&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/indexing-identifier-services/</id><link rel="alternate" href="https://donnywinston.com/posts/indexing-identifier-services/"/><title>Indexing Identifier Services</title><published>2022-08-24T22:55:26-04:00</published><updated>2022-08-24T22:55:26-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>Where do you look for identifiers?</p>
<p>If you&rsquo;re looking for a URI, the IANA has a <a href="https://www.iana.org/assignments/uri-schemes/uri-schemes.xhtml">registry of
schemes</a>, like <code>https</code>, <code>mailto</code>,
and <code>tel</code>.</p>
<p>These days, to resolve an identifier, you generally use the <code>https</code> scheme, which has an <code>authority</code>
component in its URI format. You can go with content addressing like <a href="https://ipld.io/glossary/#cid">IPLD
CIDs</a>, but that doesn&rsquo;t solve where to look &ndash; it solves knowing that
you found the thing (or that you already have the thing).</p>
<p>Authority is hard to persist. So people and organizations pool efforts towards generic authority
under the <code>https</code> URI scheme, like <code>hdl.handle.net</code>, <code>doi.org</code>, <code>n2t.net</code>, <code>identifiers.org</code>,
<code>purl.org</code>, <code>w3id.org</code>, <code>wikidata.org</code>, etc. Or they pool towards authority with narrower scope,
like <code>orcid.org</code>, <code>ror.org</code>, <code>igsn.org</code>, etc. Or they just pursue lasting authority with a new
<code>.org</code> or through a trusted <code>.gov</code>, etc.</p>
<p>How do you index identifiers from various sources? There are efforts like
<a href="https://lov.linkeddata.es/dataset/lov/">LOV</a> for vocabularies, and
<a href="https://www.crossref.org/">Crossref</a> and <a href="https://datacite.org/">DataCite</a> for DOIs.</p>
<p>I think <a href="http://openalex.org/">OpenAlex</a> is setting a nice example of collecting identifiers from
various systems and connecting them, along with descriptive metadata.</p>
<p>What about collecting identifier services? Is this of interest? Is it a fool&rsquo;s errand due to the
rise and fall of authority? Or is tracking and using the rising and falling of reputation and
reliability, Google-PageRank-style, a way to shepherd researchers to robust persistence of
identifiers?</p>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/validating-identifier-services/</id><link rel="alternate" href="https://donnywinston.com/posts/validating-identifier-services/"/><title>Validating an Identifier Service</title><published>2022-08-23T16:23:41-04:00</published><updated>2022-08-23T16:23:41-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>How do you validate that an <a href="https://w3id.org/fair/fip/terms/Identifier-service">identifier service</a>
provides global uniqueness of minted keys, persistence of bindings, and resolution of keys to
descriptive metadata?</p>
<blockquote>
<p>The key problem with testing is that a test (of any kind) that uses one particular set of inputs
tells you <em>nothing at all</em> about the behaviour of the system or component when it is given a
different set of inputs.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></p>
</blockquote>
<p>If you know that a given ID provided by a service is unique, that tells you <em>nothing at all</em> about
the uniqueness of another ID provided by that service. You need to understand &ndash; be able to reason
about &ndash; whatever algorithm the service uses to guarantee uniqueness, and trust that the service
implements that algorithm.</p>
<blockquote>
<p>The key problem is that a test (of any kind) on a system or component that is in one particular
<em>state</em> tells you <em>nothing at all</em> about the behaviour of that system or component when it happens
to be in another state.<sup id="fnref1:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></p>
</blockquote>
<p>If you know that an ID provided by a service is bound to a particular digital object and/or to
particular descriptive metadata, that tells you <em>nothing at all</em> about what the service will bind
that ID to tomorrow. You need to understand and trust any policy provided by that service regarding
persistence of bindings.<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup><sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup></p>
<blockquote>
<p>Running a test in the presence of concurrency with a known initial state and set of inputs tells
you nothing at all about what will happen the next time you run that very same test with the very
same inputs and the very same starting state&hellip;and things can’t really get any worse than that.<sup id="fnref2:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></p>
</blockquote>
<p>If you know that a service resolves an ID request to metadata describing a digital object and its
location, that tells you <em>nothing at all</em> about how the server will respond to an identical
follow-up request. You need to understand and trust the access protocols provided by the service.</p>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>B. Moseley and P. Marks, &ldquo;Out of the Tar Pit.&rdquo; Feb. 06, 2006.
[<a href="https://github.com/papers-we-love/papers-we-love/blob/master/design/out-of-the-tar-pit.pdf">online</a>]&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a>&#160;<a href="#fnref1:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a>&#160;<a href="#fnref2:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>J. Kunze, S. Calvert, J. DeBarry, M. Hanlon, G. Janée, and S. Sweat, &ldquo;Persistence statements: describing digital stickiness,&rdquo; Nov. 2016, Accessed: Aug. 23, 2022. [Online]. Available: <a href="https://escholarship.org/uc/item/2zm9x47c">https://escholarship.org/uc/item/2zm9x47c</a>&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>&ldquo;Permanence Levels and the Archives for NLM’s Permanent Web Documents. NLM Technical Bulletin. 2005 Mar-Apr.&rdquo; <a href="https://www.nlm.nih.gov/pubs/techbull/ma05/ma05_archive.html">https://www.nlm.nih.gov/pubs/techbull/ma05/ma05_archive.html</a> (accessed Aug. 23, 2022).&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/fes-identifying-identifying/</id><link rel="alternate" href="https://donnywinston.com/posts/fes-identifying-identifying/"/><title>Identifying Identifying</title><published>2022-08-22T19:27:28-04:00</published><updated>2022-08-22T19:27:28-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>Day 1 of my <a href="https://donnywinston.com/posts/a-five-week-experiment-to-elaborate-on-fair-enabling-services/">five-week experiment to elaborate on FAIR-enabling
services</a>,
and I already find myself fallen flat on my face.</p>
<p>I had wanted to go through motions of brainstorming concepts related to the service of identifying,
partition them into concepts, attributes, and relationships in the sense of Sequeda and
Lassila&rsquo;s<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> &ldquo;Knowledge Report&rdquo; intermediate representation &ndash; for each, draft a table to name it,
provide an alternative name or two, a definition, an identifier for the thing, an identifier
template for instances of the things culled from sources, and a query to get instance from sources
&ndash; or at least a nod to how one might proceed with these, particularly for the last two items.</p>
<p>I instead found myself in Philadelphia for longer than anticipated, for reasons I may or may not
divulge over a beer, and so here&rsquo;s what I came up with in the limited timebox I gave myself to push
<em>something</em> out today:</p>
<figure><img src="/img/fes-identifying-concepts.png" width="100%"/><figcaption>
            <h4>Identifying some concepts (attributes? relationships?) about identifying</h4>
        </figcaption>
</figure>

<p>An identifying service provided guarantees wrt protocol, policy, and algorthims to make good on the
guarantees. These guarantees revolve around the nature of requests and responses. Requests wrt
identifying are about minting new IDs, binding information to minted IDs, or resolving supplied IDs
to bound information. Responses are either the thing identified, information about the thing, or
where to get the thing.</p>
<p>Okay, timebox is over. Yes, leaves in the diagram above have gone unexplained. Thankfully, there are
more days for thinking, i.e. writing.</p>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>J. Sequeda and O. Lassila, Designing and building enterprise knowledge graphs. San Rafael:
Morgan &amp; Claypool Publishers, 2021.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/a-five-week-experiment-to-elaborate-on-fair-enabling-services/</id><link rel="alternate" href="https://donnywinston.com/posts/a-five-week-experiment-to-elaborate-on-fair-enabling-services/"/><title>A Five-Week Experiment to Elaborate on FAIR-Enabling Services</title><published>2022-08-19T09:48:20-04:00</published><updated>2022-08-19T09:48:20-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p><a href="https://donnywinston.com/posts/fair-enabling-services/">Yesterday</a>, I proposed that a strategy for
implementing the <a href="https://w3id.org/fair/principles/terms/FAIR">FAIR principles</a> for research data
management can focus on ensuring five FAIR-enabling <em>services</em>, which in turn will prompt tactical
choices of <a href="https://w3id.org/fair/fip/terms/FAIR-Enabling-Resource">FAIR-enabling resources</a> that
may satisfactorily address each <a href="https://w3id.org/fair/fip/terms/FIP-Question">question</a> and thereby
produce a comprehensive <a href="https://w3id.org/fair/fip/terms/FIP-Ontology">implementation profile</a>. The
purpose of such care in design is to de-risk one&rsquo;s investment in &ldquo;going FAIR&rdquo;, as the cost of
systems implementation and maintenance can easily exceed the cost of a design phase by an order of
magnitude.</p>
<p>These FAIR-enabling services are, again:</p>
<ol>
<li>Identifying</li>
<li>Validating</li>
<li>Indexing</li>
<li>Translating</li>
<li>Tracing</li>
</ol>
<p>Other than cursory remarks, I am yet to elaborate in any detail the behavior I expect from these
services. I would like to remedy this over the next five weeks, one week per service, in the order
given above.</p>
<p>And there is a constraint I would like to impose on myself: each week will be a five-day progression
of notes that reflects the service sequence above. For example, during the first week (on
Identifying), I will:</p>
<ul>
<li>
<p>On Day 1: Identify the concepts, attributes, and relationships at play in Identifying.</p>
</li>
<li>
<p>On Day 2: Assert and validate a set of statements, using elements that I identified the day
before, that should hold for Identifying.</p>
</li>
<li>
<p>On Day 3: Demonstrate a process of Indexing the above in order to efficiently retrieve
assertions.</p>
</li>
<li>
<p>On Day 4: Assert relationships among schemes for Identifying, and attempt Translating from one to
another.</p>
</li>
<li>
<p>On Day 5: Demonstrate a process of Tracing revisions made to the metadata that Identifying
yields (i.e., what is returned when an identifier is resolved).</p>
</li>
</ul>
<p>Is this five-week planned experiment ambitious? Yes? Is it <em>too</em> ambitious? Almost certainly yes.
Will I attempt it anyway? Yes.</p>
<p>Would I appreciate your day-to-day feedback for course correction? Yes. And I would enthusiastically
acknowledge your contribution when collecting and clarifying the sum of each week&rsquo;s notes.</p>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/fair-enabling-services/</id><link rel="alternate" href="https://donnywinston.com/posts/fair-enabling-services/"/><title>FAIR-Enabling Services</title><published>2022-08-18T22:14:09-04:00</published><updated>2022-08-18T22:14:09-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p><small>(The following is a transcript of my recent <a href="https://podcast.polyneme.xyz/episodes/fair-enabling-services">podcast
episode</a> on this topic.)</small></p>
<p>There is a <a href="https://w3id.org/fair/fip/terms/FIP-Ontology">FAIR Implementation Profile ontology</a>, and
it talks about <a href="https://w3id.org/fair/fip/terms/FAIR-Enabling-Resource">FAIR-enabling resources</a>. So
these are corresponding to <a href="https://w3id.org/fair/fip/terms/FIP-Question">questions</a>. For each of
the fifteen <a href="https://w3id.org/fair/principles/terms/FAIR">FAIR principles</a>, this FAIR-enabling
resource, the idea is that you&rsquo;ve identified a
<a href="https://w3id.org/fair/fip/terms/FIP-No-Choice-Declaration">challenge</a> or you&rsquo;ve made a
<a href="https://w3id.org/fair/fip/terms/FIP-Declaration">choice</a> about some resource that&rsquo;s going to help
you fulfill that &ndash; either that resource is
<a href="https://w3id.org/fair/fip/terms/declares-current-use-of">available</a>, or it&rsquo;s <a href="https://w3id.org/fair/fip/terms/declares-planned-use-of">planned</a>, or it&rsquo;s
<a href="https://w3id.org/fair/fip/terms/declares-planned-development-of">proposed</a>, or you&rsquo;re going to
<a href="https://w3id.org/fair/fip/terms/declares-planned-replacement-of">phase it out</a>, that sort of thing.</p>
<p>Twelve FAIR-enabling resources have been identified as broad categories that help address each of
the challenges with FAIR principles. One is an <a href="https://w3id.org/fair/fip/terms/Identifier-service">identifier
service</a>. This is a service that provides for
any digital object, (1) algorithms guaranteeing global uniqueness; (2) a policy document that
guarantees persistence; and (3) resolution of the identifier to machine actionable metadata
describing the object and its location.</p>
<p>This is all into under Findable. Another FAIR-enabling resource is <a href="https://w3id.org/fair/fip/terms/Metadata-schema">metadata
schema</a>. So this would be a specification, a
schema, that specifies metadata fields describing attributes of data or other digital objects.
Another FAIR-enabling resource would be a <a href="https://w3id.org/fair/fip/terms/Metadata-data-linking-schema">metadata-data linking
schema</a>. So this would be,
specifically, a specification &ndash; schema &ndash; that provides a unique, persistent, ideally
bi-directional machine actionable link between metadata and the data they describe. And the final
FAIR-enabling resource for the Findable principles is a
<a href="https://w3id.org/fair/fip/terms/Registry">registry</a>, which is a service that indexes metadata and
data and provides a search over that index.</p>
<p>For Accessible, there are three identified FAIR-enabling resources. <a href="https://w3id.org/fair/fip/terms/Communication-protocol">Communication
protocol</a>: so this is a specification for
how messages are structured and exchanged. There&rsquo;s <a href="https://w3id.org/fair/fip/terms/Authentication-and-authorization-service">authentication and authorization
service</a>. So this is a
service that mediates access to digital objects according to specified conditions.</p>
<p>And another FAIR-enabling resource is a <a href="https://w3id.org/fair/fip/terms/Metadata-preservation-policy">metadata preservation
policy</a>. So this would be a document
that describes the conditions under which metadata are to be provisioned in the future, maybe part
of a data management plan.</p>
<p>Okay. Five more FAIR-enabling resources identified. We&rsquo;re going to Interoperability now. One is a
<a href="https://w3id.org/fair/fip/terms/Knowledge-representation-language">knowledge representation language</a>: a language specification whereby
knowledge can be made processable by machines. Another FAIR-enabling resource is <a href="https://w3id.org/fair/fip/terms/Structured-vocabulary">structured
vocabulary</a>: a controlled list of uniquely
identified and unambiguous concepts with their definitions represented preferably using web
standards.</p>
<p>Finally, in Interoperable, a FAIR-enabling resource would be a <a href="https://w3id.org/fair/fip/terms/Semantic-model">semantic
model</a>, a specification that defines qualified
relations between entities describing data or other digital objects using structured vocabularies.</p>
<p>The two remaining FAIR-enabling resources, under Reusable, are (1) <a href="https://w3id.org/fair/fip/terms/Data-usage-license">data usage
license</a> &ndash; so that&rsquo;s a document that describes
the conditions under which a digital object can be legally used. And finally, a <a href="https://w3id.org/fair/fip/terms/Provenance-model">provenance
model</a>; a specification &ndash; schema &ndash; that
specifies metadata fields describing the origin and lineage of data or other digital objects.</p>
<p>So these are a bunch of FAIR-enabling resources. I was thinking about this a bit, and I wanted to
distinguish between things that actually have to be running in order for data to be alive and for
you to actually find it, access it, interoperate with it, reuse it, versus things that are resources
that those services will need that are more &ldquo;one-time&rdquo; things.</p>
<p>For example, a metadata schema isn&rsquo;t really a service, so to speak. It&rsquo;s something that you can do
and be done with. You might need to make revisions of it, so maybe there&rsquo;s some change management
procedure. But in terms of the actual service, it isn&rsquo;t quite like an identifier service, where you
want to be given an identifier and be able to know where to go, and resolve that identifier, and
determine if you have the right identifier and then get the data, get the metadata. So that&rsquo;s an
actual service that needs to be run. If not continuously, whenever you decide you need
identification, you can spin up that service and do that, but it&rsquo;s an actual service that needs to
be run in order for you to have living findability, accessibility, interoperability, reusability.</p>
<p>So of these 12 FAIR-enabling resources, I&rsquo;ve thought about how to condense them into FAIR-enabling
<em>services</em>. What are the actual <em>services</em> that are really important across these that someone needs
to worry about if they want a FAIR data ecosystem in their lab and their lifecycle for research,
that sort of thing.</p>
<p>I&rsquo;ve identified these as (1) An identification service, an identifier service. You need to be able
to identify things. And this is identifying metadata, datasets, as well as vocabulary things. So
this spans, say, F1, the A principles, as well as I2 in terms of making a vocabulary FAIR, being
able to unambiguously identify vocabulary terms. So identification is a big service that&rsquo;s needed.</p>
<p>The second service is validation. So, you could be given a metadata schema and given statements and
assertions, but how do you know they actually conform to the schema? Are you going to look those up
by hand? Are you going to kind of cross check with a sheet of paper that you have in front of you
that says the schema? No, you really want a validation service that will validate statements
according to a schema that you&rsquo;re imposing.</p>
<p>The third service is indexing. So this is related to the registry. You need something that, given a
bunch of statements that have identifiers that resolve, a bunch of statements that are valid
according to the schema &ndash; so you&rsquo;ve identified, you validated. You then need to collect them and be
able to find what you need. And so that involves indexing. So that&rsquo;s an actual service where you can
search the index. An index is the basis of search. Otherwise you&rsquo;re just doing a full scan of all
your statements. You won&rsquo;t get any leverage. You won&rsquo;t be able to winnow down with any efficiency at
all. So this index thing, this ongoing indexing, where you have an index and you maintain an index
and when you identify new concepts or data, assign them identifiers, validate your statements about
them, you want to throw them in the index and you want your registry to re-index things. So that
indexing needs to be a service.</p>
<p>The fourth FAIR-enabling service to me is translation. So, this is the essence of interoperation.
This is the point of having a knowledge representation language and a semantic model where you&rsquo;re
defining qualified relations. The idea being, you have a bunch of metadata and you want to use it
for something else. So you need some service to actually translate it. If you have data in some
format, you want to be able to translate it. You want these qualified links to know that, if you
have metadata of this format, say of a schema.org Dataset and you want a DCAT Dataset, you know the
corresponding mapping and you can perform that translation. So that would be a FAIR-enabling service
that would leverage resources like semantic model, structured vocabulary, language &ndash; ultimately, it
would leverage your index as well. So, translation would also be dependent on an index, just like
search is.</p>
<p>And the final service that I think is important here is tracing. So, given something, you want to
trace &ldquo;where did it come from?&rdquo;, and how you can use it. So this connects directly to your, static
or not, policies about usage rights, data usage, and your provenance model. And this is how you can
actually trace where things are, to determine if you can reuse it. So this is something active that
you want. You want something, and again, this would ultimately leverage an index as well. So you&rsquo;d
have a bunch of data objects and metadata objects and vocabulary terms, all of which would need to
be identified, so that&rsquo;d be an identifying service. All of your statements about things, about
provenance, mapping for translation, indexing, all of that would have to be validated. So you have
the validation service, and then finally you have the indexing. And that puts everything in. And the
indexing is the basis of support for search, which I don&rsquo;t think needs to be a separate
FAIR-enabling service &ndash; there are various ways of searching over an index, given it. But it also
enables this translation based on the semantics that you have in your model and your resources, and
tracing to determine, can I use this? What&rsquo;s the provenance of this? Was it based on things that
fell under this certain license? And so, depending on the license of my transformations, this is
what I can use it for.</p>
<p>So again, these FAIR-enabling services are: Identifying, Validating, Indexing, Translating, and
Tracing. And I hope to go into more detail about how these relate to the FAIR principles and the
resources, and sort of elaborate on them individually over the coming weeks.</p>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/fers-identifier-services/</id><link rel="alternate" href="https://donnywinston.com/posts/fers-identifier-services/"/><title>FAIR-Enabling Resources - Identifier Services</title><published>2022-08-15T22:27:00-04:00</published><updated>2022-08-15T22:27:00-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>Here are some <a href="https://w3id.org/fair/fip/terms/Identifier-service">identifier services</a> listed as
such by <a href="https://fip-wizard.ds-wizard.org">FIP Wizard</a>, a free-to-signup online tool to guide a
user in creating and publishing a machine-actionable FAIR Implementation Profile (FIP):</p>
<ul>
<li>
<p><a href="http://purl.org/np/RAVZO6G2woBMry9K9sl-4YurlGM4x6GREjnWLhef-5TTk#IGSN">Old IGSN</a></p>
<p>International Generic Sample Number before integration with DataCite</p>
</li>
<li>
<p><a href="http://purl.org/np/RAk5umcWPZegtZTBJACkE9CDap7KNwSgR8umS3SnuvW2A#SDN_CDI_PID">SDN CDI PID | SeaDataNet CDI PID</a></p>
<p>SeaDataNet Common Data Persistent Identifier</p>
</li>
<li>
<p><a href="http://purl.org/np/RAyIQx7rn1CDYgTqRlSbsZY0yQz46VxRzSgfMp138QInM#OSTI-DOI">U.S. Department of Energy Office of Scientific and Technical Information (OSTI) Data ID Service</a></p>
<p>Through the DOE Data ID Service, OSTI assigns persistent identifiers, known as Digital Object Identifiers (DOIs), to datasets submitted by DOE and its contractor and grantee researchers and registers the DOIs with DataCite to aid in citation, discovery, retrieval, and reuse.  OSTI assigns and registers DOIs for datasets for DOE researchers as a free service to enhance the Department&rsquo;s management of this important resource.</p>
</li>
<li>
<p><a href="http://purl.org/np/RA5-OsT0-sjRbcoFEGfOzkrcFtExipMRmoLErzg5QWL7c#URI">URI | Uniform Resource Identifier</a></p>
<p>URI is a string that provides a unique address (either on the Internet or on another private network, such as a computer filesystem or an Intranet) representing a resource, and implicitly describes where a resource can be found. A resource identification need not suggest the retrieval of resource representations over the Internet, nor need they imply network-based resources at all.</p>
</li>
</ul>
<p>There are currently four resources. I authored the entry for OSTI DOIs. &ldquo;URI&rdquo; doesn&rsquo;t seem like a
service. We have a long ways to go.</p>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/schema-translation-infrastructure/</id><link rel="alternate" href="https://donnywinston.com/posts/schema-translation-infrastructure/"/><title>Schema Translation Infrastructure</title><published>2022-08-11T11:13:35-04:00</published><updated>2022-08-11T11:13:35-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>Repurposing data is hard sometimes. Given a current application&rsquo;s data-worldview &ndash; i.e., its schema
&ndash; one cannot in general pull in historical data collected for different applications because those
applications had different worldviews &ndash; i.e., they used different data schemas.</p>
<p>One may perform one-off or ongoing transformations &ndash; e.g. ETL jobs &ndash; as part of a hub-and-spoke
strategy to bring data from past worlds into the &ldquo;present&rdquo; world so that all the data can be queried
in a uniform way, in the language of the present-application schema.</p>
<p>Unfortunately, the &ldquo;present&rdquo; world is a moving target. And &ldquo;past&rdquo; worlds may be merely dormant &ndash;
they may become &ldquo;present&rdquo; again if a given application is revisited.</p>
<p>Rather than hub-and-spoke schema convergence and single-timeline data migration, what if schema
translation infrastructure sought to reconcile queries across multiple worlds? That is, what if
application-X-centric questions could travel to and collect partial information from
other-application-centric worlds using the languages (schema) of those worlds?</p>
<p>Building off Ink &amp; Switch&rsquo;s ideas on edit lenses for schema evolution<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> and off Radul and
Sussman&rsquo;s ideas on propagation networks for computation<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>, as well as off the observed
salubrious hourglassing of the Internet&rsquo;s layered-architecture design<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup>, I&rsquo;m thinking about
how to facilitate effective &ldquo;schema networking&rdquo; that acknowledges and embraces the never-ending
schema evolution characteristic of data collection efforts by research-producing organizations.</p>
<h2 id="initial-scribbles">Initial Scribbles</h2>
<p>First, I offer a simplified recapitulation of the layered architecture of the Internet:</p>
<figure><img src="/img/internet-layers-written.png" width="80%"/><figcaption>
            <h4>Layered Architecture for the Internet</h4>
        </figcaption>
</figure>

<p>Next, I offer a mapping of the above to an analogous six-layered architecture for schema
translation:</p>
<figure><img src="/img/schema-translation-infra-layers-written.png" width="80%"/><figcaption>
            <h4>Layered Architecture for Schema Translation</h4>
        </figcaption>
</figure>

<p>The <em>physical protocol</em> layer is concerned with data (de)serialization/marshalling and storage. The
<em>data-link protocol</em> layer is where ETL happens &ndash; how bytes are de-isolated and made accessible to
the network. The <em>network protocol</em> layer is where propagation among worldviews/schemas &ldquo;runs&rdquo;, with
each &ldquo;cell&rdquo; (in the parlance of the propagation network literature) a join-semilattice (or is it a
meet-semilattice?) world that accumulates partial information via edit-lens functional propagators.
The <em>transport protocol</em> layer is RDF over HTTP (<a href="/posts/a-fair-digital-object/">FAIR Digital
Objects</a>?), the <em>application protocol</em> layer is RDF query (SPARQL, a
Datalog, etc.), and the <em>application</em> layer is where specific-worldview-conforming data  (i.e.,
things you plot, perform exploratory data analysis (EDA) on, select/engineer features from to feed
to ML-model training, etc.) materialize.</p>
<p>Finally, I offer a rough diagram of how various layer activities and dataflow within/between them
may be visualized:</p>
<figure><img src="/img/schema-translation-infra-layers-diagram.png" width="80%"/><figcaption>
            <h4>Schema Translation Infrastructure in Action</h4>
        </figcaption>
</figure>

<p>I want to close by noting that the problem of schema reconnection comes up not only with research
laboratory datasets that were collected independently by different teams, but also with datasets
collected over a long period of time by a single team as project/application requirements evolve and
place adaptation pressure on the &ldquo;working&rdquo; schema to undergo several revisions, thus necessitating
reconnection among schema versions (i.e. migrations, but not necessarily unidirectional if, say, a
sub-team is still using an &ldquo;old&rdquo; schema and wants to contribute &ldquo;new&rdquo; data).</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>&ldquo;Project Cambria: Translate your data with lenses,&rdquo; Oct. 06 , 2020.
<a href="https://www.inkandswitch.com/cambria/">https://www.inkandswitch.com/cambria/</a> (accessed Aug. 01, 2022).&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>A. Radul, &ldquo;Propagation networks: a flexible and expressive substrate for computation,&rdquo;
Thesis, Massachusetts Institute of Technology, 2009. Accessed: Aug. 11, 2022. [Online]. Available:
<a href="https://dspace.mit.edu/handle/1721.1/54635">https://dspace.mit.edu/handle/1721.1/54635</a>&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>Beck, M. (2019). On the hourglass model. Communications of the ACM, 62(7), 48–57.
<a href="https://doi.org/10/gj3fnj">https://doi.org/10/gj3fnj</a>&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/identifier-resolution-delays-binding/</id><link rel="alternate" href="https://donnywinston.com/posts/identifier-resolution-delays-binding/"/><title>A Perlisism for Identifiers: Delay Binding</title><published>2022-08-08T22:42:20-04:00</published><updated>2022-08-08T22:42:20-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>Inference based on semantic retrieval is more robust than inference based on syntactic parsing.</p>
<blockquote>
<p>identifiers should be as dumb as possible &ndash; in other words, should include as little metadata as
possible about the thing being identified, leaving all information to be retrieved from metadata
repositories rather than inferred from the identifier itself. People always want to infer meaning,
and will often try to teach machines to do the same. The problem is that apparent meaning in the
structure of an identifier is all too often misleading&hellip;<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></p>
</blockquote>
<p>In order to be authoritative, identifiers should be assigned as early as practicable in the creation
process, but minting is not binding.</p>
<blockquote>
<p>Functions delay binding; data structures induce binding. Moral: Structure data late in the
programming process.<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup></p>
</blockquote>
<p>Identifier resolution delays binding; identifier structures induce binding. Moral: Structure
identifiers late (or never) in the minting process.</p>
<p>Also, structure identifier resolution (i.e. retrieved-metadata structure) late. Metadata is about claims;
there may be many and different claims about the same thing. &ldquo;Multiple resolution&rdquo;, i.e. making
different metadata sources/profiles/formats accessible depending on what a client is trying to
retrieve, is akin to functional polymorphism and hence even later binding.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>M. Bide, &ldquo;Standard Identifiers: an overview of the current landscape,&rdquo; presented at the
USPTO Open Meeting: Facilitating the Development of the Online Licensing Environment for Copyrighted
Works, Apr. 01, 2015. [Online]. Available:
<a href="http://www.linkedcontentcoalition.org/phocadownload/150401%20BIDE%20Standard%20Identifiers%20Overview%20with%20embedded%20slides.pdf">pdf</a>&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>A. J. Perlis, &ldquo;Special Feature: Epigrams on programming,&rdquo; SIGPLAN Not., vol. 17, no. 9,
pp. 7–13, Sep. 1982, doi: 10.1145/947955.1083808. Online at
<a href="http://www.cs.yale.edu/homes/perlis-alan/quotes.html">http://www.cs.yale.edu/homes/perlis-alan/quotes.html</a>.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/when-do-developers-not-have-to-talk-to-stakeholders/</id><link rel="alternate" href="https://donnywinston.com/posts/when-do-developers-not-have-to-talk-to-stakeholders/"/><title>When Do Developers Not Have to Talk to Stakeholders?</title><published>2022-08-03T09:56:25-04:00</published><updated>2022-08-03T09:56:25-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>An ontologist can bridge<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> domain expertise and software development via production of</p>
<ol>
<li>
<p>a semi-informal so-called <em>intermediate representation</em><sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup> that can be understood by
domain experts, and</p>
</li>
<li>
<p>a formal ontology / knowledge graph that represents the domain in a machine-actionable way.</p>
</li>
</ol>
<blockquote>
<p>When you do software development, you want to take the human out of the loop as much as possible
&ndash; really automate it, least amount of manual effort, just minimize that&hellip;When I explain to
[developers] that an ontology or knowledge graph [is] basically an ontologist talking to
stakeholders and making sure that all the implicit knowledge they have is expressed in an explicit
structural format that systems can also read&hellip;a light bulb [is] lit in their head, like &ldquo;Oh&hellip;so it
means we do not have to talk to stakeholders?&rdquo;&hellip;Yes!&hellip;You can basically have the ontologist talk
to the stakeholders and put it into a format that you just query.&quot;<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup></p>
</blockquote>
<p>And if you switch-hit as both a domain expert and a developer, &ldquo;a little semantics goes a long way&rdquo;
&ndash; developing competence in and discipline towards producing intermediate representations can
increase your capacity to effectively collaborate/delegate and thus increase professional impact.</p>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>J. Sequeda and O. Lassila, Designing and building enterprise knowledge graphs.
San Rafael: Morgan &amp; Claypool Publishers, 2021.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>M. Fernández-López, A. Gómez-Pérez, and N. Juristo, &ldquo;METHONTOLOGY: From Ontological Art
Towards Ontological Engineering,&rdquo; Stanford University, EEUU, Mar. 1997. Accessed: Aug. 03, 2022.
[Online]. Available: <a href="https://oa.upm.es/5484/">https://oa.upm.es/5484/</a>&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>A. Faith and K. Kari, &ldquo;Data Therapy &amp; Using Ontologies To Translate Business
Rules For Devs,&rdquo; (Jul. 28, 2022). Accessed: Aug. 03, 2022. [Online Video]. Available:
<a href="https://www.youtube.com/watch?v=lRUYY1pVVqI&amp;t=238s">https://www.youtube.com/watch?v=lRUYY1pVVqI&amp;t=238s</a>&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/principles-of-robust-interoperability/</id><link rel="alternate" href="https://donnywinston.com/posts/principles-of-robust-interoperability/"/><title>Principles for Robustly Interoperable Digital Objects</title><published>2022-08-02T11:03:23-04:00</published><updated>2022-08-02T11:03:23-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>I have been ruminating on core values in service of stewardship of evolving scientific knowledge.</p>
<p>Specifically, what principles can I lean upon to guide me in the design of robustly interoperable
digital objects?</p>
<p>Here is what has jelled so far for me:</p>
<ol>
<li>
<p><a href="https://en.wikipedia.org/wiki/Interpreter_(computing)">Machine Interpretation</a></p>
<ul>
<li>
<p>rectifies and amplifies formal modeling</p>
</li>
<li>
<p>facilitates machine action</p>
</li>
<li>
<p>requires that both bits and semantics are accessible</p>
</li>
</ul>
</li>
<li>
<p><a href="https://www.w3.org/2001/tag/doc/leastPower-2006-01-23.html">Least Power</a></p>
<ul>
<li>
<p>constrained interpretation promotes interoperability</p>
</li>
<li>
<p>lower barrier to support diverse and unanticipated use cases</p>
</li>
<li>
<p>behavior is more likely to be understood and predicted with high confidence</p>
</li>
</ul>
</li>
<li>
<p><a href="https://en.wikipedia.org/wiki/Stationary-action_principle">Stationary Action</a></p>
<ul>
<li>
<p>small deviations from intended sequence-of-processes (i.e. task path) does not require large compensating efforts to reach intended outcome.</p>
</li>
<li>
<p>emphases monotonicity, smooth steering, and suitable granularity of progress.</p>
</li>
<li>
<p>emphasizes resilience of desired budget over desired schedule rather than minimization of initial budget over initial schedule.</p>
</li>
</ul>
</li>
<li>
<p><a href="https://doi.org/10.1145/359131.359136">Logic + Control</a></p>
<ul>
<li>
<p>facilitates declarative programming</p>
</li>
<li>
<p>facilitates flexibility in choices of performance tradeoffs</p>
</li>
<li>
<p>promotes reuse</p>
</li>
</ul>
</li>
<li>
<p><a href="https://en.wikipedia.org/wiki/Delta_encoding">Delta Encoding</a></p>
<ul>
<li>
<p>facilitates provenance</p>
</li>
<li>
<p>facilitates revision control</p>
</li>
<li>
<p>facilitates pub/sub for selective search and retrieval</p>
</li>
</ul>
</li>
</ol>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/shotgun-semantics/</id><link rel="alternate" href="https://donnywinston.com/posts/shotgun-semantics/"/><title>Shotgun Semantics</title><published>2022-08-01T16:28:46-04:00</published><updated>2022-08-01T16:28:46-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>Developers often resort to <em>shotgun parsing</em>: scattering data checks and fallback values in various
places throughout the system’s main logic.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></p>
<p>The habit of scattering parser-like behaviour throughout an application’s code and the resulting
inconsistencies in data handling can often lead not just to annoying complications and bugs, but
also security vulnerabilities.<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup></p>
<p>This is about reading data. What about when writing data, when setting the foundations for how it
will ultimately &ldquo;behave&rdquo; and be interpreted? Are you firing shotshells, or are you <a href="https://www.wikidata.org/wiki/Q79037">slinging
webs</a>?</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>&ldquo;Project Cambria: Translate your data with lenses,&rdquo; Oct. 06 , 2020.
<a href="https://www.inkandswitch.com/cambria/">https://www.inkandswitch.com/cambria/</a> (accessed Aug. 01, 2022).&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>S. Bratus and M. L. Patterson, &ldquo;Shotgun parsers in the cross-hairs,&rdquo; presented at BruCON 2012. Slides: <a href="http://langsec.org/brucon/ShotgunParsersBruCON.pdf">http://langsec.org/brucon/ShotgunParsersBruCON.pdf</a> (accessed Aug. 01, 2022).&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/podcast-shreyas-cholia/</id><link rel="alternate" href="https://donnywinston.com/posts/podcast-shreyas-cholia/"/><title>Interview with Shreyas Cholia</title><published>2022-07-29T10:23:10-04:00</published><updated>2022-07-29T10:23:10-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>This week on Machine-Centric Science, I interviewed Shreyas Cholia, currently at the Lawrence
Berkeley National Laboratory in Berkeley, California.</p>
<p>Topics we spoke about included: data lifecycles, edge computing for data firehoses, provenance,
standards, broad versus detailed domain vocabularies, scope for common APIs, and identifier
leveling.</p>
<p><a href="https://share.transistor.fm/s/40860a6a">HAVE A LISTEN »</a></p>
<h1 id="quotable-quotes">Quotable Quotes</h1>
<p>&ldquo;Maybe what that really means is that this publication step so to speak just needs to be pushed
further upstream&rdquo;</p>
<p>&ldquo;Maybe it&rsquo;s just conceptualizing the data lifecycle as being not so much a linear thing as much as
it is just a bunch of different steps that could be applied to the data at different stages, and
really any of those steps could happen at any time.&rdquo;</p>
<p>&ldquo;There&rsquo;s a little bit of a disconnect right now&hellip;each domain tends to have a lot of detail that
gets obscured by these high-level specifications&hellip;we&rsquo;re seeing some interesting friction&hellip;things
that evolved from different spaces, it&rsquo;s interesting to see how they&rsquo;re trying to come together
now.&rdquo;</p>
<p>&ldquo;The holy grail is&hellip;everyone can look at everything and everyone can talk to each other&hellip;in this
dataset, that&rsquo;s what this column means and that&rsquo;s what this field means and that&rsquo;s how I can compare
these two things.&rdquo;</p>
<p>&ldquo;There&rsquo;s a lot more to harmonization than just making sure things are in the same unit.&rdquo;</p>
<p>&ldquo;The driving force here is more about machine readability and machine interpretability of the data.&rdquo;</p>
<p>&ldquo;That one&rsquo;s tricky&hellip;it&rsquo;s a little bit of a moving target in terms of where you see scientific value
occurring.&rdquo;</p>
<p>&ldquo;So much of what matters is at the metadata level&hellip;If that&rsquo;s different for different domains, which
it will be, having the &lsquo;one API to rule them all&rsquo; doesn&rsquo;t really make a lot of sense.&rdquo;</p>
<p>&ldquo;At the highest level, DOIs are great&hellip;there are, though, a lot of identifiers that are kind of not
&lsquo;DOI-level&rsquo; identifiers&hellip;more low-level for tracking and provenance&hellip;down to the level of the
individual datum&hellip;a row in a spreadsheet, or a single JSON object.&rdquo;</p>
<p>&ldquo;It&rsquo;s never too late to start thinking about coming together and trying to standardize your
data&hellip;Please also spend a lot of time seeing what&rsquo;s out there and trying to work with existing
standards and trying to be a part of the broader ecosystem rather than doing your own thing.&rdquo;</p>
<h1 id="sharing-is-caring">Sharing is caring!</h1>
<p>If you enjoyed this episode, please consider sharing it with a few friends who might find it useful.
Thanks!</p>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/method-and-structure/</id><link rel="alternate" href="https://donnywinston.com/posts/method-and-structure/"/><title>Method and Structure</title><published>2022-07-27T10:18:46-04:00</published><updated>2022-07-27T10:18:46-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>Arrangements of bits have structure just like arrangements of atoms have structure. Interoperability
is about aligning structure. Processing, properties, performance &ndash; if their characterization can be
repeated, they have information structure.</p>
<center>
<figure>
<img width="60%" src="/img/materials_science_tetrahedron;structure,_processing,_performance,_and_proprerties.svg" alt="The materials science tetrahedron, which illustrates how a material's properties, processing, performance, and structure are interrelated. Source: <https://commons.wikimedia.org/w/index.php?curid=4198116>.">
<figcaption>
The materials science tetrahedron (<a href="https://commons.wikimedia.org/w/index.php?curid=4198116">source</a>).</em>
</figcaption>
</figure>
</center>
<blockquote>
<p>All structure is created by a process. Any process that can be repeated is a method. Every
method—indeed, every process—<em>itself</em> has a structure.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></p>
</blockquote>
<center>
<figure> <img src="/img/method-structure-close.jpg" width="50%" alt="Method & Structure Möbius strip" title="Method & Structure Möbius strip"/>
<figcaption>Method-and-Structure Möbius strip (<a href="https://methodandstructure.com/">source</a>)</figcaption>
</figure>
</center>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p><a href="https://methodandstructure.com/">https://methodandstructure.com/</a>&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/podcast-patrick-huck/</id><link rel="alternate" href="https://donnywinston.com/posts/podcast-patrick-huck/"/><title>Interview with Patrick Huck, on implementing FAIR for computed materials data</title><published>2022-07-21T11:01:58-04:00</published><updated>2022-07-21T11:01:58-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>This week on Machine-Centric Science, I interviewed Patrick Huck, currently staff on the Materials
Project at the Lawrence Berkeley National Laboratory in Berkeley, California. We talk about choices
and considerations in implementing FAIR.</p>
<p><a href="https://share.transistor.fm/s/dc5be09e">LISTEN NOW »</a></p>
<p>There are show notes at the link above. Also, I tried to summarize our discussion as a <a href="https://np.petapico.org/RARgDGc4UYSdRmNq-BtZ4_Gd0ZvE08yms-Ew__tGwbolE">draft FAIR
Implementation Profile
(FIP)</a>.</p>
<h1 id="talking-points">Talking Points</h1>
<p>Career paths for people that are scientists AND software engineers.</p>
<p>The U.S. Department of Energy Office of Scientific and Technical Information (OSTI) DOI Service.</p>
<p>What gets a DOI? Granularity of resources.</p>
<p>Partnering with the Novel Materials Discovery (NOMAD) Laboratory for accessing raw data.</p>
<p>Modeling: with Python classes and with OpenAPI.</p>
<p>API Gateway design for authentication and authorization.</p>
<p>Provenance: for calculation workflows and for structure sourcing (credit to submitters!).</p>
<h1 id="quotable-quotes">Quotable Quotes</h1>
<p>&ldquo;I think that&rsquo;s a big topic in science generally. What are the career paths for people that are
software engineers that are also scientists or maybe scientists first and software engineers second,
and have gone that route? It&rsquo;s not like there&rsquo;s H indexes for people like me in terms of
publications.&rdquo;</p>
<p>&ldquo;[OSTI] provides the infrastructure for minting those DOIs and making sure that those links are
always live. We&rsquo;ve become over the years with now, I think 147,000 DOIs, their biggest data client.&rdquo;</p>
<p>&ldquo;We use what&rsquo;s called robocrystallographer, which gets descriptions based on machine learning that
we get based on the information that we calculate about that structure. And then we can take that
description auto generated from our database entries and send it as metadata for the DOIs.&rdquo;</p>
<p>&ldquo;It&rsquo;s kind of transparent without even knowing that there&rsquo;s an API behind it. To the extent that
sometimes people talk about the API and they actually mean the client. I think that&rsquo;s a good thing.
People in our space expect those things to be pretty transparent.&rdquo;</p>
<p>&ldquo;I don&rsquo;t think that guarantees longevity on the scale of glacial times.&rdquo;</p>
<p>&ldquo;There&rsquo;s a lot going on in terms of making data FAIR. It&rsquo;s a little easier for making documents
FAIR, like having PDFs findable. On the data level, it becomes a little bit more complicated. And I
think that we should strive to get as close as possible to get to FAIR, but it might not for be
feasible for every domain.&rdquo;</p>
<h1 id="sharing-is-caring">Sharing is caring!</h1>
<p>If you enjoyed this episode, please consider sharing it with a few friends who might find it useful.
Thanks!</p>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/my-data-model-is-json/</id><link rel="alternate" href="https://donnywinston.com/posts/my-data-model-is-json/"/><title>"My Data Model Is JSON"</title><published>2022-07-20T20:07:34-04:00</published><updated>2022-07-20T20:07:34-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>&ldquo;My data model is JSON&rdquo;. JSON is not a data model. JSON has no semantics in the context of
information systems; JSON defines neither how data &ldquo;behaves&rdquo; nor how machines can compute with it.</p>
<p>&ldquo;My data is just JSON&rdquo;. Your data is never just JSON; you always impose external semantics.</p>
<p>&ldquo;JSON is easy to understand&rdquo;. What does the field <code>&quot;harrastukset&quot;</code> mean? In an example JSON
document, its value is <code>[&quot;valokuvaus&quot;, &quot;pienoismallit&quot;]</code>. Oh, you don&rsquo;t know Finnish?<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Ora Lassila, &ldquo;Will knowledge graphs save us from the mess of modern data practice?,&rdquo; Knowledge
Graphs Conference, New York, NY, USA (2022). [Online]. Available:
<a href="https://www.lassila.org/publications/2022/KGC2022-Lassila-keynote.pdf">https://www.lassila.org/publications/2022/KGC2022-Lassila-keynote.pdf</a>&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/a-fair-digital-object/</id><link rel="alternate" href="https://donnywinston.com/posts/a-fair-digital-object/"/><title>A FAIR Digital Object - Inching up the Hourglass</title><published>2022-07-19T11:15:09-04:00</published><updated>2022-07-19T11:15:09-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>Whether deliberate<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> or inevitable<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>, the hourglass architecture of the Internet supports a
great diversity of applications implemented using a great diversity of supporting services:</p>
<figure>
<img src="/img/internet-hourglass.png" width="100%"
     alt="An (incomplete) illustration of the hourglass Internet architecture showing the six layers, from top to bottom: specific applications, application protocols, transport protocols, network protocols, data-link protocols, and physical-layer protocols. A FAIR Digital Object (FDO) protocol could extend the HTTP application protocol."
     title="An (incomplete) illustration of the hourglass Internet architecture showing the six layers, from top to bottom: specific applications, application protocols, transport protocols, network protocols, data-link protocols, and physical-layer protocols. A FAIR Digital Object (FDO) protocol could extend the HTTP application protocol."/>
<figcaption>An (incomplete) illustration of the hourglass Internet architecture showing the six layers, from top to bottom: specific applications, application protocols, transport protocols, network protocols, data-link protocols, and physical-layer protocols. A FAIR Digital Object (FDO) protocol could extend the HTTP application protocol.</figcaption>
</figure>
<p>Could there be a minimal &ldquo;spanning layer&rdquo; protocol for FAIR-principled<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup> applications and
services? The <a href="https://fairdo.org/">FAIR Digital Object (FDO)</a> has emerged as a conceptual nexus for
consideration of such a protocol.</p>
<p>There is a working draft online for an FDO framework.<sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup> In it, an identifier resolves to a
digital object (byte sequence) by default, but one may also request a so-called identifier record.
This record would certainly support &ndash; via a simple qualified reference &ndash; the operation of
accessing the identified object&rsquo;s value-obvious situational information, i.e. the raw byte sequence.
Crucially, the identifier record would also support &ndash; again, via simple qualified references &ndash;
operations to access methodological (still value-obvious to certain consumers) and more
philosophical (epistemic, ontological, axiological &ndash; value typically not obvious) information:</p>
<figure>
<img src="/img/FDOF-IR_resolution.png" width="100%"
     alt="A FAIR Digital Object (FDO) framework - the identifier record, identifier resolution behavior, typing, and metadata schemas and records."
     title="A FAIR Digital Object (FDO) framework - the identifier record, identifier resolution behavior, typing, and metadata schemas and records."/>
<figcaption>A FAIR Digital Object (FDO) framework - the identifier record, identifier resolution behavior, typing, and metadata schemas and records (<a href="https://fairdigitalobjectframework.org/">source</a>). </figcaption>
</figure>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>M. Beck, &ldquo;On the hourglass model,&rdquo; Commun. ACM, vol. 62, no. 7, pp. 48–57, Jun. 2019, doi:
10/gj3fnj.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>S. Akhshabi and C. Dovrolis, &ldquo;The evolution of layered protocol stacks leads to an
hourglass-shaped architecture,&rdquo; SIGCOMM Comput. Commun. Rev., vol. 41, no. 4, pp. 206–217, Oct.
2011, doi: 10.1145/2043164.2018460.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>M. D. Wilkinson et al., &ldquo;The FAIR Guiding Principles for scientific data management and
stewardship,&rdquo; Sci Data, vol. 3, no. 1, p. 160018, Mar. 2016, doi: 10/bdd4.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p>L. O. Bonino da Silva Santos, &ldquo;FAIR Digital Object Framework Documentation,&rdquo; Nov. 03, 2021.
<a href="https://fairdigitalobjectframework.org/">https://fairdigitalobjectframework.org/</a> (accessed Jul. 19, 2022).&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/validation-syntax-semantics-pragmatics/</id><link rel="alternate" href="https://donnywinston.com/posts/validation-syntax-semantics-pragmatics/"/><title>Validation: Syntax, Semantics, and Pragmatics</title><published>2022-07-18T10:24:18-04:00</published><updated>2022-07-18T10:24:18-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>Validation is about preconditions for operation. It may be useful to separate preconditions into
three subtypes: syntax, semantics, and pragmatics.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></p>
<p><em>Syntax</em>: Rules about what&rsquo;s grammatically well-formed. Example: A <code>CalculateAqueousStability</code>
command may have a set of atomic-composition pairs and a set of ion-concentration pairs. An
atomic-composition pair is a string paired with a number between 0 and 1. An ion-concentration pair
is a string paired with a number.</p>
<p><em>Semantics</em>: Rules about what may be syntactically valid but is nonetheless nonsense. Example: A
<code>CalculateAqueousStability</code> command may be syntactically valid, but it&rsquo;s compositions don&rsquo;t add up
to 1, the ion concentrations are physically implausible, etc.</p>
<p><em>Pragmatics</em>: Rules about contextual appropriateness for processing a syntactically and semantically
valid message. Example: an online system can&rsquo;t efficiently calculate stability for a system of more
than 4 atomic elements on-the-fly, so this kind of command is rejected.</p>
<figure> <img src="/img/pourbaix-app-command-example.png" width="100%" alt="Calculate Aqueous Stability" title="Calculate Aqueous Stability"/>
<figcaption>Calculate Aqueous Stability <a href="https://materialsproject.org/pourbaix">on materialsproject.org</a>.</figcaption>
</figure>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>H. J. W. Percival and R. G. Gregory, <em>Architecture patterns with Python: enabling test-driven
development, domain-driven design, and event-driven microservices</em>, First edition, pp 255-264.
O’Reilly, 2020.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/high-precision-content-classification-using-hierarchy/</id><link rel="alternate" href="https://donnywinston.com/posts/high-precision-content-classification-using-hierarchy/"/><title>High-Precision Content Classification Using Hierarchy</title><published>2022-07-15T08:58:09-04:00</published><updated>2022-07-15T08:58:09-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>Content classification is the most fundamental form of holistic content understanding. It helps make
your resources findable (<a href="https://w3id.org/fair/principles/terms/F2">F2</a>) and connects them to other
resources (<a href="https://w3id.org/fair/principles/terms/I3">I3</a>).</p>
<p>Content understanding represents each piece of content in the index. Relevance of content is a
function of query and content understanding. Query understanding represents each search query as a
search intent.</p>
<p>Classification maps a document to one or more predefined categories. We can do so using hand-tuned
rules or machine learning.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> The categories can be a flat list, or they can be arranged in a
hierarchical (single-hierarchy or faceted) taxonomy<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>.</p>
<p>If the categories are hierarchical and broadly applicable
(<a href="https://w3id.org/fair/principles/terms/I1">I1</a>), then a classifier might take advantage of the
hierarchy and more confidently map content to a non-leaf category (e.g., mapping a material to
“Semiconductor” rather than “High-Gap Semiconductor” or “III-V Semiconductor”). In general, it’s
best to map value objects and entities to leaf categories.</p>
<p>Reducing the number of labels substantially improves the precision of a classifier. But filtering
out infrequent labels decreases coverage, and it’s not clear that out-of-scope examples will be
recognized in production.(<a href="https://w3id.org/fair/principles/terms/F4">F4</a>) A more robust approach is
to leverage the hierarchical nature of a taxonomy and roll up infrequently used labels to their
parent or other ancestor categories.</p>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>G. Ingersoll and D. Tunkelang, “Course Notes for ‘Search with Machine Learning.’” Corise
Education, Jun. 20, 2022. [Online]. Available:
<a href="https://corise.com/course/search-with-machine-learning/">https://corise.com/course/search-with-machine-learning/</a>&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>D. Tunkelang, “Taxonomies and Ontologies,” Medium, Aug. 30, 2017.
<a href="https://queryunderstanding.com/taxonomies-and-ontologies-8e4812a79cb2">https://queryunderstanding.com/taxonomies-and-ontologies-8e4812a79cb2</a> (accessed Jul. 15, 2022).&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/taxonomy-pruning-for-query-classification/</id><link rel="alternate" href="https://donnywinston.com/posts/taxonomy-pruning-for-query-classification/"/><title>Taxonomy Pruning for Query Classification</title><published>2022-07-14T13:39:05-04:00</published><updated>2022-07-14T13:39:05-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>When providing a search interface (<a href="https://w3id.org/fair/principles/terms/F4">F4</a>), you can improve
<a href="https://en.wikipedia.org/wiki/Precision_and_recall#Precision">precision</a> significantly by
classifying a user&rsquo;s query, assuming you are able to classify your content.</p>
<p>If you have a category taxonomy and labeled queries, you can train a classifier in order to
dynamically assign a category to a query. A benefit of taxonomic hierarchy is that, while a labeled
query may be labeled with a leaf node of the taxonomy, you can prune, i.e. &ldquo;roll up&rdquo;, the taxonomy
to ensure sufficient signal for training. This helps to maintain
<a href="https://en.wikipedia.org/wiki/Precision_and_recall#Recall">recall</a> when filtering query results by
the query&rsquo;s classification.</p>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/an-objective-function-for-refactoring/</id><link rel="alternate" href="https://donnywinston.com/posts/an-objective-function-for-refactoring/"/><title>An Objective Function for Code Refactoring</title><published>2022-07-08T08:29:31-04:00</published><updated>2022-07-08T08:29:31-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>Have you ever set an objective function for code refactoring, where, for every proposed total change
(e.g. reviewable pull request), you seek to maximize the change in this function? An example:<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></p>
<p>\[ \log_2(pct_{LOC\<em>tested})  * pct</em>{importables\<em>documented} * pct</em>{LOC\<em>nostate} \over n</em>{LOC} \]</p>
<p>Good (numerator stuff):</p>
<ul>
<li>Percent Lines of Code (LOC) covered by a test. Sublinear growth here, i.e. diminishing returns on &ldquo;getting to 100%&rdquo;. An off-the-shelf tool like <a href="https://coverage.readthedocs.io/">Coverage.py</a> will be fine here.</li>
<li>Percent &ldquo;public&rdquo; units, i.e. non-underscored module importables - functions, classes, constants, variables/objects, covered by <a href="https://diataxis.fr/">tutorial/how-to-guide/explanation (so excluding reference) documentation</a>. Maximizes code consumers&rsquo; ability to understand functionality without have to dive into the codebase. Prior art on measuring this <a href="https://simonwillison.net/2018/Jul/28/documentation-unit-tests/">here</a>.</li>
<li>Percent LOC unaffected by state (i.e. avoiding getting values from or calling methods on long-living references). Pure-functional code is easier to reason about (e.g. via a simple <a href="https://mitpress.mit.edu/sites/default/files/sicp/full-text/sicp/book/node10.html">substitution mode of execution</a>) and thus more maintainable. My strategy for measuring this would be to designate certain (sub)packages/modules as purely functional.</li>
</ul>
<p>Bad (denominator stuff):</p>
<ul>
<li>total LOC</li>
</ul>
<p>What metrics correlate with code-refactoring success in your experience? These? Others?</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>If the equation doesn&rsquo;t render for you: &lt;img title=&ldquo;refactoring-objective-function&rdquo; alt=&ldquo;refactoring-objective-function.png&rdquo;
width=&ldquo;80%&rdquo;
src=&quot;/img/refactoring-objective-function.png&quot;
/&gt;&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/complexity-is-carbon/</id><link rel="alternate" href="https://donnywinston.com/posts/complexity-is-carbon/"/><title>Complexity Is Carbon</title><published>2022-07-06T11:37:32-04:00</published><updated>2022-07-06T11:37:32-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>Some energy infrastructure emits carbon. Some data infrastructure emits complexity.</p>
<p>There is essential carbon emission, like humans exhaling CO<sub>2</sub>. And there is incidental,
non-essential carbon emission, like humans burning fossil fuels.</p>
<p>There is essential complexity in data (and software code), like that pertaining to modeling your
subject matter and your application domain. And there is incidental complexity &ndash; &ldquo;incidental is
Latin for <em>your fault</em>.&rdquo; <sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>.</p>
<p>How can we eliminate incidental carbon emissions from energy infrastructure? Electrify everything.<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup></p>
<p>How can we eliminate incidental complexity emissions from data infrastructure? Triplify everything.<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup></p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Rich Hickey, &ldquo;Simple Made Easy&rdquo;, Strangle Loop conference (2011). (<a href="https://github.com/matthiasn/talk-transcripts/blob/9f33e07ac392106bccc6206d5d69efe3380c306a/Hickey_Rich/SimpleMadeEasy.md">transcript</a>).&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>S. Griffith, <em>Electrify: an optimist’s playbook for our clean energy future</em>. Cambridge, Massachusetts: The MIT Press, 2021.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>G. Schreiber and Y. Raimond, &ldquo;RDF 1.1 Primer.&rdquo; World Wide Web Consortium (W3C), Jun. 24, 2014. [Online]. Available: <a href="http://www.w3.org/TR/rdf11-primer/">http://www.w3.org/TR/rdf11-primer/</a>&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/these-are-all-just-persistent-urls-no/</id><link rel="alternate" href="https://donnywinston.com/posts/these-are-all-just-persistent-urls-no/"/><title>These Are All Just Persistent URLs, No?</title><published>2022-07-05T09:06:03-04:00</published><updated>2022-07-05T09:06:03-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<!-- KaTeX block: \\[ x \rightarrow y \\] -->
<!-- KaTeX inline: \\( x \rightarrow y \\) -->
<p>I am beginning to walk through each <a href="https://w3id.org/fair/fip/terms/FIP-Question">question</a> of the
<a href="https://w3id.org/fair/fip/terms/FIP-Ontology">FAIR Implementation Profile (FIP) Ontology</a>. My goal
is to construct and share a populated model of people&rsquo;s articulations &ndash; aka
<a href="https://w3id.org/fair/fip/terms/FIP-Declaration">declarations</a> &ndash; of choices they&rsquo;ve made or with
<a href="https://w3id.org/fair/fip/terms/FIP-No-Choice-Declaration">challenges</a> they face with regard to
addressing each question, as well as the
<a href="https://w3id.org/fair/fip/terms/considerations">considerations</a> they associate with any such choice
or challenge.</p>
<p>The first question for which <a href="https://donnywinston.com/posts/fip-question-f1/">I&rsquo;m seeking
declarations</a> is
<a href="https://w3id.org/fair/fip/terms/FIP-Question-F1-D">F1-D</a>:</p>
<blockquote>
<p>What globally unique, persistent, resolvable identifiers do you use for datasets?</p>
</blockquote>
<p>I&rsquo;ve gotten some great responses so far, mostly
about people choosing to use the <a href="https://donnywinston.com/posts/the-handle-system-of-persistent-identifiers/">Handle (incl.
DOI)</a> or
<a href="https://donnywinston.com/posts/the-ark-system-of-pids/">ARK</a> systems.</p>
<p>I got a great question from my former group-mate <a href="https://www.linkedin.com/in/shyam-dwaraknath/">Shyam
Dwaraknath</a>:</p>
<blockquote>
<p>In the end these are all just persistent URLs no?</p>
</blockquote>
<p>For all intents and purposes, <em>yes</em>. Practically, if you don&rsquo;t give someone a resolving<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> HTTP(S)
URL, such that they can Locate and retrieve the Resource given a Uniform Identifier (i.e.,
URI\(\implies\)URL), they should be able to straightforwardly construct one.</p>
<p>Handles and ARKs use their compact forms to communicate</p>
<ol>
<li>
<p>an intention of persistence, and (related to this)</p>
</li>
<li>
<p>a URL-construction protocol in case they are</p>
<p>(a) not communicated as URLs, or</p>
<p>(b) they are, but the URLs don&rsquo;t resolve.</p>
</li>
</ol>
<p>If you see e.g. <code>10.1038/sdata.2016.18</code> somewhere, the hope is you will grok that
the <code>\d+[\.\d+]+/.+</code> pattern (period-delimited numbers, then a <code>/</code>, then stuff) is likely a Handle,
so you will try putting <a href="https://doi.org/">https://doi.org/</a> or <a href="https://hdl.handle.net/">https://hdl.handle.net/</a> before it. There either need to
be well-known public Handle HTTP Proxy servers, or you search around for &ldquo;Handle proxy server&rdquo;.
You&rsquo;ll also see <code>doi:10.1038/sdata.2016.18</code> sometimes. Same principle. The hope is you know how to
URLify it trivially.</p>
<p>The form of an ARK is similar in intent. The hope is that if you see e.g. <code>ark:57802/dw0/agu/6045</code>
somewhere (for ARKs, the <code>ark:</code> prefix is part of the ID form, even in URL paths), you&rsquo;ll think
&ldquo;this ID is intended to be persistent &ndash; an archival resource key&rdquo; and &ldquo;I hope some name mapping
authority (NMA) is publicly resolving <code>ark:57802</code> IDs&rdquo;. The well-known public ARK HTTP Proxy is
<a href="https://n2t.net">https://n2t.net</a>, and e.g. <a href="https://n2t.net/ark:57802/dw0/agu/6045">https://n2t.net/ark:57802/dw0/agu/6045</a> passes through to
<a href="https://ns.polyneme.xyz/ark:57802/dw0/agu/6045">https://ns.polyneme.xyz/ark:57802/dw0/agu/6045</a> because <code>https://ns.polyneme.xyz</code> is registered
there as the NMA for the name assigning authority (NAA) <code>ark:57802</code>.</p>
<p>Other persistent ID systems that imply/offer HTTP URLs have tighter coupling to the DNS domain
responsible for resolving the IDs. Some of these systems are intended for general use, such as
<a href="https://purl.org/">https://purl.org/</a> and <a href="https://w3id.org/">https://w3id.org/</a>.</p>
<p>In these systems, prefixes are not <em>allocated</em> like with Handles or ARKs, and there is no emphasis
on prefixes being semantically opaque so as to increase the likelihood of continued commitment to
persistence if/when stewarding organizations change names. Rather, prefixes are <em>claimed</em>, like
<a href="http://purl.org/dc">http://purl.org/dc</a> (serving e.g. <a href="http://purl.org/dc/terms">http://purl.org/dc/terms</a>) and <a href="https://purl.org/dw">https://purl.org/dw</a> (serving
e.g. <a href="https://purl.org/dw/squirrel">https://purl.org/dw/squirrel</a>), or <a href="https://w3id.org/nmdc">https://w3id.org/nmdc</a> (where currently, all path
extensions, e.g. <a href="https://w3id.org/nmdc/nmdc-schema">https://w3id.org/nmdc/nmdc-schema</a>, resolve to the same page).</p>
<p>Other DNS-coupled systems are socially positioned as providing specific types of persistent
identifiers. Such systems include the World Wide Web Consortium (W3C) <a href="https://w3.org/">https://w3.org/</a> namespace
for standards (e.g. <a href="https://w3.org/ns/dcat">https://w3.org/ns/dcat</a>), the Open Researcher and Contributor ID (ORCID)
<a href="https://orcid.org/">https://orcid.org/</a> (e.g. <a href="https://orcid.org/0000-0002-8424-0604">https://orcid.org/0000-0002-8424-0604</a>), the International Generic
Sample Number (IGSN) <a href="https://igsn.org">https://igsn.org</a> (e.g. <a href="https://igsn.org/IEWFS0001">https://igsn.org/IEWFS0001</a>), and the Research
Organization Registry (ROR) <a href="https://ror.org/">https://ror.org/</a> (e.g. <a href="https://ror.org/02jbv0t02">https://ror.org/02jbv0t02</a>).</p>
<p>If/when any such special-purpose, domain-name-tied system cannot fulfill persistence, it is hoped
that there will be (a) an adopter organization and (b) sufficient signage (e.g. minimal maintenance
of the old domain as a static notice) to enable programmatic workarounds, like the case of the
Global Researcher Identifier Database (GRID) <a href="https://grid.ac/">https://grid.ac/</a> being passed to ROR for stewardship.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Any HTTP URL is technically resolv<em>able</em>. Whether it <em>actually</em> resolves in response to an
HTTP request is a matter of service.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/the-ark-system-of-pids/</id><link rel="alternate" href="https://donnywinston.com/posts/the-ark-system-of-pids/"/><title>The ARK System of Persistent Identifiers (PIDs)</title><published>2022-07-01T16:00:01-04:00</published><updated>2022-07-01T16:00:01-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>The <a href="https://arks.org/">Archival Resource Key (ARK) system</a> is an alternative to the <a href="https://donnywinston.com/posts/the-handle-system-of-persistent-identifiers/">Handle
system</a> to satisfy
<a href="https://w3id.org/fair/principles/terms/F1">FAIR&rsquo;s F1 Principle</a>.</p>
<p>Similar to the Handle system, naming authority for ARKs is distributed by allotting prefixes.
However, there is no &ldquo;pre-prefix&rdquo; administration via a small number of credentialed multi-primary
administrators, and there is currently no fee per allotted prefix, called a Name Assigning Authority
Number (NAAN).</p>
<p>Another difference between the Handle and ARK system is in distinguishing between a name assigning
authority (NAA), i.e. identifier minting, and a name mapping authority (NMA), i.e. identifier
resolution. With the Handle system, NAA and NMA functions are administered by the same organization.
With the ARK system, an NAA may be its own NMA, may migrate from one NMA to another, or may have
multiple NMA service providers.</p>
<p>For more on ARKs, see my post on <a href="https://donnywinston.com/posts/object-persistence-a-matter-of-service/">Object Persistence: A Matter of
Service</a>, the most recent
<a href="https://datatracker.ietf.org/doc/draft-kunze-ark/34/">specification</a>, and the <a href="https://arks.org/">ARK
Alliance</a> website.</p>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/the-handle-system-of-persistent-identifiers/</id><link rel="alternate" href="https://donnywinston.com/posts/the-handle-system-of-persistent-identifiers/"/><title>The Handle System of Persistent Identifiers</title><published>2022-06-30T08:52:40-04:00</published><updated>2022-06-30T08:52:40-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>The Handle system is a popular choice for the assignment and resolution of globally unique,
persistent identifiers. Governance is centralized with the <a href="https://www.dona.net/">DONA Foundation</a>,
and administration is distributed among so-called <a href="https://www.dona.net/mpas">Credentialed Multi-Primary Administrators
(MPAs)</a>, of which there are currently nine. You&rsquo;ve likely heard of at
least one MPA: the <a href="https://www.doi.org/">International DOI Foundation</a>.</p>
<p>Each MPA is assigned a number. The DOI Foundation has <code>10</code>. This is why all DOIs begin with <code>10.</code>.
Each MPA can in turn give a &ldquo;complete&rdquo; prefix (everything before the <code>/</code>) to a so-called &ldquo;naming
authority&rdquo;. The DOI Foundation<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> gave the <a href="https://www.nature.com/">Nature Publishing Group</a> (now
<a href="https://www.springernature.com/">Springer Nature</a>) <code>10.1038</code>, for example, who in turn can create
as many local names as they&rsquo;d like, such as <code>10.1038/sdata.2016.18</code>.</p>
<p>How do handles get resolved? Each handle prefix may have its own administrator, and administration
of handles is distributed, similar to the Domain Name System (DNS). The Handle system is compatible
with DNS, but does not require it. In practice, there are known public HTTP proxy servers such as
<a href="https://hdl.handle.net/">https://hdl.handle.net/</a> and <a href="https://doi.org/">https://doi.org/</a> that allow resolution of handles as URLs. Hence,
<a href="https://doi.org/10.1038/sdata.2016.18">https://doi.org/10.1038/sdata.2016.18</a> is resolvable.</p>
<p>Another big MPA is the <a href="https://cnri.net/">Corporation for National Research Initiatives (CNRI)</a>.
CNRI governed the Handle system before passing it off to the formed-for-this-purpose DONA Foundation
in 2015. Before this, CNRI assigned MPA-esque numbers to a bunch of organizations, and these
continue to be administered by the CNRI-as-MPA, even though it&rsquo;s assigned number is <code>20</code> now. For
example, CNRI assigned <code>1721.1</code> to <a href="https://web.mit.edu/">MIT</a>, which is used for it&rsquo;s
<a href="https://dspace.mit.edu/">DSpace</a> repository. My PhD thesis was assigned <code>1721.1/71495</code>. So,
<a href="https://hdl.handle.net/1721.1/71495">https://hdl.handle.net/1721.1/71495</a> and  <a href="https://doi.org/1721.1/71495">https://doi.org/1721.1/71495</a><sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup> (and
<a href="https://dspace.mit.edu/handle/1721.1/71495">https://dspace.mit.edu/handle/1721.1/71495</a>) all get you to it.</p>
<p>You can inspect Handle prefix records, which are analogous to DNS records, via
<a href="https://hdl.handle.net/">https://hdl.handle.net/</a>. For example, <a href="https://hdl.handle.net/1721.1">https://hdl.handle.net/1721.1</a> lets you know that this
prefix is administered by MIT DSpace via the CNRI MPA (see the <code>/20.ADMIN</code>-containing <code>HS_ADMIN</code>
entry).</p>
<p>So how do you start minting and resolving Handles?</p>
<p>Become a credentialed MPA? I don&rsquo;t know, that seems hard for an individual researcher. There are
only nine <a href="https://www.dona.net/mpas">credentialed by DONA</a>.</p>
<p>Request a completed prefix from an existing MPA, e.g. something that matches <code>10.\d+</code> from the DOI
foundation? Yes, you can do that. MPAs typically charge registration and annual service fees per
allotted prefix (i.e., the whole <code>.</code>-delimited number before the <code>/</code> in a handle). In the case of
the DOI Foundation, they delegate to e.g. <a href="https://www.crossref.org/">Crossref</a> to assign <code>10.</code>
prefixes. In this case, for additional <a href="https://www.crossref.org/fees/">fees</a>, Crossref will resolve
identifiers for you (beyond assigning you a prefix to mint as many as you&rsquo;d like).</p>
<p>A final method is to find a service provider that has a complete prefix and will let you mint
handles under their prefix, or will mint them for you. This is the most typical route for
researchers. For example, <a href="https://zenodo.org">Zenodo</a> got <code>10.5281</code> from
<a href="https://datacite.org/">DataCite</a> (another <code>10.\d+</code> service provider the DOI Foundation delegates
to), and they&rsquo;ll give you a full handle when you upload stuff to <a href="https://zenodo.org">https://zenodo.org</a>.
<a href="https://www.researchequals.com">ResearchEquals</a> got <code>10.53962</code> from CrossRef, and they&rsquo;ll give you
one for anything you put on <a href="https://www.researchequals.com/">https://www.researchequals.com/</a>. And of course, journal publishers
typically give you one when you publish an article with them.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Actually, one of its <a href="https://www.doi.org/registration_agencies.html">registration agencies
(RAs)</a>, <a href="https://www.crossref.org/">Crossref</a>. The
DOI Foundation doesn&rsquo;t give out prefixes directly. Individuals request prefixes from RAs, not from
the DOI Foundation. Thank you <a href="https://twitter.com/epentz">Ed Pentz</a> for clarifying this. [footnote
added 2022-07-01]&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>Wait, what? It&rsquo;s a DOI? Nope. DOIs are Handles that start with <code>10.</code>. <a href="https://doi.org/">https://doi.org/</a> is
(currently) a public HTTP proxy server that resolves all Handles, regardless of prefix.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/fip-question-f1/</id><link rel="alternate" href="https://donnywinston.com/posts/fip-question-f1/"/><title>What globally unique, persistent, resolvable identifiers do you use for datasets?</title><published>2022-06-29T10:25:41-04:00</published><updated>2022-06-29T10:25:41-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>What globally unique, persistent, resolvable identifiers do you use for datasets?
I want to know about either (a) a challenge you&rsquo;re facing, and what you&rsquo;ve tried; or (2) a choice you recently made, and how it&rsquo;s going.</p>
<p>Context: For each question of the <a href="https://w3id.org/fair/fip/terms/FIP-Ontology">FAIR Implementation Profile (FIP) Ontology</a>, I want to collect and discuss folks&rsquo; choices and challenges on my podcast, <a href="https://podcast.polyneme.xyz/">Machine-Centric Science</a>.</p>
<p>Please email me at <a href="mailto:podcast@polyneme.xyz">podcast@polyneme.xyz</a> with:</p>
<ul>
<li>an email <em>subject</em> of either (a) &ldquo;FIP F1-D challenge&rdquo; or (b) &ldquo;FIP F1-D choice&rdquo;,</li>
<li>(preferred) an email <em>attachment</em> of a one-minute-max audio recording so that you can asynchronously mini-guest on the podcast 🙂,</li>
<li>an email <em>body</em> that at minimum says either (a) &ldquo;<a href="https://creativecommons.org/share-your-work/public-domain/cc0/">CC0</a>&rdquo; or (b) &ldquo;<a href="https://creativecommons.org/licenses/by/4.0/">CC-BY</a> &hellip;&rdquo;, where &ldquo;&hellip;&rdquo; is how you wish to be attributed (e.g. your name, your name and location, your name and affiliation, etc.).</li>
</ul>
<p>You may also write out your challenge or choice in the email body, in which case I will read it
aloud. If you choose a CC0 license, I will by default keep you anonymous unless you give me attribution info.</p>
<p>Extra credit:</p>
<ul>
<li>Use an <a href="https://orcid.org/">Open Researcher and Contributor ID (ORCiD)</a> for your attribution info. Obtaining one is free.</li>
<li>If applicable, give me an <a href="https://vocabs.ardc.edu.au/viewById/316#browse-tree">ANZSRC-2020-FoR code</a> (the browsable tree takes a second or two to load) for your typical field of research, whether two-digit (most broad), four-digit (narrower), or six-digit (narrowest in this scheme).</li>
</ul>
<p>If you really want to send me multiple choices and/or challenges, please send them as separate emails. Also, feel free to spread this message far and wide. Thanks!</p>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/known-item-vs-exploratory-search/</id><link rel="alternate" href="https://donnywinston.com/posts/known-item-vs-exploratory-search/"/><title>Findability → Known-Item Search, Discoverability → Exploratory Search?</title><published>2022-06-28T10:15:40-04:00</published><updated>2022-06-28T10:15:40-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>I keep confusing findability and discoverability. It seems that findability is often equated to
<a href="https://en.wikipedia.org/wiki/Known-item_search">known-item search</a>, and discoverability to
<a href="https://en.wikipedia.org/wiki/Exploratory_search">exploratory search</a>.</p>
<p>Known-item search is compatible with &ldquo;instant search&rdquo;, aka search-as-you-type interfaces.
Exploratory search is compatible with &ldquo;autocomplete&rdquo; (incl. re-spelling, infix matching, synonym
substitution, etc.) interfaces.</p>
<p>Recommendation can be a part of exploratory search, i.e. bundled in response to a user&rsquo;s pulling for
relevant information. It can also be pushed independently of deliberately registered search intent
&ndash; via notifications, email digests, etc.</p>
<p>Does the latter activity &ndash; the pushy one &ndash;  &ldquo;count&rdquo; for discoverability? I can imagine such
activity being framed as periodic re-running of an exploratory-search query on behalf of a user,
with query-independent factors for retrieval and ranking being varied over time.</p>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/ontology-vs-data-model/</id><link rel="alternate" href="https://donnywinston.com/posts/ontology-vs-data-model/"/><title>Is an Ontology 'better' than a Relational Data Model?</title><published>2022-06-27T10:40:08-04:00</published><updated>2022-06-27T10:40:08-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>Is an ontology &ldquo;better&rdquo; than a relational data model?
&ldquo;More expressive power&rdquo; doesn&rsquo;t always mean &ldquo;better&rdquo;.
However, ontologies allow you to ratchet up power while <a href="https://en.wikipedia.org/wiki/Rule_of_least_power">keeping logic in data structures</a>.</p>
<p>By &ldquo;relational data model&rdquo;, folks typically mean &ldquo;SQL model&rdquo;.
In the RDF world, this is roughly on par with a SHACL model, i.e. a model that expresses constraints on the shapes of entities and on the so-called &ldquo;primitive&rdquo; types of their properties/attributes/columns/fields (string, boolean, integer, etc.).
Both SHACL and SQL can set the ranges of properties to be &ldquo;reference&rdquo; types, which is indirect in SQL through primitive-typed (usually an integer or string) foreign keys.</p>
<p>An ontology language allows for more expressive data modeling than shape and attribute validation, while staying at the level of declarative data description.
In the RDF world, OWL lets you express notions of commonality and variability familiar from object-oriented programming such as classes, subclasses, and properties &ndash; you don&rsquo;t need a software-defined object-relational mapping (ORM) layer.
You can also express certain constraints for and between classes, entities (individuals), and properties.</p>
<p>There&rsquo;s nothing you can express using ontologies that you cannot also express using a SQL data model plus a general programming language, or just a programming language.
So why declaratively model data at all? Why SQL then and not just CSV files if you&rsquo;re going to load the data into Python et al. anyway?
The rule of least power (<a href="https://en.wikipedia.org/wiki/Rule_of_least_power">https://en.wikipedia.org/wiki/Rule_of_least_power</a>).
Ontology languages give you more expressive power than shape-constraint languages while reducing the risk of non-reusability of your modeling logic for unforeseen applications.</p>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/leave-beacons/</id><link rel="alternate" href="https://donnywinston.com/posts/leave-beacons/"/><title>Leave Beacons in Code</title><published>2022-06-24T11:51:48-04:00</published><updated>2022-06-24T11:51:48-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>Leave beacons in your code. I would have avoided a silly error if a variable named <code>xgb_train_data</code>
would have been named, for example, <code>xgb_train_data_filepath</code> instead.</p>
<p>When you can&rsquo;t leave globally unique, persistent, resolvable identifiers (GUPRIs), mind your beacons.</p>
<p>References:</p>
<ul>
<li>F. Hermans, <em>The Programmer’s brain: what every programmer needs to know about cognition</em>, pp28-30. Shelter Island, NY: Manning, 2021.</li>
<li>M. Crosby, J. Scholtz, and S. Wiedenbeck, &ldquo;The roles beacons play in comprehension for novice and expert programmers,&rdquo; Jul. 2002, [Online]. Available: <a href="https://www.researchgate.net/publication/228592285_The_roles_beacons_play_in_comprehension_for_novice_and_expert_programmers">https://www.researchgate.net/publication/228592285_The_roles_beacons_play_in_comprehension_for_novice_and_expert_programmers</a></li>
</ul>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/citation-file-format/</id><link rel="alternate" href="https://donnywinston.com/posts/citation-file-format/"/><title>CFF for Machine-Actionable Software Citations</title><published>2022-06-23T10:43:21-04:00</published><updated>2022-06-23T10:43:21-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>Add a <code>CITATION.cff</code> file to your git repository. The <a href="https://citation-file-format.github.io/">Citation File
Format</a> is automatically rendered on GitHub and usable by
Zenodo and Zotero.</p>
<p>Already have a DOI? <a href="https://github.com/citation-file-format/citation-file-format#tools-to-work-with-citationcff-files-wrench">Let&rsquo;s see about a DOI-to-CFF
tool</a>.
Looks like there&rsquo;s <a href="https://github.com/citation-file-format/doi2cff">doi2cff</a>, but it&rsquo;s currently
restricted to DOIs on Zenodo that are tagged as software releases.</p>
<p>So, this could work:</p>
<pre tabindex="0"><code>pip install git+https://github.com/citation-file-format/doi2cff
doi2cff init 10.5281/zenodo.6591863
</code></pre><pre tabindex="0"><code>CITATION.cff file has been written
</code></pre><p>Cool. But what if you want to cite something else? Let&rsquo;s go from DOI to BibTeX, and BibTeX to CFF.</p>
<p>Let&rsquo;s get BibTeX via <a href="https://citation.crosscite.org/docs.html#sec-4">content negotiation</a>:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bash" data-lang="bash"><span style="display:flex;"><span>curl -LH <span style="color:#e6db74">&#34;Accept: application/x-bibtex&#34;</span> <span style="color:#ae81ff">\
</span></span></span><span style="display:flex;"><span><span style="color:#ae81ff"></span>    https://doi.org/10.5281/zenodo.5570279 <span style="color:#ae81ff">\
</span></span></span><span style="display:flex;"><span><span style="color:#ae81ff"></span>    &gt;&gt; refs.bib
</span></span></code></pre></div><pre tabindex="0"><code># cat refs.bib
@article{https://doi.org/10.5281/zenodo.5570279,
  doi = {10.5281/ZENODO.5570279},
  url = {https://zenodo.org/record/5570279},
  author = {Canon, Shane and Christianson, Danielle and Duncan, William and Eloe-Fadrosh, Emiley and Fagnan, Kjiersten and Hays, David and Huntemann, Marcel and Lebedeva, Sofya and Miller, Kayd and Miller, Mark and Mouncey, Nigel and Mungall, Chris and Reddy, Tbk and Rudolph, Marisa and Sarrafan, Setareh and Sundaramurthi, Jagadish Chandrabose and Unni, Deepak and Vangay, Pajau and Wood-Charlson, Elisha and Ahmed, Faiza and Baumes, Jeffrey and Davis, Brandon and Anubhav, Fnu and Borkhum, Mark and Bramer, Lisa and Corilo, Yuri and Lipton, Mary and Mans, Douglas and McCue, Lee Ann and Millard, David and Piehowski, Paul and Prymolenna, Anastasiya and Purvine, Samuel and Richardson, Rachel and Smith, Montana and Stratton, Kelly and Babinski, Michal and Chain, Patrick and Davenport, Karen and Flynn, Mark and Hu, Bin and Kelliher, Julia and Li, Po-E and Lo, Chien-Chi and Jackson, Elais Player and Shakya, Migun and Xu, Yan and Drake, Meghan and Martin, Stanton and Wilson, Bruce and Winston, Donny},
  keywords = {microbiome, data science, data infrastructure, science gateway},
  title = {The National Microbiome Data Collaborative: a data science ecosystem for microbiome research},
  publisher = {Zenodo},
  year = {2021},
  copyright = {Creative Commons Attribution 4.0 International}
}
</code></pre><p>Sweet. There is a command-line tool for <a href="https://pypi.org/project/cffconvert/">converting CFF to other common
formats</a>, but that&rsquo;s not what we want here. Ah, here we go:
<a href="https://github.com/monperrus/bibtexbrowser/">bibtex-to-cff</a>.</p>
<p>To save you a bit of hassle, I&rsquo;ve packaged this PHP tool as a Docker image,
<code>polyneme/bibtex-to-cff</code>. Also, it can&rsquo;t (currently) handle URLs as BibTeX entry ids, so for this
case I changed <code>@article{https://doi.org/10.5281/zenodo.5570279,</code> in the above example to
<code>@article{10.5281.zenodo.5570279,</code> (it seems fine with periods).</p>
<p>Here goes:</p>
<pre tabindex="0"><code>docker run --rm \
    -v $(pwd):/usr/src/app/scratch \
    polyneme/bibtex-to-cff \
    scratch/refs.bib --id 10.5281.zenodo.5570279 \
    &gt; CITATION.cff
</code></pre><p>Done. Add this to the root of your git repo, and congratulate yourself for including
machine-actionable citation metadata with your software.</p>
<p>You can build the image yourself by cloning the <code>monperrus/bibtexbrowser</code> GitHub repo, adding the
following to a new <code>Dockerfile</code> in the repo directory:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-Dockerfile" data-lang="Dockerfile"><span style="display:flex;"><span><span style="color:#66d9ef">FROM</span><span style="color:#e6db74"> php:7.4-cli</span><span style="color:#960050;background-color:#1e0010">
</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span><span style="color:#66d9ef">COPY</span> . /usr/src/app<span style="color:#960050;background-color:#1e0010">
</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span><span style="color:#66d9ef">WORKDIR</span><span style="color:#e6db74"> /usr/src/app</span><span style="color:#960050;background-color:#1e0010">
</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span><span style="color:#66d9ef">ENTRYPOINT</span> [ <span style="color:#e6db74">&#34;php&#34;</span>, <span style="color:#e6db74">&#34;bibtex-to-cff.php&#34;</span> ]<span style="color:#960050;background-color:#1e0010">
</span></span></span></code></pre></div><p>, and then <code>docker build -t bibtex-to-cff .</code>.</p>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/pagerank-of-linked-open-vocabularies-lov/</id><link rel="alternate" href="https://donnywinston.com/posts/pagerank-of-linked-open-vocabularies-lov/"/><title>PageRank of Linked Open Vocabularies (LOV)</title><published>2022-06-15T13:28:17-04:00</published><updated>2022-06-15T13:28:17-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>Datasets are easier to reuse if they use standards that are well-established, particularly in a given domain.</p>
<p>A first approach is to ask around &ndash; ask people with whom you coauthor , people you trust in your field, etc.</p>
<p>A follow-on approach is to examine the &ldquo;graph reputation&rdquo; of relevant standards, particularly if they may be represented as resources with outbound links. We can use the PageRank algorithm, just like Google uses it to index the web of documents.</p>
<p>An an example, here I outline an initial approach to find the &ldquo;most reputable&rdquo; of <a href="https://lov.linkeddata.es/">Linked Open Vocabularies&rsquo;</a> 778 vocabularies.</p>
<p>My starting point is having the API responses for each vocabulary so that <code>lov</code> is a list of <code>dict</code>s, each with keys <code>url: str</code> and <code>api_response: dict</code>.</p>
<ol>
<li>Collect all outbound links:</li>
</ol>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#66d9ef">for</span> entry <span style="color:#f92672">in</span> lov:
</span></span><span style="display:flex;"><span>    entry[<span style="color:#e6db74">&#34;outbound_links&#34;</span>] <span style="color:#f92672">=</span> entry<span style="color:#f92672">.</span>get(<span style="color:#e6db74">&#34;outbound_links&#34;</span>, set())
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">for</span> version <span style="color:#f92672">in</span> entry[<span style="color:#e6db74">&#34;api_response&#34;</span>]<span style="color:#f92672">.</span>get(<span style="color:#e6db74">&#34;versions&#34;</span>, {}):
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">for</span> field, value <span style="color:#f92672">in</span> version<span style="color:#f92672">.</span>items():
</span></span><span style="display:flex;"><span>            <span style="color:#66d9ef">if</span> field<span style="color:#f92672">.</span>startswith(<span style="color:#e6db74">&#34;rel&#34;</span>) <span style="color:#f92672">and</span> isinstance(value, list):
</span></span><span style="display:flex;"><span>                entry[<span style="color:#e6db74">&#34;outbound_links&#34;</span>] <span style="color:#f92672">|=</span> {v <span style="color:#66d9ef">for</span> v <span style="color:#f92672">in</span> value}
</span></span></code></pre></div><ol start="2">
<li>Prepare a stream of self_link, outbound_link pairs:</li>
</ol>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#66d9ef">with</span> open(<span style="color:#e6db74">&#34;lov-outlinks.csv&#34;</span>,<span style="color:#e6db74">&#39;w&#39;</span>) <span style="color:#66d9ef">as</span> f:
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">for</span> entry <span style="color:#f92672">in</span> lov:
</span></span><span style="display:flex;"><span>        url <span style="color:#f92672">=</span> entry[<span style="color:#e6db74">&#34;url&#34;</span>]
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">for</span> link_url <span style="color:#f92672">in</span> entry[<span style="color:#e6db74">&#34;outbound_links&#34;</span>]:
</span></span><span style="display:flex;"><span>            f<span style="color:#f92672">.</span>write(<span style="color:#e6db74">f</span><span style="color:#e6db74">&#34;</span><span style="color:#e6db74">{</span>url<span style="color:#e6db74">}</span><span style="color:#e6db74">,</span><span style="color:#e6db74">{</span>link_url<span style="color:#e6db74">}</span><span style="color:#ae81ff">\n</span><span style="color:#e6db74">&#34;</span>)
</span></span></code></pre></div><ol start="3">
<li>In a file, e.g. <code>lov_pagerank.py</code>:<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></li>
</ol>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#66d9ef">if</span> __name__ <span style="color:#f92672">==</span> <span style="color:#e6db74">&#34;__main__&#34;</span>: <span style="color:#75715e"># for `spark-submit`</span>
</span></span><span style="display:flex;"><span>    sc <span style="color:#f92672">=</span> SparkContext(appName<span style="color:#f92672">=</span><span style="color:#e6db74">&#34;LovRankings&#34;</span>)
</span></span><span style="display:flex;"><span>    match_data <span style="color:#f92672">=</span> sc<span style="color:#f92672">.</span>textFile(<span style="color:#e6db74">&#34;lov-outlinks.csv&#34;</span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    xs <span style="color:#f92672">=</span> match_data<span style="color:#f92672">.</span>map(get_linking)<span style="color:#f92672">.</span>groupByKey()<span style="color:#f92672">.</span>mapValues(initialize_for_voting)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">for</span> i <span style="color:#f92672">in</span> range(<span style="color:#ae81ff">20</span>):
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">if</span> i <span style="color:#f92672">&gt;</span> <span style="color:#ae81ff">0</span>:
</span></span><span style="display:flex;"><span>            xs <span style="color:#f92672">=</span> sc<span style="color:#f92672">.</span>parallelize(zs<span style="color:#f92672">.</span>items())
</span></span><span style="display:flex;"><span>        acc <span style="color:#f92672">=</span> dict(xs<span style="color:#f92672">.</span>mapValues(empty_ratings)<span style="color:#f92672">.</span>collect())
</span></span><span style="display:flex;"><span>        zs <span style="color:#f92672">=</span> xs<span style="color:#f92672">.</span>aggregate(acc, allocate_points, combine_ratings)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    ratings <span style="color:#f92672">=</span> [(k, v[<span style="color:#e6db74">&#34;rating&#34;</span>]) <span style="color:#66d9ef">for</span> k, v <span style="color:#f92672">in</span> zs<span style="color:#f92672">.</span>items()]
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">for</span> i, (vocab, rating) <span style="color:#f92672">in</span> enumerate(
</span></span><span style="display:flex;"><span>        sorted(ratings, key<span style="color:#f92672">=</span><span style="color:#66d9ef">lambda</span> x: x[<span style="color:#ae81ff">1</span>], reverse<span style="color:#f92672">=</span><span style="color:#66d9ef">True</span>)[:<span style="color:#ae81ff">100</span>]
</span></span><span style="display:flex;"><span>    ):
</span></span><span style="display:flex;"><span>        print(<span style="color:#e6db74">&#34;</span><span style="color:#e6db74">{:3}</span><span style="color:#ae81ff">\t</span><span style="color:#e6db74">{:6}</span><span style="color:#ae81ff">\t</span><span style="color:#e6db74">{}</span><span style="color:#e6db74">&#34;</span><span style="color:#f92672">.</span>format(i <span style="color:#f92672">+</span> <span style="color:#ae81ff">1</span>, round(log2(rating <span style="color:#f92672">+</span> <span style="color:#ae81ff">1</span>), <span style="color:#ae81ff">1</span>), vocab))
</span></span></code></pre></div><p>where, above it:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#f92672">from</span> math <span style="color:#f92672">import</span> log2
</span></span><span style="display:flex;"><span><span style="color:#f92672">from</span> pyspark <span style="color:#f92672">import</span> SparkContext
</span></span><span style="display:flex;"><span><span style="color:#f92672">from</span> toolz <span style="color:#f92672">import</span> assoc
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">def</span> <span style="color:#a6e22e">get_linking</span>(line):
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> line<span style="color:#f92672">.</span>split(<span style="color:#e6db74">&#34;,&#34;</span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">def</span> <span style="color:#a6e22e">initialize_for_voting</span>(outlinks):
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> {<span style="color:#e6db74">&#34;outlinks&#34;</span>: outlinks, <span style="color:#e6db74">&#34;n_outlinks&#34;</span>: len(outlinks), <span style="color:#e6db74">&#34;rating&#34;</span>: <span style="color:#ae81ff">100</span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">def</span> <span style="color:#a6e22e">empty_ratings</span>(d):
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> assoc(d, <span style="color:#e6db74">&#34;rating&#34;</span>, <span style="color:#ae81ff">0</span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">def</span> <span style="color:#a6e22e">allocate_points</span>(acc, new):
</span></span><span style="display:flex;"><span>    _, v <span style="color:#f92672">=</span> new
</span></span><span style="display:flex;"><span>    boost <span style="color:#f92672">=</span> v[<span style="color:#e6db74">&#34;rating&#34;</span>] <span style="color:#f92672">/</span> (v[<span style="color:#e6db74">&#34;n_outlinks&#34;</span>] <span style="color:#f92672">+</span> <span style="color:#ae81ff">0.01</span>)
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">for</span> link <span style="color:#f92672">in</span> v[<span style="color:#e6db74">&#34;outlinks&#34;</span>]:
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">if</span> link <span style="color:#f92672">not</span> <span style="color:#f92672">in</span> acc<span style="color:#f92672">.</span>keys():
</span></span><span style="display:flex;"><span>            acc[link] <span style="color:#f92672">=</span> {<span style="color:#e6db74">&#34;outlinks&#34;</span>: [], <span style="color:#e6db74">&#34;n_outlinks&#34;</span>: <span style="color:#ae81ff">0</span>}
</span></span><span style="display:flex;"><span>        link_rating <span style="color:#f92672">=</span> acc<span style="color:#f92672">.</span>get(link, {})<span style="color:#f92672">.</span>get(<span style="color:#e6db74">&#34;rating&#34;</span>, <span style="color:#ae81ff">0</span>)
</span></span><span style="display:flex;"><span>        acc[link][<span style="color:#e6db74">&#34;rating&#34;</span>] <span style="color:#f92672">=</span> link_rating <span style="color:#f92672">+</span> boost
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> acc
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">def</span> <span style="color:#a6e22e">combine_ratings</span>(a, b):
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">for</span> k, v <span style="color:#f92672">in</span> b<span style="color:#f92672">.</span>items():
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">try</span>:
</span></span><span style="display:flex;"><span>            a[k][<span style="color:#e6db74">&#34;rating&#34;</span>] <span style="color:#f92672">=</span> a[k][<span style="color:#e6db74">&#34;rating&#34;</span>] <span style="color:#f92672">+</span> b[k][<span style="color:#e6db74">&#34;rating&#34;</span>]
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">except</span> <span style="color:#a6e22e">KeyError</span>:
</span></span><span style="display:flex;"><span>            a[k] <span style="color:#f92672">=</span> v
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> a
</span></span></code></pre></div><p>And here is the output of <code>spark-submit lov_pagerank.py</code>:</p>
<pre tabindex="0"><code>  1       10.6  http://purl.org/dc/elements/1.1/
  2       10.3  http://www.w3.org/2000/01/rdf-schema#
  3       10.3  http://www.w3.org/1999/02/22-rdf-syntax-ns#
  4        9.0  http://www.w3.org/2004/02/skos/core#
  5        8.9  http://purl.org/dc/terms/
  6        6.3  http://xmlns.com/foaf/0.1/
  7        6.3  http://www.w3.org/2002/07/owl#
  8        6.3  http://purl.org/dc/dcmitype/
...
</code></pre><p>We can see at a glance the &ldquo;most reputable&rdquo; vocabularies, and they don&rsquo;t surprise me. What may be more helpful is to collect candidate vocabularies for your domain and focus on their relative scores in order to gauge whether any are &ldquo;well-established&rdquo; in a sense. Even more helpful may be to include multiple &ldquo;types&rdquo; of resources &ndash; with standards linking to and being linked from various databases and policies. <a href="https://fairsharing.org/">FAIRSharing</a> seems like it could eventually support open investigation of the latter kind.</p>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Adapted from J. T. Wolohan, <em>Mastering large datasets with Python: parallelize and distribute your Python code</em>. Shelter Island, NY: Manning Publications Co, 2019.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/lean-web/</id><link rel="alternate" href="https://donnywinston.com/posts/lean-web/"/><title>Lean Web - Principles of Lean Thinking applied to Web Development</title><published>2022-06-09T12:28:09-04:00</published><updated>2022-06-09T12:28:09-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p><a href="https://en.wikipedia.org/wiki/Lean_manufacturing">Lean manufacturing</a> aims to reduce waste in
production processes and to reduce response times to consumers from producers.</p>
<p>Womack and Jones<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> authored five key principles for lean thinking in the context of manufacturing:</p>
<ol>
<li><em>Value</em>: Identify the value of a product to a consumer.</li>
<li><em>Value Stream</em> - Identify the minimal process (steps, time, information, material) to produce the value.</li>
<li><em>Flow</em>: Make production flow through the steps.</li>
<li><em>Pull</em>: Pull between the steps (rather than pushing intermediate &ldquo;inventory&rdquo; that may not be used).</li>
<li><em>Perfection</em>: Reduce the number of steps and the amount of time, information, and material needed for production.</li>
</ol>
<p><a href="https://en.wikipedia.org/wiki/Lean_software_development">Lean software development</a> aims to adapt
lean thinking to software development.</p>
<p>The Poppendiecks<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup> authored seven principles that don&rsquo;t directly provide qualified references to
Womack and Jones&rsquo; principles. Here, I attempt to align their principles of software development to
the framework and terminology of Womack and Jones&rsquo;s lean thinking principles:</p>
<ol>
<li>
<p><em>Evaluate Late</em>: Decide on the end-value of a product to a consumer as late as possible. There is
one value stream option per end-value option.</p>
</li>
<li>
<p><em>Mind Value Stream Multiplicity and Looping</em>: With one value stream per end-value hypothesis, can
value streams share structure to eliminate waste? Value streams may have loops (iterations) that
must be particularly lean to support a high learning rate.</p>
</li>
<li>
<p><em>Flow</em>: Make production flow for fast delivery and thus for rapid learning given the presence of
loops in a value stream.</p>
</li>
<li>
<p><em>Pull</em>: Pulling between steps empowers the team.</p>
</li>
<li>
<p><em>Perfection</em>: Continuous refactoring facilitates ensuring integrity and optimizing the whole.</p>
</li>
</ol>
<p>Now, I can couch a conceptualization of lean principles for web development, i.e. Lean Web
principles, with clear lineage to the lean thinking principles for manufacturing and through lean
principles for software development:</p>
<ol>
<li>
<p><em>Evaluate Resources Late</em>: Deal in data for as long as possible. Apply transformation logic later
&ndash; there are many applications. Apply presentation logic even later &ndash; there are many modes of
consumption for an application. See also: Perlis&rsquo; epigram<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup>: &ldquo;Functions delay binding;
data structures induce binding. Moral: Structure data late in the programming process.&rdquo;</p>
</li>
<li>
<p><em>Mind Value Stream Multiplicity and Looping</em>: Eliminate waste in process steps, time, information
(configuration / manual signaling), and material (code, data, storage/compute infrastructure).
Can web dev processes share logic? Pay particular attention to waste in value stream loops
(iterations).</p>
</li>
<li>
<p><em>Flow</em>: Choose continuous integration (CI) and continuous deployment (CD).</p>
</li>
<li>
<p><em>Pull</em>: Choose distributed version control for code, data, and storage/compute
infrastructure (as code).</p>
</li>
<li>
<p><em>Perfection</em>: Can it all fit in your head, to facilitate conceptual integrity and strategic
refactoring?</p>
</li>
</ol>
<p>Finally, I am well aware of <a href="https://gomakethings.com/about/">Chris Ferdinandi</a> and his excellent
exposition on <a href="https://leanweb.dev/">Lean Web thinking</a> and associated <a href="https://leanweb.dev/ebook/lean-web-principles/">three
principles</a>. Here&rsquo;s how I think his principles may
map to those above:</p>
<ol>
<li>
<p><em>Embrace the Platform</em>: This relates to evaluating resources late. Can you exchange data as RDF
(e.g. serialized as JSON-LD) over HTTP? Can you exchange logic for inference and validation as RDF
data as well, via the RDFS/OWL and SHACL standards of the Web platform? Can you exchange logic for
presentation as HTML (templates) and CSS? If your front-end requires operational processes, can that
be done using vanilla JavaScript?</p>
</li>
<li>
<p><em>Small and Modular</em>: This relates to minding value stream multiplicity and looping. There is a
lot of opportunity to eliminate waste and reuse functionality (especially functionality provided by
the platform!).</p>
</li>
<li>
<p><em>The Web is for Everyone</em>: This relates to evaluating resources late (why prematurely optimize
for applications and consumption use cases and thus exclude potential stakeholders?) and pulling
(empower people by encouraging them to pull rather than telling them to pick up whatever is pushed).</p>
</li>
</ol>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>J. P. Womack and D. T. Jones, <em>Lean thinking: banish waste and create wealth in your corporation</em>. New York, NY: Simon &amp; Schuster, 1996.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>M. Poppendieck and T. Poppendieck, <em>Lean software development: an agile toolkit</em>. Boston: Addison-Wesley, 2003.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>A. J. Perlis, “Special Feature: Epigrams on programming,” SIGPLAN Not., vol. 17, no. 9,
pp. 7–13, Sep. 1982, doi: 10.1145/947955.1083808. Online at
<a href="http://www.cs.yale.edu/homes/perlis-alan/quotes.html">http://www.cs.yale.edu/homes/perlis-alan/quotes.html</a>.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/hallucinating-datasets-across-epochal-time/</id><link rel="alternate" href="https://donnywinston.com/posts/hallucinating-datasets-across-epochal-time/"/><title>Hallucinating Datasets Across Epochal Time</title><published>2022-06-09T10:46:13-04:00</published><updated>2022-06-09T10:46:13-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>&ldquo;Dataset&rdquo; is a derived notion, a psychological construct, where &ldquo;versions&rdquo; of the dataset are a succession of values that we perceive to be causally related. &ldquo;Dataset&rdquo; is a side effect.</p>
<p>Consider Rich Hickey&rsquo;s epochal time model, which I have <a href="https://donnywinston.com/posts/the-materials-paradigm-and-epochal-time/">written about previously</a>:</p>
<p><img
src="https://donnywinston.com/img/hickey_are-we-there-yet_epochal-time-model.jpg"
alt="Epochal time model, from Rich Hickey's 'Are We There Yet' talk"
title="Epochal time model, from Rich Hickey's 'Are We There Yet' talk"
width="70%"></p>
<p>Identity is a derived notion, a collecting of values and calling each value a “state”. A state is just a labeling of a value for an identity at a point in &ldquo;time&rdquo;. The succession of states is the identity. Identity is a side effect of choosing a timeline of value succession.</p>
<p>Consider drawing a dotted line on the figure above that encompasses all of the immutable values (boxes) and all of the ovals (pure functions). This may be considered the <a href="https://donnywinston.com/elements_of_clojure/#functional">functional</a> <a href="https://donnywinston.com/elements_of_clojure/#phase">phase</a> of a <a href="https://donnywinston.com/elements_of_clojure/#process">process</a>, where data is <a href="https://donnywinston.com/elements_of_clojure/#transform">transformed</a> (<a href="https://donnywinston.com/elements_of_clojure/#accrete">accreted</a>, <a href="https://donnywinston.com/elements_of_clojure/#reduce">reduced</a>, or <a href="https://donnywinston.com/elements_of_clojure/#reshape">reshaped</a>), separate from the <a href="https://donnywinston.com/elements_of_clojure/#operational">operational</a> (i.e., <a href="https://donnywinston.com/elements_of_clojure/#pull">pull</a> or <a href="https://donnywinston.com/elements_of_clojure/#push">push</a>) phases of that process.</p>
<p>With this perspective, a process&rsquo;s functional phase also suggests labeling its succession of values. Each value may be called “data”. That is, a value becomes data when it is an input or output for a process. Depending on where the value is in the topology of the process, it may be considered “raw data” or “derived data” with respect to that process.</p>
<p>What, then, may we call the succession of data values for the timeline of a process &ndash; what is the &ldquo;identity&rdquo; here, where successive values are &ldquo;states&rdquo;? In the <a href="https://github.com/OpenLineage/OpenLineage/blob/3808d4ab7dc0229c0f7997eda49fc00ab3947f26/spec/OpenLineage.md#openlineage-spec">OpenLineage specification</a>, the name for this identity is &ldquo;dataset&rdquo;. In the <a href="https://github.com/MarquezProject/marquez/blob/c53f66c0a3f748742eea69db5f3c287c63c929a9/docs/index.md#data-model">Marquez reference implementation</a> of OpenLineage, a &ldquo;dataset version&rdquo; is a read-only, immutable version of a dataset, i.e. an immutable value in the sense of the epochal time model.</p>
<p>Thus, &ldquo;dataset versions&rdquo; are the states, and &ldquo;datasets&rdquo; the identities, for the succession of values associated with the functional phase of a process.  To the extent that an immutable digital object &ndash; a value &ndash; is useful in the functional phase of one or more processes, it is useful to identify it as a &ldquo;dataset version&rdquo;. To the extent that a succession of such values, which we perceive to be causally related via a process, is useful in whole or in part to various timelines of various processes, it is useful to identify this succession as a &ldquo;dataset&rdquo;.</p>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/consistent-valid-accurate/</id><link rel="alternate" href="https://donnywinston.com/posts/consistent-valid-accurate/"/><title>¬ consistent ⇒ ¬ valid ⇒ ¬ accurate</title><published>2022-06-03T09:17:13-04:00</published><updated>2022-06-03T09:17:13-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>If it’s not consistent, it can’t be valid.</p>
<p>If it’s not valid, it can’t be accurate.</p>
<p>If it’s not accurate, who cares if it’s timely?</p>
<blockquote>
<p>No amount of tooling, people, or process will transform invalid, inconsistent, inaccurate, untimely data into powerful insights, products, and applications</p>
<p>&ndash; <a href="https://twitter.com/sarahcat21/status/1532077087250452480">https://twitter.com/sarahcat21/status/1532077087250452480</a></p>
</blockquote>
<p></p>
<blockquote>
<p>Data tools, people, and titles are ibuprofen so that the stakeholders don’t feel the pain of difficult data. But the pain remains if the source data isn’t addressed.</p>
<p>&ndash; <a href="https://twitter.com/cavorax/status/1532054828192608257">https://twitter.com/cavorax/status/1532054828192608257</a></p>
</blockquote>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/w3c-data-recommendations-there-are-many/</id><link rel="alternate" href="https://donnywinston.com/posts/w3c-data-recommendations-there-are-many/"/><title>W3C data recommendations -- there are many!</title><published>2022-06-01T09:17:13-04:00</published><updated>2022-06-01T09:17:13-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>The World Wide Web Consortium (W3C) publishes a range of specifications and guidelines which help move web standards forward.</p>
<p>However, even when restricting scope to the Latest version of specifications with the status Recommendation and with the tag Data, there are currently 77 of them: <a href="https://www.w3.org/TR/?tag=data&amp;status=REC&amp;version=latest">https://www.w3.org/TR/?tag=data&amp;status=REC&amp;version=latest</a>!</p>
<p>I read through the listing, and here I try to categorize and present a subset of the specifications that I think are most relevant to scientific data management:</p>
<ul>
<li>
<p>description representations, i.e. formal ways to define and communicate data, metadata, and queries:</p>
<ul>
<li>Resource Description Framework (RDF)</li>
<li>SPARQL Protocol and RDF Query Language (SPARQL)</li>
</ul>
</li>
<li>
<p>description metamodels, i.e. formal ways to define and communicate models:</p>
<ul>
<li>Shapes Constraint Language (SHACL)</li>
<li>Relational Database to RDF Mapping Language (R2RML)</li>
<li>RDF Schema</li>
<li>Web Ontology Language (OWL)</li>
<li>Rule Interchange Format (RIF)</li>
</ul>
</li>
<li>
<p>description models, i.e. models that may be applied directly or may serve as umbrellas for more specialized models:</p>
<ul>
<li>Data Catalog Vocabulary (DCAT)</li>
<li>Provenance Data Model (PROV)</li>
<li>Simple Knowledge Organization System (SKOS)</li>
<li>CSV for the Web (CSVW)</li>
<li>RDF Data Cube Vocabulary</li>
<li>Organization Ontology</li>
<li>Open Digital Rights Language (ODRL)</li>
</ul>
</li>
</ul>
<p>I have left out specifications for serialization, i.e. the text-based appearance of things when viewing/editing them and their formats as files as disk.</p>
<p>Still, 14 specifications is a lot! I&rsquo;ve tried to list them out in each category in order of roughly decreasing &ldquo;bang for your learning buck&rdquo; for typical use cases I&rsquo;ve encountered.</p>
<p>I&rsquo;d love to hear from you which, if any, of the specifications above you&rsquo;ve found useful and/or which you would like to know better (or at all!).</p>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/data-stacks-for-fair/</id><link rel="alternate" href="https://donnywinston.com/posts/data-stacks-for-fair/"/><title>Data Stacks for FAIR</title><published>2022-05-30T14:39:34-04:00</published><updated>2022-05-30T14:39:34-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>I noticed a pattern at the top of each case study listed by <a href="https://www.stemma.ai">Stemma.ai</a>,
which provides data catalog software as a service based on the open-source
<a href="https://www.amundsen.io/">Amundsen</a> code. Each case study&rsquo;s so-called &ldquo;Data Stack&rdquo; comprises up to
four distinct categories of functionality &ndash; Data Catalog, Data Warehouse, ETL, and Business
Intelligence.</p>
<p>The &ldquo;Data Stack&rdquo; for each case study:
<br></p>
<table>
<thead>
<tr>
<th>Case</th>
<th>Data Catalog</th>
<th>Data Warehouse</th>
<th>ETL</th>
<th>Business Intelligence</th>
</tr>
</thead>
<tbody>
<tr>
<td><a href="https://www.stemma.ai/amundsen-case-studies/lyft">Lyft</a></td>
<td>Amundsen</td>
<td>Presto</td>
<td>Apache Airflow</td>
<td>Mode,Apache Superset</td>
</tr>
<tr>
<td><a href="https://www.stemma.ai/stemma-case-studies/convoy">Convoy</a></td>
<td>Stemma</td>
<td>Snowflake</td>
<td>dbt, Apache Airflow</td>
<td>Tableau, Metabase</td>
</tr>
<tr>
<td><a href="https://www.stemma.ai/stemma-case-studies/irobot">iRobot</a></td>
<td>Stemma</td>
<td>Amazon Athena</td>
<td>(blank)</td>
<td>Mode</td>
</tr>
<tr>
<td><a href="https://www.stemma.ai/amundsen-case-studies/ing">ING</a></td>
<td>Amundsen</td>
<td>Trino (formerly, Presto SQL)</td>
<td>(blank)</td>
<td>Apache Superset</td>
</tr>
</tbody>
</table>
<br>
<p>These categories struck me in relation with the FAIR Principles<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>:</p>
<ul>
<li>A Data Catalog is about making data <em>Findable</em>.</li>
<li>A Data Warehouse is about making data <em>Accessible</em>.</li>
<li>An ETL platform, aka a Data Orchestration<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup> platform, is about making data <em>Interoperable</em>.</li>
<li>A Business Intelligence (BI) tool is about making data <em>Reusable</em>, aka Repurposeable.</li>
</ul>
<p>It&rsquo;s encouraging to see high-level alignment between the FAIR Principles and a conceptualization of useful enterprise data systems in the corporate world.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>M. D. Wilkinson et al., &ldquo;The FAIR Guiding Principles for scientific data management and stewardship,&rdquo; Sci Data, vol. 3, no. 1, p. 160018, Mar. 2016, doi: 10/bdd4.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>Although a term I think may be more apt here than Data Orchestration, which has an imperative tone, is <em>Data Reconciliation</em>, which has a declarative tone &ndash; see e.g. S. Ryza, &ldquo;Introducing Software-Defined Assets&rdquo;, Dagster Blog, Mar. 2022. <a href="https://dagster.io/blog/software-defined-assets">https://dagster.io/blog/software-defined-assets</a> (accessed May 31, 2022).&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/a-sign-helps-you-use-it-as-though-it-were-an-x/</id><link rel="alternate" href="https://donnywinston.com/posts/a-sign-helps-you-use-it-as-though-it-were-an-x/"/><title>A Sign Helps You Use It as Though It Were an X</title><published>2022-05-29T15:58:15-04:00</published><updated>2022-05-29T15:58:15-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<blockquote>
<p>Suppose an alien architect has invented a radically new way to go from one room to another&hellip;We would never recognize it as a door&hellip;All its physical details are wrong. No matter: just superimpose on its exterior some&hellip;sign that can remind us of its use. Clothe it in a rectangular shape, or add to it a push-plate lettered <em>EXIT</em> in red and white, and every visitor from the planet Earth will know&hellip;just what that pseudoportal&rsquo;s purpose is.
<br><br>
&hellip;There are no doors inside our minds, only connections among our signs.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></p>
</blockquote>
<p>For evolvable data exchange, you need to be able to continually add qualified references galore so that participants can reason by analogy &ndash; i.e., each new thing resembles something known before.</p>
<p>This is <a href="https://w3id.org/fair/principles/terms/I3">FAIR principle I3</a>, which depends on <a href="https://w3id.org/fair/principles/terms/I1">I1</a> and <a href="https://w3id.org/fair/principles/terms/I2">I2</a> for robustness.</p>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>M. Minsky, <em>The Society of Mind</em>. New York: Simon and Schuster, 1986, p. 57.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/fair-principle-r11-metadata-are-released-with-a/</id><link rel="alternate" href="https://donnywinston.com/posts/fair-principle-r11-metadata-are-released-with-a/"/><title>FAIR Principle R1.1: Meta(data) are released with a clear and accessible data usage license</title><published>2022-05-29T15:33:35-04:00</published><updated>2022-05-29T15:33:35-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>I&rsquo;ve been recording introductions to each of the 15 FAIR Principles and releasing them as episodes of my Machine-Centric Science podcast (<a href="https://podcast.polyneme.xyz/">https://podcast.polyneme.xyz/</a>).</p>
<p>I just released the 13th one, featuring an overview of various data and code licenses. <a href="https://share.transistor.fm/s/ad7b9725">Listen here</a>.</p>
<p>Full transcript below (but also linked to via the episode landing page):</p>
<p>======</p>
<p>Hello, and welcome to Machine-Centric Science. My name is Donny Winston, and I&rsquo;m here to talk about the FAIR principles in practice for scientists who want to compound to their impacts, not their errors. Today, we&rsquo;re going to be talking about the 13th of the 15 FAIR principles, R1.1: metadata and data are released with a clear and accessible data usage license.</p>
<p>The license may be different for a data resource and the metadata that describes it. This has implications for indexing of the metadata and findability as well as ultimately using the data. It highlights the need to separate and permalink the data and the metadata.</p>
<p>By default, resources cannot be legally used without clarity in licensing. And furthermore, a license that cannot be found by an agent, a computational agent, is effectively the same as no license at all in a world of automated search and discovery.</p>
<p>There are lots of options in the world of licensing. I will go over the Creative Commons suite of data licenses, and I&rsquo;ll also go over some code licenses, and relations between them.</p>
<p>Starting from most open, with Creative Commons, there&rsquo;s CC0, no rights reserved.</p>
<p>After that, we have CC BY &ndash; by Attribution. This license lets others distribute, remix, adapt, and build upon your work, even commercially, as long as they credit you for the original creation. It&rsquo;s the most accommodating of licenses offered that still require attribution.</p>
<p>Beyond that, there&rsquo;s the CC BY SA &ndash; attribution and share-alike. This license let&rsquo;s others use your work, even for commercial purposes, as long as they credit you. And also they need to license their new creations under identical terms. So all new works based on the work will carry the same license. The attribution share-alike is the license used by Wikipedia.</p>
<p>Closing up things a bit, we have attribution-no-derivatives, CC BY ND. This license lets others reuse the work for any purpose, including commercially. However, it cannot shared with others in adapted form, so you can&rsquo;t make any changes.</p>
<p>Closing things up a bit more there&rsquo;s attribution, but noncommercial, so you can use the stuff, but non-commercially. You can provide derivative works, but the derivative work that&rsquo;s distributed has to be non-commercial.</p>
<p>Further down the line there&rsquo;s attribution non-commercial share-alike. This lets others remix, adapt, and build upon your work non-commercially, as long as they credit you and license their new creations under identical terms.</p>
<p>And finally, attribution, non-commercial, no-derivatives just allows people to download the work, share them with others as long as they credit you, but they can&rsquo;t change the work in any way or use it commercially.</p>
<p>So, those are the Creative Commons licenses typically use for data. Then there are the code licenses, and these aren&rsquo;t quite along the same spectrum of open to closed. Rather, the spectrum is more going from maximizing user freedom to maximizing redistributor freedom.</p>
<p>The most user-free license that I&rsquo;ve encountered recently that&rsquo;s in popular use is the Server Side Public License that&rsquo;s in use by, say, MongoDB. And this is akin to the Creative Commons attribution share-alike license, but additional sharing. So if someone&rsquo;s offering this software as a managed service, they have to supply the source code and they also have to supply the source code for all of the service tooling that&rsquo;s helping them to provide that service, like managed backups, et cetera. So, it goes even beyond the actual code. So it really makes sure that whoever is using the software really can reproduce the use. And so that maximizes that.</p>
<p>A little bit less than that, keeping to the domain of the actual code itself is the Affero GPL, the AGPL. That&rsquo;s more akin to CC attribution share-alike, where even if you&rsquo;re distributing the code through a service online, through a managed service, and you&rsquo;re not actually distributing the source code directly, you still have to supply the source code to the users.</p>
<p>Okay, going down, there&rsquo;s then the GPL, the GNU Public License. Here, if you&rsquo;re offering the software as a service, you don&rsquo;t have to include source modifications &ndash; it&rsquo;s only if you&rsquo;re actually distributing the source code.</p>
<p>Then, we have some licenses that are more compartmentalized to the actual portions of the code that&rsquo;s being reused. So there&rsquo;s the Lesser GPL and the Mozilla Public License, which are, again, like Creative Commons with attribution and share-alike, but if those licensed components are combined with other software, the user does not have the right to have the source code for all the other components that are necessary to use the system. They only have the right to modifications made to your component that is under the Mozilla Public License or Lesser GPL. But there may be other parts of the software system, as distributed, that are protected, are proprietary.</p>
<p>So this kind of gives a bit more freedom to the redistributor if they have proprietary code that they want to mix in, or pair with, rather, the open software. So it kind of makes the user have a little bit less insight into the total software of the system, but they still have insight into your component that you&rsquo;ve released under LGPL or Mozilla Public License, MPL.</p>
<p>There is the Business Source License. This is kind of akin to a Creative Commons by attribution, noncommercial license that typically reverts to a by-attribution, commercial-okay license. So, Business Source License, like Sentry&rsquo;s monitoring service, has an Additional Use Grant, which says, you can use this however you want as long as you do not offer commercially a managed service. So only we, the company that makes this software, can offer a managed service where we offer this to third parties. But as a user, you can do whatever you want internally. You just can&rsquo;t also be a company that sells this software as a service to other companies.</p>
<p>So again, this offers a lot of user freedom, but has a bit more emphasis on redistributor freedom to clamp down on use. And Elasticsearch&rsquo;s Elastic License is similar in the sense that a user can sort of do whatever they want with it, but they&rsquo;re restricted from redistributing it commercially as a service.</p>
<p>Then, going down more towards freedom of the redistributor, we have things like the Apache 2.0 license, which is more like a Creative Commons with attribution, and also adds in some share-alike for contributions. So by default, anyone contributing to an Apache 2 codebase also grants their contributions to be distributed under the same license.</p>
<p>And then, the most so-called permissive licenses are things like the BSD license, Berkeley Software Distribution, and the MIT license, and those are more akin to just Creative Commons attribution. So there&rsquo;s really no other restrictions or conditions on use, about things that are analogous to share-alike or non-commercial or non-derivative or you can&rsquo;t offer this as a managed service, or if you do, you need to offer all of the code for your associated tooling for the service. It kind of places the most freedom on someone who has the software and is wanting to redistribute it or repurpose it in some way. So those are generally going to be the most compatible options with everything else.</p>
<p>One other license I wanted to bring up a bit just in the context of this podcast and science &ndash; there&rsquo;s a fun license called the CRAPL. And it&rsquo;s a quote unquote academic-strength open-source license. I wouldn&rsquo;t necessarily recommend it, but I want to bring it in here just so that I can compare it to some of the other licenses that I&rsquo;ve mentioned.</p>
<p>In the software world, I would say that the CRAPL is similar to a Business Source License with no Change License &ndash; a business source license, normally after a certain period, like two or three years, will turn into a much more permissive license, so it will become actually Open in the sense of the Open Source Initiative. And the CRAPL has the Additional Use Grant, in the context of a Business Source License, that it is only for validating published claims and validating pre-publication claims. And furthermore, the use grant allows one to publish those claims on the conditions that (1) the original author is notified of the use and claims prior to submission, and (2) that those modifications are released under the CRAPL when the supporting claims are publicly released, say, in a publication.</p>
<p>So, this is also kind of like the Creative Commons by attribution, non-commercial, share-alike, with the additional condition that a good faith attempt is made to notify the original author prior to public claims and distribution of your modifications.</p>
<p>So in summary, there are a bunch of widely used licenses. Popular licenses in the data world are the Creative Commons suite. Popular licenses in the code world are things like Apache 2.0, BSD, MIT, Mozilla Public License, the GPL &ndash; Lesser, Affero &ndash; and also newer ones that extend the others to noncommercial restrictions, like the Business Source License and the Server Side Public License, or, rather, conditions on commercial use.</p>
<p>If your data and metadata are not covered clearly and accessibly by one of these licenses, or if there are additional restrictions, if those aren&rsquo;t clearly and accessibly provided, then reuse of your data is going to be jeopardized.</p>
<p>That&rsquo;ll be it for today. I&rsquo;m Donny Winston and I hope you join me again next time for Machine-Centric Science.</p>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/sending-signal-signs/</id><link rel="alternate" href="https://donnywinston.com/posts/sending-signal-signs/"/><title>Sending Signal-Signs</title><published>2022-05-29T15:33:35-04:00</published><updated>2022-05-29T15:33:35-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>Sending signal-signs<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup><br>
to steer engines of compute,<br>
the wheel does no work.</p>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>M. Minsky, <em>The Society of Mind</em>. New York: Simon and Schuster, 1986, p. 56.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/the-persistence-of-identity/</id><link rel="alternate" href="https://donnywinston.com/posts/the-persistence-of-identity/"/><title>The Persistence of Identity</title><published>2022-05-24T14:07:33-04:00</published><updated>2022-05-24T14:07:33-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>What is that strange possession that stays the same throughout its life?<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></p>
<p>Can we recollect how things appeared to us before we learned to link new meanings to those things?</p>
<p>What is this body of changelessness in spite of change?</p>
<p>Perhaps the purview of a thing&rsquo;s persistence is its predictable pathways of provenance:<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup></p>
<ul>
<li>the typical effects of its typical activities,<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup></li>
<li>its body of influenc(ed/ing) entities<sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup> whose meanings change only slowly, and</li>
<li>whichever of its agents<sup id="fnref:5"><a href="#fn:5" class="footnote-ref" role="doc-noteref">5</a></sup> change the least as its life proceeds.</li>
</ul>
<p>Data does not have intrinsic meaning:</p>
<blockquote>
<p>The semantics of our data are defined by the effects it produces when passed into our functions. These effects should be predictable whenever possible, but data cannot prevent itself from being interpreted in surprising ways.<sup id="fnref:6"><a href="#fn:6" class="footnote-ref" role="doc-noteref">6</a></sup></p>
</blockquote>
<p>An identifier is an association between a string of data and an object.<sup id="fnref:7"><a href="#fn:7" class="footnote-ref" role="doc-noteref">7</a></sup>. The semantics of our identifiers are then defined by the effects produced by interpreters that believe records bearing witness to these associations.</p>
<p>A layer of indirection separates <em>what</em> something does from <em>how</em> it does it. Similarly, an identifier separates <em>what</em> something <em>is</em> from <em>how</em> it is.</p>
<p>What are some tools for predictability in indirection?</p>
<ul>
<li><em>referential transparency</em>: The semantics of purely functional code will remain the same if we replace every expression with its result, e.g. &ldquo;1 + 1&rdquo; with &ldquo;2&rdquo; (for typical senses of &ldquo;1&rdquo;, &ldquo;+&rdquo;, and &ldquo;2&rdquo;!).</li>
<li><em>invariant relations</em>: The semantics of a data structure&rsquo;s interface &ndash; its abstract representation, its exposed behavior &ndash; will remain the same if we decide to change its concrete representation &ndash; its internal model &ndash; as long as we enforce appropriate invariant relations on that concrete representation.<sup id="fnref:8"><a href="#fn:8" class="footnote-ref" role="doc-noteref">8</a></sup></li>
</ul>
<p>Effects are the currency of meaning, yet their causes and conditions are ever fleeting:</p>
<blockquote>
<p>&hellip;everything in the world is the result of a vast concurrence of causes and conditions, and everything disappears as these causes and conditions change and pass away.</p>
<p>&ndash; Buddha<sup id="fnref:9"><a href="#fn:9" class="footnote-ref" role="doc-noteref">9</a></sup></p>
</blockquote>
<p>&ldquo;All models are wrong, but some are useful.&rdquo; An identity is ultimately a model, an abstract description that hides certain details while illuminating others, that can yield useful predictions when it provides adequate explanations relating primitive phenomena to one another and to more complex phenomena. Go forth and identify.</p>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>M. Minsky, <em>The Society of Mind</em>. New York: Simon and Schuster, 1986, p. 54.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p><a href="https://www.w3.org/TR/prov-o/">https://www.w3.org/TR/prov-o/</a>&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p><a href="https://www.w3.org/TR/prov-o/#Activity">https://www.w3.org/TR/prov-o/#Activity</a>&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p><a href="https://www.w3.org/TR/prov-o/#Entity">https://www.w3.org/TR/prov-o/#Entity</a>&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:5">
<p><a href="https://www.w3.org/TR/prov-o/#Agent">https://www.w3.org/TR/prov-o/#Agent</a>&#160;<a href="#fnref:5" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:6">
<p>Z. Tellman, Elements of Clojure. Monee, IL: Lulu.com, 2019.&#160;<a href="#fnref:6" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:7">
<p>J. A. Kunze, “The ARK Identifier Scheme (v.34).” Internet Engineering Task Force (IETF), Jan. 2022. [Online]. Available: <a href="https://datatracker.ietf.org/doc/draft-kunze-ark/">https://datatracker.ietf.org/doc/draft-kunze-ark/</a>&#160;<a href="#fnref:7" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:8">
<p>C. A. R. Hoare, “Proof of correctness of data representations,” Acta Informatica, vol. 1, no. 4, pp. 271–281, 1972, doi: 10.1007/BF00289507.&#160;<a href="#fnref:8" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:9">
<p>B. D. Kyōkai, The teachings of Buddha, 1. ed. New Delhi: Sterling Publishers, 2004.&#160;<a href="#fnref:9" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/why-is-there-not-just-one-metadata-format-for-all/</id><link rel="alternate" href="https://donnywinston.com/posts/why-is-there-not-just-one-metadata-format-for-all/"/><title>Why Is There Not Just One Metadata Format for All Kinds of Research Data?</title><published>2022-05-13T09:21:10-07:00</published><updated>2022-05-13T09:21:10-07:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<blockquote>
<p>Why is there not just one metadata format for all kinds of research / data?</p>
<p>&ndash; asked on <a href="https://fairdataforum.org/t/fair-aware-tool-faq/479/3">fairdataforum.org</a></p>
</blockquote>
<p>Metadata modeling and formatting are separate concerns. It is reasonable that different scientific domains and studies within domains may have widely varying modeling concerns. Controlled vocabulary terms, validity constraints, and other metadata elements will surely vary and evolve over time.</p>
<p>What’s not as obvious is why different scientific domains and studies within domains would have different formatting concerns. Different software applications and tools may have their preferred metadata formats for operational convenience. Thus, as some software gains prominence in a specific domain, its preferred format may be adopted by other tools in the ecosystem for ease of exchange and integration.</p>
<p>For there to be a single metadata format that is universally adopted for metadata exchange — that is, a format that a given software tool may convert to a preferred internal format for convenience of use by the tool — that format would need to be able to communicate the model being used as well. Thus, the format would need to host a language for defining models.</p>
<p>There have been some efforts at this. One effort that has gained some recognition in the FAIR data community is that of the Semantic Web set of standards. Specifically, the Resource Description Framework (RDF) base model, exchanged using a handful of standardized plain-text formats such as JSON-LD, and using RDF-expressed modeling languages such as RDFS (RDF Schema), OWL (Web Ontology Language), and SHACL (Shapes Constraint Language), is one effort towards a universal “meta-model” for defining and exchanging metadata models along with the metadata itself, in plain-text formats that both humans and machines can interpret unambiguously, if only to convert metadata to preferred internal modeling languages and formats.</p>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/a-phase-space-of-reproducibility/</id><link rel="alternate" href="https://donnywinston.com/posts/a-phase-space-of-reproducibility/"/><title>A Phase Space of Reproducibility</title><published>2022-05-09T12:29:56-04:00</published><updated>2022-05-09T12:29:56-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p><img src="https://files.polyneme.xyz/dropshare/reproducibility-phase-space-wSnbNlvjVR.png" alt="Phase Space of Reproducibility"></p>
<p>What does &ldquo;reproducible science&rdquo; encompass?</p>
<h2 id="a-tug-of-war">A tug-of-war</h2>
<p>Here is one decomposition into &ldquo;repeatability&rdquo;, &ldquo;reproducibility&rdquo;, and &ldquo;replicability&rdquo;:<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></p>
<blockquote>
<p>&ldquo;Repeatability (Same team, same experimental setup)&hellip;Reproducibility (Different team, same experimental setup)&hellip;Replicability (Different team, different experimental setup)&hellip;&rdquo;</p>
</blockquote>
<p>Here is a conflicting account of the relationship between &ldquo;replicability&rdquo; and &ldquo;reproducibility&rdquo;:<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup></p>
<blockquote>
<p>&ldquo;&hellip;even if we can replicate the results of a paper, slightly altering the experimental setup could have dramatically different results. For these reasons, we don’t want to consider the authors code, as this could be a source of bias. We want to focus on the question of reproducibility, without wading into the murky waters of replication.&rdquo;</p>
</blockquote>
<p>It seems that ACM&rsquo;s &ldquo;replicability&rdquo; is Edward Raff&rsquo;s &ldquo;reproducibility&rdquo;, and vice versa. And the colloquial phrase &ldquo;repeat after me&rdquo; is at odds with ACM&rsquo;s &ldquo;repeatability&rdquo;.</p>
<h2 id="a-trip-to-the-dictionary">A trip to the dictionary</h2>
<p>Merriam-Webster<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup> defines reproduction as</p>
<blockquote>
<p>the process by which plants and animals give rise to offspring and which fundamentally consists of the segregation of a portion of the parental body by a sexual or an asexual process and its subsequent growth and differentiation into a new individual.</p>
</blockquote>
<p>and reports the first known use in circa 1640, in the above sense.</p>
<p>So, it seems to me that reproduction subsumes replication — replication is a sub-type of reproduction where, subjective to an observer, the new individual is a replica — an indistinguishable image — of the parental body.</p>
<p>Repetition seems like the kind of reproduction where no material portion of a parental body is involved. “Repeat after me” is about a second agent observing the first agent and reproducing the output of the first agent without material reuse of a portion of the first agent’s material output.</p>
<p>Repetition is about following the same method but without using a seed from the original performance.</p>
<p>Reproduction of result, i.e. growth of a new individual, can occur / be attempted with or without repetition of method. With or without repeated method, the growth of a new individual from the seed may or may not be successful.</p>
<h2 id="repetition-of-methods-replication-of-results">Repetition of methods, replication of results</h2>
<p>Repetition seems method-focused (same activities) whereas replication seems result-focused (same outcomes).</p>
<p>A reproduction can be perceived as more or less a repetition of the original production activities, and can orthogonally be perceived as more or less a replication of the original production outcomes.</p>
<h2 id="reproduction--representation--repetition--replication">Reproduction = Representation + Repetition + Replication</h2>
<p>In terms of the W3C Provenance ontology&rsquo;s<sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup> core types of Agent, Activity, and Entity (diagram at top reproduced here):</p>
<p><img src="https://files.polyneme.xyz/dropshare/reproducibility-phase-space-wSnbNlvjVR.png" alt="Phase Space of Reproducibility"></p>
<p>Reproducibility is the space. One axis is repeatability, i.e. “activity-dependent reproducibility”. Another axis is replicability, i.e. “entity-dependent reproducibility”. The final axis is “agent-dependent reproducibility”. For scientific reproducibility, we really (really) would like to ignore the Agent axis — sure, agent/representative identity is correlated with entity resourcing and activity resourcing capacity and skill in the real world, but we’d rather not consider it an independent axis.</p>
<p>Thus, we consider a scientific process reproducible in part by its repeatability (reproduction of activities) and its replicability (reproduction of entities - artifacts, results, outcomes). This seems subjective, But “up and to the right” in the figure above is what we seem to seek.</p>
<h2 id="how-repeatable-is-reproduction-how-replicable-is-it">How repeatable is reproduction? How replicable is it?</h2>
<p>Engineering emphasizes modeling for prediction, whereas science emphasizes modeling for explanation. Thus, while repeatability (same activities) may not be valued for outcome-focused reproducibility in engineering, repeatability is valued for activity-focused reproducibility in science, that is, for explainability in terms of causes and conditions.</p>
<h2 id="on-independent-reproduction">On independent reproduction</h2>
<p>While we can consciously seek repeatability and replicability in reproduction attempts, we also typically value so-called “independent” reproduction, where it seems the investigating agent(s) were either not aware of original activities or of original starter/intermediate/output entities, or both, and yet reproduced them anyhow.</p>
<p>The significance of independent reproduction is not so much for validity of reproduction as it is for assigning credit to discovery, but still, independent reproduction is often perceived as more valid due to a perceived reduction in risk of dishonest reporting.</p>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>“Artifact Review and Badging - Current.” Association for Computing Machinery. <a href="https://www.acm.org/publications/policies/artifact-review-and-badging-current">https://www.acm.org/publications/policies/artifact-review-and-badging-current</a> (accessed May 09, 2022).&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>E. Raff, &ldquo;Quantifying independently reproducible machine learning,&rdquo; The Gradient, Feb. 2020, [Online]. Available: <a href="https://thegradient.pub/independently-reproducible-machine-learning/">https://thegradient.pub/independently-reproducible-machine-learning/</a>&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>“Definition of REPRODUCTION.” <a href="https://www.merriam-webster.com/dictionary/reproduction">https://www.merriam-webster.com/dictionary/reproduction</a> (accessed May 09, 2022).&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p>“PROV-DM: The PROV Data Model.” W3C Recommendation, April 30, 2013. <a href="https://www.w3.org/TR/prov-dm/">https://www.w3.org/TR/prov-dm/</a> (accessed May 09, 2022).&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/let-traits-accrete/</id><link rel="alternate" href="https://donnywinston.com/posts/let-traits-accrete/"/><title>Let Traits Accrete</title><published>2022-05-06T07:20:02-04:00</published><updated>2022-05-06T07:20:02-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>How can it be that complex, dynamic objects can be described by short and simple strings and words? We often seek:<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></p>
<ul>
<li>
<p><em>Selectivity</em> &ndash; Our images are often falsely clear. We may think of an object&rsquo;s &ldquo;personality&rdquo; in terms of that which we can easily describe. We may set aside the rest for now as though it simply weren&rsquo;t there.</p>
</li>
<li>
<p><em>Style</em> &ndash; To avoid making decisions we consider unimportant for now, we may develop policies that become systematic traits.</p>
</li>
<li>
<p><em>Predictability</em> &ndash; It&rsquo;s hard to maintain fruitful exchange without trust, so we may try to conform to expectations. To the extent we frame our images of producer/consumer systems in terms of traits, we teach our data to behave in accordance with those same traits.</p>
</li>
<li>
<p><em>Self-Reliance</em> &ndash; Imagined traits can, over time, make themselves actual because we must be able to predict outcomes of the use of our own data. This prediction becomes easier the more we simplify our models.</p>
</li>
</ul>
<p>We need to be able to trust our own data, logic, and presentation resources. One way to accomplish this is to think of these resources in terms of traits, and then proceed to train those dynamic resources to behave according to those immediate images.</p>
<p>Still, like a personality is merely the surface of a person, a schema is merely the surface of a dynamic digital object. What we call traits, properties, etc. are only the regularities we manage to perceive and deem worthy of systematizing at present.</p>
<p>We may not be able to &ldquo;pin down&rdquo; the traits of our digital resources because there are many processes and policies that don&rsquo;t yet show themselves directly in elicited behavior but that work behind the scenes and that may only become important to name and systematize later.</p>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>M. Minsky, <em>The Society of Mind</em>. New York: Simon and Schuster, 1986, p. 53.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/keying-into-fashion-and-style-for-knowledge/</id><link rel="alternate" href="https://donnywinston.com/posts/keying-into-fashion-and-style-for-knowledge/"/><title>Keying Into Fashion and Style for Knowledge Arrangement</title><published>2022-05-05T09:29:55-04:00</published><updated>2022-05-05T09:29:55-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>We often have sound practical reasons for making choices that have no reasons by themselves but have effects on larger scales.</p>
<p>Familiar styles make it easier for us to recognize and classify the things we see. For example, we may choose furniture according to systematic styles or fashions.</p>
<p>We protect ourselves from distractions by adopting uniform styles. For example, if every object in a room were interesting in itself, our furniture might occupy our minds too much.</p>
<p>Societies need rules that make no sense for individuals. For example, it makes no difference whether a single car drives on the left or on the right, but it makes all the difference when there are many cars!</p>
<p>It can save a lot of mental work if one makes each arbitrary choice the way one did before. The more difficult the decision, the more this policy can save. And there&rsquo;s a paradox:<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></p>
<blockquote>
<p>The more equally attractive two alternatives seem, the harder it can be to choose between them &ndash; no matter that, to the same degree, the choice can only matter less.</p>
</blockquote>
<p>Thus, it is helpful to take recourse in rules of style when we&rsquo;re fairly sure that further thought will just waste time. We should not abandon reasoning recklessly, but often, ordinary reasons cancel out, so it makes sense to use forms that lie beneath the surface of our thoughts &ndash; style, fashion, art &ndash; our &ldquo;taste&rdquo;.</p>
<p>When it comes to exchanging knowledge representations, you&rsquo;re inviting your recipients into rooms you&rsquo;ve curated &ndash; consider whether your arrangement of &ldquo;furniture&rdquo; there is adding value or distracting.</p>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>M. Minsky, <em>The Society of Mind</em>. New York: Simon and Schuster, 1986, p. 52.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/identity-and-concurrency/</id><link rel="alternate" href="https://donnywinston.com/posts/identity-and-concurrency/"/><title>Identity and Concurrency</title><published>2022-05-04T15:13:15-04:00</published><updated>2022-05-04T15:13:15-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>Regarding a resource &ndash; dataset, model, tool, standard, agent, etc. &ndash; as a single thing can be helpful: in allocating physical space, in dealing with privacy and responsibility, in de-confusing mental activity.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></p>
<p>Are human mental processes actually clean &ldquo;streams of consciousness&rdquo;, or is narrative a tool used to “straighten things out”, to simplify the representation of what happened? Is there really a single pipeline of ideas that flow through a mind?</p>
<p>And is a straightened-out story faithful to &ldquo;raw&rdquo; observation, or was a schema employed, a design pattern, an archetype, such as The Hero’s Journey?</p>
<p>In computing, modeling a workflow as a single process, a sequence of actions, is easiest for us mentally. As opposed to multiple concurrent processes.</p>
<p>And yet it is often operationally helpful to support concurrency via resource properties like immutability, idempotence, etc., even if we later explain “what happened” &ndash; if we later communicate activity provenance to fellow human beings &ndash; as a single process, a story, a linear narrative.</p>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>M. Minsky, <em>The Society of Mind</em>. New York: Simon and Schuster, 1986, p. 51.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/we-need-to-dockerize-and-distribute-robert/</id><link rel="alternate" href="https://donnywinston.com/posts/we-need-to-dockerize-and-distribute-robert/"/><title>"We Need to Dockerize and Distribute Robert"</title><published>2022-04-29T10:51:48-04:00</published><updated>2022-04-29T10:51:48-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<blockquote>
<p>It does not help for you to think that inside yourself lies someone else who does your work. This notion of &ldquo;homunculus&rdquo; &ndash; a little person inside each self &ndash; leads only to a paradox since, then, <em>that inner Self requires yet another movie screen inside itself, on which to project what <strong>it</strong> has seen!</em> <sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></p>
</blockquote>
<center>
<figure>
<img alt="the remote-control self" src="https://files.polyneme.xyz/dropshare/society-of-mind-p50-remote-control-self.sm-fivgWRt30g.jpg" />
<figcaption>"The remote-control self" [1]. </figcaption>
</figure>
</center>
<blockquote>
<p>A thing with no parts provides nothing that we can use as pieces of explanation&hellip;Why are we tempted to embrace the strange idea that what we do is done by Someone Else &ndash; that is, our Self? Because so much of what our minds do is hidden from the parts of us that are involved with verbal consciousness.  <sup id="fnref1:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></p>
</blockquote>
<center>
<figure>
<img alt="then a miracle occurs" src="https://files.polyneme.xyz/dropshare/Sidney_Harris_math07-Kxs4JFaYqr.gif" />
<figcaption>A <a href="http://www.sciencecartoonsplus.com/gallery/math/index.php#">Sidney Harris</a> classic. </figcaption>
</figure>
</center>
<p>Design is taking things apart in order to be able to put them back together. <sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup> You must <em>design</em> the digital resources you archive and disseminate, so that you don&rsquo;t &ldquo;need to dockerize and distribute Robert&rdquo; (overheard in a Slack room), which, of course, you can&rsquo;t.</p>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>M. Minsky, <em>The Society of Mind</em>. New York: Simon and Schuster, 1986, p. 50.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a>&#160;<a href="#fnref1:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>R. Hickey, “Design, Composition and Performance,” presented at the QCon, San Francisco, Nov. 2013. <a href="https://github.com/matthiasn/talk-transcripts/blob/master/Hickey_Rich/DesignCompositionPerformance.md">Transcript</a>.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/ensure-that-provenance-bottoms-out/</id><link rel="alternate" href="https://donnywinston.com/posts/ensure-that-provenance-bottoms-out/"/><title>Ensure That Provenance Bottoms Out</title><published>2022-04-28T11:33:01-04:00</published><updated>2022-04-28T11:33:01-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>Some questions may be pursued circularly, where for example you cannot find a final cause &ndash; you must ask, <em>What caused that cause?</em> Or you cannot find an ultimate goal &ndash; <em>Then what purpose does <strong>that</strong> serve?</em> Such loops can waste our time.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></p>
<p>It is a form of self-control to establish ways to bottom out, to employ base cases to stop recursion. When a child repeatedly asks <em>Why?</em>, adults may employ <em>Just because!</em></p>
<p>Cultures establish ways to deal with the need for bottoming out such as branding with shame or taboo, cloaking in awe or mystery, and consensus. Cultures evolve institutions that adopt specific answers to circular questions and establish authority-schemes to enforce these beliefs.</p>
<p>One could complain that such establishments substitute dogma for reason and truth. But in exchange, they spare whole populations from wasting time in fruitless reason loops. Rather, minds can more productively work on problems that can be solved.</p>
<p>When annotating a digital research object with lifecycle provenance metadata, including conceptual provenance relating to hypotheses and study design, it is reasonable to &ldquo;bottom out&rdquo; to a current consensus view, a milestone along the path of Kuhnian<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup> paradigm shifts stretching to the past and to the unknown future.</p>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>M. Minsky, <em>The Society of Mind</em>. New York: Simon and Schuster, 1986, p. 49.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>T. S. Kuhn, <em>The structure of scientific revolutions</em>. 1962.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/straightening-out-circular-causality/</id><link rel="alternate" href="https://donnywinston.com/posts/straightening-out-circular-causality/"/><title>"Straightening Out" Circular Causality</title><published>2022-04-27T11:31:34-04:00</published><updated>2022-04-27T11:31:34-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>We often seek to &ldquo;straighten out&rdquo; a maze-like, loop-containing situation. We try to find a &ldquo;path&rdquo; through &ldquo;causal&rdquo; explanations that go in only one direction. Why?</p>
<blockquote>
<p>There are countless different types of networks that contain loops. But all networks that contain no loops are basically the same: each has the form of a simple chain.</p>
</blockquote>
<p>Any directed acyclic graph (DAG) can be linearized, i.e. topologically sorted. And we can apply the same types of reasoning to <em>everything</em> we can represent in terms of chains of causes and effects. We can proceed from start to end without any need for a novel thought.</p>
<p>But frequently, to construct such a path, we have to ignore important interactions and dependencies that run in other directions.</p>
<p>In loopy situations, one may find success in shifting from &ldquo;causal&rdquo; learning to &ldquo;clausal&rdquo; learning. If data values are annotated with dependencies, e.g. labeled with external provenance, with justifications for data-processing decisions, etc., then dependency-directed backtracking may help to path-find by avoiding sets of premises that support previously discovered contradictions.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></p>
<p>In this way, annotating data with provenance metadata can formally help to &ldquo;straighten things out&rdquo;.</p>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>C. Hanson and G. J. Sussman, Software design for flexibility: how to avoid programming yourself into a corner. Cambridge: The MIT Press, 2021.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/data-ventriloquy/</id><link rel="alternate" href="https://donnywinston.com/posts/data-ventriloquy/"/><title>Data Ventriloquy</title><published>2022-04-25T11:41:48-04:00</published><updated>2022-04-25T11:41:48-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<blockquote>
<p><em>Punch and Judy, to their audience:</em><br/>
<br/>
Our puppet strings are hard to see,<br/>
So we perceive ourselves as free,<br/>
Convinced that no mere objects could<br/>
Behave in terms of bad or good.<br/>
<br/>
To you, we mannikins seem less<br/>
than live, because our consciousness<br/>
is that of dummies, made to sit<br/>
on laps of gods and mouth their wit;<br/>
<br/>
Are you, our transcendental gods,<br/>
likewise dangled from your rods,<br/>
and need, to show spontaneous charm,<br/>
some higher god&rsquo;s inserted arm?<br/>
<br/>
We seem to form a nested set,<br/>
with each the next one&rsquo;s marionette,<br/>
who, if you asked him, would insist<br/>
that he&rsquo;s the last ventriloquist.<br/>
<br/>
&ndash; Theodore Melnechuk</p>
</blockquote>
<p>Who&rsquo;s the last ventriloquist when it comes to a dataset? You pull data and accrete it with other data, reshape it, and reduce it, and ultimately make it dance and speak via structured representation, software action, narrative exposition, etc.</p>
<p>Can someone further contribute to the life journey of that processed data, repurposing your representation, inserting their arm to make an adjustment with a tool of their choice?</p>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/crossing-the-inter-lab-chasm/</id><link rel="alternate" href="https://donnywinston.com/posts/crossing-the-inter-lab-chasm/"/><title>Crossing the Inter-Lab Chasm</title><published>2022-04-24T11:39:00-04:00</published><updated>2022-04-24T11:39:00-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<blockquote>
<p>Without enduring self-ideals, our [research] would lack coherence. As individuals, we&rsquo;d never be able to trust ourselves to carry out our [protocols]. In a social group, no one person would be able to trust the others. A working society must evolve mechanisms that stabilize ideals &ndash; and many of the social principles that each of us regards as personal are really &ldquo;long-term memories&rdquo; in which our cultures store what they have learned across the centuries.</p>
</blockquote>
<p>Your electronic lab notebook (ELN) and/or laboratory<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> information management system (LIMS)<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup> embodies ideals in various implicit or explicit forms: data formats and validators, software transformation functions and tests, planning/recording/reporting document templates, etc.</p>
<p>Are these ideals siloed in your ELN/LIMS, or are they shared? If so, how, and is sharing robust?</p>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>whether &ldquo;controlled physical / wet&rdquo;, &ldquo;field&rdquo;, or &ldquo;computer&rdquo; laboratory&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>and you do have one, whether personal (and perhaps embarrassing), project-wide, lab-group-wide, institute-wide, etc.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/place-oriented-publishing-versus-value-oriented/</id><link rel="alternate" href="https://donnywinston.com/posts/place-oriented-publishing-versus-value-oriented/"/><title>Place Oriented Publishing Versus Value Oriented</title><published>2022-04-23T11:37:22-04:00</published><updated>2022-04-23T11:37:22-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
&lt;p>In place-oriented publishing, as in place-oriented programming, you allocate places to push things, and you pull from places. Places steward values. “Where” something is, is important. Did you publish to a reputable place?&lt;/p>
&lt;p>In value-oriented publishing, as in value-oriented programming, you pass around values, or references to values, not to places. You deal directly in values. Dereferencing services steward values. “What” something is, is important, not “where” something is. Did you publish something valued as reputable by other resources - peer reviews, other publications, etc. - that reference it and in turn are valued as reputable (c.f. PageRank)?&lt;/p></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/slot-long-range-plans-into-ecosystems/</id><link rel="alternate" href="https://donnywinston.com/posts/slot-long-range-plans-into-ecosystems/"/><title>Slot Long Range Plans Into Ecosystems</title><published>2022-04-22T11:36:21-04:00</published><updated>2022-04-22T11:36:21-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>A <a href="https://donnywinston.com/elements_of_clojure/#principled">principled</a> system has predictable relationships between its modules, whereas an <a href="https://donnywinston.com/elements_of_clojure/#adaptable">adaptable</a> system has sparse and flexible relationships between its modules.</p>
<p>In a <a href="https://donnywinston.com/elements_of_clojure/#selfconscious">selfconscious</a> culture, design and construction is a specialized task, taught in schools using abstract principles, whereas in an <a href="https://donnywinston.com/elements_of_clojure/#unselfconscious">unselfconscious</a> culture, design and construction is taught using direct demonstration and reflects the constraints and variation of an environment.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></p>
<p>The structures of a selfconscious culture are principled; they are not meant to change. If the environment changes, the structure is hardened against the change rather than adapting to it. Some principled structures, like skyscrapers or stadiums, can hold thousands of people.</p>
<p>The structures of an unselfconscious culture are adaptable; they reflect the present needs of the inhabitants. Examples include igloos &ndash; there is no &ldquo;architect&rdquo; &ndash; each person builds their own home. If an igloo grows too warm, someone can poke a hole in the wall. When it grows too cold, the hole can be filled in. Such structures tend to be only large enough to hold a single family.</p>
<p>How can we be principled, having long-range plans, and also be adaptable? How can we balance what we want to be versus what we want now?</p>
<p>An interface pulled in many directions is intrinsically stable, but an interface pulled in a single direction tends to shift &ndash; the interface itself will become vestigial. For example, the mitochondrion is now just another interdependent part of a principled whole &ndash; it is no longer an independent organism.</p>
<p>It is the ecosystem, not the organism, that adapts to change. An organism may disappear, its niche filled by something else. Roles are fungible because organisms consume and emit the same resources; they share a common interface.</p>
<p>Consider Overton windows: lawmakers often position proposed legislation (principled components) wrt an observed ecosystem of discourse (currently-stable interface).</p>
<p>We cannot have a system that is wholly principled. We can have a collection of principled components, built to be discarded, slotted into / separated by interfaces that can last only given a rich ecosystem of alternatives.</p>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>C. Alexander, <em>Notes on the synthesis of form</em>, 1964.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/schemes-for-indirect-control/</id><link rel="alternate" href="https://donnywinston.com/posts/schemes-for-indirect-control/"/><title>Schemes for Indirect Control</title><published>2022-04-20T10:55:57-04:00</published><updated>2022-04-20T10:55:57-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>There are two fundamental approaches to indirect control in code.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></p>
<p>In an <em>open</em> approach, we can change the behavior of dereferencing code by conveying different values. The decision-making mechanism must be unordered. Typically, this is implemented in code using a data structure with a distinct set of keys, i.e. a <em>lookup table</em>.</p>
<p>In a <em>closed</em> approach, we can only change the decision-making process by changing the underlying code. A conditional, e.g. an expression that uses an <code>if/elif/else</code> or <code>match/case</code> form in Python, is closed. It is ordered, and if predicates aren&rsquo;t disjoint, order matters.</p>
<p>While an open table <em>conveys</em> values, a conditional <em>decides</em> based on values. For a table to be useful, it must avoid conflicts. Conditionals avoid conflicts by making explicit, fixed decisions. An open approach must avoid conflicts in a dynamic way.</p>
<p>Consider all the kinds of tricks we use to try to force ourselves to work when we&rsquo;re tired or distracted:</p>
<ul>
<li><em>Willpower</em>: Tell yourself, &ldquo;Don&rsquo;t give in to that,&rdquo; or, &ldquo;Keep on trying.&rdquo;</li>
<li><em>Activity</em>: Move around. Exercise. Inhale. Shout.</li>
<li><em>Expression</em>: Set jaw. Stiffen upper lip. Furrow brow.</li>
<li><em>Chemistry</em>: Take coffee, amphetamines, or other brain-affecting drugs.</li>
<li><em>Emotion</em>: &ldquo;If I win, there&rsquo;s much to gain, but more to lose if I fail!&rdquo;</li>
<li><em>Attachment</em>: Imagine admiration if you succeed &ndash; or disapproval if you fail &ndash; especially from those to whom you are attached.</li>
</ul>
<p>So many schemes for self-control! How do we choose which ones to use? There isn&rsquo;t any easy way. Self-discipline takes years to learn; it grows inside us stage by stage.</p>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Z. Tellman, <em>Elements of Clojure</em>. Monee, IL: Lulu.com, 2019.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/directness-is-dangerous/</id><link rel="alternate" href="https://donnywinston.com/posts/directness-is-dangerous/"/><title>Directness Is Dangerous</title><published>2022-04-19T13:17:02-04:00</published><updated>2022-04-19T13:17:02-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>If self-control were easy, we might end up accomplishing nothing at all. Extinction would be swift indeed for species that could simply switch off hunger or pain. Instead, there must be checks and balances.</p>
<p>Imagine if any one process could seize and hold control over all the rest &ndash; we wouldn&rsquo;t complete many tasks. So, for processes to exploit each other&rsquo;s skills, roundabout pathways have to be discovered.</p>
<p>Fantasies can provide missing paths. You may not be able to make yourself angry simply by deciding to be angry, but you can still imagine objects or situations that <em>make</em> you angry, that, in effect, <em>arouse</em> your <em>Anger</em> and, for example, its tendency to counter <em>Sleep</em> as a self-control &ldquo;trick&rdquo; to continue <em>Work</em>.</p>
<p>For conscious schemes for self-control to work, e.g. in which we offer rewards to ourselves, incentives need to be discovered &ndash; our processes&rsquo; dispositions must be learned. This may involve bargaining and deception. But self-incentive tricks often don&rsquo;t work because, again, directness is dangerous.</p>
<p>Indirection is a hallowed design technique in computer programming. Code and processes can be made more robust through separation between <em>what</em> and <em>how</em>, when &ldquo;how does this work?&rdquo; is best answered, &ldquo;it depends.&rdquo;</p>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/the-conservative-self/</id><link rel="alternate" href="https://donnywinston.com/posts/the-conservative-self/"/><title>The Conservative Self</title><published>2022-04-18T22:00:01-04:00</published><updated>2022-04-18T22:00:01-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<blockquote>
<p>To understand what we call the Self, we first must see what Selves are for. <em>One function of the Self is to keep us from changing too rapidly.</em>..If we changed our minds too recklessly, we could never know what we might want next. We&rsquo;d never get much done because we could never depend on ourselves.</p>
</blockquote>
<p>Consider Hickey&rsquo;s <a href="https://donnywinston.com/posts/the-materials-paradigm-and-epochal-time/">epochal time model</a>:</p>
<p><img src="https://files.polyneme.xyz/dropshare/epochal-time-model-JOZA7dl2S8.png" alt="epochal-time-model"></p>
<p>What is the function here of Identity (i.e., the Self)? What makes a succession of states more than just a sequence of values?</p>
<p>Consider <em>identity</em> = <em>id</em> + <em>entity</em>. That is, an identity is a unique instance (with a primary key <em>id</em>) of an <em>entity</em> of a certain type.</p>
<p>What makes something an entity of a certain type? It must satisfy an <a href="https://docs.datomic.com/on-prem/schema/schema.html#entity-specs">entity spec</a>, i.e. maintain model invariants, including but not limited to the syntactic schema.</p>
<p>Can something have multiple &ldquo;id-entities&rdquo;? Sure. Something can be a Study with ID 123 as well as an Activity with ID 234 if (a) each value over time passes validation as both a Study and an Activity, and (b) each value over time resolves to the same unique individual within each of the Study and Activity model abstractions &ndash; i.e., as a function of the data attributes that these models choose to consider in order to distinguish individuals, each value is a &ldquo;state&rdquo; of &ldquo;the same&rdquo; individual.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-ttl" data-lang="ttl"><span style="display:flex;"><span><span style="color:#75715e"># expression of a non-unique name assumption</span>
</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">study</span>:<span style="color:#ae81ff">123</span> owl:<span style="color:#f92672">sameAs</span> <span style="color:#960050;background-color:#1e0010">activity</span>:<span style="color:#ae81ff">234</span> .
</span></span></code></pre></div><p>To achieve conservation of id-entity, to successfully associate an id-entity &ndash; an identity &ndash; with a sequence of values, to proclaim that sequence to be a succession of states, is to feel confident that certain model invariants are being maintained across the values&rsquo; history, that change is not reckless, that one can depend on something.</p>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/core-versus-crust/</id><link rel="alternate" href="https://donnywinston.com/posts/core-versus-crust/"/><title>Core Versus Crust</title><published>2022-04-18T16:18:06-04:00</published><updated>2022-04-18T16:18:06-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>The art of a great painting is not in any one idea, nor in a multitude of separate tricks for placing all those pigment spots, but in the great network of relationships among its parts.</p>
<p>Similarly, the collection of data and code that comprise a digital resource are by themselves as valueless as aimless, scattered daubs of paint. What counts is what we make of them, in operational and functional phases of processes in systems.</p>
<p>The value of a FAIR resource lies not in some small, precious core, but in its vast, constructed crust.</p>
<p>A fierce belief in conceptual cores &ndash; in spirits, souls, or essences &ndash; may insinuate a helplessness to improve. To seek virtue in such cores may be as wrongly aimed a search as seeking art in canvas cloths by scraping off the painter&rsquo;s work.</p>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/one-self-or-many/</id><link rel="alternate" href="https://donnywinston.com/posts/one-self-or-many/"/><title>One Self or Many</title><published>2022-04-15T10:22:47-04:00</published><updated>2022-04-15T10:22:47-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>Is a self a centralized entity? Is it a society that includes both images of what is (&ldquo;data&rdquo;) and ideals about what ought to be (&ldquo;schema&rdquo;)?</p>
<p>Sometimes we feel decentralized or dispersed, as though we were made of many different parts with different tendencies: <em>&ldquo;One part of me wants this, another part wants that. I must get better control of myself.&rdquo;</em> We sense feelings of disunity, conflicting motives, compulsions, internal tensions, and dissensions. We carry on negotiations in our head.</p>
<p>We also may have a single-view at times: <em>&ldquo;I think, I want, I feel. It&rsquo;s me, myself, who thinks my thoughts. It&rsquo;s not some nameless crowd or cloud of selfless parts.&rdquo;</em> And the times we feel most reasonably unified can be just the times that others see us as the most confused.</p>
<p>In <a href="https://donnywinston.com/posts/the-materials-paradigm-and-epochal-time/">an epochal time model</a>, a &ldquo;self&rdquo; - an identity &ndash; is a succession of images &ndash; of states; observers/processors may interpret the state-image data via a succession of contexts — of ideals / schema-versions.</p>
<p>Recent renewed interest in domain-team-driven distributed data-product governance, i.e. “data mesh”, may too be an expression of one-self-or-many tension.</p>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/what-functions-do-ideas-about-selves-serve/</id><link rel="alternate" href="https://donnywinston.com/posts/what-functions-do-ideas-about-selves-serve/"/><title>What Functions Do Ideas About Selves Serve?</title><published>2022-04-13T08:32:51-04:00</published><updated>2022-04-13T08:32:51-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>One must not mistake defining things for knowing what they are. You can know what a tiger is without defining it. You may define a tiger, yet know scarcely anything about it.</p>
<p>&ldquo;Self&rdquo; is a term used to talk about a sense of identity. Instead of asking, <em>&ldquo;What are selves?&rdquo;</em> we can ask, instead, <em>&ldquo;What are our ideas about Selves?&rdquo;</em> &ndash; and then we can ask, <em>&ldquo;What psychological functions do those ideas serve?&rdquo;</em></p>
<p>Our ideas about our Selves include beliefs about what we <em>are</em> &ndash; both what we are capable of doing and what we may be disposed to do. We may refer to such beliefs as <em>self-images</em>, as opposed to <em>self-ideals</em>, that is, ideas about what we&rsquo;d <em>like</em> to be or about what we <em>ought</em> to be.</p>
<p>When dealing with digital resources &ndash; datasets, models, workflows, schema &ndash; there are subtle semiotics at play in representing and communicating these selves and their identities:<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></p>
<p><img src="https://files.polyneme.xyz/dropshare/semiotic-information-triad-ZiIhqWAvNf.png" alt="semiotic-information-triad"></p>
<p>There are real things that occupy a given domain and scope of inquiry that are, unfortunately, neither understandable nor transmittable as fully correct messages (in the Shannon information sense).</p>
<p>Consider a dynamic digital object that represents the total information theoretic potential of &ndash; that is, all that one might say about &ndash; a real object. In representing our dynamic objects, we can only convey them as somewhat incomplete immediate objects &ndash; there is information loss.</p>
<p>So too is there loss in how these immediate objects are pointed to or signified &ndash; as something iconic like an image, described in words, etc. Our signification, our message, is imperfect.</p>
<p>And there is information loss at the response level by the interpreter that must decode the message that had to be encoded.</p>
<p>Finally, this may be the case not just for real things, but for ideal things - not just what is, but what ought to be.</p>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>M. K. Bergman, <em>A Knowledge Representation Practionary: Guidelines Based on Charles Sanders Peirce.</em> Springer International Publishing, 2018. <a href="https://doi.org/10.1007/978-3-319-98092-8">doi:10.1007/978-3-319-98092-8</a>.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/appearing-opposed-related-goals/</id><link rel="alternate" href="https://donnywinston.com/posts/appearing-opposed-related-goals/"/><title>Appearing Opposed, Related Goals</title><published>2022-04-12T16:02:49-04:00</published><updated>2022-04-12T16:02:49-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>Pain can simplify point of view. When you&rsquo;re in pain, it&rsquo;s hard to think of anything else.</p>
<p>Pleasure too can simplify point of view. You may feel that nothing is more important than finding a way to make that pleasure last.</p>
<p>We think of pain and pleasure as opposites - pleasure makes us draw its object near, whereas pain impels us to reject its object. They are also similar &ndash; they both distract, making rival goals seem small.</p>
<p>Fear and courage each do best by knowing both. Whether on offense or defense, you seek to guess the opponent&rsquo;s plan.</p>
<p><img src="https://files.polyneme.xyz/dropshare/extreme-extreme-ooIwls4njK.png" alt="extreme-extreme"></p>
<p>Sometimes, one of two seeming opposites is nothing but the absence of the other: sound and silence, light and darkness, interest and unconcern.</p>
<p><img src="https://files.polyneme.xyz/dropshare/presence-absence-Y3g46nqHFX.png" alt="presence-absence"></p>
<p>In appearing opposed, two things may serve related goals, or may engage selfsame agencies.</p>
<p>In contexts of data and code, opposition may signal an opportunity for structural sharing, for reuse.</p>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/destructive-acts-serving-constructive-goals/</id><link rel="alternate" href="https://donnywinston.com/posts/destructive-acts-serving-constructive-goals/"/><title>Destructive Acts Serving Constructive Goals</title><published>2022-04-11T15:19:36-04:00</published><updated>2022-04-11T15:19:36-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>Let&rsquo;s say that the urges of the <em>Play</em> process compete with those of other processes, like <em>Sleep</em>:</p>
<p><img alt="destructiveness-play-in-control"
width="80%"
src="https://files.polyneme.xyz/dropshare/destructiveness-play-in-control-hZHo3acyNz.png" /></p>
<p>If <em>Sleep</em> wrests control, then perhaps a <em>Wrecker</em>-process urge, previously constrained and now freed from <em>Play</em>&rsquo;s constraint, need only persist for one more kick to gain the satisfaction of a final crash:</p>
<p><img alt="destructiveness-sleep-in-control"
width="80%"
src="https://files.polyneme.xyz/dropshare/destructiveness-sleep-in-control-U17MXuQdMQ.png"/></p>
<p>This destructiveness may seem senseless, but it may serve to communicate frustration at the loss of a goal, and to serve constructive goals by leaving fewer problems to be solved &ndash; the kick may leave a mess &ldquo;outside&rdquo;, yet it may tidy the process orchestration.</p>
<p>It isn&rsquo;t true that when <em>Sleep</em> starts, <em>Play</em> must quit and all its agents have to cease. A mind can &ldquo;go to bed, yet still build towers in its head.&rdquo;</p>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/hierarchies-heterarchies-and-agent-memory/</id><link rel="alternate" href="https://donnywinston.com/posts/hierarchies-heterarchies-and-agent-memory/"/><title>Hierarchies, Heterarchies, and Agent Memory</title><published>2022-04-10T10:34:01-04:00</published><updated>2022-04-10T10:34:01-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>In a hierarchy, each agent only acts on behalf of one other agent:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#f92672">from</span> rdflib <span style="color:#f92672">import</span> Graph
</span></span><span style="display:flex;"><span><span style="color:#f92672">from</span> rdflib.namespace <span style="color:#f92672">import</span> RDF, PROV
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">def</span> <span style="color:#a6e22e">hierarchy</span>(graph):
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> all(
</span></span><span style="display:flex;"><span>        len(set(graph<span style="color:#f92672">.</span>objects(agent, PROV<span style="color:#f92672">.</span>actedOnBehalfOf))) <span style="color:#f92672">&lt;=</span> <span style="color:#ae81ff">1</span>
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">for</span> agent <span style="color:#f92672">in</span> graph<span style="color:#f92672">.</span>subjects(RDF<span style="color:#f92672">.</span>type, PROV<span style="color:#f92672">.</span>Agent)
</span></span><span style="display:flex;"><span>    )
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>hierarchy(Graph()<span style="color:#f92672">.</span>parse(data<span style="color:#f92672">=</span><span style="color:#e6db74">&#34;&#34;&#34;
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">@prefix prov: &lt;http://www.w3.org/ns/prov#&gt; .
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">@prefix :     &lt;http://example.com/&gt; .
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">:doc
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    a prov:Agent;
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">:sleepy
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    a prov:Agent;
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    prov:actedOnBehalfOf :doc;
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">:sneezy
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    a prov:Agent;
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    prov:actedOnBehalfOf :doc;
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">&#34;&#34;&#34;</span>)) <span style="color:#75715e"># True</span>
</span></span></code></pre></div><p>Hierarchies do not always work. Sometimes, for example, agents need to use each other&rsquo;s skills:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#66d9ef">def</span> <span style="color:#a6e22e">heterarchy</span>(graph):
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> <span style="color:#f92672">not</span> hierarchy(graph)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>heterarchy(Graph()<span style="color:#f92672">.</span>parse(data<span style="color:#f92672">=</span><span style="color:#e6db74">&#34;&#34;&#34;
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">@prefix prov: &lt;http://www.w3.org/ns/prov#&gt; .
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">@prefix :     &lt;http://example.com/&gt; .
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">:doc
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    a prov:Agent;
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">:sleepy
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    a prov:Agent;
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    prov:actedOnBehalfOf :sneezy, :doc;
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">:sneezy
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    a prov:Agent;
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    prov:actedOnBehalfOf :sleepy, :doc;
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">&#34;&#34;&#34;</span>)) <span style="color:#75715e"># True</span>
</span></span></code></pre></div><p>In heterarchies, per-agent working memory becomes critical: an agent must keep track of what next to do in a job A if it starts a job B before A is done. In hierarchies, priority and preemption can straightforwardly waterfall down from the top supervisor.</p>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/agents-and-hierarchies/</id><link rel="alternate" href="https://donnywinston.com/posts/agents-and-hierarchies/"/><title>Agents and Hierarchies</title><published>2022-04-08T10:39:46-04:00</published><updated>2022-04-08T10:39:46-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>Designing any society, be it human or computational, involves decisions like these:</p>
<ul>
<li>Which agents choose which others to do what jobs?</li>
<li>Who will decide which jobs are done at all?</li>
<li>Who decides what efforts to expend?</li>
<li>How will conflicts be settled?</li>
</ul>
<p>Furthermore, roles even in a hierarchy are always relative. To <a href="https://donnywinston.com/posts/the-world-of-blocks/"><em>Builder</em></a>, <em>Add</em> is a subordinate, but to <em>Find</em>, <em>Add</em> is a boss. As for yourself, which sorts of thoughts concern you most &ndash; the orders you are supposed to take or those you&rsquo;re supposed to give?</p>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/the-principle-of-noncompromise/</id><link rel="alternate" href="https://donnywinston.com/posts/the-principle-of-noncompromise/"/><title>The Principle of Noncompromise</title><published>2022-04-07T09:52:04-04:00</published><updated>2022-04-07T09:52:04-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<blockquote>
<p>The longer an internal conflict persists among an agent&rsquo;s subordinates, the weaker becomes that agent&rsquo;s status among its own competitors. If such internal problems aren&rsquo;t settled soon, other agents will take control and the agents formerly involved will be &ldquo;dismissed.&rdquo;</p>
</blockquote>
<p>Whenever several agents have to compete for the same resources, they are likely to get into conflicts.</p>
<p>Those agents&rsquo; superiors, too, may be under competitive pressure and likely to grow weak themselves whenever their subordinates are slow in achieving their goals, no matter whether because of conflicts between them or because of individual incompetence.</p>
<p>However, an agency that has &ldquo;lost control&rdquo; can continue to work inside itself &ndash; and thus become prepared to seize a later opportunity.</p>
<p>Must every &ldquo;mind&rdquo; contain some topmost center of control? Not necessarily. We sometimes settle conflicts by appealing to superiors, but other conflicts never end and never cease to trouble us.</p>
<p>Good human supervisors plan ahead to avoid conflicts in the first place, and &ndash; when they can&rsquo;t &ndash; they try to settle quarrels locally before appealing to superiors. But tiny mental/computational agents simply cannot know enough to be able to negotiate with one another or to find effective ways to adjust to each other&rsquo;s interference. Only larger agencies could be resourceful enough to do such things, to become versatile enough to negotiate by offering support for its subordinates&rsquo; goals.</p>
<blockquote>
<p>&ldquo;Please, <em>Wrecker</em>, wait a moment more till <em>Builder</em> adds just one more block: it&rsquo;s worth it for a louder crash!&rdquo;</p>
</blockquote>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/migrating-conflicts-between-agents-to-higher/</id><link rel="alternate" href="https://donnywinston.com/posts/migrating-conflicts-between-agents-to-higher/"/><title>Migrating Conflicts Between Agents to Higher Levels</title><published>2022-04-06T16:46:31-04:00</published><updated>2022-04-06T16:46:31-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>Many children not only like to build, they also like to knock things down &ndash; to hear the complicated noises and watch so many things move at once.</p>
<p>Let&rsquo;s imagine a sibling agent to <a href="https://donnywinston.com/posts/the-world-of-blocks/"><em>Builder</em></a> called <em>Wrecker</em>, whose specialty is knocking things down:</p>
<p><img src="https://files.polyneme.xyz/dropshare/wrecker-cRAwagc53l.png" alt="wrecker"></p>
<p>Suppose <em>Wrecker</em> gets aroused, but there&rsquo;s nothing in sight to smash. Then <em>Wrecker</em> will have to get some help &ndash; by putting <em>Builder</em> to work, for example.</p>
<p>But what if, at some later time, <em>Wrecker</em> considers the tower to be high enough to smash, while <em>Builder</em> wants to make it taller still? Who could settle that dispute?</p>
<p>Is the decision left to <em>Wrecker</em>, who activated <em>Builder</em> in the first place? What if both were activated by a higher-level agent, <em>Play-with-Blocks</em>? What if that agent in turn was activated by a <em>Play</em> agent, who may be in conflict with <em>Eat</em> and <em>Sleep</em>?</p>
<p><img src="https://files.polyneme.xyz/dropshare/conflict-builder-wrecker-OciBzr2npH.png" alt="conflict-builder-wrecker"></p>
<p>A child&rsquo;s play is not an isolated thing. It always happens in the context of other real-life concerns. Whatever we may chose to do, there are always other things we&rsquo;d also like to do.</p>
<p>In single-thread, synchronous computer programming, prolonged conflict may be avoided. A function A may call another function B in its body and wait for B to return. If B encounters trouble, it can raise an exception for A to catch. The chain of command for conflict resolution can be clear.</p>
<p>In the case of asynchronous, independent agents like <em>Builder</em> and <em>Wrecker</em>, what is the effect of prolonged conflict? Perhaps the conflict tends to weaken their mutual superior, <em>Play-with-Blocks</em>. In turn, this could reduce <em>Play-with-Blocks</em>&rsquo;s ability to suppress <em>its</em> rivals. Next, if that conflict isn&rsquo;t settled soon, it could weaken <em>Play</em> at the next-highest level. Then, <em>Eat</em> or <em>Sleep</em> might seize control.</p>
<p><em>Data, data everywhere, but not a value to validate&hellip;</em></p>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/are-people-machines/</id><link rel="alternate" href="https://donnywinston.com/posts/are-people-machines/"/><title>Are People Machines?</title><published>2022-04-05T14:43:25-04:00</published><updated>2022-04-05T14:43:25-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>Are people machines?</p>
<p>&ldquo;Everyone knows that machines can behave only in lifeless, mechanical ways.&rdquo;</p>
<p>This objection seems reasonable. A person ought to feel offended at being likened to any <em>trivial</em> machine. But it seems to me that the word &ldquo;machine&rdquo; is getting to be out of date.</p>
<p>We ought to recognize that we&rsquo;re still in an early era of machines, with virtually no idea of what they may become. What if some visitor from outer space had come a billion years ago to judge the fate of earthly life from watching clumps of cells that hadn&rsquo;t even learned to crawl?</p>
<p>Our first intuitions about computers came from experiences with machines of the 1940s, which contained only thousands of parts. But a human brain contains billions of cells, each one complicated by itself and connected to many thousands of others.</p>
<p>Present-day computers represent an intermediate degree of complexity. And yet, we continue to use old words as though there had been no change at all. We need to adapt our attitudes to phenomena that work on scales never before conceived. Does the term &ldquo;machine&rdquo; take us far enough?</p>
<p>Rhetoric won&rsquo;t settle anything. In trying to understand what the vast mechanisms of the human brain may do, we can find self-respect in knowing what wonderful machines we are  (and what a FAIR, internetworked &ldquo;electronic brain&rdquo; could be).</p>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/apis-are-human-interfaces/</id><link rel="alternate" href="https://donnywinston.com/posts/apis-are-human-interfaces/"/><title>GUIs and APIs Are Both Human Interfaces</title><published>2022-04-05T11:18:11-04:00</published><updated>2022-04-05T11:18:11-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>GUIs and APIs are both human interfaces. They both frame perspectives on data/operation service
offerings so that human beings can navigate and consume them. The human being in the case of APIs is
the application programmer, a subset of users. GUIs are applications, so it is natural to expect an
API’s capabilities to be a superset of a corresponding GUI’s — application programmers program the
GUI using the API.</p>
<p>Your resource service interface is not necessarily understandable &ndash; discoverable, crawl-able &ndash;  by
machines. APIs are generally not machine-actionable interfaces.</p>
<p>Nor is it necessarily wise that a given API be made machine-actionable. This would result in a
two-audience problem. With two different target audiences, humans and machines, how could an API
serve both well?</p>
<p>I used to think that GUIs were for humans and APIs were for machines. I now have a SICP-esque
perspective on APIs: they &ldquo;must be written for people to read, and only incidentally for machines to
execute.&rdquo;<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>H. Abelson, G. J. Sussman, and J. Sussman, <em>Structure and Interpretation of Computer
Programs</em>, 2nd ed. MIT Press, 2002.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/easy-things-are-hard/</id><link rel="alternate" href="https://donnywinston.com/posts/easy-things-are-hard/"/><title>Easy Things Are Hard</title><published>2022-04-04T13:30:15-04:00</published><updated>2022-04-04T13:30:15-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>In general, we&rsquo;re least aware of what our minds do best. It&rsquo;s mainly when systems start to fail that we engage the special agencies involved with what we call &ldquo;consciousness.&rdquo;</p>
<p>Accordingly, we&rsquo;re more aware of simple processes that don&rsquo;t work well than of complex ones that work flawlessly.</p>
<p>This phenomenon helps to explain the poor performance of many so-called <em>expert systems</em> in the 1980s. There were attempts to fully rationalize human expertise as calculative rules. The effect was often to regress an expert&rsquo;s <em>knowing how</em> to a novice practitioner&rsquo;s <em>knowing that</em>.</p>
<p>Skill acquisition in unstructured domains moves not towards abstract rules, but rather from abstract rules to particular cases. And &ldquo;the distinction between education, a process aimed at drawing out the abilities of the student, and training, in which the student is learning to negotiate a structured domain, is crucial.”</p>
<p>This may help shed light on much of the recent mixed success of &ldquo;unexplainable&rdquo; neural-network-based decision systems.</p>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/holes-and-parts/</id><link rel="alternate" href="https://donnywinston.com/posts/holes-and-parts/"/><title>Holes and Parts</title><published>2022-04-03T11:20:34-04:00</published><updated>2022-04-03T11:20:34-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>What keeps a mouse contained in a box?</p>
<p>It is the way a box prevents motion in all directions. Each board bars escape in a certain direction. The left side keeps the mouse from going left, the right from going right, the top keeps it from leaping out, and so on.</p>
<p>The secret of a box is simply in how the boards are arranged to prevent motion in <em>all</em> directions. That&rsquo;s what <em>containing</em> means.</p>
<p>It&rsquo;s silly to expect any separate board by itself to contain any <em>containment</em>, even though each contributes to the containing. It is like the cards of a straight flush in poker; only the full hand has any value at all.</p>
<p>The same applies to words like <em>life</em> and <em>mind</em>. It is foolish to use these words for describing the smallest components of living things because these words were invented to describe how larger assemblies interact. Like <em>boxing-in</em>, words like <em>living</em> and <em>thinking</em> are useful for describing phenomena that result from certain combinations of relationships.</p>
<p>None of the 15 FAIR principles<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> <em>contain</em> FAIR. A digital resource will not become &ldquo;more FAIR&rdquo; when it adheres to one rather than none of the principles.</p>
<p>However, just like <em>life</em> has gradually lost much of its mystery &ndash; at least for modern biologists, because they understand so many of the important interactions among the chemicals in cells &ndash; FAIR can be demystified by understanding how the components of a well-made FAIR resource interact to facilitate reuse and repurposing.<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup></p>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>M. D. Wilkinson et al., “The FAIR Guiding Principles for scientific data management and stewardship,” Sci Data, vol. 3, no. 1, p. 160018, Mar. 2016, doi: 10/bdd4.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>I&rsquo;ve been going over the FAIR principles one by one on <a href="https://podcast.polyneme.xyz/">my podcast</a>. Each such episode has averaged about five minutes.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/parts-and-wholes/</id><link rel="alternate" href="https://donnywinston.com/posts/parts-and-wholes/"/><title>Parts and Wholes</title><published>2022-04-01T11:19:24-04:00</published><updated>2022-04-01T11:19:24-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<blockquote>
<p>We&rsquo;re often told that certain wholes are &ldquo;more than the sum of their parts.&rdquo; We hear this expressed with reverent words like &ldquo;holistic&rdquo; and &ldquo;gestalt,&rdquo; whose academic tones suggest that they refer to clear and definite ideas. But I suspect the actual function of such terms is to anesthetize a sense of ignorance. We say &ldquo;gestalt&rdquo; when things combine to act in ways we can&rsquo;t explain, &ldquo;holistic&rdquo; when we&rsquo;re caught off guard by unexpected happenings and realize we understand less than we thought we did.</p>
</blockquote>
<p>What makes a tower more than separate blocks, or a wall more than a set of many bricks? Every block/brick is held in place by its neighbors and gravity. Why is a chain more than its various links? To explain why chain-links cannot come apart, we can demonstrate how each would get in its neighbors&rsquo; way.</p>
<p>In graphical diagrams of such physical situations, the edges drawn between nodes are &ndash; implicitly or explicitly &ndash; labeled, qualified relations. An arrow is not a mystery &ndash; it is, for example, gravitational force.</p>
<p>Sometimes, giving names to things can help by leading us to focus on some mystery. It&rsquo;s harmful, though, when naming leads the mind to think that names alone bring meaning close.</p>
<p>With Linked Data, all edges relating parts are labeled, and those labels are things, not strings. Such discipline can help us to not fool ourselves.</p>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/novelists-and-reductionists/</id><link rel="alternate" href="https://donnywinston.com/posts/novelists-and-reductionists/"/><title>Novelists and Reductionists</title><published>2022-03-31T11:18:14-04:00</published><updated>2022-03-31T11:18:14-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>Some like to focus on the new. They like to invent theories.</p>
<p>Some are adamant about reducing to what has come before. This has worked remarkably well for the core of physics.</p>
<p>These inclinations are not incompatible given some kind of &ldquo;leveling&rdquo;, with discipline about connections. Standing on the shoulders of giants and all that.</p>
<p>Much of apparent &ldquo;novelty&rdquo; may be reducible to the structured annotation and (re-)configuration of core mechanism. Like how various organisms’ genetic inheritances have been modded over millennia.</p>
<p>Linked Data may be framed as a way to get novelists and reductionists to sit at the table of FAIR together.</p>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/components-and-connections/</id><link rel="alternate" href="https://donnywinston.com/posts/components-and-connections/"/><title>Components and Connections</title><published>2022-03-30T15:05:42-04:00</published><updated>2022-03-30T15:05:42-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>An agent like <em>Builder</em> is not merely a collection of parts like <em>Find</em>, <em>Get</em>, <em>Put</em>, and all the rest. <em>Builder</em> would not work at all unless those agents were linked to one another by a suitable network of interconnections:</p>
<p><img src="https://files.polyneme.xyz/dropshare/agents-by-themselves-and-in-bureaucracy-WDrJoLTNjp.png" alt="agents-by-themselves-and-in-bureaucracy"></p>
<p>Could you predict what <em>Builder</em> does from knowing just that left-hand list above? Of course not. First, we must know how each separate part works. Second, we must know how each part interacts with those to which it is connected. And third, we have to understand how all these local interactions combine to accomplish what that system <em>does</em> &ndash; as seen from the outside.</p>
<p>There is lots of prior art for understanding combinations of component interactions, whether as expression trees or wiring diagrams. Computer programming has traditionally emphasized the former, but note how <em>Move</em> has two &ldquo;parents&rdquo; in the diagram above.</p>
<p>I leave you with the intro to the last chapter of <sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>:</p>
<blockquote>
<p>Decades of programming experience have taken a toll on our collective imagination. We come from a culture of scarcity, where computation and memory were expensive, and concurrency was difficult to arrange and control. This is no longer true. But our languages, our algorithms, and our architectural ideas are based on those assumptions. Our languages are basically sequential and directional &ndash; even functional languages assume that computation is organized around values percolating up through expression trees. Multidirectional constraints are hard to express in functional languages.</p>
</blockquote>
<blockquote>
<h3 id="escaping-the-von-neumann-straitjacket">Escaping the Von Neumann straitjacket</h3>
<p>The propagator model of computation provides one avenue of escape. The propagator model is built on the idea that the basic computational elements are propagators, autonomous independent machines interconnected by shared cells through which they communicate. Each propagator machine continuously examines the cells it is connected to, and adds information to some cells based on computations it can make from information it can get from others. Cells accumulate information and propagators produce information.</p>
</blockquote>
<blockquote>
<p>Since the propagator infrastructure is based on propagation of data through interconnected independent machines, propagator structures are better expressed as wiring diagrams than as expression trees. In such a system partial results are useful, even though they are not complete. For example, the usual way to compute a square root is by successive refinement using Heron&rsquo;s method. In traditional programming, the result of a square root computation is not available to subsequent computations until the required error tolerance is achieved. By contrast, in an analog electrical circuit that performed the same function, the partial results could be used by the next stages as first approximations to their computations. This is not an analog/digital problem—it is organizational. In a propagator mechanism the partial results of a digital process can be made available without waiting for the final result.</p>
</blockquote>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>C. Hanson and G. J. Sussman, Software design for flexibility: how to avoid programming yourself into a corner. Cambridge: The MIT Press, 2021.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/wholes-and-parts-in-fair-and-mind/</id><link rel="alternate" href="https://donnywinston.com/posts/wholes-and-parts-in-fair-and-mind/"/><title>Wholes and Parts, in FAIR and Mind</title><published>2022-03-29T08:43:41-04:00</published><updated>2022-03-29T08:43:41-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<blockquote>
<p>It is the nature of the mind that makes individuals kin, and the differences in the shape, form, or manner of the material atoms out of whose intricate relationships that mind is built are altogether trivial.</p>
<p>&ndash; Isaac Asimov</p>
</blockquote>
<p>It is the nature of FAIR that can make digital resources kin &ndash; interoperable in fugues of machine action. The differences in the schema, serialization, or domain-specificity of the digital datoms out of whose intricate relationships any given resource is built are altogether trivial.</p>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/agents-and-agencies/</id><link rel="alternate" href="https://donnywinston.com/posts/agents-and-agencies/"/><title>Agents and Agencies</title><published>2022-03-28T23:42:18-04:00</published><updated>2022-03-28T23:42:18-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>We want to explain complicated things as a combination of simpler things. You must be prepared to feel a certain sense of loss. When we break things down to their smallest parts, they may each seem as dry as dust at first, as though some essence has been lost.</p>
<p>Where does the &ldquo;knowing-how-to-build&rdquo; of a <em>Builder</em> agent reside? It is not in any part, so it is not enough to explain what each separate agent does. We must understand how parts are interrelated &ndash; how <em>groups</em> of agents can accomplish things.</p>
<p>Seen by itself, as an agent, <em>Builder</em> is just a simple process that turns other agents on and off. Seen from outside, as an agency, <em>Builder</em> does whatever all its sub-agents accomplish, using one another&rsquo;s help:</p>
<p>&lt;img title=&ldquo;agents-and-agencies&rdquo; alt=&ldquo;agents-and-agencies&rdquo;
width=&ldquo;100%&rdquo;
src=&ldquo;<a href="https://files.polyneme.xyz/dropshare/agents-and-agencies-xWtuG28LbO.png%22">https://files.polyneme.xyz/dropshare/agents-and-agencies-xWtuG28LbO.png&quot;</a>
/&gt;</p>
<p><em>Builder</em> seems to lead a double life. As agency, it seems to know its job. As agent, it cannot know anything at all.</p>
<p>And knowing how is not the same as knowing that. If while performing an activity expertly you find yourself consciously reflecting on what you are doing and the rules for doing it, chances are you will experience a severe degradation of performance &ndash; you fall victim to &ldquo;knowing that&rdquo; as it interrupts and replaces your &ldquo;knowing how&rdquo;.</p>
<p>Hermans <sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> highlights three kinds of confusion when reading code: lack of knowledge, lack of information, and lack of processing power. These also reflect a gradual descent from know-how focus to know-that focus, from seeing agencies to tracing the wiring-together and actions of individual agents.</p>
<p>So it may be with a society of FAIR digital resources &ndash; agents may be distributed, and yet agency may be coherent.</p>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>F. Hermans, The Programmer’s brain: what every programmer needs to know about cognition. Shelter Island, NY: Manning, 2021.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><author><name>Donny Winston</name><uri>https://orcid.org/0000-0002-8424-0604</uri></author><id>tag:donnywinston.com,2022:/posts/common-sense/</id><link rel="alternate" href="https://donnywinston.com/posts/common-sense/"/><title>Common Sense</title><published>2022-03-27T09:32:59-04:00</published><updated>2022-03-27T09:32:59-04:00</updated><content type="html" xml:base="https://donnywinston.com" xml:lang="en">
<![CDATA[<p>We <a href="https://donnywinston.com/posts/the-world-of-blocks/">found a way</a> to make a tower builder out of parts. But <em>Builder</em> is really far from done.</p>
<p>For example, how could <em>Find</em> determine which blocks are still available for use? It would have to &ldquo;understand&rdquo; the scene in terms of what it is trying to do.</p>
<p>We&rsquo;ll need theories both about what it means to understand and about how a machine could have a goal.</p>
<p>Consider all the <em>practical</em> judgments that an actual <em>Builder</em> would have to make. It would have to decide whether there are enough blocks to accomplish its goal and whether they are strong and wide enough to support the others that will be placed on them.</p>
<p>By the time we are adults, we regard all of this to be simple &ldquo;common sense&rdquo;. But that deceptive pair of words conceals almost countless different skills.</p>
<blockquote>
<p>Common sense is not a simple thing. Instead, it is an immense society of hard-earned practical ideas &ndash; of multitudes of life-learned rules and exceptions, dispositions and tendencies, balances and checks.</p>
</blockquote>
<p>Dreyfus calls the ability to intuitively respond to patterns without decomposing them into component features &ldquo;holistic discrimination and association&rdquo;<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>:</p>
<blockquote>
<p>When things are proceeding normally, experts don&rsquo;t solve problems and don&rsquo;t make decisions; they do what normally works.</p>
</blockquote>
<p>As each new group of skills matures, we build more layers on top of them. As time goes on, the layers below become increasingly remote until, when we try to speak of them in later life, we find ourselves with little more to say than <em>&ldquo;I don&rsquo;t know.&rdquo;</em></p>
<div style="margin-top: 2em;">
<small>

    <a href="https://buttondown.email/donny" target="_blank">Subscribe</a> to get short notes
    like this on Machine-Centric Science delivered to your email.

</small>
</div>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>H. L. Dreyfus and S. E. Dreyfus, <em>Mind over machine: the power of human intuition and expertise in the era of the computer</em>. New York: The Free Press, 1988.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry></feed>