Indexing Identifiers
Indexing identifiers is key to disambiguating entities.
Wikipedia has disambiguation pages. For example, there are various concepts in mathematics and computing, various computing products, and various companies that identify with the term “Precision”. I made disambiguation pages for same-chemical-formula inorganic crystal structures for the Materials Project.
Indexing identifiers is also key to unifying entities. It’s an open
world after all,1 with a comcomitant
non-unique naming assumption. OpenAlex
indexes various ID types for a work. For
example,
http://api.openalex.org/works/https://doi.org/10.7717/peerj.4375
will funnel you to the payload for https://openalex.org/W2741809807
, which has an ids
field with
openalex
, doi
, mag
(Microsoft Academic Graph), pmid
(Pubmed), and pmcid
(Pubmed Central)
IDs.
Finally, indexing identifiers is key to registering and resolving metadata, i.e. relationships between identifiers. Registries include Linked Open Vocabularies (LOV), the Ontology Lookup Service (OLS), the Zazuko Prefix Server, and the OBO Foundry. Resolvers include Identifiers.org and Name-To-Thing (n2t). There is even at least one metaregistry, Bioregistry.io.
Any time you encounter a web service using a “remote data access” style, i.e. exposing a query language via a single access point – SQL, SPARQL, GraphQL, MongoDB, etc. – its highly likely that all entity identifiers are indexed to support efficient retrieval and combination/joining.