Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modeling metaclasses #2454

Open
cthoyt opened this issue Sep 27, 2023 · 1 comment
Open

Modeling metaclasses #2454

cthoyt opened this issue Sep 27, 2023 · 1 comment
Labels
ontology metadata Issues related to ontology metadata

Comments

@cthoyt
Copy link
Collaborator

cthoyt commented Sep 27, 2023

This is an epic issue about how to model metaclasses in ontologies

Charlie's original question

I asked a question on the OBO Foundry slack, but I think it would be valuable to duplicate it here

Do we have a standard pattern for modeling metaclass relationships? For example, if I wanted to create an ontology that contained both gene families and actual genes (I'm not going to do this since HGNC exists, so this just for argument). I have children of my parent "Gene" class, which consist of some gene families and other groupings that all have "is a" relationships between them. Eventually I have an actual specific gene, which "is a" on potentially one or more gene families. How do I make it clear which terms in my ontology are grouping terms and which ones are bottom-level genes? I guess I can look at which terms have no children "is a" relations, but this can also break in certain edge cases.

More generally, I guess the question is how do I denote which classes are metaclasses and which ones are regular classes? And does this question even make sense, since most modeling is pretty meta anyway?

One thing I thought of was to model the relationships between gene families and genes not with "is a" but with another relationship like "part of" or "component of". Maybe this works sometimes, but I think there might be situations where this won't make sense either.

Another example: NCBITaxon has some well-defined "ranks" which are annotated with a has rank relation

I'm also going to copy @cmungall's response here, since this probably warrants at least a blog post from his side (I hope!) and potentially some documentation into OBO Foundry best practices. Here's what he said:

There are two questions here: what property to use to link classes to metaclasses, and what vocabulary to use for the metaclasses themselves. I’ll stick to the former for now. This is all assuming you want to keep modeling as classes at all (which is not a foregone conclusion; many people find OBO unintuitive here). Here is how it is currently done:

  • PRO uses rdfs:comment and a tacit controlled vocabulary of string values. See this PRO issue: Use a standard vocabulary and annotation property for indicating metaclass PROconsortium/PRoteinOntology#150
    As you point out, NCBITaxon uses its own has_rank annotation property, and for values has a URIs sneakily introduced into the main NCBITaxon namespace itself
  • CHEBI absolutely needs to use metaclasses as it absolutely permeates chemical classification thinking (e.g. “species”), and confusion about levels lie at the heart of many CHEBI problems. See chemrof slides
  • We should probably have these in GO MF in order to distinguish reaction entities (EC 4 level) from reaction classes (EC levels 3 and below), for now you can hack this with oboInOwl:hasDbXref
  • Many ontologies use Design Patterns (all ontologies should). These are essentially metaclasses. dcterms:conformsTo is used to connect OBO classes to “metaclasses” (but the object URIs are not well standardized here)
  • Mondo is likely to have a better way of distinguishing disease “entities” from “classes” in the future. You can sort of get these by oboInOwl:subset tags now but this is not a good mechanism
  • In Monarch we use biolink:category to link classes to biolink metaclasses

This is a mess. Most people don’t realize it’s a mess because they don’t realize there is a common shape to what is currently a lot of bespoke hacks. This becomes quite pressing for things like genes where there is clearly a distinction between gene entities as denoted by HGNCs and the OBO approach of lacking a formal way to distinguish between the forms of “eukaryotic gene” and “sonic hedgehog gene upstream of a specifically modified region of DNA in the epidermal cell of my left pinky”.
I would advocate for biolink:category or analogous property. It’s simple yet theoretically sound and avoids many metamodeling pitfalls (unfortunately W3C standards have a lot of unnecessary traps for us here, and while annoying, they can’t be ignored)

Other Resources

@cmungall
Copy link
Contributor

Further thoughts in this slide deck

Note this is an active area of research, it's easy to get into modeling muddles here, see for example this paper:
https://www.semantic-web-journal.net/system/files/swj3480.pdf

I contacted the authors and they are interested in our use case. One of them is speaking at this conference on metaclass modeling next week:
https://jku-win-dke.github.io/MULTI2023/

They also said that in their wikidata analysis "gene" was the most problematic concept :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ontology metadata Issues related to ontology metadata
Projects
None yet
Development

No branches or pull requests

3 participants