-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to tell if a prefix is for an ontology, database, etc.? #926
Comments
Hi @webermn, thanks for the comment. The primary goal of the Bioregistry is to index semantic spaces. Often, this corresponds 1-to-1 with ontologies or databases, but this isn't always the case. A good counterexample is the Uber Anatomy Ontology (UBERON), which has both the eponymous semantic space for anatomical entities (uberon) and also an additional one for properties (ubprop). Similarly, you can see that the KEGG database results in many different semantic spaces (https://bioregistry.io/kegg.compound, https://bioregistry.io/kegg.disease, etc.). In many ways, the difference between what's an "ontology" and what's a "database" is merely the curation philosophy of the maintainers and the kind of export format they use. For example, people typically consider HGNC as a database, but it can also be easily dumped to an ontology. You can find many examples of "databases" that are dumped as "ontologies" here: https://github.com/biopragmatics/obo-db-ingest. See further discussion about databases to ontologies at https://docs.google.com/presentation/d/1aySEHTgkags7UPJYHyvQ9frYvAIqr1G5A3u7dGF26Y4 and OBOFoundry/OBOFoundry.github.io#1981. Where available, the Bioregistry keeps tracks of links to ontology artifacts, so this could be a simple solution for this. In https://bioregistry.io/api/registry/uberon?format=json, you can find three links to ontology artifacts: Similarly, the Python API allows for getting these: >>> import bioregistry as br
>>> br.get_obo_download("uberon")
'http://purl.obolibrary.org/obo/uberon.obo' From what you wrote, I think you just want a way of filtering the registry page from the GUI. If you can better describe why you want to do this, we might be able to work towards a solution. |
Thank you, @cthoyt, for the helpful response. The examples you provided on "dumping" a database to an ontology, as well as Chris's slides and some of your blog posts I came across, were great context. I think what I am ultimately trying to do/find might not be the purpose of the Bioregistry resource. It may be the purpose of other resources, and I'm just not aware. I'll try to explain:
Again, this may not really be the purpose of the Bioregistry and could instead be something other resources are attempting to achieve. That said, my logic is that the simpler it is for the average person to navigate resources like the Bioregistry and see how this might help inform things like DMSPs, the more people will become comfortable with these important curation concepts and engage with this community — which I ultimately think will better enable biomedical research. (Please feel free to close this issue as a "Won't Do"!) |
@webermn wow, that's really helpful, thanks for writing such a detailed response. As you mentioned, there are lots of repositories that cover various resources with different metadata standards related to their different scopes. Bioregistry actually already imports and aligns with re3data, FAIRsharing, and several others (listed here). These two specific examples both have different focuses than Bioregistry, but are nonetheless valuable for aligning. I will look into if this can be done with DataCite as well (again, even though the goal of that resource is different than Bioregistry's). Overall, I think using Bioregistry as a resource to help write DMPs is a great idea. I would love to see people taking a principled approach to what kinds of PIDs they use and especially making sure they're written/stored/communicated in a standard way. Another registry that has a similar focus to Bioregistry (and is also itself standardized then incorporated in the Bioregistry) is https://registry.bio2kg.org/. They have a really nice navigation/faceted search that I would like to replicate that would address some of your concerns. Additional context: we are likely going to get some dedicated funding for the Bioregistry soon, and addressing your use cases would be a great use of this opportunity. Do you think you would be able to meet next week or in the near future to discuss further? Feel free to send an email to cthoyt@gmail.com (I'm on European time) |
Noob question with probably a simple answer staring me in the face, but how can I determine which registry entires are formal ontologies vs. controlled vocabularies vs. databases/repositories vs. [other]?
I have been looking for a 'type' attribute (or equivalent) for this, but my primary workaround has been to use the search button to narrow the list by the terms above. Being able to filter on 'type' would be helpful for me — if not in the GUI, then as an attribute that could be parsed from the JSON/YAML/TSV export.
The text was updated successfully, but these errors were encountered: