Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Epic: GREI 2 Task 1 - Publish benchmarks for amount of Harvard Dataverse ORCIDs, ROR IDs, and related research object metadata #225

Open
cmbz opened this issue Apr 10, 2024 · 1 comment
Labels
GREI Year 4 Year 4 GREI task GREI 2 Consistent Metadata GREI 2.1 GREI 2: Task 1

Comments

@cmbz
Copy link
Contributor

cmbz commented Apr 10, 2024

These benchmarks will help us determine if more ORCIDs, ROR IDs, and related research object metadata are being published as a result of our implementation of version 1 of the GREI Metadata Recommendations.

ORCIDs
In the 12-months before we make relevant changes to Dataverse, we'll determine what percentage of author metadata published in Harvard Dataverse includes ORCIDs (both "valid" ORCID metadata - where "ORCID" is chosen for the Identifier Type and what's entered in the Identifier field follows the xxxx-xxxx-xxxx-xxxx format - and "invalid" ORCID metadata).

And for the 12-months after we've made the changes, which will let depositors include ORCIDs for other types of people associated with deposits, such as other types of contributors, we'll determine this percentage again to see if it has increased compared to the previous 12-month period.

For example, using metadata collected in August 2023 from most known Dataverse installations (https://doi.org/10.7910/DVN/8FEGUV), we can determine that of all of the author metadata that Harvard Dataverse published in 2022, 36.5 percent includes an ORCID. The remaining author metadata may include no identifiers or may include other types of identifiers for people or organizations.

See the Jupyter notebook to see how we got this metric. The CSV file produced by that notebook lists the same metric for many of the other Dataverse installations that the community knew of in August 2023.

ROR IDs
Harvard Dataverse and most known installations of Dataverse do not record ROR IDs.

For the 12-months before and after we've made changes to Dataverse that lets it record ROR IDs, we'll determine what percentage of metadata about organizations associated with published deposits includes ROR IDs.

Related research object metadata
In the 12-months before we make relevant changes to Dataverse, we'll determine what percentage of deposits published in Harvard Dataverse include the DOIs of related research articles (that is, DOIs in a Related Publication field).

And for the 12-months after we've made the changes, which will also let depositors include DOIs for other types of related research objects like related datasets, we'll determine this percentage again to see if it has increased compared to the previous 12-month period.

Year 4 task

@cmbz
Copy link
Contributor Author

cmbz commented Apr 10, 2024

2024/04/10

  • Create epic for planned Year 4 work

@cmbz cmbz added the GREI Year 4 Year 4 GREI task label May 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
GREI Year 4 Year 4 GREI task GREI 2 Consistent Metadata GREI 2.1 GREI 2: Task 1
Projects
None yet
Development

No branches or pull requests

1 participant