Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

As a researcher, Dataverse admin or curator, I want more information about my data sent to DataCite so that it's more discoverable #2917

Closed
posixeleni opened this issue Feb 4, 2016 · 30 comments
Labels
Feature: Metadata Type: Feature a feature request User Role: Superuser Has access to the superuser dashboard and cares about how the system is configured

Comments

@posixeleni
Copy link
Contributor

posixeleni commented Feb 4, 2016

To help expand on the DataCite DOI support #24 we are doing we will also add some additional DataCite fields which will help in the discovery of our datasets in their index.

For example this is what we currently send:

<?xml version="1.0" encoding="UTF-8"?>
<resource
	xmlns="http://datacite.org/schema/kernel-3"
	xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://datacite.org/schema/kernel-3
    http://schema.datacite.org/meta/kernel-3/metadata.xsd">
	<identifier identifierType="DOI">10.7910/DVN/29606</identifier>
	<creators>
		<creator>
			<creatorName>Blackwell, Matthew, Honaker, James, King, Gary</creatorName>
		</creator>
	</creators>
	<titles>
		<title>Replication data for: A Unified Approach To Measurement Error And Missing Data: Overview</title>
	</titles>
	<publisher>:unav</publisher>
	<publicationYear>2015</publicationYear>
	<resourceType resourceTypeGeneral="Text"/>
</resource>

This will need to be updated to this, which includes metadata suggested by Martin Fenner from DataCite and by the Helmsley project (add contributors).

<?xml version="1.0" encoding="UTF-8"?>
<resource
	xmlns="http://datacite.org/schema/kernel-3"
	xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://datacite.org/schema/kernel-3 http://schema.datacite.org/meta/kernel-3/metadata.xsd">
	<identifier identifierType="DOI">10.7910/DVN/XYZ02</identifier>
	<creators>
		<creator>
			<creatorName>Castro, Eleni</creatorName>
			<nameIdentifier schemeURI="http://orcid.org/" nameIdentifierScheme="ORCID">0000-0001-9767-8536</nameIdentifier>
			<affiliation>IQSS</affiliation>
		</creator>
		<creator>
			<creatorName>Barbosa, Sonia</creatorName>
		</creator>
	</creators>
	<titles>
		<title>Replication data for: Testing out DataCite</title>
	</titles>
	<publisher>Harvard Dataverse</publisher>
	<publicationYear>2015</publicationYear>
	<resourceType resourceTypeGeneral="Dataset"/>
	<descriptions>
		<description descriptionType="Abstract">It was a good idea to try testing out this metadata before implementation
    </description>
	</descriptions>
	<contributors>
		<contributor contributorType="ProjectLeader">
			<contributorName>Starr, Joan</contributorName>
			<affiliation>California Digital Library</affiliation>
		</contributor>
	</contributors>
</resource>

Complete mapping to DataCite will still need to be done as described in #2774 and #2778 but this will require additional resources and time to complete.

@pameyer
Copy link
Contributor

pameyer commented Feb 8, 2016

One suggestion: allow for 'particular kind of dataset' (in addition to just "Dataset") and the "subject" tag.

If that belongs in a different issue (aka - not relevant for this indexing), please let me know (or move it wherever it fits better).

@posixeleni
Copy link
Contributor Author

Thanks for the suggestions @pameyer we are tracking what mapping we can support for the first phase here: https://docs.google.com/spreadsheets/d/1uADPbtVUEIXz5phtThjxU6gkAg-5jojxSlHtF959lEU/edit?usp=sharing

@posixeleni
Copy link
Contributor Author

posixeleni commented Mar 2, 2016

This feature needs to be enabled to send metadata to both:

  • EZID API
  • DataCite MDS

@posixeleni posixeleni removed their assignment Jun 21, 2016
@pdurbin pdurbin assigned bmckinney and unassigned bmckinney Oct 6, 2016
@djbrooke
Copy link
Contributor

djbrooke commented Oct 7, 2016

@pdurbin @bmckinney

Going to move this out of In Progress for now, feel free to re-add the label and an owner if this is incorrect. Thanks!

@pdurbin
Copy link
Member

pdurbin commented Oct 7, 2016

@djbrooke sorry, I should have left a comment. On Thursday I said I'd created an issue called something like "DataCite XML: send more fields on published". Then I realized that this issue already exists and @pameyer has already commented on it. @bmckinney and I discussed it and he agreed he'd work on it the next two weeks or so I assigned it to him and put it in "Development" in https://waffle.io/IQSS/dataverse . My understanding is that it's relatively low effort but I'll defer to @bmckinney to estimate it after talking more with @pameyer about his requirements. I'm fine with whatever Waffle column but I believe this is one of the issues the @pameyer would say is a show stopper for DNS cutover day of switching https://data.sbgrid.org to be powered by Dataverse. I just created a GitHub label called "SBGrid" and applied it to this issue so that I can find it again.

@djbrooke
Copy link
Contributor

djbrooke commented Oct 7, 2016

Sounds good - feel free to move it over when work begins on it!

@pdurbin
Copy link
Member

pdurbin commented Oct 13, 2016

@bmckinney is working on this so I moved it to Development in Waffle.

@pdurbin
Copy link
Member

pdurbin commented Oct 13, 2016

@sekmiller
Copy link
Contributor

@bmckinney The action happens in the DOIDataCiteServiceBean, specifically getUpdateMetadataFromDataset and getMetadataFromStudyForCreateIndicator (There could be a little cleanup done here to reduce repeated code.) There are methods on the DatasetVersion object to get specific field values that you can use to pattern additional methods for the additional metadata required. See getTitle() and getDescription()

@pdurbin
Copy link
Member

pdurbin commented Oct 21, 2016

@scolapasta and @bmckinney discussed this issue fairly extensively yesterday in an SBGrid spring planning meeting and it sounded like the decision was that @bmckinney will beef up the existing class rather than adding a new one. I'm happy to be corrected if I misunderstood.

@djbrooke djbrooke added this to the 4.8 - Large Data Upload Integration milestone Dec 20, 2016
@djbrooke
Copy link
Contributor

djbrooke commented Jan 3, 2017

Hi @bmckinney - we have this marked as "In Progress" - feel free to link to any branches or other work that explains the current state of this, or just move it back to "Ready" if it's not started. Thanks!

@djbrooke
Copy link
Contributor

@jggautier - do you think we should close this one? Is this work covered in #4318?

@jggautier
Copy link
Contributor

jggautier commented Jan 10, 2018

I've always thought this issue was about sending DataCite more metadata in the DataCite schema. #4318 is about making it harvestable with OAI-PMH. It would be ideal if the same metadata in DataCite schema is harvestable over OAI-PMH and sent to DataCite on dataset publish.

One question I have is: Will the same metadata be harvestable and sent to DataCite?
It seems it wouldn't be, based on discussion in today's sprint planning.

So my other question is: Will the development work to make DataCite metadata available in both ways be easy enough that the effort can be part of #4318? Or should there be more than one (one issue to make it harvestable, another to send to DataCite)?

@jggautier
Copy link
Contributor

jggautier commented Jan 11, 2018

To keep the scope of the other issue, #4318, focused on providing value to @LauraHuisintveld at DANS, @shlake at UVA and others interested in harvesting using the DataCite metadata format, lets keep this issue about sending more metadata to DataCite separate. I can rename the title to make it clear that it's about sending additional metadata to DataCite.

Thanks to @sekmiller and @pameyer who helped me understand a little better how the metadata is sent to DataCite versus made available through OAI-PMH. Seems like #4318 gets Dataverse closer to sending more metadata.

There's more mapping to do, either as part of this issue or an issue following this one, as part of a broader discussion about improving the connections between data and publications to take advantage of services like the Scholarly Link eXchange framework (related to #2778).

@jggautier jggautier changed the title Make available additional DataCite Metadata As a researcher and admin, I want to send to DataCite additional metadata so that my data is more discoverable Jan 11, 2018
@jggautier jggautier changed the title As a researcher and admin, I want to send to DataCite additional metadata so that my data is more discoverable As a researcher, Dataverse admin or curator, I want more information about my data sent to DataCite so that it's more discoverable Jan 11, 2018
@pdurbin
Copy link
Member

pdurbin commented Jan 11, 2018

I got the impression from today's discussion that Dataverse sends data to DataCite on publish when either EZID or DataCite is used as a persistent ID provider. What about Handles? Is data sent to DataCite on publish when Handles are used rather than DOIs?

@pdurbin
Copy link
Member

pdurbin commented Jul 11, 2018

What about Handles? Is data sent to DataCite on publish when Handles are used rather than DOIs?

I assume that data is sent do DataCite on publish for Handles too.

Mostly I'm leaving a comment here because #4782 which is in flight is also about sending more data to DataCite.

Also, code has been written for this issue #2917 at https://github.com/sbgrid/sbgrid-dataverse/blob/feature/datacite-xml/mod-sbgrid/src/main/java/edu/harvard/iq/dataverse/export/datacite/DataciteDataModel.java but it seems to be more in the context of export. A flavor of DataCite export has been implemented in pull request #4664 for #4257.

@bencomp
Copy link
Contributor

bencomp commented Jul 11, 2018

DataCite doesn't register Handles, as far as I know – the local Handle server does. Metadata is a requirement for getting DOIs for datasets from DataCite, but I don't think they would accept metadata for datasets with Handles.

@pdurbin
Copy link
Member

pdurbin commented Jul 11, 2018

@bencomp that makes sense. Thanks!

@jggautier
Copy link
Contributor

jggautier commented Sep 14, 2018

When #5029 is released, Dataverse will know if creators are people or organizations, and can include DataCite's "nametype" attribute. (See Martin Fenner's comment about nameTypes.)

@cmbz
Copy link

cmbz commented Sep 9, 2024

Closing because it will be addressed by #5889

@cmbz cmbz closed this as not planned Won't fix, can't repro, duplicate, stale Sep 9, 2024
@pdurbin
Copy link
Member

pdurbin commented Sep 17, 2024

Yes, this PR specifically:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature: Metadata Type: Feature a feature request User Role: Superuser Has access to the superuser dashboard and cares about how the system is configured
Projects
Status: Done
Development

No branches or pull requests

9 participants