Align or merge DataCite metadata exports #5889

jggautier · 2019-05-28T18:17:53Z

This issue is meant to record the differences between Dataverse's two newest metadata exports as of v4.14, "DataCite"/"Datacite" and "OpenAIRE"/"oai_datacite", and discussion about how to align (or possibly merge) the very similar exports.

As part of v4.10 (released in Dec. 2018), Dataverse makes available through the UI, API and over OAI-PMH dataset metadata in the DataCite schema (#5043). This lets Dataverse export dataset metadata in a widely-used, discipline-agnostic schema that's more standardized than Schema.org and has more metadata than Dublin Core.

As part of v4.14 (released in May 2019), Dataverse makes available through the UI, API and over OAI-PMH DataCite metadata that complies with OpenAIRE requirements (#4257). Repositories need to follow these requirements in order for their dataset metadata to be made discoverable (harvested) by OpenAIRE (OpenAIRE EXPLORE). The OpenAIRE metadata requirements follow the DataCite schema, with some differences between OpenAIRE and DataCite listed in their documentation.

What both exports are called depending on the export method:

Both metadata exports are based on DataCite 4 and are meant to be valid against the DataCite 4 schema (although the xml records available over OAI-PMH in "Datacite" format reference DataCite's 3.1 schema). But Dataverse exports them as separate formats for several reasons:

The two metadata exports were worked on at different times by different groups
When work on making Dataverse OpenAIRE compliant started, I thought the OpenAIRE export would follow the DataCite 3.1 schema since the OpenAIRE guidelines for data repositories follows DataCite 3.1. And I knew that Dataverse would eventually export DataCite 4 metadata, so it made sense to make them separate exports. But we're told the OpenAIRE folks plan to update their guidelines, so our 4Science colleagues created the OpenAIRE export following the DataCite 4 schema. (For example, a notable difference between DataCite 3 and 4 is how funder information is handled. The OpenAIRE guidelines mandate that the contributorType property is used, which is how DataCite 3 handles funder info. But Dataverse's OpenAIRE export is using the DataCite 4 fundingReferences property instead.)
The "OpenAIRE" metadata export uses an algorithm that adds metadata about whether dataset authors and contact persons are people or organizations (in DataCite's nameType attribute). The algorithm was the last thing discussed in the OpenAIRE GitHub issue.

Ideally, Dataverse would export only one metadata record, made available through the UI, API and over OAI-PMH, that follows the DataCite schema and is also OpenAIRE compliant. The way things are now, where Dataverse exports two different metadata records based on DataCite but different, people have been confused about the differences between the two metadata exports called "DataCite" and "OpenAIRE" in the UI and called "Datacite" and "oai_datacite" in the API endpoints and made available over OAI-PMH.

But we may want to maintain two metadata exports because:

the OpenAIRE export is using the nameType algorithm, which was tested during QA but only tested for evidence that the algorithm would work in at least some cases. We haven't tried to estimate how often it will correctly figure out if author/contact names of actual datasets are people or organizations (although it's based on an algorithm DataCite uses that we're told is right over 90% of the time). Would people want to be able to export or harvest metadata that does not include the nameType metadata (maybe because they find that it's not correct often enough)?
the OpenAIRE export uses one of four mandatory Access Rights terms. The rules that Dataverse uses to determine this are discussed in a GitHub issue comment. But I realized recently that the rules are too simple and lead to cases where datasets are marked as closedAccess when restricted access is more appropriate (e.g. https://doi.org/10.7910/DVN/0PMZC6, where file request is disabled, but people can request access through a process that happens outside of Dataverse). A GitHub issue about this is opened (Access Rights metadata in OpenAIRE metadata export is being misapplied #5920), so we can figure out how to assign more appropriate access rights to datasets. Until then, would people want to be able to export or harvest metadata that does not include these sometimes misleading Access Rights?

We should decide if:

Dataverse should maintain one export or two and
If maintaining only one export, make sure that it has all of the metadata available in the current two exports.
If maintaining two exports, make sure that the amount of metadata in one export is as close to the same amount in the other (and continues to be as synced as possible) and document what the differences are. (As of v4.14 the "OpenAIRE" export has more metadata than the "DataCite" export but there are things missing in both.)

mheppler · 2021-02-10T15:49:37Z

Related? Silent publishing failure when not all fields required by Datacite are present #7551

jggautier · 2021-02-10T16:04:03Z

Good point. It could be related if/when Dataverse repositories start sending more metadata to DataCite and the dependencies among the child fields of any of that metadata is the same as the dependencies of the child fields in the Producer compound field (which right now is the only field causing those silent failures).

adam3smith · 2021-03-09T19:23:24Z

@qqmyers and I are also looking at this given that what we're currently sending to DataCite is indeed rather inadequate.
Looking at the Crosswalk Julian put together, it seems to me that the current OpenAIRE export is strictly better. The only field I'm seeing where DataCite has something and OpenAire doesn't is Name Identifier schemeURI and that's either just not documented or an oversight that should be fixed.

I'm not at all concerned about the naming algorithm. If anything, I think it's a good idea to try to guess organizational names.
I think the closed vs. restricted data categorization is something that should get addressed, I don't see it as a blocker.

Given this, I think a single export format makes sense.

In terms of items missing from both exports, the citation metadata looks complete, but the individual subject blocks seem to have some stuff missing. From @philippconzett 's list at #7072 that's most notably the geography data, which we'd also like to capture.

We're viewing this as pretty high priority given how widely DataCite data are used (e.g. the fact that we're not linking up our funding information to the PID graph isn't great) -- is there anything we can do to help move this along?

djbrooke · 2021-03-09T19:34:05Z

Thanks @adam3smith.

@jggautier if we bring this into the sprint starting tomorrow, would you have some time over the next two weeks to get into this with a developer that picks it up? If not, the sprint starting in two weeks? I know you're spread a bit thin right now with the UI/UX work starting up, so if it makes sense to wait two weeks I think that's fine. What do you think?

poikilotherm · 2021-03-09T19:37:17Z

Is my #7077 related here, too? (Going to work on that, you folks know... Funding...)

djbrooke · 2021-03-09T19:39:49Z

@poikilotherm May be related, but I think we'd want to move these forward independently IMHO. I think much of the discussion around #7077 will happen as part of the Software Metadata WG.

adam3smith · 2021-03-09T19:42:29Z

Awesome! @jggautier -- I think you have this covered, but if there's anything you'd like another set of eyes on or a 2nd opinion just tag and/or email me.

jggautier · 2021-03-10T13:53:43Z

Thanks @adam3smith. Great to hear there's more interest in prioritizing this! I'm all on board with saving the closed vs. restricted data categorization problem (#5920) for another day if it moves this issue forward. I think there are a few other things we should consider:

Is there any reason why repositories wouldn't like the nameType algorithm? Is there a way to test how well it's been working generally and for certain types of names? @adam3smith or @qqmyers, would you happen to know how DataCite figured out that the algorithm they use works 90% of the time? Or should we include the nameType algorithm and later on, separate from this, figure out how well it's working?
When more metadata is sent to DataCite, we should make sure we don't run into the compound field dependency issues that the Producer metadata field had (discussed in Silent publishing failure when not all fields required by Datacite are present #7551). (For example, the OpenAIRE export deals with missing Related Publication metadata by including it only if certain fields are filled.)
The OpenAIRE export uses the IsCitedBy relationship when including metadata from the Related Publication field. We never really resolved how to use DataCite's relation terms (discussed in Add "Relation Type" to related publication metadata fields to send DataCite related publication metadata #2778). I think we could:
- Work out how to allow depositors to define different types of relationships between their datasets and related text-based publications (like articles) and/or make it easier for repositories to choose what types of relationships they want their depositors to use. This might involve UI changes.
- Decide with the Dataverse community which one relation term to use and expect people and other systems (harvesters, indexers, etc) to interpret that term very broadly (like "this publication is somehow related to this dataset"). Then I think this term could logically be applied to the Related Publication metadata in datasets that Dataverse repositories have already published.
- Decide with the Dataverse community to use one term that we define more narrowly (like "this dataset is cited by this publication"). But does it make sense to apply that term to the metadata of existing datasets? Not all repositories know what types of relationships their depositors had in mind when entering Related Publication metadata. I'd guess a majority of the time, the dataset is used to support findings/conclusions made in an article, but the article may not be citing the dataset. Could there be other reasons why a dataset is associated with something like a journal article? And will people and other systems ever care about/rely on the differences between the relationship types? (I think for MakeDataCount, the answer right now is no: when citations are counted, any one of several types of relation terms are valid because repositories are using the terms in different ways, so the standard's designers don't want to be too strict about which relation term or terms signal a "citation".)
- Not include Related Publication metadata in this new, merged DataCite metadata export and tackle Add "Relation Type" to related publication metadata fields to send DataCite related publication metadata #2778 separately.

adam3smith · 2021-03-10T14:44:40Z

Thanks Julian.

I have no insight into the Datacite algorithm for distinguishing between corporate and personal authors. Maybe @mfenner would be willing to chime in?
There are a number of the fields that only make sense as conditionals (e.g. all the scheme/identifier fields). The solution described in Custom Metadata: Allow Dataverse Installations to Define Conditionally Required Fields for Compound Fields #7606 looks good to me and would appear to solve this and seems to be scheduled to land in 5.4?
My view would be that we need to allow some more flexibility in related terms, which makes Add "Relation Type" to related publication metadata fields to send DataCite related publication metadata #2778 fairly complex (there's a reason it's been open for so long) and we should not let it block the low hanging fruits, i.e. go with Julian's last option and tackle it separately.

mfenner · 2021-03-10T14:51:24Z

Users can set Personal or Organizational authors via nameType. Otherwise DataCite is doing the following:

if there is an ORCID associated with the author, it is a person
if there is a givenName, it is a person
if the creatorName has something that looks like a givenName, and that givenName is in a dictionary of known given names (using https://github.com/berkmancenter/namae), it is a person. This is where the 90% comes from. The dictionary is not so good in non-European names, and there are organization names that contain a given name (e.g. "Alfred P. Sloan Foundation").

adam3smith · 2021-03-10T15:18:18Z

Thanks! Dataverse currently doesn't have a nameType option, which is why we need some sort of algorithmic solution to determine this.

The ORCID option make sense
Since Dataverse doesn't have separate given/family name fields, I'm guessing the option here is to use the presence of a comma as a heuristic (that's what Zotero would do on import and it generally works pretty well. The problem is that this will have a fair number of false positives with non-Western names, as it's common to enter names without comma and often in familyname/givenname order (e.g., Mao Zedong)

Since the name list also sounds like it works less well for non-Western names, I'd actually now be somewhat nervous about this. Do you have contacts at some of the Chinese DV installations we could ask or are there Dataverse Collections at Harvard more likely to contain non-Western creator names so we could check?

If this is indeed fairly common, labeling a significant number of people with non-Western names as institutions seems a lot more problematic than the reverse and I'd go back on my opinion above...

mfenner · 2021-03-10T15:41:44Z

The presence of a comma is unfortunately not a good heuristic for DataCite, as many repositories use "givenName familyName", instead of "familyName, givenName".

The best solution is really using givenName and familyName. The reason we use a name dictionary is mainly that adoption of givenName/familyName is too low.

adam3smith · 2021-03-10T15:50:39Z

Just to be clear -- what we're after here is not to change what Datacite does but what Dataverse does in creating metadata submitted to Datacite -- Datacite just comes in because the Dataverse algorithm for handling names is derived from your code.

I think adding separate name fields would be quite challenging at this point, though I agree that it'd be much preferable.

mfenner · 2021-03-10T16:27:46Z

I understand. One important reason for "guessing" personal names is citation styles and formatted citations (as you of course know). DataCite introduced givenName and familyName a few years ago and it is still optional as it is indeed challenging to implement.

jggautier · 2021-03-10T17:05:49Z

Thanks @mfenner as always!

@adam3smith, there was a lot of discussion in #4257 about figuring out the nameType and adapting DataCite's algorithm to address failure cases discovered during QA, but I think the summary at #4257 (comment) still holds, and includes looking for an ORCID but I don't think we looked at how well it works for non-Western names. I think we could contact folks from installations where non-Western names are common, and possibly where they're running 4.14+ Dataverse repositories, and could look at Dataverse Collections at Harvard more likely to contain non-Western creator.

Maybe the outcome of this investigation would be to figure out whether or not we need to make it possible/easier for installations to turn off the nameType algorithm for the DataCite export. @adam3smith, @qqmyers, @djbrooke. How does that sound? And work to figure out how Dataverse repositories can better determine nameType can be done as part of another issue?

@adam3smith wrote:

There are a number of the fields that only make sense as conditionals (e.g. all the scheme/identifier fields). The solution described in #7606 looks good to me and would appear to solve this and seems to be scheduled to land in 5.4?

I agree and spoke with @scolapasta about the use cases and limits of #7606. My understanding is that it wouldn't address cases like the Related Publication field. @scolapasta could confirm, but from what I understand 7606 wouldn't let repositories say that if the ID Type is filled, the ID Number must also be filled (or vice versa), because that compound field also has two other fields, "Citation" and "URL", which for the purposes of exporting metadata in the DataCite schema, I think those two fields should remain optional.

The code for the OpenAIRE export already handles Related Publication in a different way, only including that metadata if both ID Type and ID Number are filled (instead of taking the approach of #7606 to prompt depositors to enter the metadata the way the software/installation admins expect). I'm not sure if there are other fields to consider, but I don't think looking out for these cases will make this issue take any longer to work on.

@djbrooke wrote:

@jggautier if we bring this into the sprint starting tomorrow, would you have some time over the next two weeks to get into this with a developer that picks it up? If not, the sprint starting in two weeks? I know you're spread a bit thin right now with the UI/UX work starting up, so if it makes sense to wait two weeks I think that's fine. What do you think?

Based on all of this I'm thinking two things should be done, and I'd have time in the next two weeks to help do them, before this is ready for implementation work starts:

a review of the metadata mapping. Like @adam3smith wrote, that shouldn't be too much trouble
a look into how well the nameType algorithm is working for non-Western creator names and if installations need a way to turn the algorithm off (and not include in the DataCite export a guess about if a creator is a person or organization)

Then maybe we could aim for working on implementation in the following sprint? What do you all think?

mfenner · 2021-03-10T17:41:13Z

One small comment: the author of the library we use for names (https://github.com/berkmancenter/namae) is @inukshuk who @adam3smith knows from citationstyles work, maybe it is worth reaching out to him, e.g. to ask about handling of non-Western names.

qqmyers · 2021-03-10T18:17:48Z

One quick thought: It might be simple to add a person/org choice field and just use ‘the algorithm’ to pre-populate that for existing data, i.e. we only use it to handle legacy info rather than in an ongoing way. (Could even make it something that could be optional if admins don’t think it works well for their installations.)

…

-- Jim From: Julian Gautier [mailto:notifications@github.com] Sent: Wednesday, March 10, 2021 12:06 PM To: IQSS/dataverse Cc: qqmyers; Mention Subject: Re: [IQSS/dataverse] Align (or merge) DataCite metadata exports (#5889) Thanks @mfenner<https://github.com/mfenner> as always! @adam3smith<https://github.com/adam3smith>, there was a lot of discussion in #4257<#4257> about figuring out the nameType and adapting DataCite's algorithm to address failure cases discovered during QA, but I think the summary at #4257 (comment)<#4257 (comment)> still holds, and includes looking for an ORCID but I don't think we looked at how well it works for non-Western names. I think we could contact folks from installations where non-Western names are common, and possibly where they're running 4.14+ Dataverse repositories, and could look at Dataverse Collections at Harvard more likely to contain non-Western creator. Maybe the outcome of this investigation would be to figure out whether or not we need to make it possible/easier for installations to turn off the nameType algorithm for the DataCite export. @adam3smith<https://github.com/adam3smith>, @qqmyers<https://github.com/qqmyers>, @djbrooke<https://github.com/djbrooke>. How does that sound? And work to figure out how Dataverse repositories can better determine nameType can be done as part of another issue? @adam3smith<https://github.com/adam3smith> wrote: There are a number of the fields that only make sense as conditionals (e.g. all the scheme/identifier fields). The solution described in #7606<#7606> looks good to me and would appear to solve this and seems to be scheduled to land in 5.4? I agree and spoke with @scolapasta<https://github.com/scolapasta> about the use cases and limits of #7606<#7606>. My understanding is that it wouldn't address cases like the Related Publication field. @scolapasta<https://github.com/scolapasta> could confirm, but from what I understand 7606 wouldn't let repositories say that if the ID Type is filled, the ID Number must also be filled (or vice versa), because that compound field also has two other fields, "Citation" and "URL", which for the purposes of exporting metadata in the DataCite schema, I think those two fields should remain optional. The code for the OpenAIRE export already handles Related Publication in a different way, only including that metadata if both ID Type and ID Number are filled (instead of taking the approach of #7606<#7606> to prompt depositors to enter the metadata the way the software/installation admins expect). I'm not sure if there are other fields to consider, but I don't think looking out for these cases will make this issue take any longer to work on. @djbrooke<https://github.com/djbrooke> wrote: @jggautier<https://github.com/jggautier> if we bring this into the sprint starting tomorrow, would you have some time over the next two weeks to get into this with a developer that picks it up? If not, the sprint starting in two weeks? I know you're spread a bit thin right now with the UI/UX work starting up, so if it makes sense to wait two weeks I think that's fine. What do you think? Based on all of this I'm thinking two things should be done, and I'd have time in the next two weeks to help do them, before this is ready for implementation: * a review of the metadata mapping. Like @adam3smith<https://github.com/adam3smith> wrote, that shouldn't be too much trouble * a look into how well the nameType algorithm is working for non-Western creator names and if installations need a way to turn the algorithm off (and not include in the DataCite export a guess about if an creator is a person or organization) Then maybe we could aim for working on implementation in the following sprint? What do you all think? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<#5889 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ABTLRTY6EVOK4WUM5GGM5ALTC6KHDANCNFSM4HQF4N2Q>.

adam3smith · 2021-03-11T15:22:02Z

a review of the metadata mapping. Like @adam3smith wrote, that shouldn't be too much trouble

a look into how well the nameType algorithm is working for non-Western creator names and if installations need a way to turn the algorithm off (and not include in the DataCite export a guess about if an creator is a person or organization)

Then maybe we could aim for working on implementation in the following sprint? What do you all think?

That sounds good to me.

It might be simple to add a person/org choice field and just use ‘the algorithm’ to pre-populate that for existing data, i.e. we only use it to handle legacy info rather than in an ongoing way.

We'd be happy with this -- the more control we have over metadata the better -- but there may be concern about too many UI elements for self-deposit repositories.

abollini · 2021-09-19T10:52:07Z

Sorry for joining the discussion so late, I just want to add a reference to the inprogress update to the OpenAIRE DataArchive guidelines that will be based on the Datacite version 4 schema https://openaire-guidelines-for-data-archive-managers.readthedocs.io/en/latest/index.html

This is essentially the new version of the guidelines that we were requested to develop for in 2018 (to be more specific at this time we have looked to the Datacite schema v4.1) and was contributed to Dataverse in 4.14

The OpenAIRE team is still working on the new version, I take the freedom to ping them on this thread openaire/guidelines-data-archives#2 so that they will be aware of the work in progress on the Dataverse community

jggautier · 2021-10-06T15:42:44Z

Hi @abollini. I don't think you're late at all. The status of this issue was brought up in a recent Dataverse community meeting, so I thought it would be helpful to write here that the plans being discussed in this GitHub issue for how to proceed haven't been started or finalized. I think it's great that the OpenAIRE team will be aware of this discussion. Thanks!

jggautier · 2022-02-24T17:50:06Z

Just noticed that in the DataCite export's of installations running Dataverse software v5.9 and maybe all earlier versions, parentheses are added to the Author Affiliation values that are put in DataCite's creator > affiliation element:

The screenshot is from an export from Demo Dataverse, running v5.9. It's also done in this export from DataverseNL (v5.9)

Maybe this is because the code is getting what's displayed on the dataset page instead of what's entered in the field on the edit metadata page? Looks like that was the issue when Author Affiliation values were wrapped in parenthesis in the search API results (#6570 (comment))

The OpenAIRE export doesn't include the parenthesis, so I mention this bug in this issue since it seems natural that merging these two exports, or aligning them more, would also fix this parenthesis bug.

jggautier · 2024-04-05T14:56:07Z

Just an update about what I wrote last week about consulting @abollini about the related comments he left in a GitHub issue at openaire/guidelines-data-archives#2.

In that issue I commented to let @abollini know about this proposal to merge the two exports, asked for feedback about having the merged export's schemaLocation point to the xsd of DataCite's 4.5 schema, and asked for more information about bringing "arguments from the OpenAIRE team" to this effort.

adam3smith · 2024-04-05T15:06:53Z

Thanks Julian -- we'd be very happy to see this merged and I think it'd have significant downstream benefits to improve the Dataverse-deposited metadata with Datacite this way.

poikilotherm · 2024-04-07T13:34:56Z

I'm still having this crazy idea about generating model classes from the Schema XSDs and create mappers from our internal metadata model to the target model...

jggautier · 2024-04-08T13:59:59Z

Hey @poikilotherm, would this be a better way to change the exports? Would it take a lot of time to do?

sbarbosadataverse · 2024-04-09T20:26:09Z

Ceilyn and Sonia priorized and moved to sprint ready as part of GREI Y3 planning @jggautier @scolapasta Please weigh in if you have objections.

- Generate model classes for DataCite 4.5 Metadata Schema - Add a simple test to demonstrate usage and basic validity.

poikilotherm · 2024-04-10T10:59:55Z

@jggautier I put together a very simple demonstrator for the generator part, using the DataCite 4.5 Kernel. (It does not include the mapper part, where we map our internal to the generated model. I could create an example exporter for that if you want.) To run the example, use this:

git clone --branch 5889-gen-schema-pojos https://github.com/IQSS/dataverse.git dataverse
cd dataverse
mvn -f modules/dataverse-schemas package

Aside from that, here's the comparison: https://github.com/IQSS/dataverse/compare/5889-gen-schema-pojos

jggautier · 2024-04-15T14:00:15Z

Thanks @sbarbosadataverse. I don't have any objections to this being prioritized and moved to sprint ready. I'm worried we won't hear back from folks from OpenAIRE by the end of the sprint next Wednesday. I'll reach out to @abollini again in openaire/guidelines-data-archives#2

@poikilotherm I'm hesitant to try to better understand what generators are. But could you write about the benefits? For example, does it make it easier to change the exporters?

poikilotherm · 2024-04-15T15:35:05Z

Currently, for DataCite we use a template approach, combined with XML processing. For DDI we use AFAIK an XML only processing approach. For our JSON based exports we use mostly JSON processing.

The point is: all of this is hand crafted. The implementation is done by us and we need to make sure the serialized output matches the specifications involved. We also provide the mapping from our internal model to the target model with these serializers.

When using generators, parts of the process are put upside down. You start with the spec (XML XSD, Json Schema, Open API...) and you use a tool to generate model classes out of these.

The result are classes that can be serialized to the target output data using the Jakarta standard included data binding mechanisms. Beyond that, these classes can also be used for the inverted process: deserialization from some data to the model. An example would be importing DataCite XML from OAI-PMH: use the data binding to get a populated Java model of the data.

As the model classes are generated from the spec, they are known to fully transform all of the spec into the model. We might not use all of the available modeling, but at least we can easily extend without much hassle.

As long as the generator tools don't make mistakes, the data binding is always going to be valid output data as well as always map from correct input data back to the model.

Using our own implementations for de-/serialization requires extensive testing and also lot of manual work to implement every change etc.

The availability of schemas and model classes for them allows a much stricter enforcing of data validity at compile and runtime. Constraints about the data from the spec are transported into the data model, allowing for simpler interaction with the model from code as well as the Java compiler assisting you to build it.
Example: most generators will allow you to create a Fluent API for the model.

For the exporters, having schemas around (and I'm talking about more than just DataCite) will also allow for a clearer defined data exchange between the core application and plugged in exporters. The model classes provide Data Transfer Objects as a side product.

Also, upgrading schemas is improved. We can include a generated data model version for any version of a schema. If we want to change the supported schema version, the Java code can help us determine what to change and how. It's much clearer in code what is supported and what isn't. Changing a version means change the import path for them classes.

Brain dump out.

jggautier · 2024-07-01T19:25:42Z

@cmbz asked me to add a status update to this GitHub issue. There's discussion and related work in the pull request at #10615 that addresses at least some of what's been proposed in this GitHub issue.

DS-INRA · 2024-07-22T10:26:41Z

Another related issue :

As a researcher, Dataverse admin or curator, I want more information about my data sent to DataCite so that it's more discoverable #2917

cmbz · 2024-08-20T15:22:29Z

To focus on the most important features and bugs, we are closing issues created before 2020 (version 5.0) that are not new feature requests with the label 'Type: Feature'.

If you created this issue and you feel the team should revisit this decision, please reopen the issue and leave a comment.

pdurbin · 2024-08-20T20:33:38Z

This issue has an open PR...

Datacite xml improvements #10615

... so I'm reopening it. It'll be closed when we merge it.

pdurbin · 2024-09-17T19:34:15Z

We're now using this PR instead to close this issue:

DataciteXML changes Plus RelationType field #10632

jggautier · 2024-09-18T13:34:33Z

Thanks for the heads up @pdurbin. I'm going to keep this issue open, or I guess re-open it after that PR is merged, so that I can see what decisions were made and what goals and questions aren't addressed yet.

pdurbin · 2024-09-18T13:41:44Z

@jggautier sounds good. Perhaps we can create a new issue with any remaining items.

jggautier added the Feature: Metadata label May 28, 2019

jggautier mentioned this issue Jul 9, 2019

If dataset depositors choose a contributor type that isn't one of DataCite's contributorTypes, in Dataverse's DataCite and OpenAIRE metadata exports, map to DataCite's "Other" contributorType #6003

Closed

jggautier mentioned this issue Oct 6, 2020

Support for Crossref Grant ID #7286

Open

jggautier mentioned this issue Feb 9, 2021

Some Dataverse metadata fields seem not to be indexed correctly by DataCite #7072

Open

abollini mentioned this issue Sep 19, 2021

Solve compatibility issue for Dataverse software openaire/guidelines-data-archives#2

Open

jggautier mentioned this issue Oct 6, 2021

Improve datacite metadata #8108

Open

pdurbin added the Feature: Harvesting label Apr 12, 2022

cmbz added the Size: 30 A percentage of a sprint. 21 hours. (formerly size:33) label Apr 4, 2024

poikilotherm added a commit that referenced this issue Apr 10, 2024

feat(schemas): introduce schema generation submodule #5889

cfe5378

- Generate model classes for DataCite 4.5 Metadata Schema - Add a simple test to demonstrate usage and basic validity.

jggautier mentioned this issue Apr 24, 2024

issue #5277: first implementation step for exporting related publicat… #8357

Closed

qqmyers mentioned this issue Jun 5, 2024

Datacite xml improvements #10615

Closed

jggautier mentioned this issue Jul 8, 2024

Affiliations entered in affiliation fields are parenthesized in "Datacite" and Schema.org exports #9330

Open

cmbz closed this as completed Aug 20, 2024

pdurbin reopened this Aug 20, 2024

cmbz mentioned this issue Sep 9, 2024

As a researcher, Dataverse admin or curator, I want more information about my data sent to DataCite so that it's more discoverable #2917

Closed

pdurbin mentioned this issue Sep 17, 2024

DataciteXML changes Plus RelationType field #10632

Merged

landreev closed this as completed in #10632 Sep 23, 2024

pdurbin added this to the 6.4 milestone Sep 23, 2024

jggautier reopened this Sep 24, 2024

pdurbin removed this from the 6.4 milestone Sep 25, 2024

Align or merge DataCite metadata exports #5889

Align or merge DataCite metadata exports #5889

Comments

jggautier commented May 28, 2019 • edited Loading

mheppler commented Feb 10, 2021

jggautier commented Feb 10, 2021 • edited Loading

adam3smith commented Mar 9, 2021

djbrooke commented Mar 9, 2021

poikilotherm commented Mar 9, 2021 • edited Loading

djbrooke commented Mar 9, 2021

adam3smith commented Mar 9, 2021

jggautier commented Mar 10, 2021

adam3smith commented Mar 10, 2021

mfenner commented Mar 10, 2021

adam3smith commented Mar 10, 2021

mfenner commented Mar 10, 2021

adam3smith commented Mar 10, 2021 • edited Loading

mfenner commented Mar 10, 2021

jggautier commented Mar 10, 2021 • edited Loading

mfenner commented Mar 10, 2021 • edited Loading

qqmyers commented Mar 10, 2021 via email

adam3smith commented Mar 11, 2021

abollini commented Sep 19, 2021

jggautier commented Oct 6, 2021

jggautier commented Feb 24, 2022 • edited Loading

jggautier commented Apr 5, 2024 • edited Loading

adam3smith commented Apr 5, 2024

poikilotherm commented Apr 7, 2024

jggautier commented Apr 8, 2024

sbarbosadataverse commented Apr 9, 2024 • edited by cmbz Loading

poikilotherm commented Apr 10, 2024 • edited Loading

jggautier commented Apr 15, 2024

poikilotherm commented Apr 15, 2024 • edited Loading

jggautier commented Jul 1, 2024

DS-INRA commented Jul 22, 2024

cmbz commented Aug 20, 2024

pdurbin commented Aug 20, 2024

pdurbin commented Sep 17, 2024

jggautier commented Sep 18, 2024

pdurbin commented Sep 18, 2024

jggautier commented May 28, 2019 •

edited

Loading

jggautier commented Feb 10, 2021 •

edited

Loading

poikilotherm commented Mar 9, 2021 •

edited

Loading

adam3smith commented Mar 10, 2021 •

edited

Loading

jggautier commented Mar 10, 2021 •

edited

Loading

mfenner commented Mar 10, 2021 •

edited

Loading

jggautier commented Feb 24, 2022 •

edited

Loading

jggautier commented Apr 5, 2024 •

edited

Loading

sbarbosadataverse commented Apr 9, 2024 •

edited by cmbz

Loading

poikilotherm commented Apr 10, 2024 •

edited

Loading

poikilotherm commented Apr 15, 2024 •

edited

Loading