Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add author authority control metadata from MARC #7724

Open
hornc opened this issue Mar 24, 2023 · 3 comments
Open

Add author authority control metadata from MARC #7724

hornc opened this issue Mar 24, 2023 · 3 comments
Labels
Lead: @hornc Issues overseen by Charles (Staff: Data Engineering Lead) [managed] Module: Import Issues related to the configuration or use of importbot and other bulk import systems. [managed] Priority: 3 Issues that we can consider at our leisure. [managed] Theme: MARC records

Comments

@hornc
Copy link
Collaborator

hornc commented Mar 24, 2023

From @tfmorris' comment on #7652

Add $0 & $1 for 700, 710, 711

See:
https://www.loc.gov/marc/bibliographic/bd700.html

  • $0 - Authority record control number or standard number (R)
  • $1 - Real World Object URI (R)

And https://www.loc.gov/marc/bibliographic/ecbdcntf.html for more details about $0 and $1 in general. They may appear on a range of fields, so there could be other opportunities to utilize them for our metadata. (Needs investigation)
and my response:

This is interesting, I didn't know about these. I'm not sure exactly where to extract these ids to in OL's data model.

Both $0 and $1 are repeatable, so there could be multiple values. It also seems like the ids and URIs could be anything in general.

Existing test data only has 7xx$0 examples:

700 1  $6 880-05 $a El Moudden, Abderrahmane. $0 http://id.loc.gov/authorities/names/nr92001540
700 1  $6 880-06 $a Bin-Ḥāddah, ʻAbd al-Raḥīm. $0 http://id.loc.gov/authorities/names/nr97026593
700 1  $6 880-07 $a Gharbi, Mohamed Lazhar. $0 http://id.loc.gov/authorities/names/nr96019749
710 2  $6 880-08 $a Jāmiʻat Muḥammad al-Khāmis. $b Kullīyat al-Ādāb wa-al-ʻUlūm al-Insānīyah. $0 http://id.loc.gov/authorities/names/n83213755
700 1  $6 880-04 $a Hayashiya, Tatsusaburō, $d 1914-1998. $0 http://id.loc.gov/authorities/names/n81047233
700 1  $6 880-05 $a Yokoi, Kiyoshi. $0 http://id.loc.gov/authorities/names/n81089234
700 1  $6 880-06 $a Narabayashi, Tadao, $d 1940-1960. $0 http://id.loc.gov/authorities/names/n85206624

They all happen to be LOC URIs. It looks like we might be able to assume all URIs in this field will be valid web URLs, but I'm not sure that is strictly a given.

I'm thinking adding and making use of these $0 / $1 author sub-fields should be a separate feature (building on this one [#7652 880 linkages ]).

Originally posted by @hornc in #7652 (comment)

@hornc hornc added Theme: MARC records Module: Import Issues related to the configuration or use of importbot and other bulk import systems. [managed] labels Mar 24, 2023
@tfmorris
Copy link
Contributor

Note that $0 doesn't have to contain URIs. They can also be text with a leading parenthesized identifier system identifier, e.g.

100 1#$aBach, Johann Sebastian.$4aut$0(DE-101c)310008891
100 1#$aTrollope, Anthony,$d1815-1882.$0(isni)0000000121358464

for DNB and ISNI respectively. Ones that OpenLibrary supports include isni, viaf, wikidata, goodra

I'm not sure exactly where to extract these ids to in OL's data model.

I think they go in identifiers. I'm not sure why LC NAF isn't one of OpenLibrary's identifier options, but that's where I'd put all the ones listed above (in their primitive, not URI, form i.e. nr92001540 not http://id.loc.gov/authorities/names/nr92001540), assuming that LC NAF gets added as a valid identifier.

You could also ping VIAF and resolve the redirects to get the VIAF ID, although that might be a post-process, rather than part of the import.

$ curl -LI http://viaf.org/viaf/sourceID/LC%7Cnr+97026593#skos:Concept
(base) toms-mbp:~ tfmorris$ curl -LI http://viaf.org/viaf/sourceID/LC%7Cnr+97026593#skos:Concept
HTTP/1.0 301 Moved Permanently
Location: https://viaf.org/viaf/sourceID/LC%7Cnr+97026593
Server: BigIP
Connection: Keep-Alive
Content-Length: 0

HTTP/1.1 301 Moved Permanently
Server: Apache-Coyote/1.1
srwRequestMethod: HEAD
[....]
Location: http://viaf.org/viaf/39664214

@hornc
Copy link
Collaborator Author

hornc commented May 9, 2023

Another example with LOC URI:
https://openlibrary.org/show-records/marc_columbia/Columbia-extract-20221130-007.mrc:0:703

marc_columbia might be a consistent source of LOC author identifiers.

@mekarpeles mekarpeles added Needs: Triage This issue needs triage. The team needs to decide who should own it, what to do, by when. [managed] Needs: Lead Priority: 3 Issues that we can consider at our leisure. [managed] Lead: @hornc Issues overseen by Charles (Staff: Data Engineering Lead) [managed] and removed Needs: Triage This issue needs triage. The team needs to decide who should own it, what to do, by when. [managed] Needs: Lead labels Sep 15, 2023
@hornc
Copy link
Collaborator Author

hornc commented Aug 29, 2024

see #9812 for the now added field to populate on authors for LC NAF id

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Lead: @hornc Issues overseen by Charles (Staff: Data Engineering Lead) [managed] Module: Import Issues related to the configuration or use of importbot and other bulk import systems. [managed] Priority: 3 Issues that we can consider at our leisure. [managed] Theme: MARC records
Projects
None yet
Development

No branches or pull requests

3 participants