Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON-LD profiles #163

Closed
fils opened this issue Mar 6, 2023 · 12 comments
Closed

JSON-LD profiles #163

fils opened this issue Mar 6, 2023 · 12 comments
Labels
enhancement New feature or request

Comments

@fils
Copy link
Member

fils commented Mar 6, 2023

This is just a potential "nice to have" but there are some questions that would need to be answered first.

In JSON-LD you can declare a profile ;like

<script type="application/ld+json;profile=http://www.w3.org/ns/json-ld#frame">

see: https://www.w3.org/TR/json-ld/#iana-considerations

Questions

  1. Are type strings like hash URLs? ie, is everything past the ; sorta ignored. I ask since there are many profils and I wouldn't want to see this break the content negotiation. Likely the content negotiation will not contain the profile?
  2. So this means it is really a case of dealing with this profile in the HTML parsing. We are using "github.com/PuerkitoBio/goquery" for this, so the question really needs to be raised there.
@fils fils added the enhancement New feature or request label Mar 6, 2023
@fils
Copy link
Member Author

fils commented Mar 6, 2023

I asked about this over in the goquery repo: PuerkitoBio/goquery#440

@valentinedwv
Copy link
Member

Yeah, damn hashes break things.... end up needing to custom encode with a pattern like

@fils
Copy link
Member Author

fils commented Mar 7, 2023

@valentinedwv got a reply in the issue in goquery. Sounds like it is easy to select based on the prefix (regex starts with) string. So we could allow groups to leverage profiles while then not breaking Gleaner. I need to run this by Google though to make it is not a spanner in the works for them.

I'd hate to recommend something to people that ends up getting them ignored by Google.

@fils
Copy link
Member Author

fils commented Mar 7, 2023

rats!

if I make a page without a profile. Note, I use a SHACL link as the "profile" here. That might be a bit of overloading but it serves to test the idea at least.

  <script type="application/ld+json">
        JSON-LD HERE
</script>

demo page without profile
schema.org validation test results

it works fine.

However, with a profile:

          <script type="application/ld+json;profile=https://github.com/ESIPFed/science-on-schema.org/master/validation/shapegraphs/soso_common_v1.2.3.ttl">
        JSON-LD HERE
</script>

demo page with profile
schema.org validation test results

Anyway, the results are kinda sad. Since it implies this approach is likely not going to be something the majority of tooling is going to be able to address correctly. It's almost a non-starter to recommend using profiles if they are going to break tools as common and popular as validator.schema.org.

@valentinedwv
Copy link
Member

valentinedwv commented Mar 7, 2023

So, different pain point... headless gleaner.

const elements = document.querySelectorAll('script[type="application/ld+json"]');

And know it's not perfect, but it can be tested with your examples...
https://github.com/gleanerio/gleaner/blob/dev/internal/summoner/acquire/headless_test.go

looks like this can be addrsssed with:
[id^='someId']

@smrgeoinfo
Copy link

smrgeoinfo commented Mar 7, 2023

<script type="application/ld+json" 
        profile="https://github.com/ESIPFed/science-on-achema.org/master/validation/shapegraphs/soso_common_v1.2.3.ttl">

works in validator.

We need a better URI to identify the SOSO SDO profile.... :)

@fils
Copy link
Member Author

fils commented Mar 7, 2023

@smrgeoinfo that is against the spec though.. so I think that is a non-starter..

signposting wit the resource type might be the best bet.

Link: https://github.com/ESIPFed/science-on-achema.org/master/validation/shapegraphs/soso_common_v1.2.3.ttl ; rel="resource"

A URI is allowed, but a DOI or handle could be used too.

SHACL isn't a semantic type though, it's a validation. So some thought might need to go in there to set a type.

I've been playing with a "pattern asset catalog" idea.. which might work to declare a type via, but I'd rather stay in web architecture world rather than define a whole catalog convention.

@valentinedwv
Copy link
Member

Seems like it's about content negotiation, and script tags.
https://www.w3.org/TR/json-ld/#iana-examples

@fils
Copy link
Member Author

fils commented Mar 7, 2023

@valentinedwv agreed.. I think the profile approach is a dead end for our use case...

The data- attribute like @smrgeoinfo has suggested would work but it's a bit dive into setting a community convention that others might not use or use differently. Doesn't mean it's not OK for a given community though.

The signposting might be an approach but again more for declaring a semantic type rather than a validation resource.

@fils
Copy link
Member Author

fils commented Mar 7, 2023

@smrgeoinfo so I revisited the LDN spec as I knew it had something with SHACL..

https://www.w3.org/TR/ldn/#:~:text=5.1%20Constraints,4xx%20error%20code.

they are using the link head pattern

Link: <http://example.org/id/x> ; rel="http://www.w3.org/ns/ldp#constrainedBy",

So this could be used as well as the signposting for the semantic type.

This has the beauty of being all web architecture but does require;

  1. the provider can mod their headers
  2. the consumer reads and actions on them

still, since the fall back is the core structured data on the web approach we use now, this is a fine enhancement in those cases where both parties can leverage it and still allows basic clients to work as normal.

@fils
Copy link
Member Author

fils commented Mar 7, 2023

Visual overview to this point

image

@smrgeoinfo
Copy link

The use case I was thinking about for a JSON-LD profile is to communicate that a metadata document conforms to some profile, so that an application parsing that metadata can safely make assumptions about the metadata. This requires a couple things:

  1. some convention about how profiles are identified. A profile specification should assert an identifier string for documents conforming to the profile to be identified. Currently ISO and OGC specification practices defining identified conformance classes are the most unambiguous approach to this that I know of.
  2. a convention to bind a metadata instance to the profile it conforms to. For metadata embedded in html as a <script> </script> element, the attributes of the script element are the obvious way to do this. Having a link header that specifies a metadata profile only works if there is an assumption that there is only one metadata script and the parser can guess which script in the html document that is.

I'm guessing that the reason using a 'profile' attribute on script is 'is against the spec' is because its not listed in the spec. Since I don't think many (any?) users are xml validating html, does this really break anything?

Another approach that makes sense to me is use <script type="application/JSON-LD"> and in the JSON-LD use schema:encodingFormat or (better yet...) dcterms:conformsTo to provide an identifier for the profile. This would require metadata scrapers to look at all scripts with the expected mimeType to check to see what profile they use. Practically speaking, its unlikely there would be more than one...

valentinedwv added a commit that referenced this issue May 2, 2023
Prep for json profiles #163
valentinedwv added a commit that referenced this issue May 2, 2023
Prep for json profiles #163
valentinedwv added a commit that referenced this issue May 3, 2023
Prep for json profiles #163
#128. For Validation just unmarshal an interface
valentinedwv added a commit that referenced this issue May 3, 2023
Prep for json profiles #163
#128. For Validation just unmarshal an interface
valentinedwv added a commit that referenced this issue May 3, 2023
Prep for json profiles #163
#128. For Validation just unmarshal an interface
valentinedwv added a commit that referenced this issue May 9, 2023
Prep for json profiles #163
#128. For Validation just unmarshal an interface
valentinedwv added a commit that referenced this issue Aug 31, 2023
* accept content types #192.
Prep for json profiles #163



* accept content types #192.
Prep for json profiles #163
#128. For Validation just unmarshal an interface
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants