External secondary instances: I/O support for `jr:` URLs #201

eyelidlessness · 2024-08-27T17:56:47Z

This design issue is part of broader support for external secondary instances:

The intent in this issue is to decide on a direction for how we want to handle retrieval of form attachments generally, with specific goal of supporting external secondary instance functionality.

These are the pertinent aspects of the ODK XForms spec.

Broadly: Secondary Instances - External
More specifically: File Endpoints

The issue will focus on the engine/client interface to support the retrieval of jr: URLs. As I tend to do for engine/client interface design, I will present a few options which we can choose from or iterate on. Implementation in the engine will be derived from there.

Note: it is expected that the design we choose here will also lay groundwork for supporting other jr: URL use cases—i.e. media attachments—so I've tried to be mindful of that (as we should in discussion).

Note: this issue does not currently address the jr://instance/last-saved virtual endpoint, but I believe nothing in any of the proposed options would block or impede that functionality, when we're ready to address it.

Option 0: client/host application handles resource resolution

This is a true null option: in the narrowest sense, we could claim this work is already done with the provision of a fetchResource configuration option. This option is technically sufficient to satisfy the engine's spec responsibilities.

How this would work:

The engine will call fetchResource with any external secondary instance's jr: URL
If the client has provided a fetchResource option, the client is then responsible for resolving that jr: URL to the referenced resource, for the active form instance
If the client has not provided a fetchResource option, the engine will produce a well-defined error result (with errors to be discussed in a separate design)

At least until we support offline mode (or any other functionality that would imply runtime-level caching/persistence), this would leave resource resolution entirely to clients.

This is sort of the opposite of a "pit of success" option, with its primary appeal being limited engine-side work for this aspect of the targeted feature.

Beyond obvious non-"pit of success" drawbacks, I'll specifically note that it's the most likely option to result in disparities and drift between clients. It's also likely to promote disparities/drift between different functionality which intersect with it.¹

Option 0.1: engine does not handle this aspect at all

Another null option variant.

This would effectively mean that clients must resolve jr: URLs before initializing a form. They'd probably supply the resources as data: or blob: URLs, substituted directly in the form definition provided by clients to the engine.

This option does not appeal to me, but I think it's worth mentioning so we can make a thoroughly informed decision.

Option 0.5: engine provides resolution handler(s) for common cases

An extension of option 0, this is similar in spirit to the submission API proposal (#188), and some of the discussion ongoing there. The idea would be that we recognize one or more typical resource mapping schemes, and expose default fetchResource implementations to address those (likely as some kind of factory function so clients can parameterize them for per-instance usage appropriately).

I would imagine starting with handlers for:

the OpenRosa Protocol media manifest
perhaps also Central's form attachment list API

Option 1: engine provides one or more explicit mechanisms for form attachment resolution, tailored to feature-specific use cases

Instead of the engine calling a generalized fetchResource option with a jr: URL, the engine would instead accept a configuration mapping between specific jr: URLs to one of:

the resource's real URL, or some other fetch-able URL (blob:, data:, ?); this mapped URL would then be accessed by the same fetchResource option
anything else capable of resolving a Blob of the resource's data (Promise<Blob>, () => Promise<Blob>, ?)

The mapping itself could be any of:

the OpenRosa manifest format
the response type from Central's form attachment list API
a simple Map-like object (or even Record<string, T> if we're feeling really loosey goosey about it)

This has been a particular pain point in Enketo. Support for jr: URLs is spread across three packages, and difficult to iterate on even after moving the projects to a monorepo. ↩

The text was updated successfully, but these errors were encountered:

sadiqkhoja · 2024-09-03T18:16:41Z

I was thinking if host application could just provide secondary attachments along with Form XML so engine doesn't have make network calls and deal with all the network related errors. I am saying this with the assumptions:

Engine doesn't really need media files like images/audio
Host application would have the list of all Form attachments (CSVs and XMLs only) and it can fetch them when it is fetching Form XML.

This is closer to the Option 1 presented above, except required attachments are resolved at the host application levle before anything else happens.

eyelidlessness · 2024-09-04T15:50:57Z

I want to make sure to sum up a couple key conclusions from our discussion yesterday:

Design choice: Option 1, possibly supplemented by Option 0.5

We decided to go with Option 1. As a stretch goal, we may also include aspects of Option 0.5 as an additive aid to clients/host applications.

We refined the proposed Option 1 interface. Putting aside naming (included here so the type will be valid syntax/highlighted as such), this is the interface we anticipate clients/host applications to provide for all form attachments:

type FormAttachmentMapping = Record<`jr:${string}`, () => Promise<Response>>;

Open for bikesheddy discussion: is there any openness to making this a Map rather than a Record? @sadiqkhoja your call. I would generally like to move away from using plain objects for bag-of-stuff collections where the keys aren't at least partially known/fixed at design time. But I can also understand this may be less convenient at the package boundary.

As for the value side of the mapping, providing a thunk per resource:

Allows clients and/or host applications to provide arbitrary, opaque implementations. Requesting a resource may simply pass through a fetch call, might perform some specialized error handling, may even handle fault tolerance (e.g. retries, which we determined not to do in the engine, at least for now).
Allows the engine to perform essential requests upfront, ensuring those resources' availability for form operations, and consistently surfacing any errors encountered in a format suitable for designs coming out of External secondary instances: Error conditions #202.
Allows the engine to throttle repeat requests (whether scheduled upfront or otherwise), ensuring consistency and predictability for a given form session.

We decided to represent each resource as a Response because it has semantics appropriate for likely usage scenarios, and suitable for error-reporting designs in #202 (without placing undue burden on clients to conform to a stricter Result-like representation as we've chosen there).

On engine invocation of form requests broadly (i.e. including media)

While this design is primarily focused on support for external secondary instances, it has obvious implications for other form attachments. We discussed this, which was also raised in the above comment. We determined it makes sense for the engine to invoke all such requests, largely for reasons discussed in the last section. Notably, we expect that the engine will perform media requests earlier in a form session (either as part of form load, or perhaps immediately following resolution of the initial form state).

Addendum to yesterday's discussion of this point

As an additional point not covered yesterday, but which I think helps to bolster this decision: insofar as the engine expects clients/host applications to provide resource data, there's another very good reason for the engine to invoke requests, and in particular to for the engine to get those requests back as a Response. Namely, streaming. If we expected:

Promise<Blob>, we'd block form load on potentially very large resources which typically load progressively
() => Promise<Blob>, we'd block either form load or other arbitrary form operations (potentially even infecting large amounts of the engine/client API with asynchrony in the process!)

And the engine does need to access at least some resource data for media attachments, as they may be associated with node values.

In the future, we might consider expanding this interface to provide multiple representations, such as something like Record<'jr:*', { request: () => Promise<Response>, url?: string }>. But I think it's best to stick to the simpler interface for now until we have some time to integrate it and understand from experience where there are real gaps.

eyelidlessness added design/architecture external-secondary-instances labels Aug 27, 2024

This was referenced Aug 27, 2024

External secondary instances: Error conditions #202

Open

External secondary instances: Engine representation, XPath support #203

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

External secondary instances: I/O support for `jr:` URLs #201

External secondary instances: I/O support for `jr:` URLs #201

eyelidlessness commented Aug 27, 2024 •

edited

Loading

sadiqkhoja commented Sep 3, 2024

eyelidlessness commented Sep 4, 2024 •

edited

Loading

External secondary instances: I/O support for jr: URLs #201

External secondary instances: I/O support for jr: URLs #201

Comments

eyelidlessness commented Aug 27, 2024 • edited Loading

Option 0: client/host application handles resource resolution

Option 0.1: engine does not handle this aspect at all

Option 0.5: engine provides resolution handler(s) for common cases

Option 1: engine provides one or more explicit mechanisms for form attachment resolution, tailored to feature-specific use cases

Footnotes

sadiqkhoja commented Sep 3, 2024

eyelidlessness commented Sep 4, 2024 • edited Loading

Design choice: Option 1, possibly supplemented by Option 0.5

On engine invocation of form requests broadly (i.e. including media)

Addendum to yesterday's discussion of this point

External secondary instances: I/O support for `jr:` URLs #201

External secondary instances: I/O support for `jr:` URLs #201

eyelidlessness commented Aug 27, 2024 •

edited

Loading

eyelidlessness commented Sep 4, 2024 •

edited

Loading