Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability for ConfigSourceProvider to use Config #470

Open
kenfinnigan opened this issue Nov 8, 2019 · 37 comments
Open

Ability for ConfigSourceProvider to use Config #470

kenfinnigan opened this issue Nov 8, 2019 · 37 comments
Labels
use case 💡 An issue which illustrates a desired use case for the specification
Milestone

Comments

@kenfinnigan
Copy link
Contributor

It would be nice if it was possible for ConfigSourceProvider instances to read the current Config of higher ordinal sources for use in defining it.

For instance, a config property in system or microprofile-config.properties that specifies a list of directories that contain config elements, which a ConfigSourceProvider can reference to retrieve. At present there's no way to do this, meaning values need to be hard coded into the instance directly as to where something might be.

Might require a "piecemeal" construction of Config so that each lower ordinal can take the Config up to that point and use it

@Emily-Jiang
Copy link
Member

At today's hangout with @kenfinnigan @carlosdlr , below is the user case.
user defined config source - default (microrofile-config.properites (100), env var (300), system properties (400))

ordinal= 700
able to see all config sources with the ordinal less than 700 so that some config properties can be used

@smillidge
Copy link
Contributor

I like the concept as we do currently have the issue of how you configure a ConfigSource. However how this works needs careful specification. Does this work for ConfigSources packaged in the same application. Do they need sorting by ordinal before intialising?

@kenfinnigan
Copy link
Contributor Author

@smillidge Initial thought is it would require a two-phase creation of ConfigSource and ConfigSourceProvider. The initial construction through the service loader, and then ordering by ordinal so that we start with the lowest number and work to the higher ordinal while calling some "initialize()" type method that takes the current Config as a parameter.

Thus enabling higher ordinals to use values in the lower ones to assist in construction

@kenfinnigan
Copy link
Contributor Author

Small update, we're going to explore implementing this in SmallRye Config

@dteleguin
Copy link

There seems to be some confusion here - the OP mentions lower to higher ordinal access, whilst the use case refers to the opposite (correct me if I'm wrong).

Actually, both cases are relevant. Here's a real-world example (this is what we'd love to have in Keycloak.X):

env variable (ordinal=300):
KEYCLOAK_HOME=/path/to/keycloak

application.properties (ordinals=250-260):

keycloak.config.dir=${keycloak.home}/config
keycloak.config.file=${keycloak.config.dir}/keycloak.cfg

ConfigSourceProvider for keycloak.cfg (custom config source, ordinal=270):

String config = config.getValue("keycloak.config.file", String.class);

The user might provide just KEYCLOAK_HOME, in this case everything will be resolved as defined in application.properties; or he/she could specify config file location using KEYCLOAK_CONFIG_FILE env var or -Dkeycloak.config.file=.... Hence, a ConfigSourceProvider with intermediate ordinal needs to be able to access config values from sources with ordinals both lower and higher. Therefore, ordering ConfigSources by ordinal alone seems to be insufficient here. But we can augment it with some means to explicitly declare dependencies between the sources, like e.g. @Before / @After / @Depends annotations. The loader then must resolve exact loading order using those annotations (or fallback to ordinals if there are none). Obviously, attention should be paid to detecting circular dependencies.

@emattheis
Copy link
Contributor

I like the idea of allowing a config source to specify the set of property keys that it depends on. I think the common use case here is the idea of using the config API itself to bootstrap other config sources. If a source implementation can declare the bootstrap properties it requires, then the loader can defer instantiation until it can satisfy the requirements. If it cannot reach a state where the requirements are met, then the source will never be instantiated. This has the pleasant side effect of being able to conditionally instantiate a config source.

We could accomplish this in a backwards compatible manner by allowing a source to declare a constructor that takes a Config instance as a sole argument with an optional accompanying annotation to define the bootstrap requirements. Then the loader can pass an intermediate view of the config, backed by the already instantiated sources, to the source constructor.

The loader must instantiate all the sources that do not declare bootstrap requirements first, so that the ones that do can still take advantage of the natural ordinal behavior (e.g. if the same property is available from environment variables and system properties we don't want to hand out the config as a bootstrap config until we've instantiated both sources).

After instantiating all the "normal" sources, the loader should iterate over any discovered sources that have bootstrap requirements and attempt to instantiate them with the current config if the bootstrap requirements are satisfied. This process should repeat until no more sources are left to instantiate or the bootstrap requirements for all any remaining sources cannot be satisfied. Ordinal can't be taken into account until a source is instantiated, so we should be clear in specification that there can be no guarantee about the order of instantiation and the availability of other sources that have bootstrap requirements at bootstrap time. We could certainly invent a way for sources to indicate what properties they can provide at bootstrap time in order to make a more intelligent ordering decision in the loader, but I think that is a rabbit hole best avoided.

Proposed Annotation:

public @interface BootstrapConfig {
	String[] requires() default {};
	String[] requiresOne() default {};
}

So, the most basic use case would be a config source that just wants to take advantage of any "normal" sources:

public class MyConfigSource implements ConfigSource {
    public MyConfigSource(Config config) {
        // initialize source
    }
}

Which is functionally equivalent to:

public class MyConfigSource implements ConfigSource {
    public MyConfigSource(@BootstrapConfig Config config) {
        // initialize source
    }
}

A more practical example would be a source that needs a particular property defined:

public class MyConfigSource implements ConfigSource {
    public MyConfigSource(@BootstrapConfig(requires = "my.config.boostrap") Config config) {
        // initialize source
    }
}

Another example where a source can be configured in two different ways

public class MyConfigSource implements ConfigSource {
    public MyConfigSource(@BootstrapConfig(requiresOne = { "my.config.option1", "my.config.option2" }) Config config) {
        // initialize source
    }
}

@kenfinnigan
Copy link
Contributor Author

Interesting idea.

I like the different method of determining the load order, my concern is it results in a lot of repetition over the ConfigSource instances that we don't have today. Would this have a performance impact?

@emattheis
Copy link
Contributor

emattheis commented Jan 9, 2020

I think in practice it would almost always boil down to exactly two iterations over a very small list. I imagine that sources requiring bootstrap config will prefer to rely on config coming from sources which do not have bootstrap requirements. So, their requirements will be satisfied on the first pass and the second pass will simply confirm that no more instantiations can be done (more likely the set is simply empty on the second pass attempt).

Even in a pathological case with an extreme number of sources having interdependent bootstrap requirements, the cost of successive iterations should be quite small because the logic is trivial. Any significant time would be spent by the sources actually providing the required properties and that cost is unavoidable.

Either way, it should be easy to test. Any platform config that uses this functionality should obviously be designed to be as efficient as possible. Application developers don't assume much greater risk than they already have with poorly implemented config sources, and they get a more powerful set of options in the bargain.

@emattheis
Copy link
Contributor

I realized today that my proposal above is flawed due to the use of the service loader mechanism to discover config sources. This means that config source must have a no-args constructor and that is the only one that ever gets invoked by the service loader. Thus, the whole idea of having a special constructor doesn't work.

With the service loader mechanism in mind, it seems to me that introducing a new factory type might be the way to go instead:

public interface ConfigSourceFactory {
    Optional<ConfigSource> getConfigSource(Config bootstrapConfig);
}

Implementations would be discovered after the current config building process and then the getConfigSource method would be called on each instance with the intermediate config as described in my previous proposal. Once a factory implementation yields a source, the source is added to the config and the factory is discarded. Factories will be called repeatedly with the updated config until no factories remain, or all remaining factories yield empty optionals.

Config bootstrapConfig;        // start with a view of the config loaded thus far

List<ConfigSourceFactory> factories = new LinkedList<>();

ServiceLoader.load(ConfigSourceFactory.class)
             .forEach(factories::add);

Iterator<ConfigSourceFactory> iterator;
Optional<ConfigSource> source;

for (boolean newSource = true; newSource && factories.size() > 0; ) {
    newSource = false;         // don't repeat unless we get another source on this pass
    iterator = factories.iterator();
    
    while (iterator.hasNext())
    {
        source = iterator.next().getConfigSource(bootstrapConfig);
        
        if (source.isPresent())
        {
            source.get();      // add this source to bootstrapConfig
            iterator.remove(); // discard this factory
            newSource = true;  // ensure we repeat for any remaining factories
        }
    }
}

In this manner, the factory implementation is free to use whatever mechanism it wants to determine whether or not it can successfully create a source and the loading mechanism doesn't need any special handling to order instantiations.

@kenfinnigan
Copy link
Contributor Author

I think ConfigSourceProvider covers what you're thinking of with a ConfigSourceFactory, granted with different parameters for retrieving ConfigSource.

I think it's a bit more complicated because there are pure ConfigSource instances to be discovered, and also ConfigSourceProvider instances to be discovered that in turn provide a ConfigSource instance. So a ConfigSourceProvider could require the presence of some config before creating a ConfigSource.

@emattheis
Copy link
Contributor

emattheis commented Jan 10, 2020

The problem with the existing mechanisms - and the genesis of this issue - is the lack of a standard way to access the config provided by already-registered sources.

Introducing a third mechanism means we can keep the existing ones completely intact and still provide the desired functionality.

Also, I think ConfigSourceProvider is very narrowly applicable to cases where duplicate resources exist across classloaders that sources want to utilize. Using it for the much broader case of sources that need access to other sources, or simply want to conditionally provide a source, doesn't makes sense. I think in such cases there is no desire to be classloader-specific, for example.

@kenfinnigan
Copy link
Contributor Author

Is that purely for a "backwards compatible" position?

In June we have a breaking change MP platform release, so now's the time to actually break things if we need to

@emattheis
Copy link
Contributor

I suppose it's a combination of backwards-compatibility and pragmatism 😛

All things being equal, I would prefer a plan that reduces the surface area of of the SPI rather than expands it. The issues I've run into personally are the inability to conditionally provide a single source (although I see now how I could do this with ConfigSourceProvider using an empty or singleton iterator) and the desire to have access to previously instantiated sources from within a provider. I could see unifying all the source creation use cases into a single interface, but that leaves us with the sticky problem of how to order instantiation when everything is the same. The big advantage of the bolt-on solution is the ability for the new kind of factory to rely on the existing stuff working as normal and first.

Also, I'd love to see something before June...

@dmlloyd
Copy link
Contributor

dmlloyd commented Jan 10, 2020

Just for posterity, I'd like to point out that it is possible to have a configuration source which is itself configured by other properties today (it's just not quite as easy as possible). Here's an example (feel free to cut & paste if desired):

public class MyConfigSource implements ConfigSource {
    enum State {
        NOT_INITIALIZED,
        INITIALIZING,
        INITIALIZED,
    }

    private volatile State state = State.NOT_INITIALIZED;

    private boolean ensureInitialized() {
        if (state != INITIALIZED) {
            synchronized (this) {
                if (state == NOT_INITIALIZED) {
                    state = INITIALIZING;
                    // ... do the work of initial setup here ...
                    // ... this may include recursively reading config properties ...
                    state = INITIALIZED;
                    return true;
                } else if (state == INITIALIZING) {
                    // recursive
                    return false;
                } else {
                    assert state == INITIALIZED;
                    return true;
                }
            }
        } else {
            return true;
        }
    }

    @Override
    public String getValue(String key) {
        if (! ensureInitialized()) {
            return null;
        }
        // .. get the value ..
    }

    @Override
    public Set<String> getPropertyNames() {
        if (! ensureInitialized()) {
            return Collections.emptySet();
        }
        // return our loaded property names
    }

    // etc.
}

This results in a 0% API surface area increase.

@emattheis
Copy link
Contributor

One problem with the lazy-initialization approach is that you may not be able to successfully initialize, but your source will be enlisted in the config anyway. Probably better to use a provider:

public class MyConfigSourceProvider implements ConfigSourceProvider {
    public Iterable<ConfigSource> getConfigSources(ClassLoader forClassLoader) {
        MyConfigSource source = null;
        // ... initialize source, if possible ...
        return source == null ? Collections.emptySet() : Collections.singleton(source);
    }
}

Either way, you still have the problem of actually using the config API to get your properties. I suppose you can do something like this:

ConfigProviderResolver configProviderResolver = ConfigProviderResolver.instance();
Config config = configProviderResolver.getBuilder()
                                      .addDefaultSources()
                                      .build();

try {
    // ... use the config in some way ...
} finally {
    configProviderResolver.releaseConfig(config);
}

Then you can at least have access to the built-in sources, albeit at the cost of creating a separate config, but if you want to use .addDiscoveredSources() you have a cycle to contend with.

@dmlloyd
Copy link
Contributor

dmlloyd commented Jan 10, 2020

If inability to initialize is a possibility, you could choose to throw exceptions or add a "failed" state which acts the same as NOT_INITIALIZED without trying to reinit. This would be equivalent to not registering the config source.

Using the config API isn't a problem because recursion is handled by the state machine (synchronization is thread local), so you can just call ConfigProvider.getConfig(). You don't have to create a temporary one.

@emattheis
Copy link
Contributor

emattheis commented Jan 10, 2020

@dmlloyd I fail to see how you can safely use ConfigProvider.getConfig() from within a ConfigSource. There is no guarantee that you will get a Config that uses the current instance of the ConfigSource, therefore you can't rely on the state machine you outlined to be in the INITIALIZING state.

The following implementation (along with the necessary META-INF/services/org.eclipse.microprofile.config.spi.ConfigSource file) results in a StackOverflowError using SmallRye Config 1.5.1:

import java.util.Collections;
import java.util.HashMap;
import java.util.Map;

import org.eclipse.microprofile.config.ConfigProvider;
import org.eclipse.microprofile.config.spi.ConfigSource;

public class MyConfigSource implements ConfigSource {
    private static enum State {
    	NOT_INITIALIZED,
    	INITIALIZING,
    	INITIALIZED
    }

    private final Map<String, String> properties = new HashMap<String, String>();

    private volatile State state = State.NOT_INITIALIZED;

    private boolean ensureInitialized() {
        if (state != State.INITIALIZED) {
            synchronized (this)    {
                if (state == State.NOT_INITIALIZED) {
                    state = State.INITIALIZING;
                    ConfigProvider.getConfig();
                    state = State.INITIALIZED;
                    return true;
                } else if (state == State.INITIALIZING) {
                    return false;
                } else {
                    assert state == State.INITIALIZED;
                    return true;
                }
            }
        } else {
            return true;
        }
    }

    public String getValue(String key) {
        if (! ensureInitialized()) {
            return null;
        }

        return properties.get(key);
    }

    public Map<String, String> getProperties() {
        if (! ensureInitialized()) {
            return Collections.emptyMap();
        }

        return properties;
    }

    public String getName() {
        return getClass().getName();
    }

    public static void main(String[] args) {
        ConfigProvider.getConfig();
    }
}

@dmlloyd
Copy link
Contributor

dmlloyd commented Jan 11, 2020

@dmlloyd I fail to see how you can safely use ConfigProvider.getConfig() from within a ConfigSource. There is no guarantee that you will get a Config that uses the current instance of the ConfigSource, therefore you can't rely on the state machine you outlined to be in the INITIALIZING state.

The following implementation (along with the necessary META-INF/services/org.eclipse.microprofile.config.spi.ConfigSource file) results in a StackOverflowError using SmallRye Config 1.5.1:

Ah, well that part of the issue could be fixed if we (either in SmallRye or in the spec as well) require that ConfigProvider.getConfig() yield the current configuration during configuration operations.

@emattheis
Copy link
Contributor

Right, so that brings us full circle to the start of this issue: there's no specified way for a ConfigSourceProvider to access the Config API for its own initialization.

Ah, well that part of the issue could be fixed if we (either in SmallRye or in the spec as well) require that ConfigProvider.getConfig() yield the current configuration during configuration operations.

How would you propose doing this? Seems like there's a lot of opportunity for ambiguity...

@dmlloyd
Copy link
Contributor

dmlloyd commented Jan 13, 2020

Ah, well that part of the issue could be fixed if we (either in SmallRye or in the spec as well) require that ConfigProvider.getConfig() yield the current configuration during configuration operations.

How would you propose doing this? Seems like there's a lot of opportunity for ambiguity...

I think it would be reasonable for the Config implementation to use a scoped thread local for this purpose. This way, if the nested ConfigSource delegates to a different Config instance for some reason, the scoping would be updated.

To support this in the specification would require an SPI to be added to ConfigProvider, something like this:

public static void withConfig(Config config, Runnable action) { ... }

To support this in an implementation rather than in the specification, it can all be internal (but it would only work correctly if implementations are not mixed - which is normally not a problem because it is only allowed to have one ConfigProviderResolver implementation present today).

Either way, if no "current" config is set, then the configuration would be selected based on application as it is done today.

@emattheis
Copy link
Contributor

@dmlloyd digging a little deeper, the reason for the stack overflow in the example above is because the default for ConfigSource#getOrdinal() checks the properties for the ordinal value, and SmallRye needs the ordinal to order the source. So, FWIW, it IS possible to use your proposal in SmallRye as long as you provide a default ordinal while instantiating.

Anyway, I still feel like this is better solved by NOT enabling the use of this pattern. I think it would cleaner to insist that sources be ready to provide actual properties before they are enlisted into a config and provide a way for source providers to have access to already enlisted sources when they are invoked.

I think there are basically two categories of ConfigSourceProvider: those that rely on Config and those that don't. Perhaps the thing to do is simple introduce a mixin interface that implementations can use to indicate their need for configuration:

public interface Configurable {
    void configure(Config config);
}

Any discovered source providers that also implement Configurable would deferred until all ConfigSource implementations and non-configurable ConfigSourceProvider implementations were used to bootstrap the config, then that config would be set on all the configurable providers and they would be invoked (in no specific order) to obtain the remaining sources for the config.

Then we could simply specify that configurable source providers can ONLY count on directly instantiated sources or those provided by non-configurable providers.

@dmlloyd
Copy link
Contributor

dmlloyd commented Jan 13, 2020

@dmlloyd digging a little deeper, the reason for the stack overflow in the example above is because the default for ConfigSource#getOrdinal() checks the properties for the ordinal value, and SmallRye needs the ordinal to order the source. So, FWIW, it IS possible to use your proposal in SmallRye as long as you provide a default ordinal while instantiating.

Anyway, I still feel like this is better solved by NOT enabling the use of this pattern. I think it would cleaner to insist that sources be ready to provide actual properties before they are enlisted into a config and provide a way for source providers to have access to already enlisted sources when they are invoked.

I agree that it's best if sources are ready to provide properties before being enlisted into a Config. In Quarkus we take this farther and actually create a new Config when we have phased configuration of config sources (for example the discussion on smallrye/smallrye-config#215 which is applicable right now).

I think there are basically two categories of ConfigSourceProvider: those that rely on Config and those that don't. Perhaps the thing to do is simple introduce a mixin interface that implementations can use to indicate their need for configuration:

public interface Configurable {
    void configure(Config config);
}

Any discovered source providers that also implement Configurable would deferred until all ConfigSource implementations and non-configurable ConfigSourceProvider implementations were used to bootstrap the config, then that config would be set on all the configurable providers and they would be invoked (in no specific order) to obtain the remaining sources for the config.

Then we could simply specify that configurable source providers can ONLY count on directly instantiated sources or those provided by non-configurable providers.

So this would appear to imply that the Config itself is mutable (in terms of the sources that are registered). I'm really on the fence as to whether or not this is a good idea in general.

@emattheis
Copy link
Contributor

So this would appear to imply that the Config itself is mutable (in terms of the sources that are registered). I'm really on the fence as to whether or not this is a good idea in general.

No, I don't want that either. I think there should be a single immutable snapshot of the config created just prior to invoking any configurable source providers. This snapshot gets passed to the source providers via the configure method and is only ever used there.

@dmlloyd
Copy link
Contributor

dmlloyd commented Jan 13, 2020

So this would appear to imply that the Config itself is mutable (in terms of the sources that are registered). I'm really on the fence as to whether or not this is a good idea in general.

No, I don't want that either. I think there should be a single immutable snapshot of the config created just prior to invoking any configurable source providers. This snapshot gets passed to the source providers via the configure method and is only ever used there.

OK, so this begs the question of whether we actually need any spec-level support for this to begin with? After all, one can always create a configuration, and then use that one to make another one, and register only the final configuration with the application.

@emattheis
Copy link
Contributor

As a developer of a portable microprofile application, I want to @Inject config into my components. As a deployer of said application, if I want to be able to introduce alternative config sources, the only portable way to do that is with the specified discovery mechanism. There's no specified way to get in between the actual microprofile config implementation and the application, is there?

@dmlloyd
Copy link
Contributor

dmlloyd commented Jan 13, 2020

No, there is no specified way to do this that I am aware of.

@emattheis
Copy link
Contributor

So, I would argue that we DO need spec level support for this concept. Quarkus is a good example of where the lack of configuration "tiers" is problematic. If I deploy a custom config source in Quarkus, it gets instantiated 3 times per deployment and the lifecycle is not clearly specified or enforced (AFAIK). As more frameworks unify around the microprofile config model, I think it's important to try and fold some of these concepts into the core spec so portability can be preserved.

@emattheis
Copy link
Contributor

As I'm back working on my application code, it occurs to me that my particular friction with the Config API could be solved in the CDI space. MicroProfile embraces CDI so it's natural for application developers who want to augment the Config to do so using CDI.

#125 started some thinking along those lines, but doesn't seem to have gone anywhere...

@radcortez
Copy link
Contributor

Not sure if CDI would be the proper way to go either. While it would definitely solve the issue, remember that the Config API must also work without a CDI container, so I believe a solution for this should cover both cases (with and without CDI).

@emattheis
Copy link
Contributor

I agree. I think the idea of a tiered system should be addressed in both places. However, in my particular use case (and the bulk of application use cases, I believe) could be more easily accommodated in CDI. We can reasonably assume that there is a Config available at CDI initialization that uses the existing source/converter discovery mechanisms, so it would be possible to produce a CDI-specific, injectable Config instance that layers in sources/converters provided as CDI beans. Those implementations could then rely on specified mechanisms to inject any values they need at initialization time from the non-CDI Config.

@radcortez
Copy link
Contributor

Hum, maybe I didn't understand correctly. I would you handle the tiered system when you don't have CDI?

@emattheis
Copy link
Contributor

emattheis commented Jan 24, 2020

I'm just suggesting that the need for a tiered system exists in both places - the core API with the service loader mechanism, and CDI with scoped/typed beans.

I've already presented some ideas in this thread for a non-CDI approach, but I've shifted my personal focus to a CDI solution since that's where my particular interest lies.

@radcortez
Copy link
Contributor

I understand. So, you think that both should different solutions? Ideally if we could find something that works for both it would be great, but yes, in CDI it might be tricky.

@emattheis
Copy link
Contributor

Yes. I think whatever we come up with in the base API is naturally available at the CDI level, but there's an additional opportunity in CDI to embrace the use of CDI beans to contribute sources/converters. I'm working on POC that I'm pretty happy with and hope to share it soon.

@dmlloyd dmlloyd added the use case 💡 An issue which illustrates a desired use case for the specification label Mar 4, 2020
@radcortez
Copy link
Contributor

Implemented a feature in SmallRye Config to support this:
smallrye/smallrye-config#313

@Emily-Jiang
Copy link
Member

I'm keen to see this issue resolved as I also heard other users wanting to have this kind of support. Feel free to do a PR for further discussion!

@radcortez radcortez added this to the MP Config 2.1 milestone Mar 4, 2021
@wicksim
Copy link

wicksim commented Aug 5, 2021

Not exactly this use case, but maybe helpful for people landing here: If you just need a value from an other property as a part for any property you wish to define in a ConfigSource, you could simply use this syntax: ${quarkus.datasource.db-kind}.

overheadhunter added a commit to cryptomator/hub that referenced this issue Nov 25, 2021
to be replaced, once there is an official way to access higher-order config (see eclipse/microprofile-config#470)
overheadhunter added a commit to cryptomator/hub that referenced this issue Nov 26, 2021
to be replaced, once there is an official way to access higher-order config (see eclipse/microprofile-config#470)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
use case 💡 An issue which illustrates a desired use case for the specification
Projects
None yet
Development

No branches or pull requests

8 participants