Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File config on small set #1346

Closed
wants to merge 2 commits into from
Closed

File config on small set #1346

wants to merge 2 commits into from

Conversation

yhmtsai
Copy link
Member

@yhmtsai yhmtsai commented Jun 12, 2023

This PR picks some components from different base type of Ginkgo to show the necessary components in file config.
It's not header-only like #800 because we need to instantiate all combinations of each template parameter and base type in compile time when the file includes the header. Thus, it leads the slow compilation issue.
Moreover, Ginkgo only provides users to extend the base type like LinOp or Criterion, not the template type directly.

This pr provides two libraries file_config_custom and file_config.
file_config contains the selection implementation in the library to avoid multiple definition issue when it is included from different places, but file_config_custom implements it in the header to make users easily extend the selection.
There's no difference in practice if the file_config is only used in one file or the final application.
Note. Using file_custom in two distinct objects in the same library may lead the multiple definition issue. Users should first pack the custom type and the selection function into another library with file_config_custom like file_config.

current usage:

[
    {
        "base": "Csr", // it can also be Csr<float, int> directly
        "ValueType": "double",
        "IndexType": "int", // these two contains default like Csr<>
        "exec": "given", // it can also be optional or construct different executor
        "dim": [
            3,
            4
        ], // optional, it can also use 3 for square matrix
        "num_nonzeros": 123, // optional
        "strategy": "load_balance", // optional
        "read": "file", // optional
        "name": "A" // required in ResourceManager but no effect from function call
    },
    {
        "base": "Isai",
        "factory": {
            "skip_sorting": true, // all factory's param are optional
            "excess_solver_factory": {
                // some linop factory construction or refer to existed data
            },
            "exec": "given" // can be construced in config, or given from outside.
        },
        "generate": "A" // or construct the matrix here or from outside
        "add_loggers": {} // optional
    }
]

TODO and discussion:

  • where should we put the file config? should it be in ginkgo repo?
  • type name suggestion?
  • folder structure

@yhmtsai yhmtsai added 1:ST:ready-for-review This PR is ready for review 1:ST:need-feedback The PR is somewhat ready but feedback on a blocking topic is required before a proper review. labels Jun 12, 2023
@yhmtsai yhmtsai requested review from a team June 12, 2023 07:57
@yhmtsai yhmtsai self-assigned this Jun 12, 2023
@ginkgo-bot ginkgo-bot added reg:build This is related to the build system. reg:testing This is related to testing. mod:core This is related to the core module. type:preconditioner This is related to the preconditioners type:matrix-format This is related to the Matrix formats reg:helper-scripts This issue/PR is related to the helper scripts mainly concerned with development of Ginkgo. type:stopping-criteria This is related to the stopping criteria labels Jun 12, 2023
@yhmtsai yhmtsai changed the title File config small set demonstration File config on small set Jun 12, 2023
@MarcelKoch
Copy link
Member

Perhaps we should define how the file config should look like? That would mean, which keywords are used to create a CSR matrix, etc. I think it would help the discussion to see what kind of file configs we are talking about.

@upsj
Copy link
Member

upsj commented Jun 12, 2023

I think as a starting point, we should be thinking about what's the minimal set of features we need to have a viable integration into our applications, as long as the approach is extensible enough. For configuring a solver, the only types we need to know about are LinOpFactories and maybe stopping criterion factories, no matrix types or executors. For the file config schema, I would try to make as much as possible implied by context. An example would be

{
    "type": "solver::Cg", // maybe we can omit the solver::, since there is only one Cg type
    "value_type": "double",
    "stop": {
        "iterations": 1000, // there are only 5 types of stopping criteria we can instantiate,
        "time": 100,        // we can add keywords for all of them
        "residual_reduction": 1e-6
    },
    "preconditioner": {
        "type": "preconditioner::Jacobi", // maybe we can omit the preconditioner::, as it is implied by context
        "value_type": "double",           // value_type: double could be inherited
        "index_type": "int32",
        "max_block_size": 8
    }
}

@yhmtsai
Copy link
Member Author

yhmtsai commented Jun 12, 2023

for me, the matrix type and executor are necessary. Like CG using Csr or Cg using Coo gives different performance due to SpMV and we put the executor mapping in all our examples.
for stopping criterion, the shortcut version will be there in the end. I will also say the shortcut version should be also in ginkgo itself.
Also, in some cases, the preconditioner does not use the same matrix as the solver.
The initial design idea is to keep the same usage as ginkgo API such that users does not need to learn two rules for setting and give the same flexibility.

the default value type is like our template default not inherited from the previous part, but I am thinking about adding alias section to change the value_type easily

@upsj
Copy link
Member

upsj commented Jun 12, 2023

If a file object acts like a Factory::parameters type and require a call to .on(exec) in the end, there would be no need to involve executors at all, see #1336. As soon as we involve a preconditioner, we need to convert the matrix to Csr again, anyways. A potential way to deal with this would be allowing users to specify which matrix type should be used for the SpMVs in an iterative solver, and implement the corresponding functionality in SolverBase, or provide a LinOpFactory that converts the input LinOp to another type. We already discussed the same pattern for handling precision conversions in factory hierarchies.

@yhmtsai
Copy link
Member Author

yhmtsai commented Jun 12, 2023

no, I mainly mean the selection of the executor. The executor can be given from outside or constructed in the config.
Preconditioner requiring matrix in Csr is the limitation from the preconditioner, not the limitation to avoid using other format in other places. The preconditioner changes the format should not affect the format in the solver.
If the factory supports without .on(exec) from the same parent class, it can be used in file_config for sure. Currently, it can already be hidden from the config because it also allows inheritance from the above executor.
Ginkgo supports the setting method -> file_config should also support it unless it can not be handled by json file. (For example, the full flexibility of user-defined lambda function in multigrid may not be possible)

@yhmtsai
Copy link
Member Author

yhmtsai commented Jun 12, 2023

@MarcelKoch do you mean the json-schema you mentioned in another pull request?
or some readme/documentation for different class

@MarcelKoch
Copy link
Member

@yhmtsai I meant what @upsj posted. Some examples of how the .json could look like.

@MarcelKoch
Copy link
Member

Personally, I would go even further than @upsj and would omit the value_type and index_type completely from the config. I think this is not something that application users want to define. For example, for hyteg I had a fixed index_type and a templated value_type for my wrapper class that used the file config. These types should be used together with the file config.

@yhmtsai
Copy link
Member Author

yhmtsai commented Jun 12, 2023

omitting value type and index type is also supported and optional as I said. you need a way to fix the valuetype for the config, which is close to the alias section.
putting them in template makes user need to recompile when they want to try different precision.
there's no hurt with supporting the selection of type.
we also have mixed precision support.

@yhmtsai
Copy link
Member Author

yhmtsai commented Jun 12, 2023

@MarcelKoch I updated the usage in the description.

@upsj
Copy link
Member

upsj commented Jun 13, 2023

I wrote a small prototype for what I would imagine a minimal approach would look like in the minimal_file_config_prototype branch, moving the complexity of ValueType etc. into template parameters like @MarcelKoch suggested. The interesting part is in config_parser.cpp, everything is based on #1336.

@MarcelKoch
Copy link
Member

I'm not sure if I really want to open this up, but should we consider formats other than json? IMO json is not ideal for human written files. For example, it does not allow for comments, and you need lots of ", {, and such.
I'm not saying we definitely need to switch, there are quite a few advantages of json, most importantly that it's widely used. But maybe we could discuss it.

@upsj
Copy link
Member

upsj commented Jun 13, 2023

I believe that was the idea behind #1295, I think architecturally we should consider the representation of the parsed data independent of the object construction implementation.

@upsj
Copy link
Member

upsj commented Jun 13, 2023

Oh and BTW, nlohmann-json can deal with comments as well

@MarcelKoch
Copy link
Member

@upsj I'm concerned with the step before that (unless I misunderstood you). This is again the question of how a config file should look like, and if json is the best tool to represent that.

One feature that json is missing, would be references within the document. For example in yaml you could have

.criteria: &criteria
  iterations: 100

cg:
  type: solver::Cg
  stop: 
    *criteria

gmres
  type: solver::Gmres
  stop: 
    *criteria

which I personally think is neat. But again, we should discuss if we want this or not.

@upsj
Copy link
Member

upsj commented Jun 13, 2023

@MarcelKoch as long as we can transform it into a tree with resolved references, any configuration language should be suitable. JSON is just a cleaner language from the parser perspective.

@MarcelKoch
Copy link
Member

For me, the user perspective is way more important than the parser perspective for the config files, that's why I bring this whole issue up.

@upsj
Copy link
Member

upsj commented Jun 13, 2023

Agreed, to me the more contentious part of this design lies in the mapping from parsed config to generated object though. If we have a JSON object as the top-level entry in the table, we can get references the same way by doing

{
    "cg": {
        "type": "solver::Cg",
        "stop": "criteria",
        "preconditioner": "jacobi-precond"
    },
    "gmres": {
        "type": "solver::Gmres",
        "stop": "criteria",
        "preconditioner": "jacobi-precond"
    }
    "jacobi-precond": {
        "type": "preconditioner::Jacobi",
        "max_block_size": 8
    },
    "criteria": {
        "iterations": 1000,
        "time": 100,
        "residual_reduction": 1e-6
    }
}

Basically any place we expect an object we also accept a string that references a top-level object from the parsed input.

@MarcelKoch
Copy link
Member

I think in the JSON case, we would need to carry around some state. So when parsing the cg solver for example:

auto cg = parse(config["cg"]);

then parse needs to have the whole config available. That should not be too difficult, but it would move some work from the external parser to our side.

@upsj
Copy link
Member

upsj commented Jun 13, 2023

To me the question is: Do we need a user-accessible format-agnostic representation of the structure? If not, we can have a parse_json and parse_yaml function that only require an istream/string and the name of the object config we want to instantiate, and only share what code can be reused between them.

@yhmtsai
Copy link
Member Author

yhmtsai commented Jun 14, 2023

In yaml reference usage, will config["cg"] still get information from criterion? or, will it show a non-found reference?
Another way is to preprocess the json first (hard copy the reference into corresponding block) such that the user can use config["cg"] directly.

currently, implementation also requires the whole json to achieve reference-like usage.

manager.put(full_config); // lazy construction
// manager.build(full_config); build everything
manager.search("cg"); // build the item when calling it

@tcojean
Copy link
Member

tcojean commented Jun 14, 2023

Just to make sure it's here, here is how we use the current ressource manager in OpenCARP which infinitely simplifies the interfacing to Ginkgo.

https://git.opencarp.org/openCARP/openCARP/-/blob/master/numerics/ginkgo/SF_ginkgo_solver.cc

It's unclear for preconditioners like BDDC, and also unclear how it would change for distributed.

These are the current files for the openCARP config:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1:ST:need-feedback The PR is somewhat ready but feedback on a blocking topic is required before a proper review. 1:ST:ready-for-review This PR is ready for review mod:core This is related to the core module. reg:build This is related to the build system. reg:helper-scripts This issue/PR is related to the helper scripts mainly concerned with development of Ginkgo. reg:testing This is related to testing. type:matrix-format This is related to the Matrix formats type:preconditioner This is related to the preconditioners type:stopping-criteria This is related to the stopping criteria
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants