Skip to content

kit-data-manager/pit-service

Repository files navigation

Typed PID Maker

License Java CI with Gradle

The Typed PID Maker enables the creation, maintenance, and validation of PIDs. It ensures the PID contains typed, machine-actionable information using validation. This is especially helpful in the context of FAIR Digital Objects (FAIR DOs / FDOs). To make this work, our validation strategy requires a reference to a registered Kernel Information Profile within the PID record, as defined by the recommendations of the Research Data Alliance (RDA). In the RDA context, this kind of service is called a "PIT service". We use Handle PIDs, which can be created using a Handle Prefix (not included). For testing or other local purposes, we support sandboxed PIDs, which require no external service.

See also: Documentation | Configuration details | Features | Build | Run | License

Features

  • ✅ Create PIDs containing typed key-value-pairs for easy, fast, and automated decision-making.
  • ✅ Maintain the information within these PIDs.
  • ✅ Validate PIDs.
  • ✅ Resolve PIDs.
  • ✅ Store the created PIDs in your database and query them.
    • ✅ Pagination support
    • ✅ Tabulator.js support
  • ✅ Build & use your own search index
    • ✅ Search for information stored within PIDs. This includes PIDs you created, updated or resolved at some point.
    • ✅ Supports the full elastic DSL (and requires an Elasticsearch 8 instance).
  • ✅ Authentication via JWT or KeyCloak
  • ✅ Bootstrap with existing PIDs in your PID Prefix (see command line options).
  • ✅ Extract all your PIDs to CSV files (see command line options).
  • ✅ Make your PIDs distinguishable with a customizable branding-prefix and other customization options.

Some of the features are described in more detail in the following sections.

Search example

The search can be executed via the provided swagger interface (default location: http://localhost:8090/swagger-ui.html). For example, with the following request body you will get all record information:

{
  "query": {
    "regexp": {
      "pid": {
        "value": ".*",
        "flags": "ALL",
        "case_insensitive": true
      }
    }
  }
}

You can also use other http clients, like CURL. A CURL (which may be provided by swagger) request may look like this:

curl -X 'POST' \
  'http://localhost:8090/api/v1/search?page=0&size=20' \
  -H 'accept: application/hal+json' \
  -H 'Content-Type: application/json' \
  -d '{
  "query": {
    "regexp": {
      "pid": {
        "value": ".*",
        "flags": "ALL",
        "case_insensitive": true
      }
    }
  }
}'

The look of a PID (customization)

The detailed configuration documentation has a list of available properties which influence the way PIDs look. There is no functional benefit in PID customization. Some examples for possible PIDs with the Typed PID Maker:

  • ProjectX--1d6c-152c-c9e0-c136-1509
    • branding-prefix = ProjectX--
    • mode = HexChunk
    • num-chunks = 4
    • casing = lower
  • d08a-3c11-8e8a-55f7-76a6
    • without branding-prefix
  • AADF-46A0-661F-9CAF-43A2
    • casing = upper
  • 8d819cba-ba84-4080-b86c-8d2d318c240f
    • (default configuration)
    • mode = UUID4
    • casing = lower

If you have an interest in more customization, feel free to contact us.

In general, a PID is built like the following scheme:

PID = prefix + suffix
  1. The prefix is a string which is prepended to all your PIDs. It can be considered a namespace and is given to you by your PID system (here: the handle system). It usually ends with a slash (/) as a separator.
  2. The suffix is a random string, generated by the Typed PID Maker. This generation can be customized.

As the suffix is flexible, we can prepend a branding prefix to it, to show some relation to a project or institution. Please note that the branding is then part of the suffix, and therefore part of the whole PID. It can not be changed if the PID has already been registered. Of course, it can be changed for new PIDs. If a branding is applied, the scheme of a PID can be represented like the following:

PID = prefix + (branding + uniquely-generated-string)
               ^------------- <suffix> -------------^

All other configuration properties affect only the uniquely-generated-string. For example, you may choose a different generation method (UUID (default) or Hex Chunks) enforce casing (lower-case, upper-case).

How to build

In order to build the Typed PID Maker, you'll need:

  • Java SE Development Kit 11 (or openjdk 11) or higher

After obtaining the sources change to the folder where the sources are located perform the following steps:

user@localhost:/home/user/typed-pid-maker$ ./gradlew build
> Configure project :
Using release profile for building notification-service
<-------------> 42% EXECUTING [0s]
[...]
user@localhost:/home/user/typed-pid-maker$

The Gradle wrapper will now take care of downloading the configured version of Gradle and finally build the Typed PID Maker microservice. As a result, a jar file containing the entire service is created at build/libs/TypedPIDMaker-$(version).jar.

How to start

For development purposes, the easiest way to run the service with your configuration file is:

./gradlew run --args="--spring.config.location=config/application.properties"

Before you are able to start the microservice, you have to modify the file 'application.properties' according to your local setup. Therefor, copy the file conf/application.properties to your project folder and customize it. For the Collection API you just have to adapt the properties of spring.datasource and you may change the server.port property. All other properties can be ignored for the time being.

As soon as you finished modifying 'application.properties', you may start the service by executing the following command inside the project folder, e.g. where the service has been built before:

user@localhost:/home/user/typed-pid-maker$ ./build/libs/TypedPIDMaker-$(version).jar

  .   ____          _            __ _ _
 /\\ / ___'_ __ _ _(_)_ __  __ _ \ \ \ \
( ( )\___ | '_ | '_| | '_ \/ _` | \ \ \ \
 \\/  ___)| |_)| | | | | || (_| |  ) ) ) )
  '  |____| .__|_| |_|_| |_\__, | / / / /
 =========|_|==============|___/=/_/_/_/
 :: Spring Boot ::        (v2.0.5.RELEASE)
[...]
1970-01-01 00:00:00.000  INFO 56918 --- [           main] o.s.b.w.embedded.tomcat.TomcatWebServer  : Tomcat started on port(s): 8070 (http) with context path ''

As soon as the microservice is started, you can browse to

http://localhost:8090/swagger-ui.html

in order to see available RESTful endpoints and their documentation. You may have to adapt the port according to your local settings. Furthermore, you can use this Web interface to test single API calls in order to get familiar with the service.

Details on the version number and other build information can be found on http://localhost:8090/actuator/info.

Command line options

  • --spring.config.location=config/application.properties set the configuration files location to be used. Not required if the file is in the same directory as the jar file.
  • bootstrap all-pids-from-prefix starts the service and bootstraps all PIDs. This means:
    • store the PIDs as "known PIDs" in the local database (as configured)
    • send one message per PID to the message broker (if configured)
    • (WIP, #128) store the PID records in the search index (if configured)
    • after the bootstrap, the application will continue to run
  • bootstrap known-pids same as above, but:
    • not using all PIDs from prefix, but only the ones stored in the local database ("known PIDs")
    • useful to, for example, re-send PIDs via messaging to notify new services
  • write-file all-pids-from-prefix writes all PIDs of the configured PID prefix to a CSV file (one PID per line).
  • write-file known-pids same as above but:
    • only with the PIDs stored in the local database ("known PIDs").

License

The KIT Data Manager is licensed under the Apache License, Version 2.0.