Skip to content

Commit

Permalink
[BP] Upgrade guidance for geonetwork 3 users (#7644)
Browse files Browse the repository at this point in the history
* Upgrade guidance for geonetwork 3 users

- Includes mitigation guidance for removed serivces
- housekeeping: Address mkdocs warnings with due to changing emoji library
- change maven build to use --strict flag to prevent mkdoc regressions

Signed-off-by: Jody Garnett <jody.garnett@gmail.com>

* Update docs/manual/docs/maintainer-guide/updating/index.md

* Update docs/manual/docs/maintainer-guide/updating/index.md

* Update docs/manual/docs/maintainer-guide/updating/index.md

* Update docs/manual/docs/maintainer-guide/updating/index.md

* Update docs/manual/docs/maintainer-guide/updating/index.md

* Update doc typos

---------

Signed-off-by: Jody Garnett <jody.garnett@gmail.com>
Co-authored-by: Jose García <josegar74@gmail.com>
  • Loading branch information
jodygarnett and josegar74 committed Jan 22, 2024
1 parent ed14595 commit 79c8c7f
Show file tree
Hide file tree
Showing 25 changed files with 496 additions and 264 deletions.
16 changes: 12 additions & 4 deletions docs/manual/docs/api/csw.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,15 @@
# Catalog Service for the Web (CSW) {#csw-api}
# OGC Catalog Service (CSW) {#csw-api}

The CSW end point exposes the metadata records in your catalog in XML format using the OGC CSW protocol (version 2.0.2).

Two protocols are available:
Two Catalogue Service profiles are available:

- CSW: Provides the ability to search and publish metadata for data, services and related information.
- CSW-T: Provides an interface for creating, modifying and deleting catalog records via the CSW protocol.
- Catalogue Services for the Web (CSW): Provides the ability to search and publish metadata for data, services and related information.
- Catalogue Services for the Web Transaction (CSW-T): Provides additional operations for creating, modifying and deleting catalog records via the CSW protocol.

Reference:

* [Catalogue Service](https://www.ogc.org/standard/cat/) (OGC)

## Configuration

Expand Down Expand Up @@ -69,3 +73,7 @@ Example of a request using an index field name:
```

The mapping between CSW standard queryable and the index fields are defined in **`web/src/main/webapp/WEB-INF/config-csw.xml`**.

## Upgrading from GeoNetwork 3.0 Guidance

The configuration of "Virtual CSW" end-points are replaced by [sub-portals](../administrator-guide/configuring-the-catalog/portal-configuration.md).
Binary file added docs/manual/docs/api/img/admin-console-api.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/manual/docs/api/img/geonetwork-api-html.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/manual/docs/api/img/geonetwork-api-test.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file not shown.
Binary file removed docs/manual/docs/api/img/opensearch.png
Binary file not shown.
11 changes: 5 additions & 6 deletions docs/manual/docs/api/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,12 +14,11 @@ The API guide describes entry points that can be used to interact with the catal

The OGC Industry standard to search, retrieve record in XML format. Can be used to manage records with transaction operation.

GeoNetwork 3.12.x only:

- [OpenSearch and INSPIRE ATOM](opensearch.md)
- [RDF DCAT](rdf-dcat.md)
- [Open Archive Initiative](oai-pmh.md)

No longer supported:

- [INSPIRE ATOM](inspire_atom.md)
- [Open Archive Initiative](oai-pmh.md)
- [OpenSearch](opensearch.md)
- [Q Search](q-search.md)
- [RDF DCAT](rdf-dcat.md)
- [Z39-50](z39-50.md)
18 changes: 18 additions & 0 deletions docs/manual/docs/api/inspire_atom.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
## INSPIRE ATOM

The INSPIRE technical guideline for download services facilitates an option to set up a download service based on OpenSearch and Atom. A separate OpenSearch endpoint is created for every Atom-based download service.

!!! note

Only records based on the ISO19139 standard can be used with Atom. ISO19115-3 records are not.


A remote ATOM feed can be registered in a metadata record (see [Linking data using ATOM feeds](../user-guide/associating-resources/linking-online-resources.md#linking-data-using-atom-feed)), but the catalog can also create ATOM feeds from records describing datasets and services.

For a service metadata record, the corresponding ATOM feed is accessed at: `http://localhost:8080/geonetwork/srv/atom/describe/service?uuid=8b719ebd-646e-4963-b9e0-16b3c2a6d94e`. If the service is attached to one or more datasets (see [Linking a dataset with a service](../user-guide/associating-resources/linking-dataset-or-service.md)), then the feed will also expose each dataset as an `entry` in the feed. Check that the service type is set to `download` (if not, the dataset feed will return an exception).

The dataset feed is accessible at: `http://localhost:8080/geonetwork/srv/atom/describe/dataset?spatial_dataset_identifier_code=b795de68-726c-4bdf-a62a-a42686aa5b6f`. Links will be created for each online resource flagged with a `function` set to `download`.

Examples:

- <https://catalog.inspire.geoportail.lu/geonetwork>
25 changes: 10 additions & 15 deletions docs/manual/docs/api/oai-pmh.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,24 +2,19 @@

!!! warning

Not yet available in version 4.
Unavailable since version 4.0.0.

There is some interest migrating OAI-PMH to Elasticsearch engine used by GeoNetwork 4.
Interested parties are encouraged to contact the project team for direction on this topic.

The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) standard exposes the metadata records in your catalog in an XML format defined by version 2.0 of the OAI-PMH protocol.

The OAI-PMH end point exposes the metadata records in your catalog in XML format using the version 2.0 of the OAI-PMH protocol.
Reference:

## Configuration
* [Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH)](https://www.openarchives.org/OAI/openarchivesprotocol.html)

The following URL is the standard end point for the catalog (substitute your GeoNetwork URL): <http://localhost:8080/geonetwork/srv/eng/oaipmh>?
## Upgrading from GeoNetwork 3.0 Guidance

## Requests
The OAI-PMH API is no longer available.

Standard OAI-PMH requests can be done using the url above and the 6 verbs provided by the standard:

- GetRecord
- Identify
- ListIdentifiers
- ListMetadataFormats
- ListRecords
- ListSets

Please see <https://www.openarchives.org/OAI/openarchivesprotocol.html> for further details.
Recommend migrating to use of [Catalog Service for the Web (CSW)](csw.md) API which provides XML document access.
48 changes: 18 additions & 30 deletions docs/manual/docs/api/opensearch.md
Original file line number Diff line number Diff line change
@@ -1,44 +1,32 @@
# OpenSearch and INSPIRE ATOM {#opensearch-and-atom}
# OpenSearch

!!! warning

Not yet available in version 4.
Unavailable since version 4.0.0.

There is no known sponsor or interested party for implementing OpenSearch.

## OpenSearch
The OpenSerach API provides a serivce description advertised in the HTML.

The catalog provides an opensearch entry point at <http://localhost:8080/geonetwork/srv/eng/portal.opensearch>. This service is advertised in the HTML.
Browsers detect the availability of opensearch by checking the index page at the root of the (sub)domain. Setup required defining a rewrite rule forwarding requests to the geonetwork application.

![](img/opensearch.png)
Reference:

Browsers detect the availability of opensearch by checking the index page at the root of the (sub)domain. If you install geonetwork in a subfolder, consider to set up a rewrite rule forwarding the index request to the subfolder.
* [OpenSearch](https://www.ogc.org/standard/opensearch/) (Open Geospatial Consortium)

An example of such a rewrite rule in Apache:
## Upgrading from GeoNetwork 3.0 Guidance

``` text
RewriteEngine on
RewriteRule "^/$" "/geonetwork/" [R]
```
OpenSearch API is no longer available.

Verify in a browser if opensearch is detected by typing the url and then a space. The url bar should then give an indication that you're searching within the site.
* Recommend migrating to [GeoNetwork OpenAPI](the-geonetwork-api.md) if html discoverability is of primary importance.

![](img/opensearch-in-browser.png){width="300px"}
This provides a self-describing service, and automation tools for developer access in different programming languages.
However the result is specific to the GeoNetwork application, and not an industry standard for interoperability.

## INSPIRE ATOM
* Recommend migration to [OpenGIS Web Catalogue Service (CSW)](csw.md) if standards compliance is of primary importance.

The INSPIRE technical guideline for download services facilitates an option to set up a download service based on OpenSearch and Atom. A separate OpenSearch endpoint is created for every Atom-based download service.

!!! note

Only records based on the ISO19139 standard can be used with Atom. ISO19115-3 records are not.


A remote ATOM feed can be registered in a metadata record (see [Linking data using ATOM feeds](../user-guide/associating-resources/linking-online-resources.md#linking-data-using-atom-feed)), but the catalog can also create ATOM feeds from records describing datasets and services.

For a service metadata record, the corresponding ATOM feed is accessed at: `http://localhost:8080/geonetwork/srv/atom/describe/service?uuid=8b719ebd-646e-4963-b9e0-16b3c2a6d94e`. If the service is attached to one or more datasets (see [Linking a dataset with a service](../user-guide/associating-resources/linking-dataset-or-service.md)), then the feed will also expose each dataset as an `entry` in the feed. Check that the service type is set to `download` (if not, the dataset feed will return an exception).

The dataset feed is accessible at: `http://localhost:8080/geonetwork/srv/atom/describe/dataset?spatial_dataset_identifier_code=b795de68-726c-4bdf-a62a-a42686aa5b6f`. Links will be created for each online resource flagged with a `function` set to `download`.

Examples:

- <https://catalog.inspire.geoportail.lu/geonetwork>
!!! note OGC API - Records

The OGC API - Records standard is not yet ready, but is expected to provide the best of both worlds: html discoverability, and standards compliance.

Interested parties are encouraged to contribute towards this roadmap activity.
13 changes: 13 additions & 0 deletions docs/manual/docs/api/q-search.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Q Search {#q-search}

!!! warning

Unavailable since version 4.0.0.

The Q Search endpoint was built using the GeoNetwork 3.0 Lucene search engine and is no longer available.

## Upgrading from GeoNetwork 3.0 Guidance

The Q Search endpoint is replaced by the Elasticsearch ``/srv/api/search/records/_search`` endpoint.

GeoNetwork 3.0 scripts will need to be migrated to the Elasticsearch API, using POST requests in the Elasticsearch syntax.
21 changes: 13 additions & 8 deletions docs/manual/docs/api/rdf-dcat.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,18 +2,23 @@

!!! warning

Not yet available in version 4.

Unavailable since version 4.0.0.

There is no known sponsor or interested party for implementing RDF DCAT.
Interested parties may contact the project team for guidance and to express their intent.

The RDF DCAT end point provides a way of getting information about the catalog, the datasets and services, and links to distributed resources in a machine-readable format. The formats of the output are based on DCAT, an RDF vocabulary that is designed to facilitate interoperability between web-based data catalogs.

## URLS
Reference:

* [Data Catalog Vocabulary (DCAT)](https://www.w3.org/TR/vocab-dcat-3/)

The following URLs are available (substitute your GeoNetwork URL):
## Upgrading from GeoNetwork 3.0 Guidance

- <http://localhost:8080/geonetwork/srv/eng/rdf.metadata.get?uuid=> : returns an RDF record for the given UUID
- <http://localhost:8080/geonetwork/srv/eng/rdf.search>?: returns a dcat:Catalog record. By default this will describe all the records in the catalog, but query filters are available (see below)
RDF DCAT API is no longer available.

## Query parameters
1. We recommend migrating to use of [Catalog Service for the Web (CSW)](csw.md) API to query and explore data.

- `_cat`: Metadata Category
2. When downloading using `GetRecord` make use of the `application/rdf+xml; charset=UTF-8` output format.

This will allow retrieving records in the same document format as previously provided by RDF DCAT api.
110 changes: 5 additions & 105 deletions docs/manual/docs/api/search.md
Original file line number Diff line number Diff line change
@@ -1,109 +1,9 @@
# Search Service {#q-search}
# Search Service

## OpenAPI Search
## Elasticsearch

!!! note
GeoNetwork provides a full Elasticsearch end-point: ``/srv/api/search/records/_search``

GeoNetwork 4
Reference


The Q Search endpoint is replaced by the ``/srv/api/search/records/_search`` endpoint.

Parameter Reference:

- [Serach API](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-search.html) (Elasticsearch)

## Q Search

!!! note

GeoNetwork 3


The Q Search endpoint allows you to query the catalog programmatically. It is available in the local catalog at `http://localhost:8080/geonetwork/srv/eng/q` (otherwise substitute your catalog URL).

### Query results parameters

The following parameters can be appended to your request to format the results:

- `_content_type=json`: returns results in json format. If this parameter is not provided, then the results are returned in xml format.
- `sortBy`: sorts the results by different criteria (example: `sortBy=relevance`):
- `relevance` (default sorting method if not provided)
- `title` (metadata title)
- `changeDate` (metadata datestamp)
- `rating`
- `popularity`
- `denominatorDesc`
- `denominatorAsc`
- `sortOrder=reverse`: Used to sort alphabetically. Note this will sort in **ASCENDING** order (eg A - Z)
- `from`, `to`: Used to return a subset of the results, usually for pagination (example: `from=1&to=20`)
- `fast`: Used to indicate the information to return. Possible values:
- `index`: returns the metadata information from the Lucene index (a subset of the information). In most cases this is the best option as the retrieval of information from the Lucene index is very fast.

The fields returned are configured in the `dumpFields` section in <https://github.com/geonetwork/core-geonetwork/blob/master/web/src/main/webapp/WEB-INF/config-lucene.xml#L107>

- `false`: returns the raw (full) metadata. This is slower as it will retrieve every metadata attribute from the database. If this parameter is not provided, it returns a minimal set of information for each record: uuid, internal id, metadata schema, create/change dates
- `buildSummary`: Returns a summary element with search facets that can be used to filter the metadata, typically used to provide quick filters (facets) on the search results page. Values:
- `true` (default, if the parameter is not provided).
- `false`: does not return the summary.
- `summaryOnly`: Returns the summary (depending on the value of the parameter `buildSummary`) and results. Values:
- `0` (default, if the parameter is not provided).
- Any other value returns the summary only.
- `resultType`: type of summary to return. Summaries are configured in the `summaryTypes` section in <https://github.com/geonetwork/core-geonetwork/blob/master/web/src/main/webapp/WEB-INF/config-summary.xml#L132-L249>
- `hits` (default value if not provided), returns the fields configured in the `hits` section in <https://github.com/geonetwork/core-geonetwork/blob/master/web/src/main/webapp/WEB-INF/config-summary.xml#L185>
- `details` (recommended value to send), returns the fields configured in the `details` section in <https://github.com/geonetwork/core-geonetwork/blob/master/web/src/main/webapp/WEB-INF/config-summary.xml#L133>
- `extraDumpFields`: a comma-separated list of additional fields that you wish to return alongside the fields returned according to the resultType you have chosen. The wildcard character `*` can be used to match multiple fields. For example `extraDumpFields=mycustomfield*` would match mycustomfield1 and mycustomfield2.
- Other values in the summaries section are allowed

### Query filter parameters

You can search on any field(s) indexed in Lucene. For a complete reference see <https://github.com/geonetwork/core-geonetwork/blob/master/schemas/iso19139/src/main/plugin/iso19139/index-fields/default.xsl>

Note you can query the Lucene index graphically,using a Java-based graphical tool such as [Luke](https://github.com/DmitryKey/luke). Version [4.10.4](https://github.com/DmitryKey/luke/releases/tag/luke-4.10.4.1/) is required to work with the version of Lucene bundled with GeoNetwork. Download the jar file where you can access the GeoNetwork index files, then execute with:

`java -jar luke-with-deps.jar`

Then follow the instructions in the tool.

Most relevant fields:

- `any`: A special Lucene field that indexes all the text content in the metadata. Example: <http://localhost:8080/geonetwork/srv/eng/q?any=water&from=1&to=20&resultType=details&fast=index&_content_type=json>

There are some additional query fields, that use the content from the Lucene field `any`.

- `or`: extract the tokens of the query parameter to return the results that contain at least 1 of the tokens
- `without`: extract the tokens of the query parameter to return the results that don't contain any of the tokens.
- `phrase`: return the results that contain the exact text as provided in the search query parameter.
- `title`: metadata title.
- `abstract`: metadata abstract.
- `topicCat`: metadata topic categories.
- `keyword`: metadata keywords.
- `type`: hierarchyLevel (dataset, service, etc.)

If several tokens are included in the query, an AND query with all the tokens is executed. For example, `title=roads&topicCat=biota`. This query will return the results that contain roads in the title AND have the topic category biota.

An OR query of several fields can be executed using the format: `field1_OR_field2_OR_... =value`. For example, `title_OR_abstract=roads` returns the metadata that contain roads in the title OR the abstract.

Additionally an OR query of several values for a single field can be executed, if the Lucene configuration for that field allows it, with the following format: `field=value1 or value2 or ...` For example `topicCat=biota or farming`, returns the metadata where the topic category is either biota OR farming. If the query was executed as `topicCat=biota&topicCat=farming` then only the metadata with BOTH topic categories would be returned.

### Date Searches

There are a number of ways that you can search by date. Date searches should be of the form YYYY-MM-DD

- dateFrom/dateTo: uses the changeDate parameter in the index.
- creationDateFrom/To: uses the creation date.
- revisionDateFrom/To: uses the revision date.

### Query examples

Query with any field for metadata containing the string 'infrastructure', returning json, using the fast index to return results, and returning the fields configured in `config-summary.xml`:

<http://localhost:8080/geonetwork/srv/eng/q?any=infrastructure&_content_type=json&fast=index&from=1&resultType=details&sortBy=relevance&to=20>

Query datasets with title containing the string 'infrastructure', returning json, using the fast index to return results, returning the fields configured in `config-summary.xml` and returning only the first 20 results (ordered by relevance):

<http://localhost:8080/geonetwork/srv/eng/q?title=infrastructure&type=dataset&_content_type=json&fast=index&from=1&resultType=details&sortBy=relevance&to=20>

Query datasets with a revision date in June 2019 using the fast index to return results, returning the fields configured in `config-summary.xml` and returning only the first 20 results (ordered by relevance):

<http://localhost:8080/geonetwork/srv/eng/q?_content_type=json&revisionDateFrom=2019-06-01&revisionDateTo=2019-06-30&fast=index&from=1&resultType=details&sortBy=relevance&to=20>
- [Search API](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-search.html) (Elasticsearch)
Loading

0 comments on commit 79c8c7f

Please sign in to comment.