Skip to content

Commit

Permalink
feat: integrate cloud event for webhook
Browse files Browse the repository at this point in the history
Signed-off-by: chlins <chenyuzh@vmware.com>
  • Loading branch information
chlins committed Feb 9, 2023
1 parent 413a43e commit 0356872
Showing 1 changed file with 303 additions and 0 deletions.
303 changes: 303 additions & 0 deletions proposals/new/cloudevent.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,303 @@
# Proposal: Support Cloud Event For Webhook

Author: ChenYu Zhang/[chlins](https://github.com/chlins)

## Abstract

This proposal aims to do the webhook enhancement in harbor. The major features will be webhook refactor, cloud event format support for webhook payload.

## Background

CloudEvents is a specification for describing event data in a common way, and now is also under the Cloud Native Computing Foundation. Harbor already supports send webhook notifications to remote endpoints when events happened, but right now the payload format is fixed by internal, the user cannot modify the body of requests unless update the source code and re-compile component images. As the cloud event is the CNCF community specification for event, so harbor would add the integration for it.

## Goals

- Refactor the webhook codebase to migrate the job processing to the common task framework.
- Add the integration of Cloud Event.

## Non-Goals

- Retain all the webhook job histories.(only migrate the last one for every event type)
- Implement all combinations of cloud event specifications.(just provide one standard format(JSON))

## Implementation

### Frontend

TBD

### Backend

#### Refactor webhook job

In the previous version, harbor has unified the schedule/task framework, other job vendors such as replication, tag retention, scan and garage collection all migrated to this framework, but webhook job is a legacy one, so we should migrate it and then can get better lifecycle management and debug capabilities like query trigger histories and job logs from harbor UI or API. The migration can be summarized as following steps.

##### 1. Create webhook job by manager/controller provided by task package

##### 2. Update the job implementation of WEBHOOK and SLACK, and print more useful logs for debug

##### 3. Introduce the new API as unified job style for operations of webhook jobs

*List the executions for a specific policy*

```rest
GET /api/v2.0/projects/{project_name_or_id}/webhook/policies/{policy_id}/executions
```

Response

```json
[
{
"end_time": "2023-01-19T07:06:14Z",
"id": 46,
"metrics": {
"success_task_count": 1,
"task_count": 1
},
"start_time": "2023-01-19T07:06:12Z",
"status": "Success",
"trigger": "EVENT",
"vendor_id": 2,
"vendor_type": "WEBHOOK"
},
{
"end_time": "2023-01-19T07:05:46Z",
"id": 45,
"metrics": {
"success_task_count": 1,
"task_count": 1
},
"start_time": "2023-01-19T07:05:44Z",
"status": "Success",
"trigger": "EVENT",
"vendor_id": 2,
"vendor_type": "WEBHOOK"
}
]
```

*List the tasks for a specific execution*

```rest
GET /api/v2.0/projects/{project_name_or_id}/webhook/policies/{policy_id}/executions/{execution_id}/tasks
```

Response

```json
[
{
"creation_time": "2023-01-19T07:06:12Z",
"end_time": "2023-01-19T07:06:14Z",
"execution_id": 46,
"id": 46,
"run_count": 1,
"start_time": "2023-01-19T07:06:12Z",
"status": "Success",
"update_time": "2023-01-19T07:06:14Z"
}
]
```

*Get the log of a specific webhook task*

```rest
GET /api/v2.0/projects/{project_name_or_id}/webhook/policies/{policy_id}/executions/{execution_id}/tasks/{task_id}/log
```

Response

```text
2023-01-31T09:12:42Z [INFO] [/jobservice/job/impl/notification/webhook_job.go:88]: start to run webhook job.
2023-01-31T09:12:42Z [INFO] [/jobservice/job/impl/notification/webhook_job.go:103]: request body:
{"type":"DELETE_ARTIFACT","occur_at":1675156360,"operator":"admin","event_data":{"resources":[{"digest":"sha256:f271e74b17ced29b915d351685fd4644785c6d1559dd1f2d4189a5e851ef753a","tag":"latest","resource_url":"192.168.8.107/library/alpine:latest"}],"repository":{"date_created":1675155445,"name":"alpine","namespace":"library","repo_full_name":"library/alpine","repo_type":"public"}}}
2023-01-31T09:12:44Z [INFO] [/jobservice/job/impl/notification/webhook_job.go:118]: receive response, status code: 200.
2023-01-31T09:12:44Z [INFO] [/jobservice/job/impl/notification/webhook_job.go:125]: send webhook successfully.
```

##### 4. Adjust the legacy API handler logic for compatible of old API

The following legacy APIs are not used widely, so we will add the deprecated mark in the swagger and remove them in the future.

```rest
# This endpoint returns webhook jobs of a project.
GET /api/v2.0/projects/{project_name_or_id}/webhook/jobs
```

```rest
# This endpoint returns last trigger information of project webhook policy.
GET /api/v2.0/projects/{project_name_or_id}/webhook/lasttrigger
```

##### 5. Migrate the old job rows to other tables by unified way in the database

*NOTICE: We may need to care about the performance as the old table never be cleaned.*

The previous notification jobs stored in the table `notification_job`, and this table never be cleaned up, so we can not restore all old rows to table `execution/task`, just migrate the last job for every event type of one webhook policy. The following SQL can help to do the migrations.

```sql
DO $$
DECLARE
unique_job RECORD;
job RECORD;
vendor_type varchar(32);
status varchar(32);
status_code integer;
execid integer;
BEGIN
FOR unique_job IN select distinct policy_id,event_type from notification_job
LOOP
select * into job from notification_job where policy_id=unique_job.policy_id and event_type=unique_job.event_type order by creation_time desc limit 1;
/* convert vendor type */
if job.notify_type = 'http' then
vendor_type = 'WEBHOOK';
elsif job.notify_type = 'slack' then
vendor_type = 'SLACK';
else
vendor_type = 'WEBHOOK';
end if;
/* convert status */
if job.status = 'pending' then
status = 'Pending';
status_code = 0;
elsif job.status = 'scheduled' then
status = 'Scheduled';
status_code = 1;
elsif job.status = 'running' then
status = 'Running';
status_code = 2;
elsif job.status = 'stopped' then
status = 'Stopped';
status_code = 3;
elsif job.status = 'error' then
status = 'Error';
status_code = 3;
elsif job.status = 'finished' then
status = 'Success';
status_code = 3;
else
status = '';
status_code = 0;
end if;

insert into execution (vendor_type,vendor_id,status,trigger,start_time,end_time,update_time) values (vendor_type,job.policy_id,status,'EVENT',job.creation_time,job.update_time,job.update_time) returning id into execid;

insert into task (execution_id,job_id,status,status_code,run_count,extra_attrs,creation_time,start_time,update_time,end_time,vendor_type) values (execid,job.job_uuid,status,status_code,1,to_json(job.job_detail),job.creation_time,job.update_time,job.update_time,job.update_time,vendor_type);
END LOOP;
END $$;
```

The migration script can only process the data in the database, for runtime job should use `Jobservice Dashboard` to manage or cleanup.

##### 6. Drop the table `notification_job`

#### Integrate Cloud Event

> CloudEvents is a specification for describing event data in common formats to provide interoperability across services, platforms and systems.
Define a new event data format for harbor webhook by following the [spec](https://github.com/cloudevents/spec/blob/v1.0.2/cloudevents/spec.md).

| Attribute | [Type](https://github.com/cloudevents/spec/blob/v1.0.2/cloudevents/spec.md#type-system) | Description | REQUIRED/OPTIONAL |
| --- | --- | --- | --- |
| id | String | Identifies the event | REQUIRED |
| source | URI-reference | Identifies the context in which an event happened | REQUIRED |
| specversion | String | The version of the CloudEvents specification which the event uses | REQUIRED |
| type | String | This attribute contains a value describing the type of event related to the originating occurrence | REQUIRED |
| datacontenttype | String | Content type of data value | OPTIONAL |
| dataschema | URI | Identifies the schema that data adheres to | OPTIONAL |
| data | Data | The event payload | OPTIONAL |
| subject | String | This describes the subject of the event in the context of the event producer (identified by source) | OPTIONAL |
| time | Timestamp | Timestamp of when the occurrence happened | OPTIONAL |

##### Event Type Mapping

| Original Type | Cloud Event Type |
| --- | --- |
| DELETE_ARTIFACT | io.goharbor.artifact.deleted |
| PULL_ARTIFACT | io.goharbor.artifact.pulled |
| PUSH_ARTIFACT | io.goharbor.artifact.pushed |
| DELETE_CHART | io.goharbor.chart.deleted |
| DOWNLOAD_CHART | io.goharbor.chart.downloaded |
| UPLOAD_CHART | io.goharbor.chart.uploaded |
| QUOTA_EXCEED | io.goharbor.quota.exceeded |
| QUOTA_WARNING | io.goharbor.quota.warned |
| REPLICATION | io.goharbor.replication |
| SCANNING_FAILED | io.goharbor.scan.failed |
| SCANNING_COMPLETED | io.goharbor.scan.completed |
| SCANNING_STOPPED | io.goharbor.scan.stopped |
| TAG_RETENTION | io.goharbor.tag_retention.finished |

##### Interface

Define a interface to handle the event data formation.

```go
type Formatter interface {
// Format formats the payload to needed format.
Format(payload *Payload) ([]byte, error)
}
```

Implement json format driver for original and cloudevent format driver to support cloud event type.

```go
type JsonFormatter struct {}

func (*JsonFormatter) Format(payload *Payload) ([]byte, error) {
// do something...

data, err := json.Marshal(payload)
if err != nil {
return nil, err
}

return data, nil
}
```

```go
type CloudEventFormatter struct {}

func (*CloudEventFormatter) Format(payload *Payload) ([]byte, error) {
// do something...

data, err := cloudevent.Marshal(payload)
if err != nil {
return nil, err
}

return data, nil
}
```

##### Example

Push Artifact

```json
{
"specversion": "1.0",
"type": "io.goharbor.artifact.pushed",
"source": "/projects/1/webhook",
"id": "e18c74f8-188e-47ee-861a-bfcd81c3509b",
"time": "2020-04-05T17:31:00Z",
"operator": "harbor-jobservice",
"datacontenttype": "application/json",
"data": "{\"resources\":[{\"digest\":\"sha256:f271e74b17ced29b915d351685fd4644785c6d1559dd1f2d4189a5e851ef753a\",\"tag\":\"latest\",\"resource_url\":\"demo.goharbor.io\/library\/alpine:latest\"}],\"repository\":{\"date_created\":1675155445,\"name\":\"alpine\",\"namespace\":\"library\",\"repo_full_name\":\"library\/alpine\",\"repo_type\":\"public\"}}"
}
```

Replication

```json
{
"specversion": "1.0",
"type": "io.goharbor.replication.finished",
"source": "/projects/1/webhook",
"id": "e8bae503-d320-4c9d-b189-912dc182e6e0",
"time": "2020-04-05T17:31:00Z",
"operator": "admin",
"datacontenttype": "{\"replication\":{\"harbor_hostname\":\"demo.goharbor.io\",\"job_status\":\"Success\",\"artifact_type\":\"image\",\"authentication_type\":\"basic\",\"override_mode\":true,\"trigger_type\":\"MANUAL\",\"policy_creator\":\"admin\",\"execution_timestamp\":1675304619,\"src_resource\":{\"registry_name\":\"dockerhub\",\"registry_type\":\"docker-hub\",\"endpoint\":\"https:\/\/hub.docker.com\",\"namespace\":\"library\"},\"dest_resource\":{\"registry_type\":\"harbor\",\"endpoint\":\"http:\/\/demo.goharbor.io\",\"namespace\":\"library\"},\"successful_artifact\":[{\"type\":\"image\",\"status\":\"Success\",\"name_tag\":\"alpine [1 item(s) in total]\"}]}}"
}
```

0 comments on commit 0356872

Please sign in to comment.