forked from elastic/kibana
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[Alerting] retry internal OCC calls within alertsClient
During development of elastic#75553, some issues came up with the optimistic concurrency control (OCC) we were using internally within the alertsClient, via the `version` option/property of the saved object. The referenced PR updates new fields in the alert from the taskManager task after the alertType executor runs. In some alertsClient methods, OCC is used to update the alert which are requested via user requests. And so in some cases, version conflict errors were coming up when the alert was updated by task manager, in the middle of one of these methods. Note: the SIEM function test cases stress test this REALLY well. In this PR, we wrap all the methods using OCC with a function that will retry them, a short number of times, with a short delay in between. If the original method STILL has a conflict error, it will get thrown after the retry limit. In practice, this eliminated the version conflict calls that were occuring with the SIEM tests, once we started updating the saved object in the executor. For cases where we know only attributes not contributing to AAD are being updated, a new function is provided that does a partial update on just those attributes, making partial updates for those attributes a bit safer. That will be also used by PR elastic#75553.
- Loading branch information
Showing
5 changed files
with
250 additions
and
33 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,59 @@ | ||
/* | ||
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one | ||
* or more contributor license agreements. Licensed under the Elastic License; | ||
* you may not use this file except in compliance with the Elastic License. | ||
*/ | ||
|
||
// This module provides a helper to perform retries on a function if the | ||
// function ends up throwing a SavedObject 409 conflict. This can happen | ||
// when alert SO's are updated in the background, and will avoid having to | ||
// have the caller make explicit conflict checks, where the conflict was | ||
// caused by a background update. | ||
|
||
import { random } from 'lodash'; | ||
import { Logger, SavedObjectsErrorHelpers } from '../../../../../src/core/server'; | ||
|
||
type RetryableForConflicts<T> = () => Promise<T>; | ||
|
||
// number of times to retry when conflicts occur | ||
const RetryForConflictsAttempts = 5; | ||
|
||
// milliseconds to wait before retrying when conflicts occur | ||
const RetryForConflictsDelayMin = 100; | ||
const RetryForConflictsDelayMax = 250; | ||
|
||
// retry an operation if it runs into 409 Conflict's, up to a limit | ||
export async function retryIfConflicts<T>( | ||
logger: Logger, | ||
name: string, | ||
operation: RetryableForConflicts<T>, | ||
retries: number = RetryForConflictsAttempts | ||
): Promise<T> { | ||
let error: Error; | ||
|
||
// run the operation, return if no errors or throw if not a conflict error | ||
try { | ||
return await operation(); | ||
} catch (err) { | ||
error = err; | ||
if (!SavedObjectsErrorHelpers.isConflictError(err)) { | ||
logger.error(`alertClient ${name} conflict, exceeded retries`); | ||
throw err; | ||
} | ||
} | ||
|
||
// must be a conflict; if no retries left, throw it | ||
if (retries <= 0) { | ||
throw error; | ||
} | ||
|
||
// delay a bit before retrying | ||
logger.warn(`alertClient ${name} conflict, retrying ...`); | ||
await waitBeforeNextRetry(); | ||
return await retryIfConflicts(logger, name, operation, retries - 1); | ||
} | ||
|
||
async function waitBeforeNextRetry(): Promise<void> { | ||
const millis = random(RetryForConflictsDelayMin, RetryForConflictsDelayMax); | ||
await new Promise((resolve) => setTimeout(resolve, millis)); | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
55 changes: 55 additions & 0 deletions
55
x-pack/plugins/alerts/server/saved_objects/partially_update_alert.ts
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,55 @@ | ||
/* | ||
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one | ||
* or more contributor license agreements. Licensed under the Elastic License; | ||
* you may not use this file except in compliance with the Elastic License. | ||
*/ | ||
|
||
import { pick } from 'lodash'; | ||
import { RawAlert } from '../types'; | ||
|
||
import { | ||
SavedObjectsClientContract, | ||
ISavedObjectsRepository, | ||
SavedObjectsErrorHelpers, | ||
SavedObjectsUpdateOptions, | ||
} from '../../../../../src/core/server'; | ||
|
||
import { AlertAttributesExcludedFromAAD, AlertAttributesExcludedFromAADType } from './index'; | ||
|
||
type PartiallyUpdateableAlertAttributes = Pick<RawAlert, AlertAttributesExcludedFromAADType>; | ||
|
||
interface PartiallyUpdateAlertSavedObjectOptions { | ||
version?: string; | ||
ignore404?: boolean; | ||
namespace?: string; // only should be used with ISavedObjectsRepository | ||
} | ||
|
||
type SavedObjectClient = SavedObjectsClientContract | ISavedObjectsRepository; | ||
|
||
// direct, partial update to an alert saved object via scoped SavedObjectsClient | ||
// using namespace set in the client | ||
export async function partiallyUpdateAlert( | ||
savedObjectsClient: SavedObjectClient, | ||
id: string, | ||
attributes: PartiallyUpdateableAlertAttributes, | ||
options?: PartiallyUpdateAlertSavedObjectOptions | ||
): Promise<void | ReturnType<SavedObjectsClientContract['update']>> { | ||
const attributeUpdates = pick(attributes, AlertAttributesExcludedFromAAD); | ||
|
||
const updateOptions: SavedObjectsUpdateOptions = {}; | ||
if (options?.namespace) { | ||
updateOptions.namespace = options.namespace; | ||
} | ||
if (options?.version) { | ||
updateOptions.version = options.version; | ||
} | ||
|
||
try { | ||
return await savedObjectsClient.update<RawAlert>('alert', id, attributeUpdates, updateOptions); | ||
} catch (err) { | ||
if (options?.ignore404 && SavedObjectsErrorHelpers.isNotFoundError(err)) { | ||
return; | ||
} | ||
throw err; | ||
} | ||
} |