Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add docs about process isolation in Node SDK #11378

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

mydea
Copy link
Member

@mydea mydea commented Sep 17, 2024

This adds docs for how to use withIsolationScope.

Additionally, it also adds a new section (for server SDKs) under "enriching events" describing how process isolation works. We lack docs for this as of now, and this is something users may def. fall into.

It would be ideal if we could somehow auto-fork this for middlewares, but in the meanwhile...

See getsentry/sentry-javascript#12191 (reply in thread) for how this affects users.

@mydea mydea requested review from lforst, Lms24, AbhiPrasad and a team September 17, 2024 07:33
@mydea mydea self-assigned this Sep 17, 2024
Copy link

vercel bot commented Sep 17, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
changelog ✅ Ready (Inspect) Visit Preview 💬 Add feedback Sep 18, 2024 9:22am
sentry-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Sep 18, 2024 9:22am
1 Skipped Deployment
Name Status Preview Comments Updated (UTC)
develop-docs ⬜️ Ignored (Inspect) Visit Preview Sep 18, 2024 9:22am

Copy link

codecov bot commented Sep 17, 2024

Bundle Report

Changes will increase total bundle size by 537 bytes (0.0%) ⬆️. This is within the configured threshold ✅

Detailed changes
Bundle name Size Change
sentry-docs-server-cjs 7.43MB 546 bytes ⬆️
sentry-docs-edge-server-array-push 257.07kB 3 bytes ⬇️
sentry-docs-client-array-push 6.4MB 6 bytes ⬇️

Copy link
Member

@Lms24 Lms24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit concerned about the middleware thing tbh. I always thought the http instrumentation would create the root span and fork the isolation scope before any middleware is evaluated. After all, we have middleware child spans inside an http.server span 🤔

Docs-wise, I just had a couple of suggestions but good idea to more add more content for the isolation scope!

@@ -0,0 +1,47 @@
---
title: Process Isolation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

m: I'm not sure if "Process" isolation is a good term for the heading. We worry about isolating certain jobs/tasks/requests within one process.

I'd rather call this page "Request Isolation" or more generally "Task Isolation". WDYT?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was also not 100% sure about this. My first instinct was also to go with Request Isolation but this does not really cover e.g. isolation background jobs 🤔 but maybe request isolation is still more understandable 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good to me, at least better than process isolation.

Maybe another angle to this: What about "Data Isolation"? I mean generally this page explains how to isolate data within a unit of work (job/task/request).


In server-side environments, the <PlatformLink to='/enriching-events/scopes'>isolation scope</PlatformLink> is automatically forked around request boundaries. This means that each request will have its own isolation scope, and data set on the isolation scope will only apply to events captured during that request. This is done automatically by the SDK.

However, the request isolation happens when the request itself is processed. This means that if you e.g. have a middleware where you want to set Sentry data (e.g. `Sentry.setUser()` in an auth middleware), you have to manually fork the isolation scope with `Sentry.withIsolationScope()` - see [Using withIsolationScope](../scopes/#using-withisolationscope).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not actionable for docs but
that's a massive limitation actually 😬 I was under the impression that the OTEL http instrumentation would fork the isolation scope automatically before any express middleware is evaluated. Is that not the case?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my app, this is the case with the "standard" OTel instrumentation packages. I can't speak to Sentry's internals, but it's very very strange that this doesn't just work out of the box.

Regardless of that: the language "when the request is processed" is, to me, imprecise and maybe even inaccurate. I consider middleware execution to very much be part of "request processing" - after all, I have middleware parsing request bodies, performing authorization checks, redirecting authenticated requests to the login page, and so on. My app has many requests that are completely handled by middleware!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so in otel you do not have this concept of scopes and capturing data on them for a request etc., so this problem does not really exist there, as far as I can tell.

We have the problem that the way that the auto-isolation works is actually forking the isolation scope after a http.server span is emitted. And this is of course only emitted once the request is actually "done" (or at least started), not when the middlewares run. I agree that this kind of sucks and we should see if we find a way to solve this better. I'll adjust the wording for this for now!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I did some more digging - actually this is not correct! Middlewares are actually correctly isolated in express, so I will remove this section again (the general content is still good).

@nwalters512 the problem in your app is not a fundamental problem with middlewares, but it happens because you added http instrumentation yourself, which overwrites ours (and this is what handles the auto isolation). See https://docs.sentry.io/platforms/javascript/guides/node/opentelemetry/custom-setup/#custom-http-instrumentation for details on this. Sentry relies on the http instrumentation for a bunch of things. It would be nice to find a way to warn about this, though 🤔

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm.. if we can't create an instance of of OTEL http instrumentation, how are we supposed to go about registering it with our existing OTEL setup?

You're suggesting:

// do not import/create a new HttpInstrumentation instance
// import { HttpInstrumentation } from '@opentelemetry/instrumentation-http';
const sentryHttpIntegration = Sentry.httpIntegration();

const sentryClient = Sentry.init({
    dsn: process.env.SENTRY_DSN,
    integrations: (defaultIntegration) =>
        defaultIntegration.map((integration) => (integration.name === 'Http' ? sentryHttpIntegration : integration)),
    release,
    environment,
    skipOpenTelemetrySetup: true,
    tracesSampleRate: 1.0,
});

if that's the case what we're unable to register the instrumentation

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need to add the http instrumentation yourself at all - just do not run registerInstrumentations({ instrumentation: [new HttpInstrumentation()] }), this happens automatically under the hood. Does that lead to problems/does not work as expected for you, with your custom setup? 🤔 We could expose our HttpInstrumentation and allow you to add this yourself, I guess, if that's necessary!

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated my demo sentry + OTEL setup as follows:

import * as Sentry from '@sentry/node';
import { SentryPropagator, SentrySampler } from '@sentry/opentelemetry';
// import { registerInstrumentations } from '@opentelemetry/instrumentation';
// import { ExpressInstrumentation } from '@opentelemetry/instrumentation-express';
// import { HttpInstrumentation } from '@opentelemetry/instrumentation-http';
import { BatchSpanProcessor } from '@opentelemetry/sdk-trace-base';
import { NodeTracerProvider } from '@opentelemetry/sdk-trace-node';

const release = 'sentry-scope-testing';
const environment = 'local';

const sentryClient = Sentry.init({
    dsn: process.env.SENTRY_DSN,
    beforeSend: (event, hint, ...args) => {
        const { contexts, exception, extra, tags, message, user } = event;
        console.dir(
            {
                whoami: 'sentry:beforeSend',
                event: { contexts, exception, extra, tags, message, user },
                hint,
                args,
            },
            { depth: null }
        );
        return event;
    },
    release,
    environment,
    skipOpenTelemetrySetup: true,
    tracesSampleRate: 1.0,
});

const provider = new NodeTracerProvider({
    sampler: new SentrySampler(sentryClient!),
});

provider.addSpanProcessor(
    // mock processor to avoid polluting console output
    new BatchSpanProcessor({
        export: (_spans, cb) => cb({ code: 0 }),
        shutdown: () => Promise.resolve(),
        forceFlush: () => Promise.resolve(),
    })
);

provider.register({
    propagator: new SentryPropagator(),
    contextManager: new Sentry.SentryContextManager(),
});

// registerInstrumentations({
//     instrumentations: [new ExpressInstrumentation(), new HttpInstrumentation()],
// });

Sentry.validateOpenTelemetrySetup();

export const initializedSentry = Sentry;

Then for the express middleware, i removed forking of the isolation scope to see if Middleware is correctly isolated in express:

const auth: RequestHandler = (req, res, next) => {
    const authUser = Users.find((u) => u.id === req.headers['authorization']);
    // Sentry.withIsolationScope(() => {
    if (!authUser) {
        Sentry.setTag('Authenticated', false);
        Sentry.setTag('UserID', null);
        Sentry.setUser(null);
        next(new Error('Authentication Error'));
    } else {
        Sentry.setTag('Authenticated', true);
        Sentry.setTag('UserID', authUser.id);
        Sentry.setUser(authUser);
        res.locals.authUser = authUser;
        next();
    }
    // });
};

However, my output reverted to showing the issue:

{
  whoami: 'sentry:beforeSend',
  event: { ... },
    exception: { ... },
    extra: {
      expectedUser: { id: '4', email: 'foo4@example.com', name: 'foo example4' }
    },
    tags: { Authenticated: true, UserID: '1' },
    message: undefined,
    user: { id: '1', email: 'foo@example.com', name: 'foo example' }
  },
  hint: { ... },
  args: []
}

Copy link

@sbriceland sbriceland Sep 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Furthermore, switching my "mock exporter" to the ConsoleSpanExporter does not print an OTEL spans

provider.addSpanProcessor(
    new BatchSpanProcessor(new ConsoleSpanExporter())
    // mock processor to avoid polluting console output
    // new BatchSpanProcessor({
    //     export: (_spans, cb) => cb({ code: 0 }),
    //     shutdown: () => Promise.resolve(),
    //     forceFlush: () => Promise.resolve(),
    // })
);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opened a PR here: sbriceland/sentry-debug#1 where we can discuss this more!

Comment on lines 26 to 47
This is also necessary if you want to isolate a non-request process, e.g. a background job.

The following example shows how you can use `withIsolationScope` to attach a user and a tag in an auth middleware:

```javascript
const auth = (req, res, next) => {
Sentry.withIsolationScope(() => {
const authUser = findUserForHeader(req.headers["authorization"]);
if (!authUser) {
Sentry.setTag("Authenticated", false);
Sentry.setUser(null);
next(new Error("Authentication Error"));
} else {
Sentry.setTag("Authenticated", true);
Sentry.setUser(authUser);
next();
}
});
};
```

This way, the user & tag will only be attached to events captured during the request that passed the auth middleware.
Copy link
Member

@Lms24 Lms24 Sep 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

l: I'd move the sentence about bg jobs after the user/middleware example for consistency

Suggested change
This is also necessary if you want to isolate a non-request process, e.g. a background job.
The following example shows how you can use `withIsolationScope` to attach a user and a tag in an auth middleware:
```javascript
const auth = (req, res, next) => {
Sentry.withIsolationScope(() => {
const authUser = findUserForHeader(req.headers["authorization"]);
if (!authUser) {
Sentry.setTag("Authenticated", false);
Sentry.setUser(null);
next(new Error("Authentication Error"));
} else {
Sentry.setTag("Authenticated", true);
Sentry.setUser(authUser);
next();
}
});
};
```
This way, the user & tag will only be attached to events captured during the request that passed the auth middleware.
The following example shows how you can use `withIsolationScope` to attach a user and a tag in an auth middleware:
\```javascript
const auth = (req, res, next) => {
Sentry.withIsolationScope(() => {
const authUser = findUserForHeader(req.headers["authorization"]);
if (!authUser) {
Sentry.setTag("Authenticated", false);
Sentry.setUser(null);
next(new Error("Authentication Error"));
} else {
Sentry.setTag("Authenticated", true);
Sentry.setUser(authUser);
next();
}
});
};
\```
This way, the user & tag will only be attached to events captured during the request that passed the auth middleware.
Manually calling `withIsolationScope` is also necessary if you want to isolate a non-request task, like background jobs or queue task.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

argh the nested ``` lines destroy the diff. I just moved the bg job sentence to the bottom and slightly adjusted it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, I do not want to bury this at the very bottom because it is actually a very important aspect that we don't really cover anywhere 🤔 like, if you have a background job or similar you will definitely always have to do this manually 🤔

@sbriceland
Copy link

Thanks for taking the time to improve the docs! I just wanted to add my two cents in case it may help some folks down the road.

From the current documentation:
image

This portion of the existing docs is what lead me to believe that I could use the Sentry.setXXX methods such that when a Sentry.captureXXX is called it would auto-magically have the appropriate scoped data.

While this is an accurate description of Isolation Scope behavior, I believe it would be helpful to call out when you are inside of an isolation scope and a reference to your updated docs here. This way developers have a heads up that Isolation Scope isn't guaranteed e.g. when inside of middleware, unless the isolation scope is manually forked/setup.

@sbriceland
Copy link

It would be ideal if we could somehow auto-fork this for middlewares, but in the meanwhile...

On that point, either the @sentry/node lib could include a helper method like the following. Or since it's simple enough, the docs could let people know something like this should work for Express

import * as Sentry from '@sentry/node';
import type { RequestHandler } from 'express';

export const setupExpressIsolationScopeMiddleware: RequestHandler = (_req, _res, next) => {
    Sentry.withIsolationScope(() => {
        next();
    });
};

Set up isolation scope pre request for Express middleware:

const app: Application = express();

// before routes are added to app
app.use(setupExpressIsolationScopeMiddleware);

@mydea
Copy link
Member Author

mydea commented Sep 18, 2024

After some more digging, I figured out that actually middlewares do work as expected, the problem was a different one - see #11378 (comment).

However, this PR is still good to have because this content was missing, which was an oversight anyhow!

Copy link
Member

@Lms24 Lms24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making the changes! LGTM now!

@@ -0,0 +1,47 @@
---
title: Process Isolation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good to me, at least better than process isolation.

Maybe another angle to this: What about "Data Isolation"? I mean generally this page explains how to isolate data within a unit of work (job/task/request).

…ation/index.mdx

Co-authored-by: Lukas Stracke <lukas.stracke@sentry.io>
mydea added a commit to getsentry/sentry-javascript that referenced this pull request Sep 18, 2024
With this change, it is easier to figure out from logs if the correct or
incorrect http instrumentation is added.

Now, if you see e.g. this in the logs, if users have enabled logs
(`debug: true` if not using `skipOpenTelemetrySetup: true`, else using
native OTEL debug logs with e.g. `diag.setLogger(new
DiagConsoleLogger(), DiagLogLevel.DEBUG)`):

```js
@opentelemetry/instrumentation-http-sentry Applying instrumentation patch for nodejs core module on require hook { module: 'http' }
@opentelemetry/instrumentation-http Applying instrumentation patch for nodejs core module on require hook { module: 'http' }
```

you can tell that that it has been double instrumenting this
incorrectly. You should never see the
`@opentelemetry/instrumentation-http` entry anymore, otherwise something
is wrong there.

This came out of getsentry/sentry-docs#11378, I
looked into various ways to debug this but there is not really an API
provided by OTEL that allows us to figure this out 😬
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants