Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fiware - data loss with orion-ld and a stress test #1650

Open
truijllo opened this issue Jul 29, 2024 · 3 comments
Open

Fiware - data loss with orion-ld and a stress test #1650

truijllo opened this issue Jul 29, 2024 · 3 comments
Assignees

Comments

@truijllo
Copy link

During a series of tests we realised that we have a problem with notifications generated by subscriptions. In particular, not all changes correctly received by orion and made to the entities monitored by the subscriptions generate notifications to the downstream systems.
To identify the problem, the test environment was reduced to

  • orion-ld (both versions 1.5.1 and 1.6.0)
  • mongodb (both versions 3.3 and 4.4)
  • a fake http server that receives notifications
  • Load tests from github/fiware
    The services run on the same machine using Docker containers.

During the test, 100 entities are created, a single subscription for all of them, and 5 updates with an interval of 1s. The test is run for both LD and V2. In both cases, a significant number of notifications are not sent to the fake server.

Analysing the logs with debug and trace enabled, we found that notifications are not sent when this condition occurs:

[1318]:addTriggeredSubscriptions_withCache | msg=Subscription:         66964843ed9b9fa7aa4d3bc8
[1319]:addTriggeredSubscriptions_withCache | msg=NOW:                  1721124935.260880
[1320]:addTriggeredSubscriptions_withCache | msg=lastNotificationTime: 1721124935.265951
[1321]:addTriggeredSubscriptions_withCache | msg=DIFF:                 -0.005070
[1322]:addTriggeredSubscriptions_withCache | msg=throttling:           0.000000
[1323]:addTriggeredSubscriptions_withCache | msg=lastSuccess:          1721124935.265951
[1324]:addTriggeredSubscriptions_withCache | msg=lastFailure:          0.000000
[1329]:addTriggeredSubscriptions_withCache | msg=No notification due to throttling (last: 1721124935.260880 vs now: 1721124935.265951)

throttling is set to 0.

By enabling the -experimental flag on the same configuration, no notifications are lost on the LD side, but obviously the problem persists on the V2 side, which does not benefit from the flag changes.
Another test was done using fiware/orion instead of orion-ld and the V2 side showed no problems.

The issue appears even with relatively small numbers, such as 20 records submitted and 14 notified to the fake server ( this means 6 losts ), which makes us think there is something wrong with us that we cannot identify.

We need both parts, the V2 and the LD, this issue is pushing us to split services with the two different services, kind of an overkill solution.

During the test, 100 entities are created, a single subscription for all of them, and 5 updates with an interval of 1s.

the fake http server should receive all the changes.

Pushing a json file using the Apache benchmarking tool ( "ab" ) , using a sequential approach everything works fine.
Using concurrent ingestion ( i.e. 20 input with 10 requests in a time ) I observe only ( more or less ) 14 records notified to the fake server.
Other tests were done with other tools ( even with https://github.com/FIWARE/load-tests )

Does anyone know what the problem could be or how to fix it?

@thinkingmik
Copy link

thinkingmik commented Aug 8, 2024

I have the same problem. I'm using orion-ld:1.6.0 and mongodb:4.4.

In my case there is an IoT Agent UL that creates/updates entities on orion-ld context. It send about 70 calls in once to orion.
The orion-ld context is updated successfully and I see all the 70 entities created/updated, but the related entity type subscription sends to an external service not all the notification. For example sometimes I've got 57 subscriptions sent, 37, 48, etc... (it's random).

@kzangeli
Copy link
Collaborator

kzangeli commented Sep 4, 2024

Try setting throttling to -1.
In NGSIv2 that means it is ignored.
In NGSI-LD throttling is ignored if it is zero or less.
Sorry about the inconsistency here between the two APIs

@kzangeli kzangeli self-assigned this Sep 4, 2024
@truijllo
Copy link
Author

truijllo commented Sep 9, 2024

it seems to work, I used -1 in throttling in both API and, in a testing environment, I get as much entries as I push into.
Thanks a lot !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants