Use standalone mapper instead of the ORM one #286

marko-bekhta · 2024-06-19T13:58:10Z

fixes #265 (potentially)

While doing this I've noticed that quarkus can now log:

15:38:51 WARN  [i.ve.co.im.BlockedThreadChecker] Thread Thread[vert.x-worker-thread-1,5,main] has been blocked for 61353 ms, time limit is 60000 ms: io.vertx.core.VertxException: Thread blocked
        at java.base@17.0.9/jdk.internal.misc.Unsafe.park(Native Method)
        at java.base@17.0.9/java.util.concurrent.locks.LockSupport.park(LockSupport.java:211)
        at java.base@17.0.9/java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1864)

I haven't seen that before 🤔. And I've tried to run on main when calling /reindex this log also is logged.

Another thing related to the /reindex: after a live-reload something gets messed up resulting in:

java.lang.NullPointerException: Cannot invoke "java.util.Map.size()" because "m" is null
	at java.base/java.util.HashMap.putMapEntries(HashMap.java:495)
	at java.base/java.util.LinkedHashMap.<init>(LinkedHashMap.java:385)
	at io.quarkus.hibernate.search.standalone.elasticsearch.runtime.HibernateSearchStandaloneRecorder$1.apply(HibernateSearchStandaloneRecorder.java:110)
	at io.quarkus.hibernate.search.standalone.elasticsearch.runtime.HibernateSearchStandaloneRecorder$1.apply(HibernateSearchStandaloneRecorder.java:103)
	at org.hibernate.search.mapper.pojo.standalone.mapping.SearchMapping_A3Pwpzj6LZ1U3ryq007Z-Hp-28Q_Synthetic_Bean.createSynthetic(Unknown Source)
	at org.hibernate.search.mapper.pojo.standalone.mapping.SearchMapping_A3Pwpzj6LZ1U3ryq007Z-Hp-28Q_Synthetic_Bean.doCreate(Unknown Source)
	at org.hibernate.search.mapper.pojo.standalone.mapping.SearchMapping_A3Pwpzj6LZ1U3ryq007Z-Hp-28Q_Synthetic_Bean.create(Unknown Source)
	at org.hibernate.search.mapper.pojo.standalone.mapping.SearchMapping_A3Pwpzj6LZ1U3ryq007Z-Hp-28Q_Synthetic_Bean.create(Unknown Source)
	at io.quarkus.arc.impl.AbstractSharedContext.createInstanceHandle(AbstractSharedContext.java:119)
	at io.quarkus.arc.impl.AbstractSharedContext$1.get(AbstractSharedContext.java:38)
	at io.quarkus.arc.impl.AbstractSharedContext$1.get(AbstractSharedContext.java:35)
	at io.quarkus.arc.generator.Default_jakarta_enterprise_context_ApplicationScoped_ContextInstances.c29(Unknown Source)
	at io.quarkus.arc.generator.Default_jakarta_enterprise_context_ApplicationScoped_ContextInstances.computeIfAbsent(Unknown Source)
	at io.quarkus.arc.impl.AbstractSharedContext.get(AbstractSharedContext.java:35)
	at io.quarkus.arc.impl.ClientProxies.getApplicationScopedDelegate(ClientProxies.java:21)
	at org.hibernate.search.mapper.pojo.standalone.mapping.SearchMapping_A3Pwpzj6LZ1U3ryq007Z-Hp-28Q_Synthetic_ClientProxy.arc$delegate(Unknown Source)
	at org.hibernate.search.mapper.pojo.standalone.mapping.SearchMapping_A3Pwpzj6LZ1U3ryq007Z-Hp-28Q_Synthetic_ClientProxy.scope(Unknown Source)
	at io.quarkus.search.app.indexing.IndexingService.createIndexes(IndexingService.java:191)
	at io.quarkus.search.app.indexing.IndexingService.reindex(IndexingService.java:170)
	at io.quarkus.search.app.indexing.IndexingService_Subclass.reindex$$superforward(Unknown Source)
	at io.quarkus.search.app.indexing.IndexingService_Subclass$$function$$1.apply(Unknown Source)
	at io.quarkus.arc.impl.AroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:73)
	at io.quarkus.arc.impl.AroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:62)
	at io.quarkus.arc.impl.ActivateRequestContextInterceptor.invoke(ActivateRequestContextInterceptor.java:129)
	at io.quarkus.arc.impl.ActivateRequestContextInterceptor.aroundInvoke(ActivateRequestContextInterceptor.java:33)
	at io.quarkus.arc.impl.ActivateRequestContextInterceptor_Bean.intercept(Unknown Source)
	at io.quarkus.arc.impl.InterceptorInvocation.invoke(InterceptorInvocation.java:42)
	at io.quarkus.arc.impl.AroundInvokeInvocationContext.perform(AroundInvokeInvocationContext.java:30)
	at io.quarkus.arc.impl.InvocationContexts.performAroundInvoke(InvocationContexts.java:27)
	at io.quarkus.search.app.indexing.IndexingService_Subclass.reindex(Unknown Source)
	at io.quarkus.search.app.indexing.IndexingService.lambda$registerManagementRoutes$0(IndexingService.java:57)
	at io.vertx.ext.web.impl.BlockingHandlerDecorator.lambda$handle$0(BlockingHandlerDecorator.java:48)
	at io.vertx.core.impl.ContextImpl.lambda$executeBlocking$1(ContextImpl.java:191)
	at io.vertx.core.impl.ContextInternal.dispatch(ContextInternal.java:279)
	at io.vertx.core.impl.ContextImpl.lambda$internalExecuteBlocking$2(ContextImpl.java:210)
	at io.vertx.core.impl.TaskQueue.run(TaskQueue.java:76)
	at org.jboss.threads.ContextHandler$1.runWith(ContextHandler.java:18)
	at org.jboss.threads.EnhancedQueueExecutor$Task.doRunWith(EnhancedQueueExecutor.java:2516)
	at org.jboss.threads.EnhancedQueueExecutor$Task.run(EnhancedQueueExecutor.java:2495)
	at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1521)
	at org.jboss.threads.DelegatingRunnable.run(DelegatingRunnable.java:11)
	at org.jboss.threads.ThreadLocalResettingRunnable.run(ThreadLocalResettingRunnable.java:11)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.base/java.lang.Thread.run(Thread.java:840)
Resulted in: jakarta.enterprise.inject.CreationException: Error creating synthetic bean [A3Pwpzj6LZ1U3ryq007Z-Hp-28Q]: java.lang.NullPointerException: Cannot invoke "java.util.Map.size()" because "m" is null
	... 39 more
Resulted in: java.lang.IllegalStateException: Failed to create indexes: Error creating synthetic bean [A3Pwpzj6LZ1U3ryq007Z-Hp-28Q]: java.lang.NullPointerException: Cannot invoke "java.util.Map.size()" because "m" is null
	at io.quarkus.search.app.indexing.IndexingService.createIndexes(IndexingService.java:213)
	... 26 more

have you seen something like that before? maybe it's a known issue?

also sometimes I see:

14:02:07 ERROR [i.qu.ar.im.UncaughtExceptions] Error occurred while destroying instance of SYNTHETIC bean [types=[java.lang.Object, org.hibernate.search.mapper.pojo.standalone.mapping.SearchMapping], qualifiers=[@jakarta.enterprise.inject.Default, @Any], target=n/a]: java.lang.ClassCastException: class org.hibernate.search.mapper.pojo.standalone.mapping.impl.StandalonePojoMapping cannot be cast to class java.io.Closeable (org.hibernate.search.mapper.pojo.standalone.mapping.impl.StandalonePojoMapping is in unnamed module of loader io.quarkus.bootstrap.classloading.QuarkusClassLoader @483010b3; java.io.Closeable is in module java.base of loader 'bootstrap') [Error Occurred After Shutdown]

but I cannot reproduce this one reliably 🙈 😕

There's probably something obvious that I'm missing on why SearchSession is not exposed through CDI with standalone mapper
Loading strategy is not trivial 😃, that's me saying, who missed adding sink.complete(); and spending 15 minutes trying to understand why indexing never stops 😆.
- mass indexing isn't ideal for our case of a data Stream I wonder, whether we could make it simpler to index a stream, rather than pushing ids and then loading "entities"

yrodiere · 2024-06-19T14:33:32Z

have you seen something like that before? maybe it's a known issue?

Not known, we'll need to investigate. Please file a bug?

also sometimes I see:

14:02:07 ERROR [i.qu.ar.im.UncaughtExceptions] Error occurred while destroying instance of SYNTHETIC bean [types=[java.lang.Object, org.hibernate.search.mapper.pojo.standalone.mapping.SearchMapping], qualifiers=[@jakarta.enterprise.inject.Default, @Any], target=n/a]: java.lang.ClassCastException: class org.hibernate.search.mapper.pojo.standalone.mapping.impl.StandalonePojoMapping cannot be cast to class java.io.Closeable (org.hibernate.search.mapper.pojo.standalone.mapping.impl.StandalonePojoMapping is in unnamed module of loader io.quarkus.bootstrap.classloading.QuarkusClassLoader @483010b3; java.io.Closeable is in module java.base of loader 'bootstrap') [Error Occurred After Shutdown]

but I cannot reproduce this one reliably 🙈 😕

Same, please file a bug.

There's probably something obvious that I'm missing on why SearchSession is not exposed through CDI with standalone mapper

What would be the scope of such a CDI bean?

Anyway, you're supposed to create it like documented here: https://quarkus.io/version/main/guides/hibernate-search-standalone-elasticsearch#implementing-the-rest-service

Loading strategy is not trivial 😃, that's me saying, who missed adding sink.complete(); and spending 15 minutes trying to understand why indexing never stops 😆.

:s Any idea how we could make it easier?

mass indexing isn't ideal for our case of a data Stream I wonder, whether we could make it simpler to index a stream, rather than pushing ids and then loading "entities"

Yeah you'd need to hack something together. E.g. make the entities themselves the "IDs" and have the entity loader be a no-op. Maybe that's what you did... I'll have a look :)

src/main/java/io/quarkus/search/app/hibernate/GuideLoadingBinder.java

src/main/java/io/quarkus/search/app/indexing/IndexingService.java

marko-bekhta · 2024-06-19T16:24:41Z

ok 😃 bug reports opened

:s Any idea how we could make it easier?

Yeah, I mean, it is all in the docs and examples; it was just me: oh, I need to implement this interface? ok, IDE, tell me which methods are there, and I'll figure it out from there... instead of READING the doc first !!! 🤦🏻 😄.
I wonder if we could hide the sink and ask the user to return some context object like:

interface SomeMeaningfullNameHere {
   boolean noMoreIds();
   List<T> currentIds()
}
// or
interface SomeMeaningfullNameHere {
   Optional<List<T>> currentIds(); // and if optional is empty => no more ids... 
}

and thanks for taking a look at this one!!

yrodiere

LGTM, feel free to merge if it works :)

yrodiere · 2024-06-19T16:58:34Z

And thanks for working on this!

marko-bekhta · 2024-06-24T07:39:45Z

I've added a copy of a progress monitor; since we do not know the total ahead of time, the built-in one would log a negative % progress.

Hibernate Search isn't really causing the live reload issue (and there's also no live reload in prod). Closing the mapping one shouldn't be a problem, as we have a new container on each deployment.

So.. I think we can merge this one and see how it behaves throughout the week.

yrodiere · 2024-06-24T07:48:28Z

I've added a copy of a progress monitor; since we do not know the total ahead of time, the built-in one would log a negative % progress.

Maybe we should add support for "unkwown total size" in Hibernate Search, both in the ID loader and in the default logging monitor?

So.. I think we can merge this one and see how it behaves throughout the week.

+1, merging, thanks

marko-bekhta · 2024-06-24T08:02:19Z

Maybe we should add support for "unkwown total size" in Hibernate Search, both in the ID loader and in the default logging monitor?

created https://hibernate.atlassian.net/browse/HSEARCH-5180 🙈

yrodiere · 2024-06-25T07:35:06Z

After your last patch (#288), indexing succeeded, and it looks like it works now: https://search-quarkus-io-dev-search-quarkus-io.apps.ospo-osci.z3b1.p1.openshiftapps.com/api/guides/search?q=orm+elasticsearch
Shall we put in production?

BTW, I suspect this PR has lowered the memory footprint of the app significantly... It's hard to say for sure, since OpenShift resource limits and JVM heap size are different between prod and staging, but currently in prod we use 1.5GB after a GC, in staging we use 500MB (also after a GC). If we get even a fraction of this gain, that would be great, but we'll have to think of adjusting OpenShift limits accordingly to actually take advantage of it.

marko-bekhta · 2024-06-25T07:41:01Z

Let's give it a try while we have most of the week ahead of us 😃🤞🏻
I'll create a PR to prod branch 🙈

yrodiere · 2024-06-26T14:25:35Z

BTW, I suspect this PR has lowered the memory footprint of the app significantly... It's hard to say for sure, since OpenShift resource limits and JVM heap size are different between prod and staging, but currently in prod we use 1.5GB after a GC, in staging we use 500MB (also after a GC). If we get even a fraction of this gain, that would be great, but we'll have to think of adjusting OpenShift limits accordingly to actually take advantage of it.

Memory usage is down from 1.5GB to 1GB in prod... :D
Not that there's anything to be proud of (what were those 500MB???), but it's at least good news.

yrodiere reviewed Jun 19, 2024

View reviewed changes

marko-bekhta force-pushed the convert-to-standalone-mapper branch from 4ac1a87 to 8c98f1c Compare June 19, 2024 15:30

yrodiere approved these changes Jun 19, 2024

View reviewed changes

marko-bekhta added 2 commits June 24, 2024 08:54

Use standalone mapper instead of the ORM one

5dc5085

Bump the worker execute time from 60s to 180s

0381d0d

marko-bekhta force-pushed the convert-to-standalone-mapper branch from 8c98f1c to 0381d0d Compare June 24, 2024 06:54

marko-bekhta marked this pull request as ready for review June 24, 2024 06:54

Add a copy of PojoMassIndexingLoggingMonitor for stream data

ad5f18e

yrodiere merged commit f580f6e into quarkusio:main Jun 24, 2024
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use standalone mapper instead of the ORM one #286

Use standalone mapper instead of the ORM one #286

marko-bekhta commented Jun 19, 2024 •

edited

Loading

yrodiere commented Jun 19, 2024

marko-bekhta commented Jun 19, 2024

yrodiere left a comment

yrodiere commented Jun 19, 2024

marko-bekhta commented Jun 24, 2024

yrodiere commented Jun 24, 2024

marko-bekhta commented Jun 24, 2024

yrodiere commented Jun 25, 2024

marko-bekhta commented Jun 25, 2024

yrodiere commented Jun 26, 2024

Use standalone mapper instead of the ORM one #286

Use standalone mapper instead of the ORM one #286

Conversation

marko-bekhta commented Jun 19, 2024 • edited Loading

yrodiere commented Jun 19, 2024

marko-bekhta commented Jun 19, 2024

yrodiere left a comment

Choose a reason for hiding this comment

yrodiere commented Jun 19, 2024

marko-bekhta commented Jun 24, 2024

yrodiere commented Jun 24, 2024

marko-bekhta commented Jun 24, 2024

yrodiere commented Jun 25, 2024

marko-bekhta commented Jun 25, 2024

yrodiere commented Jun 26, 2024

marko-bekhta commented Jun 19, 2024 •

edited

Loading