Skip to content

Commit

Permalink
Fix occasional failed imports due to race conditions
Browse files Browse the repository at this point in the history
Fixes kaitai-io/kaitai_struct#951

Until now, KSC had occasional problems with resolving imported top-level types, reporting `unable to find type '<imported>'` errors from time to time (see kaitai-io/kaitai_struct#951 for more details and reproduction instructions). It turned out that the importing code ran on multiple threads that were concurrently modifying/reading shared non-thread-safe `HashMap`s without any synchronization, which resulted in race conditions.

I thought it would be best to eliminate the race conditions by switching to some thread-safe counterpart to `mutable.HashMap`. After some googling, it turned out that there are two viable options, [`scala.collection.concurrent.TrieMap`](https://www.scala-lang.org/api/2.13.13/scala/collection/concurrent/TrieMap.html) and [`java.util.concurrent.ConcurrentHashMap`](https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ConcurrentHashMap.html).

However, `TrieMap` doesn't seem to be implemented for Scala.js (scala-js/scala-js#4866) and it is `final`, so we cannot use it as a base class of `ClassSpecs` instead of `mutable.HashMap` that we're using now. `ConcurrentHashMap` (despite being in the `java.util.concurrent.*` package, suggesting that it would only be available on JVM) surprisingly appears to be available in Scala.js (scala-js/scala-js#1487), so in theory we could use it even in `shared/src/...` code, but I haven't figured out how to make `ClassSpecs` extend from it. It must be added that since Scala 2.13, the inheritance from `HashMap` was deprecated, so we'll have to find another way, but for now it works.

The unsynchronized hash map accesses in the `JavaClassSpecs.cached()` are in JVM-specific code (`jvm/src/...` folder), so I've just changed the private `relFiles` and `absFiles` fields to use `ConcurrentHashMap`, thus resolving the race conditions here. (`TrieMap` would work too - there's probably no practical difference for our use case.)

For the `LoadImports.loadImport()` method, I've just wrapped the accesses to shared `ClassSpecs` in a manual `synchronized` block. According to some on the internet, using `synchronized` is kind of discouraged due to being error-prone in favor of using existing thread-safe types or immutable types (Scala generally seems to prefer immutable types, but I can't imagine how they could be used in this case), but I believe it's the easiest way to solve the problem here.
  • Loading branch information
GreyCat authored Mar 9, 2024
2 parents d071323 + 1a71310 commit 0c1bcdd
Show file tree
Hide file tree
Showing 3 changed files with 29 additions and 4 deletions.
13 changes: 11 additions & 2 deletions jvm/src/main/scala/io/kaitai/struct/formats/JavaClassSpecs.scala
Original file line number Diff line number Diff line change
Expand Up @@ -8,15 +8,24 @@ import java.io.{File, FileNotFoundException}
import scala.collection.mutable
import scala.concurrent.ExecutionContext.Implicits.global
import scala.concurrent.Future
import scala.collection.concurrent
import java.util.concurrent.ConcurrentHashMap
import scala.jdk.CollectionConverters._

/**
* Java implementation of ClassSpec container, doing imports from local files.
*/
class JavaClassSpecs(relPath: String, absPaths: Seq[String], firstSpec: ClassSpec)
extends ClassSpecs(firstSpec) {

private val relFiles = mutable.Map[String, ClassSpec]()
private val absFiles = mutable.Map[String, ClassSpec]()
// We're using thread-safe `ConcurrentHashMap` for `relFiles` and `absFiles`,
// because these hash maps may be mutated concurrently by multiple threads in
// `JavaClassSpecs.cached()`. Using a non-thread-safe hash map here could
// occasionally cause `cacheMap.get(name)` in `JavaClassSpecs.cached()` to
// fail internally and throw an `ArrayIndexOutOfBoundsException`, see
// https://github.com/kaitai-io/kaitai_struct/issues/951
private val relFiles: concurrent.Map[String, ClassSpec] = new ConcurrentHashMap[String, ClassSpec]().asScala
private val absFiles: concurrent.Map[String, ClassSpec] = new ConcurrentHashMap[String, ClassSpec]().asScala

override def importRelative(name: String, path: List[String], inFile: Option[String]): Future[Option[ClassSpec]] = Future {
Log.importOps.info(() => s".. importing relative $name")
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -76,8 +76,23 @@ class LoadImports(specs: ClassSpecs) {
//
// In theory, duplicate imports shouldn't be returned at all by
// import* methods due to caching, but we won't rely on it here.
if (!specs.contains(specName)) {
specs(specName) = spec
//
// The `synchronized` block is necessary because this code is run
// concurrently by multiple threads (each resolving different imports)
// and `specs` is a shared non-thread-safe `HashMap`. Without this
// synchronization, a few imports were occasionally missing from
// `specs` due to a race condition, and even (though rarely) the
// implementation of `specs.contains()` could fail internally with an
// `ArrayIndexOutOfBoundsException`. For more details, see
// https://github.com/kaitai-io/kaitai_struct/issues/951
val isNewSpec = specs.synchronized {
val isNew = !specs.contains(specName)
if (isNew) {
specs(specName) = spec
}
isNew
}
if (isNewSpec) {
processClass(spec, ImportPath.updateWorkDir(workDir, impPath))
} else {
Log.importOps.warn(() => s"... we have that already, ignoring")
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,7 @@ class ResolveTypes(specs: ClassSpecs, topClass: ClassSpec, opaqueTypes: Boolean)
(Some(ClassSpec.opaquePlaceholder(typeName)), None)
} else {
// Opaque types are disabled => that is an error
Log.typeResolve.info(() => " => ??? (opaque type are disabled => error)")
(None, Some(TypeNotFoundErr(typeName, curClass, path)))
}
case Some(x) =>
Expand Down

0 comments on commit 0c1bcdd

Please sign in to comment.