Skip to content

Aeron Archive

Martin Thompson edited this page Jan 21, 2020 · 34 revisions

The Aeron Archive service can record and replay messages streams from durable/persistent storage. The service is designed to be high performance and tests show that its major limitation is the performance of the storage for the recordings. With sufficiently fast storage the archive can record or replay a stream of message at full 10GigE line rate.

Samples can be found here and systems tests here.

Archive Service

To start the Archive service you launch the ArchivingMediaDriver which includes the media driver by composition. This can run standalone or be embedded in another service. The configuration options can be found in the Archive.

The service offers these main features:

  • Record: service can record a particular subscription, described by <channel, streamId>. Each resulting image for the subscription will be recorded under a new recordingId. Local network publications are recorded using the spy feature for efficiency. If no subscribers are active then the recording can advance the stream by setting the aeron.spies.simulate.connection system property to true.

  • Extend: service can extend an existing recording by appending.

  • Replay: service can replay a recorded recordingId from a particular position, and for a particular length which can be Aeron.NULL_VALUE for an open ended replay.

  • Query: the catalog for existing recordings and the recorded position of an active recording.

  • Truncate: allows a stopped recording to have its length truncated, and if truncated to the start position then it is effectively deleted.

  • Replay Merge: allows a late joining subscriber of a recorded stream to replay a recording and then merge with the live stream for cut over if the consumer is fast enough to keep up.

  • Replicate: allows for the replication of a recording from one archive to another, plus have the option to merge with a live multicast stream and record it after using the source archive to catchup. When using replication it is necessary to configure the replication channel for the destination archive with aeron.archive.replication.channel.

  • Purge & Restore: allows for history of a long running recording, which is likely to keep extending, to have the oldest history purged, to save disk space, and restored if required later. This works at the granularity of recording segments which are a multiple of term buffers. The oldest segments can be detached from a recording to allow for compression or copying elsewhere, they can also be optionally deleted via the API, or a combination of detach and delete can be achieved with the purge operation. Segments can be copied back into place and then attached. Alternatively, a backed up stream to another archive can be replayed and recorded for the range required and then migrated to the beginning of the recording to restore them.

  • Checksums: allows computing checksums such as CRC32 and CRC32C while recording and verifying them during replay.

The Archive service can be run in one of three threading modes:

  • ArchiveThreadingMode.DEDICATED: 3 threads are used. One for each of the conductor responding to control signals and queries, one for recording streams, and one for replaying streams.
  • ArchiveThreadingMode.SHARED: 1 thread is used for all the control, recording, and replay.
  • ArchiveThreadingMode.INVOKER: No threads are used in the archive and the duty cycle is driven externally by calling the AgentInvoker.invoke() method on the Archive.invoker() object. Each call to invoke with perform one duty cycle of the Archive.

The overall threading usage of the Archive is a combination for the ArchviveThreadingMode and the ThreadingMode for the composed MediaDriver.

Archive Client

The Archive Client can communicate with an Archive using the control protocol and receive events on the progress of recording via the recording events stream. To support dynamic subscribers for the recording events stream then its channel can be multicast or MDC (Multi-Destination-Cast).

Samples using an archive can be found here.

Recording Durability

An archive can be instructed to record streams, i.e. <channel, streamId> pairs. These streams are recorded with the file sync level the archive has been launched with. Progress is reported on the recording events stream.

  • aeron.archive.file.sync.level=0: for normal writes to the OS page cache for background writing to disk.
  • aeron.archive.file.sync.level=1: for forcing the dirty data pages to disk.
  • aeron.archive.file.sync.level=2: for forcing the dirty data pages and file metadata to disk.

When setting file sync level greater than zero it is also important to sync the archive catalog with the aeron.archive.catalog.file.sync.level to the same value.

Recordings will be assigned a recordingId and a full description of the stream is captured in the Archive Catalog. The Catalog chronicles the contents of an archive as RecordingDescriptors which can be queried.

The progress of active recordings can be tracked using AeronStat to view the rec-pos counter for each stream.

Recording When No Subscribers Are Connected

It is possible to record a stream, over a UDP based channel, when no subscribers have connected by enabling a property so that spy subscriptions can simulate a connection on the media driver via either of:

  • aeron.spies.simulate.connection system property to true.
  • MediaDriver.Context.spiesSimulateConnection(boolean) to true.

Spy subscriptions are very efficient when used to record a local outbound network publication without requiring a local subscription to receive the outbound data stream on a inbound connection. IPC based channels will be connected when the archive starts to record regardless of if this property is set or not.

Querying the Archive Catalog

The contents for Archive can be queried by listing the RecordingDescriptors. This can be a simple paging through all descriptors in the catalog, or the query can be filtered by <channel, streamId> to reduce the result set.

The SBE message format for the descriptors is as follows:

    <sbe:message name="RecordingDescriptor"
                 id="22"
                 description="Describes a recording in the catalog">
        <field name="controlSessionId"   id="1"  type="int64"/>
        <field name="correlationId"      id="2"  type="int64"/>
        <field name="recordingId"        id="3"  type="int64"/>
        <field name="startTimestamp"     id="4"  type="time_t"/>
        <field name="stopTimestamp"      id="5"  type="time_t"/>
        <field name="startPosition"      id="6"  type="int64"/>
        <field name="stopPosition"       id="7"  type="int64"/>
        <field name="initialTermId"      id="8"  type="int32"/>
        <field name="segmentFileLength"  id="9"  type="int32"/>
        <field name="termBufferLength"   id="10" type="int32"/>
        <field name="mtuLength"          id="11" type="int32"/>
        <field name="sessionId"          id="12" type="int32"/>
        <field name="streamId"           id="13" type="int32"/>
        <data  name="strippedChannel"    id="14" type="varAsciiEncoding"/>
        <data  name="originalChannel"    id="15" type="varAsciiEncoding"/>
        <data  name="sourceIdentity"     id="16" type="varAsciiEncoding"/>
    </sbe:message>

Note: A stopPosition of -1 means a live recording is in progress.

Replaying Streams

A replay of a recorded stream can be requested by recordingId. It is possible to replay a recording that has stopped or one that is currently in progress. Replays are request from a position and for a length. The replay session id returned from the replay request will be the sessionId of the replayed Aeron stream in the lower 32-bits which can be obtained with a downcast to an int. The full 64-bit return value for the replay session is required to stop the replay early.

If the requested replay is for a recording that is currently in progress and the length goes beyond the current recorded position then replay will track the live recording. This can be useful for a subscriber that wants to track a live stream but does not want to participate in flow control of the transient stream if it make slow the others down.

Replays of live streams should sent to a different <channel, streamId> pairing to avoid congestion.

Checksums

Archive can compute checksums while recording data and verify them during replay. Both features are enabled separately via the properties aeron.archive.record.checksum (for record) and aeron.archive.replay.checksum (for replay) or the corresponding configuration API (i.e. io.aeron.archive.Archive.Context#recordChecksum(io.aeron.archive.checksum.Checksum) and io.aeron.archive.Archive.Context#replayChecksum(io.aeron.archive.checksum.Checksum) respectively). The value to either of the properties should be a fully qualified class name that implements the io.aeron.archive.checksum.Checksum interface.

Note: Enabling checksums for replay should use the same io.aeron.archive.checksum.Checksum implementation as for the recording, otherwise replay will fail with the io.aeron.archive.client.ArchiveException.

Out of the box Aeron provides two implementations of the io.aeron.archive.checksum.Checksum interface:

  • io.aeron.archive.checksum.Crc32 - implements CRC32 checksum algorithm. Available always.
  • io.aeron.archive.checksum.Crc32c - implements CRC32C checksum algorithm. Available on JDK 9+. It is an error to configure it for JDK 8, i.e. java.lang.IllegalStateException will be thrown in this case. This has better performance.

It is also possible to access these implementations programmatically via the io.aeron.archive.checksum.Checksums class.

A custom implementation of the io.aeron.archive.checksum.Checksum interface can be provided but it should (ideally) be stateless or at the very least thread safe since Aeron will use a single instance for all invocations.

Besides being computed and checked during normal archive operations the checksums can be checked and/or re-computed by using the io.aeron.archive.ArchiveTool (CLI and API), i.e.:

  • verify command accepts an optional -checksum className parameter. When specified the recorded checksums in the segment files will be verified using the specified io.aeron.archive.checksum.Checksum implementation. If a mismatch is detected the corresponding recording is marked as invalid.
  • checksum command allows computing and overwriting checksum information in the segment files using the specified io.aeron.archive.checksum.Checksum implementation.

For more information please refer to the JavaDoc of the io.aeron.archive.ArchiveTool class.

Performance impact of using Checksums

When checksums are enabled they have a non-negligible negative performance effect. Although the exact impact depends on lots of factors such as the OS, file system, JDK version, checksum algorithm and the message length.