Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(sn): reuse buffer for ReplicateRequest unmarshaling #808

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ijsong
Copy link
Member

@ijsong ijsong commented Jun 8, 2024

What this PR does

Improve unmarshaling performance by reusing buffers for ReplicateRequest in the
backup replica.

The protobuf message github.com/kakao/varlog/proto/snpb.(ReplicateRequest) has
two slice fields—LLSN ([]uint64) and Data ([][]byte). The backup replica
receives replicated log entries from the primary replica via the gRPC service
github.com/kakao/varlog/proto/snpb.(ReplicatorServer).Replicate, which sends
ReplicateRequest messages.

Upon receiving a ReplicateRequest, the backup replica unmarshals the message,
which involves growing slices for fields such as LLSN and Data. This growth
causes copy overhead whenever the slice capacities need to expand.

To address this, we introduce a new method, ResetReuse, for reusing slices
instead of resetting them completely. The ResetReuse method shrinks the slice
lengths while preserving their capacities, thus avoiding the overhead of
reallocating memory.

Example implementation:

type Message struct {
    Buffer []byte
    // Other fields
}

func (m *Message) Reset() {
    *m = Message{}
}

func (m *Message) ResetReuse() {
    s := m.Buffer[:0]
    *m = Message{}
    m.Buffer = s
}

Risks:

This approach has potential downsides. Since the heap space consumed by the
slices is not reclaimed, the storage node's memory consumption may increase.
Currently, there is no mechanism to shrink the heap usage.

Additionally, this PR changes the generated code. The protobuf compiler can
revert it, which is contrary to our intention. To catch this mistake, this PR
includes a unit test (github.com/kakao/varlog/proto/snpb.TestReplicateRequest)
to verify that the buffer backing the slices is reused.

Which issue(s) this PR resolves

Resolves: #795

See also: #806

@ijsong ijsong requested a review from hungryjang as a code owner June 8, 2024 14:31
@ijsong ijsong force-pushed the replicate_rpc_improvement branch from d5f5e27 to 0e0a36e Compare June 8, 2024 14:32
Copy link

codecov bot commented Jun 8, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 68.44%. Comparing base (6a93655) to head (8ef1886).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #808      +/-   ##
==========================================
+ Coverage   68.09%   68.44%   +0.35%     
==========================================
  Files         182      182              
  Lines       17899    17903       +4     
==========================================
+ Hits        12188    12254      +66     
+ Misses       4941     4863      -78     
- Partials      770      786      +16     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ijsong ijsong force-pushed the replicate_rpc_improvement branch from 18c6632 to 770f7ef Compare June 8, 2024 15:16
require.Equal(t, want2, got2)
})

t.Run("GrowLength", func(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to check whether unmarshal or grow was successful.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@ijsong ijsong force-pushed the replicate_rpc_improvement branch 2 times, most recently from 3060aa5 to 122afa5 Compare June 13, 2024 15:18
@ijsong ijsong requested a review from hungryjang June 13, 2024 15:18
@ijsong ijsong force-pushed the replicate_rpc_improvement branch 2 times, most recently from 6feab12 to 57f4a1a Compare June 14, 2024 12:34
Base automatically changed from proto_tests to main June 15, 2024 06:54
@ijsong ijsong force-pushed the replicate_rpc_improvement branch from 57f4a1a to 60a1a53 Compare June 15, 2024 06:57
Improve unmarshaling performance by reusing buffers for ReplicateRequest in the
backup replica.

The protobuf message `github.com/kakao/varlog/proto/snpb.(ReplicateRequest)` has
two slice fields—LLSN (`[]uint64`) and Data (`[][]byte`). The backup replica
receives replicated log entries from the primary replica via the gRPC service
`github.com/kakao/varlog/proto/snpb.(ReplicatorServer).Replicate`, which sends
`ReplicateRequest` messages.

Upon receiving a `ReplicateRequest`, the backup replica unmarshals the message,
which involves growing slices for fields such as LLSN and Data. This growth
causes copy overhead whenever the slice capacities need to expand.

To address this, we introduce a new method, `ResetReuse`, for reusing slices
instead of resetting them completely. The `ResetReuse` method shrinks the slice
lengths while preserving their capacities, thus avoiding the overhead of
reallocating memory.

Example implementation:

```go
type Message struct {
    Buffer []byte
    // Other fields
}

func (m *Message) Reset() {
    *m = Message{}
}

func (m *Message) ResetReuse() {
    s := m.Buffer[:0]
    *m = Message{}
    m.Buffer = s
}
```

Risks:

This approach has potential downsides. Since the heap space consumed by the
slices is not reclaimed, the storage node's memory consumption may increase.
Currently, there is no mechanism to shrink the heap usage.

Additionally, this PR changes the generated code. The protobuf compiler can
revert it, which is contrary to our intention. To catch this mistake, this PR
includes a unit test (github.com/kakao/varlog/proto/snpb.TestReplicateRequest)
to verify that the buffer backing the slices is reused.

Resolves: #795

See also: #806
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Performance Analysis and Improvement of LogStream Primary-Backup Replication
2 participants