Skip to content

Commit

Permalink
eBPF/PSA: Implement caching for ActionSelector, LPM and ternary tables (
Browse files Browse the repository at this point in the history
#3738)

* Implement caching for ActionSelector, LPM and ternary tables

This commit introduces optimization for the PSA/eBPF backend called
`table caching` which adds cache table with only `exact` match keys for time
consuming look-ups. We identified three situations where this optimization
can improve performance:

- Table with `ternary` (and/or `lpm`, `exact`) key - skip slow TSS algorithm
  if the key was earlier matched.
- Table with `lpm` (and/or `exact`) key - skip slow `LPM_TRIE` map (especially
  when there is many entries) if the key was earlier matched.
- `ActionSelector` member selection from group - skip slow checksum
  calculation for `selector` key if it was earlier calculated.

This optimization may not improve performance in every case, so it must be
explicitly enabled by compiler option `--table-caching`.

Limitations/to do:

- `DirectCounter` and `DirectMeter` are not supported by this optimization
  (tables with these externs will not have enabled cache).
- Cache size is a half of table size, but it would be more configurable or
  smart during compilation.
- Updates to tables with cache or `ActionSelector` require cache invalidation
  (`nikss` library will remove all cached entries).

Co-authored-by: Tomasz Osiński <osinstom@gmail.com>

* Replace memcpy

Co-authored-by: Tomasz Osiński <osinstom@gmail.com>
  • Loading branch information
tatry and osinstom committed Nov 30, 2022
1 parent e987655 commit 9036ffc
Show file tree
Hide file tree
Showing 17 changed files with 701 additions and 12 deletions.
5 changes: 5 additions & 0 deletions backends/ebpf/ebpfControl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -355,6 +355,11 @@ void ControlBodyTranslator::processApply(const P4::ApplyMethod* method) {
builder->endOfStatement(true);
builder->blockEnd(true);

if (table->cacheEnabled()) {
table->emitCacheUpdate(builder, keyname, valueName);
builder->blockEnd(true);
}

builder->emitIndent();
builder->appendFormat("if (%s != NULL) ", valueName.c_str());
builder->blockStart();
Expand Down
7 changes: 7 additions & 0 deletions backends/ebpf/ebpfOptions.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -84,4 +84,11 @@ EbpfOptions::EbpfOptions() {
},
"[psa only] Select the mode used to pass metadata from XDP to TC "
"(possible values: meta, head, cpumap).");
registerOption(
"--table-caching", nullptr,
[this](const char*) {
enableTableCache = true;
return true;
},
"[psa only] Enable caching entries for tables with lpm or ternary key");
}
2 changes: 2 additions & 0 deletions backends/ebpf/ebpfOptions.h
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,8 @@ class EbpfOptions : public CompilerOptions {
enum XDP2TC xdp2tcMode = XDP2TC_NONE;
// maximum number of unique ternary masks
unsigned int maxTernaryMasks = 128;
// Enable table cache for LPM and ternary tables
bool enableTableCache = false;

EbpfOptions();

Expand Down
2 changes: 2 additions & 0 deletions backends/ebpf/ebpfTable.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -739,6 +739,8 @@ void EBPFTable::emitInitializer(CodeBuilder* builder) {
}

void EBPFTable::emitLookup(CodeBuilder* builder, cstring key, cstring value) {
if (cacheEnabled()) emitCacheLookup(builder, key, value);

if (!isTernaryTable()) {
builder->target->emitTableLookup(builder, dataMapName, key, value);
builder->endOfStatement(true);
Expand Down
12 changes: 12 additions & 0 deletions backends/ebpf/ebpfTable.h
Original file line number Diff line number Diff line change
Expand Up @@ -129,6 +129,18 @@ class EBPFTable : public EBPFTableBase {
// Whether to drop packet if no match entry found.
// Some table implementations may want to continue processing.
virtual bool dropOnNoMatchingEntryFound() const { return true; }

virtual bool cacheEnabled() { return false; }
virtual void emitCacheLookup(CodeBuilder* builder, cstring key, cstring value) {
(void)builder;
(void)key;
(void)value;
}
virtual void emitCacheUpdate(CodeBuilder* builder, cstring key, cstring value) {
(void)builder;
(void)key;
(void)value;
}
};

class EBPFCounterTable : public EBPFTableBase {
Expand Down
25 changes: 24 additions & 1 deletion backends/ebpf/psa/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -535,6 +535,23 @@ $ bpftool help
Refer to [the bpftool guide](https://manpages.ubuntu.com/manpages/focal/man8/bpftool-prog.8.html) for more examples how to use it.
# Performance optimizations
## Table caching
Table caching optimizes P4 table lookups by adding a cache with all `exact` matches for time-consuming lookups including:
- table with `ternary` (and/or lpm, exact) key - skip slow TSS algorithm if the key was matched earlier.
- table with `lpm` (and/or exact) key - skip slow `LPM_TRIE` map (especially when there is many entries) if the key was matched earlier.
- `ActionSelector` member selection from group - skip slow checksum calculation for `selector` key if it was calculated earlier.
The fast exact-match map is added in front of each instance of a table that contains a `lpm`, `ternary` or `selector` match
key. The table cache is implemented with `BPF_MAP_TYPE_LRU_HASH`, which shares its implementation with the BPF hash map.
The LRU map provides a good lookup performance, but lower performance on map updates due to a maintenance process. Thus,
this optimization fits into use cases, where a value of table key changes infrequently between packets.
This optimization may not improve performance in every case, so it must be explicitly enabled by compiler option. To enable
table caching pass `--table-caching` to the compiler.
# TODO / Limitations
We list the known bugs/limitations below. Refer to the Roadmap section for features planned in the near future.
Expand All @@ -545,14 +562,20 @@ with some NICs. So far, we have verified the correct behavior with Intel 82599ES
- `lookahead()` with bit fields (e.g., `bit<16>`) doesn't work.
- `@atomic` operation is not supported yet.
- `psa_idle_timeout` is not supported yet.
- DirectCounter and DirectMeter externs are not supported for P4 tables with implementation (ActionProfile).
- DirectCounter and DirectMeter externs are not supported for P4 tables with implementation (ActionProfile or ActionSelector).
- The `xdp2tc=head` mode works only for packets larger than 34 bytes (the size of Ethernet and IPv4 header).
- `value_set` only supports the exact match type and can only match on a single field in the `select()` expression.
- The number of entries in ternary tables are limited by the number of unique ternary masks. If a P4 program uses many ternary tables and the `--max-ternary-masks` (default: 128) is set
to a high value, the P4 program may not load into the BPF subsystem due to the BPF complexity issue (the 1M instruction limit exceeded). This is the limitation of the current implementation of the TSS algorithm that
requires iteration over BPF maps. Note that the recent kernel introduced the [bpf_for_each_map_elem()](https://lwn.net/Articles/846504/) helper that should simplify the iteration process and help to overcome the current limitation.
- Setting a size of ternary tables does not currently work.
- DirectMeter cannot be used if a table defines `ternary` match fields, as [BPF spinlocks are not allowed in inner maps of map-in-map](https://patchwork.ozlabs.org/project/netdev/patch/20190124041403.2100609-2-ast@kernel.org/).
- Table cache optimization can't be enabled on tables with DirectCounter or DirectMeter due to two different states of a
table entry. Tables with these externs will not have enabled cache optimization even when enabled by compiler option.
- When table cache optimization is enabled for a table, the number of cached entries is determined as a half of table size.
This would be more configurable or smart during compilation.
- Updates to tables or ActionSelector with enabled table cache optimization require cache invalidation. `nikss` library
will remove all cached entries if it detects cache.
# Roadmap
Expand Down
132 changes: 131 additions & 1 deletion backends/ebpf/psa/ebpfPsaTable.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -225,6 +225,8 @@ EBPFTablePSA::EBPFTablePSA(const EBPFProgram* program, const IR::TableBlock* tab
initDirectCounters();
initDirectMeters();
initImplementation();

tryEnableTableCache();
}

EBPFTablePSA::EBPFTablePSA(const EBPFProgram* program, CodeGenInspector* codeGen, cstring name)
Expand Down Expand Up @@ -320,11 +322,13 @@ void EBPFTablePSA::emitInstance(CodeBuilder* builder) {
program->arrayIndexType, cstring("struct ") + valueTypeName,
1);
}

emitCacheInstance(builder);
}

void EBPFTablePSA::emitTypes(CodeBuilder* builder) {
EBPFTable::emitTypes(builder);
// TODO: placeholder for handling PSA-specific types
emitCacheTypes(builder);
}

/**
Expand Down Expand Up @@ -899,4 +903,130 @@ cstring EBPFTablePSA::addPrefixFunc(bool trace) {
return addPrefixFunc;
}

void EBPFTablePSA::tryEnableTableCache() {
if (!program->options.enableTableCache) return;
if (!isLPMTable() && !isTernaryTable()) return;
if (!counters.empty() || !meters.empty()) {
::warning(ErrorType::WARN_UNSUPPORTED,
"%1%: table cache can't be enabled due to direct extern(s)",
table->container->name);
return;
}

tableCacheEnabled = true;
createCacheTypeNames(false, true);
}

void EBPFTablePSA::createCacheTypeNames(bool isCacheKeyType, bool isCacheValueType) {
cacheTableName = instanceName + "_cache";

cacheKeyTypeName = keyTypeName;
if (isCacheKeyType) cacheKeyTypeName = keyTypeName + "_cache";

cacheValueTypeName = valueTypeName;
if (isCacheValueType) cacheValueTypeName = valueTypeName + "_cache";
}

void EBPFTablePSA::emitCacheTypes(CodeBuilder* builder) {
if (!tableCacheEnabled) return;

builder->emitIndent();
builder->appendFormat("struct %s ", cacheValueTypeName.c_str());
builder->blockStart();

builder->emitIndent();
builder->appendFormat("struct %s value", valueTypeName.c_str());
builder->endOfStatement(true);

// additional metadata fields add at the end of this structure. This allows
// to simpler conversion cache value to value used by table

builder->emitIndent();
builder->append("u8 hit");
builder->endOfStatement(true);

builder->blockEnd(false);
builder->endOfStatement(true);
}

void EBPFTablePSA::emitCacheInstance(CodeBuilder* builder) {
if (!tableCacheEnabled) return;

// TODO: make cache size calculation more smart. Consider using annotation or compiler option.
size_t cacheSize = std::max((size_t)1, size / 2);
builder->target->emitTableDecl(builder, cacheTableName, TableHashLRU,
"struct " + cacheKeyTypeName, "struct " + cacheValueTypeName,
cacheSize);
}

void EBPFTablePSA::emitCacheLookup(CodeBuilder* builder, cstring key, cstring value) {
cstring cacheVal = "cached_value";

builder->appendFormat("struct %s* %s = NULL", cacheValueTypeName.c_str(), cacheVal.c_str());
builder->endOfStatement(true);

builder->target->emitTraceMessage(builder, "Control: trying table cache...");

builder->emitIndent();
builder->target->emitTableLookup(builder, cacheTableName, key, cacheVal);
builder->endOfStatement(true);

builder->emitIndent();
builder->appendFormat("if (%s != NULL) ", cacheVal.c_str());
builder->blockStart();

builder->target->emitTraceMessage(builder,
"Control: table cache hit, skipping later lookup(s)");
builder->emitIndent();
builder->appendFormat("%s = &(%s->value)", value.c_str(), cacheVal.c_str());
builder->endOfStatement(true);
builder->emitIndent();
builder->appendFormat("%s = %s->hit", program->control->hitVariable.c_str(), cacheVal.c_str());
builder->endOfStatement(true);

builder->blockEnd(false);
builder->append(" else ");
builder->blockStart();

builder->target->emitTraceMessage(builder, "Control: table cache miss, nevermind");
builder->emitIndent();

// Do not end block here because we need lookup for (default) value
// and set hit variable at this indent level which is done in the control block
}

void EBPFTablePSA::emitCacheUpdate(CodeBuilder* builder, cstring key, cstring value) {
cstring cacheUpdateVarName = "cache_update";

builder->emitIndent();
builder->appendFormat("if (%s != NULL) ", value.c_str());
builder->blockStart();

builder->emitIndent();
builder->appendLine("/* update table cache */");

builder->emitIndent();
builder->appendFormat("struct %s %s = {0}", cacheValueTypeName.c_str(),
cacheUpdateVarName.c_str());
builder->endOfStatement(true);

builder->emitIndent();
builder->appendFormat("%s.hit = %s", cacheUpdateVarName.c_str(),
program->control->hitVariable.c_str());
builder->endOfStatement(true);

builder->emitIndent();
builder->appendFormat("__builtin_memcpy((void *) &(%s.value), (void *) %s, sizeof(struct %s))",
cacheUpdateVarName.c_str(), value.c_str(), valueTypeName.c_str());
builder->endOfStatement(true);

builder->emitIndent();
builder->target->emitTableUpdate(builder, cacheTableName, key, cacheUpdateVarName);
builder->newline();

builder->target->emitTraceMessage(builder, "Control: table cache updated");

builder->blockEnd(true);
}

} // namespace EBPF
13 changes: 13 additions & 0 deletions backends/ebpf/psa/ebpfPsaTable.h
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,13 @@ class EBPFTablePSA : public EBPFTable {
void initDirectMeters();
void initImplementation();

bool tableCacheEnabled = false;
cstring cacheValueTypeName;
cstring cacheTableName;
cstring cacheKeyTypeName;
void tryEnableTableCache();
void createCacheTypeNames(bool isCacheKeyType, bool isCacheValueType);

void emitTableValue(CodeBuilder* builder, const IR::MethodCallExpression* actionMce,
cstring valueName);
void emitDefaultActionInitializer(CodeBuilder* builder);
Expand Down Expand Up @@ -81,6 +88,12 @@ class EBPFTablePSA : public EBPFTable {
bool dropOnNoMatchingEntryFound() const override;
static cstring addPrefixFunc(bool trace);

virtual void emitCacheTypes(CodeBuilder* builder);
void emitCacheInstance(CodeBuilder* builder);
void emitCacheLookup(CodeBuilder* builder, cstring key, cstring value) override;
void emitCacheUpdate(CodeBuilder* builder, cstring key, cstring value) override;
bool cacheEnabled() override { return tableCacheEnabled; }

EBPFCounterPSA* getDirectCounter(cstring name) const {
auto result = std::find_if(counters.begin(), counters.end(),
[name](std::pair<cstring, EBPFCounterPSA*> elem) -> bool {
Expand Down
Loading

0 comments on commit 9036ffc

Please sign in to comment.