Improve documentation for CPU target and feature selection #15487

ScottTodd · 2023-11-08T19:16:04Z

There has been a lot of great work recently on CPU configuration across multiple CPU architectures and specific device models. We should distill what we've collectively learned into the documentation.

This section: https://iree.dev/guides/deployment-configurations/cpu/#compile-and-run-a-program was written some time ago and pretty much says "--iree-llvmcpu-target-triple and --iree-llvmcpu-target-cpu-features are important but we don't fully understand them yet". I think we're starting to understand them :)

Possible topics to cover:

How to determine what CPU, CPU features, etc. a particular device has
How to communicate those features to the compiler (flag recipes, links to benchmark configurations, etc.)
How to verify that generated code made use of certain features
What to expect out of the box, and why automatic detection is not sufficient for ahead of time scenarios
- Aside: for JIT scenarios (i.e. TorchDynamo), are we falling short and can we do better?

ScottTodd · 2023-11-08T19:16:39Z

cc @dcaballe @bjacob (others?)

ScottTodd · 2024-04-11T15:25:19Z

Doc/tooling updates here would still be helpful

Recent discussion: https://discord.com/channels/689900678990135345/689932666505920596/1225523467215044742

Does anyone know how to check what --iree-llvmcpu-target-cpu-features=host resolves to on a given machine? Trying to match what a user observed on their machine from mine

someone says llc -march=arm -mattr=help
for us, if you pass that flag and compile-to=input it should show you all them in the device config
#executable_target_embedded_elf_x86_64_ = #hal.executable.target<"llvm-cpu", "embedded-elf-x86_64", {cpu = "znver2", cpu_features = "+prfchw,-cldemote,+avx,+aes,+sahf,+pclmul,-xop,+crc32,+xsaves,-avx512fp16,-usermsr,-sm4,-egpr,+sse4.1,-avx512ifma,+xsave,-avx512pf,+sse4.2,-tsxldtrk,-ptwrite,-widekl,-sm3,-invpcid,+64bit,+xsavec,-avx10.1-512,-avx512vpopcntdq,+cmov,-avx512vp2intersect,-avx512cd,+movbe,-avxvnniint8,-avx512er,-ccmp,-amx-int8,-kl,-avx10.1-256,-sha512,-avxvnni,-rtm,+adx,+avx2,-hreset,-movdiri,-serialize,-vpclmulqdq,-avx512vl,-uintr,-cf,+clflushopt,-raoint,-cmpccxadd,+bmi,-amx-tile,+sse,-gfni,-avxvnniint16,-amx-fp16,-ndd,+xsaveopt,+rdrnd,-avx512f,-amx-bf16,-avx512bf16,-avx512vnni,-push2pop2,+cx8,-avx512bw,+sse3,-pku,+fsgsbase,+clzero,+mwaitx,-lwp,+lzcnt,+sha,-movdir64b,-ppx,+wbnoinvd,-enqcmd,-prefetchwt1,-avxneconvert,-tbm,-pconfig,-amx-complex,+ssse3,+cx16,+bmi2,+fma,+popcnt,-avxifma,+f16c,-avx512bitalg,+rdpru,+clwb,+mmx,+sse2,+rdseed,-avx512vbmi2,-prefetchi,+rdpid,-fma4,-avx512vbmi,-shstk,-vaes,-waitpkg,-sgx,+fxsr,-avx512dq,+sse4a", data_layout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128", native_vector_size = 32 : i64, target_triple = "x86_64-unknown-unknown-eabi-elf"}>

ScottTodd added documentation ✏️ Improvements or additions to documentation codegen/llvm LLVM code generation compiler backend labels Nov 8, 2023

ScottTodd mentioned this issue Nov 16, 2023

Revamp Android Benchmarks #15452

Closed

ScottTodd mentioned this issue Sep 20, 2024

Compiling for llvm-cpu without targeting a specific CPU is a bad experience #18561

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve documentation for CPU target and feature selection #15487

Improve documentation for CPU target and feature selection #15487

ScottTodd commented Nov 8, 2023

ScottTodd commented Nov 8, 2023

ScottTodd commented Apr 11, 2024

Improve documentation for CPU target and feature selection #15487

Improve documentation for CPU target and feature selection #15487

Comments

ScottTodd commented Nov 8, 2023

ScottTodd commented Nov 8, 2023

ScottTodd commented Apr 11, 2024