Skip to content

[lld][ELF] Allow .data.rel.ro.unlikely to be RELRO #148920

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

mingmingl-llvm
Copy link
Contributor

https://discourse.llvm.org/t/rfc-profile-guided-static-data-partitioning/83744 proposes to partition a static data section (like .data.rel.ro) into two sections, one grouping the cold ones and the other grouping the rest.

lld requires all relro sections to be contiguous. To place .data.rel.ro.unlikely in the middle of all relro sections, this change proposes to add .data.rel.ro.unlikely explicitly as RELRO section.

@llvmbot
Copy link
Member

llvmbot commented Jul 15, 2025

@llvm/pr-subscribers-lld-elf

@llvm/pr-subscribers-lld

Author: Mingming Liu (mingmingl-llvm)

Changes

https://discourse.llvm.org/t/rfc-profile-guided-static-data-partitioning/83744 proposes to partition a static data section (like .data.rel.ro) into two sections, one grouping the cold ones and the other grouping the rest.

lld requires all relro sections to be contiguous. To place .data.rel.ro.unlikely in the middle of all relro sections, this change proposes to add .data.rel.ro.unlikely explicitly as RELRO section.


Full diff: https://github.com/llvm/llvm-project/pull/148920.diff

2 Files Affected:

  • (modified) lld/ELF/Writer.cpp (+3-3)
  • (added) lld/test/ELF/relro-data-unlikely.s (+50)
diff --git a/lld/ELF/Writer.cpp b/lld/ELF/Writer.cpp
index 15909daf51ab6..dc7f620b9170e 100644
--- a/lld/ELF/Writer.cpp
+++ b/lld/ELF/Writer.cpp
@@ -629,9 +629,9 @@ static bool isRelroSection(Ctx &ctx, const OutputSection *sec) {
   // magic section names.
   StringRef s = sec->name;
 
-  bool abiAgnostic = s == ".data.rel.ro" || s == ".bss.rel.ro" ||
-                     s == ".ctors" || s == ".dtors" || s == ".jcr" ||
-                     s == ".eh_frame" || s == ".fini_array" ||
+  bool abiAgnostic = s == ".data.rel.ro" || s == ".data.rel.ro.unlikely" ||
+                     s == ".bss.rel.ro" || s == ".ctors" || s == ".dtors" ||
+                     s == ".jcr" || s == ".eh_frame" || s == ".fini_array" ||
                      s == ".init_array" || s == ".preinit_array";
 
   bool abiSpecific =
diff --git a/lld/test/ELF/relro-data-unlikely.s b/lld/test/ELF/relro-data-unlikely.s
new file mode 100644
index 0000000000000..951ed59fb0ef2
--- /dev/null
+++ b/lld/test/ELF/relro-data-unlikely.s
@@ -0,0 +1,50 @@
+# REQUIRES: x86
+# RUN: llvm-mc -filetype=obj -triple=x86_64-pc-linux %s -o %t.o
+
+# RUN: echo "SECTIONS { \
+# RUN:  .data.rel.ro : { .data.rel.ro } \
+# RUN:  .data.rel.ro.unlikely : { *(.data.rel.ro.unlikely) } \
+# RUN: } INSERT AFTER .text " > %t.script
+
+# RUN: ld.lld --script=%t.script %t.o -o %t
+# RUN: llvm-readelf -l %t | FileCheck --check-prefix=SEG %s
+# RUN: llvm-readelf -S %t | FileCheck %s
+
+# There are 2 RW PT_LOAD segments. p_offset p_vaddr p_paddr p_filesz of the first
+# should match PT_GNU_RELRO.
+# The .data.rel.ro.unlikely section is in PT_GNU_RELRO segment.
+
+#           Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
+# SEG:      LOAD           0x0001c8 0x00000000002011c8 0x00000000002011c8 0x000001 0x000001 R E 0x1000
+# SEG-NEXT: LOAD           0x0001c9 0x00000000002021c9 0x00000000002021c9 0x001001 0x001e37 RW  0x1000
+# SEG-NEXT: LOAD           0x0011ca 0x00000000002041ca 0x00000000002041ca 0x000001 0x000002 RW  0x1000
+# SEG-NEXT: GNU_RELRO      0x0001c9 0x00000000002021c9 0x00000000002021c9 0x001001 0x001e37 R   0x1
+# SEG-NEXT: GNU_STACK      0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW  0x0
+
+# SEG:      .text
+# SEG-NEXT: .data.rel.ro .data.rel.ro.unlikely .relro_padding
+# SEG-NEXT: .data .bss
+
+#        [Nr] Name                    Type            Address          Off    Size
+# CHECK:      .data.rel.ro            PROGBITS        00000000002021c9 0001c9 000001
+# CHECK-NEXT: .data.rel.ro.unlikely   PROGBITS        00000000002021ca 0001ca 001000
+# CHECK-NEXT: .relro_padding          NOBITS          00000000002031ca 0011ca 000e36
+# CHECK-NEXT: .data                   PROGBITS        00000000002041ca 0011ca 000001
+# CHECK-NEXT: .bss                    NOBITS          00000000002041cb 0011cb 000001
+
+.globl _start
+_start:
+  ret
+
+
+.section .data.rel.ro, "aw"
+.space 1
+
+.section .data.rel.ro.unlikely, "aw"
+.space 4096
+
+.section .data, "aw"
+.space 1
+
+.section .bss, "aw"
+.space 1

Comment on lines 632 to 633
bool abiAgnostic = s == ".data.rel.ro" || s == ".bss.rel.ro" ||
s == ".ctors" || s == ".dtors" || s == ".jcr" ||
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proposal also mentions adding ".hot" to the end of section names, so maybe it would be better to do something like this:
s.starts_with(".data.rel.ro") || s.starts_with(".bss.rel.ro" || s == ".ctors" ...

Do we also need to slightly revisit the behaviour of LinkerScript::getOutputSectionName to not merge these segments when a linker script is not provided? I'll note that there is an option which keeps some .text-prefixed sections separated, which we might want to extend to data?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for taking a look and sharing the thoughts @lenary ! I replied inline. Let me know your thoughts and I'd be glad to follow up!

The proposal also mentions adding ".hot" to the end of section names

yeah, for the prototype in the rfc, .rodata is indeed partitioned into .rodata.hot, rodata.unlikely and .rodata, and most of the prototyping work are (and will be) kept (e.g., in the codegen pass and asmprinter).

However to have fewer elf sections (4 more instead of 8 more) after partitioning 4 sections, the implementation changed to place cold data in <section>.unlikely and keeping the rest in <section>, and there is no <section>.hot in the eventual executable. Both .hot and .unlikely suffixes are kept in intermediate object files though (more on this on linker-script vs keep-data-section-prefix discussion)

so maybe it would be better to do something like this:: s.starts_with(".data.rel.ro") || s.starts_with(".bss.rel.ro")

The updated change handles both .hot and .unlikely suffix for .data.rel.ro only. The motivation to leave out other relro sections is that the amount of accesses from those sections are very trivial (e.g., 0.002% dTLB miss is from .bss.rel.ro, with 3e-7 accesses). We can get simpler binary with fewer partitioned elf sections, and it doesn't really miss performance opportunities.

Do we also need to slightly revisit the behaviour of LinkerScript::getOutputSectionName to not merge these segments when a linker script is not provided? I'll note that there is an option which keeps some .text-prefixed sections separated, which we might want to extend to data?

For text it'd be

if (ctx.arg.zKeepTextSectionPrefix)
for (StringRef v : {".text.hot", ".text.unknown", ".text.unlikely",
".text.startup", ".text.exit", ".text.split"})
if (isSectionPrefix(v.substr(5), s->name.substr(5)))
return v;
. There isn't an immediate use case to revisit the behavior in upstream lld for the binaries I work on. But I think it's sensible to have a similar functionality for data. Are you interested in extending
if (ctx.arg.zKeepTextSectionPrefix)
for (StringRef v : {".text.hot", ".text.unknown", ".text.unlikely",
".text.startup", ".text.exit", ".text.split"})
if (isSectionPrefix(v.substr(5), s->name.substr(5)))
return v;
for data or having a use case?

fwiw, one upside about using linker script is that it provides flexible section mapping and section ordering besides grouping.

  • By section mapping, symbols in the intermediate object file have .hot, .unlikely or no suffix , and linker script can map intermediate unlikely to executable <section>.unlikely, and map .hot and no-suffix ones to executable <section>.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The motivation to leave out other relro sections is that the amount of accesses from those sections are very trivial

Ok, sounds good, I'm happy to ignore .bss.rel.ro.

I think there are three use-cases that are relevant:

  1. A user that doesn't care about this feature, but might have input files containing .data.rel.ro.unlikely or .data.rel.ro.<something> (maybe from a static library they don't control). They don't use a linker script or provide any custom options to the linker. In this case, all the .data.rel.ro.* input sections should be merged, I think like they currently are.

  2. A user that uses this feature, but doesn't want to really care about layout. They would use your -z keep-data-section-prefix option, which would put the .data.rel.ro.unlikely and .data.rel.ro input sections into separate output sections in the final executable.

  3. A user that uses this feature and deeply cares about exactly where these sections are placed, so uses a linker script to manually place .data.rel.ro and .data.rel.ro.unlikely. So yes they get to care about both address assignment and grouping as you have pointed out.

I expect (but cannot verify) that there would be no problem having lots of .data.rel.ro.* (output) sections in an ELF executable, but I do realise we want to keep to just one RELRO segment (program header), and likely just one .relro_padding.

These usecases effectively echo the same situation for .text.hot and .text.unlikely - if you don't care, you do nothing and get one .text section, if you care a little you can use a flag (and the flag tries to do something reasonable: several .text.* sections, but still only one executable segment), and if you care a lot you can use a linker script to assign addresses for your .text.* sections, and potentially change the section->segment mapping too.

These same 3 usecases should apply equivalently for .rodata and .data (and suffixed versions of those), which can use the same flag you've added here. Maybe that work is in a separate patch?

It's not entirely clear to me in the code how to achieve this, but I'm surprised that your handling of the option is in this function rather than in LinkerScript::getOutputSectionName, like we do for .text - but I also understand that .data.rel.ro needs a bit more processing than .text for assigning into segments.

@mingmingl-llvm mingmingl-llvm requested a review from lenary July 16, 2025 01:09
// Returns true if the section is a data section that's read only and
// relocatable.
static bool isRelRoDataSection(Ctx &ctx, StringRef SectionName) {
// If -z keep-data-section-prefix is given, '<section>.hot' and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<section> in the comment seems misleading, since it only applies to .data.rel.ro. Just spell out .data.rel.ro.hot to make it easy for users to grep the code.

Copy link
Contributor

@snehasish snehasish Jul 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the function name I expected this logic to use the sh_type and sh_flags to determine if it was relocatable and read only. Would that be more robust than relying on the name? If we choose to keep this logic based on the name, maybe note that in the function comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed this helper function to isRelRoDataSectionByName and added a comment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of .data,.rodata,.data.rel.ro use SHT_PROGBITS. They don't have distinguishing flags, either.

I'd omit ByName from the function name. isDataRelRoSection to be consistent with ctx.in.bssRelRo

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the function name.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the informative discussion.

# RUN: .data.rel.ro.unlikely : { *(.data.rel.ro.unlikely) } \
# RUN: } INSERT AFTER .text " > %t.script

# RUN: ld.lld -z keep-data-section-prefix --script=%t.script %t.o -o %t
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using a linker script (most people specify -T instead of the long form --script) changes input-section-to-output-section mapping behavior, e.g. no default .text.* => `.text.

We need another ld.lld invocation without a linker script.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done by using -T linker-script.lds and adding three more invocation lines (to cover 4 combinations of {keep-data-section-prefix on/off, linker-script or not}

# RUN: llvm-readelf -l %t | FileCheck --check-prefix=SEG %s
# RUN: llvm-readelf -S %t | FileCheck %s

# There are 2 RW PT_LOAD segments. p_offset p_vaddr p_paddr p_filesz of the first
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use ## for non-RUN-non-CHECK comments to make them stand out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done. I'll keep this in mind for future changes.


.section .data.rel.ro, "aw"
.space 1

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add .data.rel.ro.hot ``.data.rel.ro.split` to demonstrate the behavior

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

@@ -0,0 +1,50 @@
# REQUIRES: x86
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keep-text-section-prefix.s

(Make the filename more generic in case this feature applies to .data.unlikely in the future)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

# REQUIRES: x86
# RUN: llvm-mc -filetype=obj -triple=x86_64-pc-linux %s -o %t.o

# RUN: echo "SECTIONS { \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are quite a few extra files. We prefer split-file for such tests. Many tests use a.o b.o out as relocatable and executable file names. Consider -o a.o ld.lld -z keep-data-section-prefix -T a.lds a.o -o out

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

Copy link
Contributor Author

@mingmingl-llvm mingmingl-llvm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for reviews! PTAL.

@@ -0,0 +1,50 @@
# REQUIRES: x86
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

# RUN: llvm-readelf -l %t | FileCheck --check-prefix=SEG %s
# RUN: llvm-readelf -S %t | FileCheck %s

# There are 2 RW PT_LOAD segments. p_offset p_vaddr p_paddr p_filesz of the first
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done. I'll keep this in mind for future changes.


.section .data.rel.ro, "aw"
.space 1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

# RUN: .data.rel.ro.unlikely : { *(.data.rel.ro.unlikely) } \
# RUN: } INSERT AFTER .text " > %t.script

# RUN: ld.lld -z keep-data-section-prefix --script=%t.script %t.o -o %t
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done by using -T linker-script.lds and adding three more invocation lines (to cover 4 combinations of {keep-data-section-prefix on/off, linker-script or not}

# REQUIRES: x86
# RUN: llvm-mc -filetype=obj -triple=x86_64-pc-linux %s -o %t.o

# RUN: echo "SECTIONS { \
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

// Returns true if the section is a data section that's read only and
// relocatable.
static bool isRelRoDataSection(Ctx &ctx, StringRef SectionName) {
// If -z keep-data-section-prefix is given, '<section>.hot' and
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed this helper function to isRelRoDataSectionByName and added a comment.

@mingmingl-llvm mingmingl-llvm requested a review from MaskRay July 16, 2025 21:01
Comment on lines 632 to 633
bool abiAgnostic = s == ".data.rel.ro" || s == ".bss.rel.ro" ||
s == ".ctors" || s == ".dtors" || s == ".jcr" ||
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The motivation to leave out other relro sections is that the amount of accesses from those sections are very trivial

Ok, sounds good, I'm happy to ignore .bss.rel.ro.

I think there are three use-cases that are relevant:

  1. A user that doesn't care about this feature, but might have input files containing .data.rel.ro.unlikely or .data.rel.ro.<something> (maybe from a static library they don't control). They don't use a linker script or provide any custom options to the linker. In this case, all the .data.rel.ro.* input sections should be merged, I think like they currently are.

  2. A user that uses this feature, but doesn't want to really care about layout. They would use your -z keep-data-section-prefix option, which would put the .data.rel.ro.unlikely and .data.rel.ro input sections into separate output sections in the final executable.

  3. A user that uses this feature and deeply cares about exactly where these sections are placed, so uses a linker script to manually place .data.rel.ro and .data.rel.ro.unlikely. So yes they get to care about both address assignment and grouping as you have pointed out.

I expect (but cannot verify) that there would be no problem having lots of .data.rel.ro.* (output) sections in an ELF executable, but I do realise we want to keep to just one RELRO segment (program header), and likely just one .relro_padding.

These usecases effectively echo the same situation for .text.hot and .text.unlikely - if you don't care, you do nothing and get one .text section, if you care a little you can use a flag (and the flag tries to do something reasonable: several .text.* sections, but still only one executable segment), and if you care a lot you can use a linker script to assign addresses for your .text.* sections, and potentially change the section->segment mapping too.

These same 3 usecases should apply equivalently for .rodata and .data (and suffixed versions of those), which can use the same flag you've added here. Maybe that work is in a separate patch?

It's not entirely clear to me in the code how to achieve this, but I'm surprised that your handling of the option is in this function rather than in LinkerScript::getOutputSectionName, like we do for .text - but I also understand that .data.rel.ro needs a bit more processing than .text for assigning into segments.

Co-authored-by: Sam Elliott <quic_aelliott@quicinc.com>
Copy link
Contributor

@snehasish snehasish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, though please wait for others to approve.

// Returns true if the section is a data section that's read only and
// relocatable.
static bool isRelRoDataSection(Ctx &ctx, StringRef SectionName) {
// If -z keep-data-section-prefix is given, '<section>.hot' and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the informative discussion.

@@ -0,0 +1,89 @@
# REQUIRES: x86
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this x86 only? The assembly below is consumed with the x86 linux triple so it shouldn't matter?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants