Skip to content

Commit

Permalink
[llvm-objcopy][ELF] -O binary: use LMA instead of sh_offset to decide…
Browse files Browse the repository at this point in the history
… where to write section contents

.text sh_address=0x1000 sh_offset=0x1000
.data sh_address=0x3000 sh_offset=0x2000

In an objcopy -O binary output, the distance between two sections equal
their LMA differences (0x3000-0x1000), instead of their sh_offset
differences (0x2000-0x1000). This patch changes our behavior to match
GNU.

This rule gets more complex when the containing PT_LOAD has
p_vaddr!=p_paddr. GNU objcopy essentially computes
sh_offset-p_offset+p_paddr for each candidate section, and removes the
gap before the first address.

Added tests to binary-paddr.test to catch the compatibility problem.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D71035
  • Loading branch information
MaskRay committed Dec 16, 2019
1 parent 9e119ad commit d28c6d5
Show file tree
Hide file tree
Showing 2 changed files with 118 additions and 41 deletions.
123 changes: 105 additions & 18 deletions llvm/test/tools/llvm-objcopy/ELF/binary-paddr.test
Original file line number Diff line number Diff line change
@@ -1,9 +1,21 @@
# RUN: yaml2obj %s -o %t
# RUN: llvm-objcopy -O binary %t %t2
# RUN: od -t x2 %t2 | FileCheck %s --ignore-case
# RUN: wc -c < %t2 | FileCheck %s --check-prefix=SIZE
## The computed LMA of a section in a PT_LOAD equals sh_offset-p_offset+p_paddr.
## The byte offset difference between two sections equals the difference between their LMAs.

!ELF
## Corollary: if two sections are in the same PT_LOAD, the byte offset
## difference equals the difference between their sh_addr fields.

# RUN: yaml2obj --docnum=1 %s -o %t1
# RUN: llvm-objcopy -O binary %t1 %t1.out
# RUN: od -A x -t x2 %t1.out | FileCheck %s --check-prefix=CHECK1 --ignore-case
# RUN: wc -c %t1.out | FileCheck %s --check-prefix=SIZE1

# CHECK1: 000000 c3c3 c3c3 0000 0000 0000 0000 0000 0000
# CHECK1-NEXT: 000010 0000 0000 0000 0000 0000 0000 0000 0000
# CHECK1-NEXT: *
# CHECK1-NEXT: 001000 3232
# SIZE1: 4098

--- !ELF
FileHeader:
Class: ELFCLASS64
Data: ELFDATA2LSB
Expand All @@ -14,32 +26,107 @@ Sections:
Type: SHT_PROGBITS
Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
Address: 0x1000
AddressAlign: 0x0000000000001000
AddressAlign: 0x1000
Content: "c3c3c3c3"
- Name: .data
Type: SHT_PROGBITS
Flags: [ SHF_ALLOC ]
Flags: [ SHF_ALLOC, SHF_WRITE ]
Address: 0x2000
AddressAlign: 0x0000000000001000
AddressAlign: 0x1000
Content: "3232"
ProgramHeaders:
- Type: PT_LOAD
Flags: [ PF_X, PF_R ]
VAddr: 0x1000
PAddr: 0x1000
Align: 0x1000
Flags: [ PF_R, PF_W ]
Sections:
- Section: .text
- Section: .data

## The computed LMA of a section not in a PT_LOAD equals its sh_addr.

# RUN: yaml2obj --docnum=2 %s -o %t2
# RUN: llvm-objcopy -O binary %t2 %t2.out
# RUN: od -A x -t x2 %t2.out | FileCheck %s --check-prefix=CHECK2 --ignore-case
# RUN: wc -c %t2.out | FileCheck %s --check-prefix=SIZE2

## The computed LMA of .data is 0x4000. The minimum LMA of all sections is 0x1000.
## The content of .data will be written at 0x4000-0x1000 = 0x3000.
# CHECK2: 000000 c3c3 c3c3 0000 0000 0000 0000 0000 0000
# CHECK2-NEXT: 000010 0000 0000 0000 0000 0000 0000 0000 0000
# CHECK2-NEXT: *
# CHECK2-NEXT: 003000 3232
# SIZE2: 12290

--- !ELF
FileHeader:
Class: ELFCLASS64
Data: ELFDATA2LSB
Type: ET_EXEC
Machine: EM_X86_64
Sections:
- Name: .text
Type: SHT_PROGBITS
Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
## Not in a PT_LOAD. LMA = sh_addr = 0x1000.
Address: 0x1000
AddressAlign: 0x1000
Content: "c3c3c3c3"
- Name: .data
Type: SHT_PROGBITS
Flags: [ SHF_ALLOC, SHF_WRITE ]
## LMA = sh_offset-p_offset+p_paddr = 0x2000-0x2000+0x4000 = 0x4000.
Address: 0x2000
AddressAlign: 0x1000
Content: "3232"
ProgramHeaders:
- Type: PT_LOAD
Flags: [ PF_R, PF_W ]
VAddr: 0x2000
## p_vaddr is increased from 0x2000 to 0x4000.
PAddr: 0x4000
Align: 0x1000
Sections:
- Section: .data

# CHECK: 0000000 c3c3 c3c3 0000 0000 0000 0000 0000 0000
# CHECK-NEXT: 0000020 0000 0000 0000 0000 0000 0000 0000 0000
# CHECK-NEXT: *
# CHECK-NEXT: 0030000 3232
# SIZE: 12290
## Check that we use sh_offset instead of sh_addr to decide where to write section contents.

# RUN: yaml2obj --docnum=3 %s -o %t3
# RUN: llvm-objcopy -O binary %t3 %t3.out
# RUN: od -A x -t x2 %t3.out | FileCheck %s --check-prefix=CHECK3 --ignore-case
# RUN: wc -c %t3.out | FileCheck %s --check-prefix=SIZE3

## The minimum LMA of all sections is 0x1000.
## The content of .data will be written at 0x3000-0x1000 = 0x2000.
# CHECK3: 000000 c3c3 c3c3 0000 0000 0000 0000 0000 0000
# CHECK3-NEXT: 000010 0000 0000 0000 0000 0000 0000 0000 0000
# CHECK3-NEXT: *
# CHECK3-NEXT: 002000 3232
# SIZE3: 8194

--- !ELF
FileHeader:
Class: ELFCLASS64
Data: ELFDATA2LSB
Type: ET_EXEC
Machine: EM_X86_64
Sections:
- Name: .text
Type: SHT_PROGBITS
Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
## Not in a PT_LOAD. LMA = sh_addr = 0x1000.
Address: 0x1000
AddressAlign: 0x1000
Content: "c3c3c3c3"
- Name: .data
Type: SHT_PROGBITS
Flags: [ SHF_ALLOC, SHF_WRITE ]
## sh_addr is increased from 0x2000 to 0x3000, but it is ignored.
## LMA = sh_offset-p_offset+p_paddr = 0x2000-0x2000+0x3000 = 0x3000.
Address: 0x3000
AddressAlign: 0x1000
Content: "3232"
ProgramHeaders:
- Type: PT_LOAD
Flags: [ PF_R, PF_W ]
VAddr: 0x3000
PAddr: 0x3000
Sections:
- Section: .data
36 changes: 13 additions & 23 deletions llvm/tools/llvm-objcopy/ELF/Object.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2253,38 +2253,28 @@ Error BinaryWriter::finalize() {
std::unique(std::begin(OrderedSegments), std::end(OrderedSegments));
OrderedSegments.erase(End, std::end(OrderedSegments));

uint64_t Offset = 0;

// Modify the first segment so that there is no gap at the start. This allows
// our layout algorithm to proceed as expected while not writing out the gap
// at the start.
if (!OrderedSegments.empty()) {
Segment *Seg = OrderedSegments[0];
const SectionBase *Sec = Seg->firstSection();
auto Diff = Sec->OriginalOffset - Seg->OriginalOffset;
Seg->OriginalOffset += Diff;
// The size needs to be shrunk as well.
Seg->FileSize -= Diff;
// The PAddr needs to be increased to remove the gap before the first
// section.
Seg->PAddr += Diff;
uint64_t LowestPAddr = Seg->PAddr;
for (Segment *Segment : OrderedSegments) {
Segment->Offset = Segment->PAddr - LowestPAddr;
Offset = std::max(Offset, Segment->Offset + Segment->FileSize);
}
// Compute the section LMA based on its sh_offset and the containing segment's
// p_offset and p_paddr. Also compute the minimum LMA of all sections as
// MinAddr. In the output, the contents between address 0 and MinAddr will be
// skipped.
uint64_t MinAddr = UINT64_MAX;
for (SectionBase &Sec : Obj.allocSections()) {
if (Sec.ParentSegment != nullptr)
Sec.Addr =
Sec.Offset - Sec.ParentSegment->Offset + Sec.ParentSegment->PAddr;
MinAddr = std::min(MinAddr, Sec.Addr);
}

layoutSections(Obj.allocSections(), Offset);

// Now that every section has been laid out we just need to compute the total
// file size. This might not be the same as the offset returned by
// layoutSections, because we want to truncate the last segment to the end of
// its last section, to match GNU objcopy's behaviour.
TotalSize = 0;
for (const SectionBase &Sec : Obj.allocSections())
for (SectionBase &Sec : Obj.allocSections()) {
Sec.Offset = Sec.Addr - MinAddr;
if (Sec.Type != SHT_NOBITS)
TotalSize = std::max(TotalSize, Sec.Offset + Sec.Size);
}

if (Error E = Buf.allocate(TotalSize))
return E;
Expand Down

0 comments on commit d28c6d5

Please sign in to comment.