Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flash loader rework #1113

Merged
merged 5 commits into from
Mar 25, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
54 changes: 27 additions & 27 deletions flashloaders/Makefile
Original file line number Diff line number Diff line change
@@ -1,38 +1,38 @@
# Note that according to the original GPLed code, compiling is noted to be
# as simple as gcc -c, this fails with my tests where this will lead to a wrong
# address read by the program.
# This makefile will save your time from dealing with compile errors
# Adjust CC if needed
# The flash loader code cannot be compiled by the system gcc. This
# makefile use arm-none-eabi-gcc for this purpose

CC = /opt/local/gcc-arm-none-eabi-8-2018-q4-major/bin/arm-none-eabi-gcc
CROSS_COMPILE ?= arm-none-eabi-

CFLAGS_thumb1 = -mcpu=Cortex-M0 -Tlinker.ld -ffreestanding -nostdlib
CFLAGS_thumb2 = -mcpu=Cortex-M3 -Tlinker.ld -ffreestanding -nostdlib
CC = $(CROSS_COMPILE)gcc
OBJCOPY = $(CROSS_COMPILE)objcopy

all: stm32vl.o stm32f0.o stm32l.o stm32f4.o stm32f4_lv.o stm32l4.o stm32f7.o stm32f7_lv.o
XXD = xxd
XXDFLAGS = -i -c 4

stm32vl.o: stm32f0.s
$(CC) stm32f0.s $(CFLAGS_thumb2) -o stm32vl.o
stm32f0.o: stm32f0.s
$(CC) stm32f0.s $(CFLAGS_thumb1) -o stm32f0.o
stm32l.o: stm32lx.s
$(CC) stm32lx.s $(CFLAGS_thumb2) -o stm32l.o
stm32f4.o: stm32f4.s
$(CC) stm32f4.s $(CFLAGS_thumb2) -o stm32f4.o
stm32f4_lv.o: stm32f4lv.s
$(CC) stm32f4lv.s $(CFLAGS_thumb2) -o stm32f4_lv.o
stm32l4.o: stm32l4.s
$(CC) stm32l4.s $(CFLAGS_thumb2) -o stm32l4.o
stm32f7.o: stm32f7.s
$(CC) stm32f7.s $(CFLAGS_thumb2) -o stm32f7.o
stm32f7_lv.o: stm32f7lv.s
$(CC) stm32f7lv.s $(CFLAGS_thumb2) -o stm32f7_lv.o
CFLAGS_ARMV6_M = -mcpu=Cortex-M0 -Tlinker.ld -ffreestanding -nostdlib
CFLAGS_ARMV7_M = -mcpu=Cortex-M3 -Tlinker.ld -ffreestanding -nostdlib

clean:
rm *.o
all: stm32vl.h stm32f0.h stm32lx.h stm32f4.h stm32f4lv.h stm32l4.h stm32f7.h stm32f7lv.h


%.h: %.bin
$(XXD) $(XXDFLAGS) $< $@

%.bin: %.o
$(OBJCOPY) -O binary $< $@
rm $<

# separate rule for STM32F0
stm32f0.o: stm32f0.s
$(CC) stm32f0.s $(CFLAGS_ARMV6_M) -o stm32f0.o

# separate rule for STM32F1/F3
stm32vl.o: stm32f0.s
$(CC) stm32f0.s $(CFLAGS_ARMV7_M) -o stm32vl.o

# generic rule for all other ARMv7-M
%.o: *.s
$(CC) $< $(CFLAGS_ARMV7_M) -o $@

clean:
rm -f *.h
38 changes: 14 additions & 24 deletions flashloaders/cleanroom.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,4 @@
Original Chinese version can be found below.

# Clean Room Documentation English Version
# Flash Loader Documentation

Code is situated in section `.text`

Expand All @@ -12,20 +10,19 @@ All parameters would be passed over registers

`r0`: the base address of the copy source
`r1`: the base address of the copy destination
`r2`: the total word (4 bytes) count to be copied (with expeptions)
`r2`: the count of byte to be copied
`r3`: flash register offset (used to support two banks)

**What the program is expected to do**:

Copy data from source to destination, after which trigger a breakpint to exit. Before exit, `r2` must be cleared to zero to indicate that the copy is done.
Copy data from source to destination, after which trigger a breakpint to exit. Before exit, `r2` must be less or equal to zero to indicate that the copy is done.

**Limitation**: No stack operations are permitted. Registers ranging from `r3` to `r12` are free to use. Note that `r13` is `sp`(stack pointer), `r14` is `lr`(commonly used to store jump address), `r15` is `pc`(program counter).

**Requirement**: After every single copy, wait until the flash finishes. The detailed single copy length and the way to check can be found below. Address of `flash_base` shall be two-bytes aligned.

## stm32f0.s

**Exception**: `r2` stores the total half word (2 bytes) count to be copied

`flash_base`: 0x40022000

`FLASH_CR`: offset from `flash_base` is 16
Expand All @@ -37,11 +34,11 @@ Copy data from source to destination, after which trigger a breakpint to exit. B

**Special requirements**:

Before every copy, read a word from FLASH_CR, set the lowest bit to 1 and write back. Copy one half word each time.
Before every copy, read a word from FLASH_CR, set the PG bit to 1 and write back. Copy one half word each time.

How to wait for the write process: read a word from FLASH_SR, loop until the content is not 1. After that, check FLASH_SR, proceed if the content is 4, otherwise exit.
How to wait for the write process: read a word from FLASH_SR, loop until the busy bit is reset. After that, FLASH_SR is check. The process is interrupted if the error bit (0x04) is set.

Exit: after the copying process and before triggering the breakpoint, clear the lowest bit in FLASH_CR.
Exit: after the copying process and before triggering the breakpoint, clear the PG bit in FLASH_CR.

## stm32f4.s

Expand All @@ -56,7 +53,8 @@ Exit: after the copying process and before triggering the breakpoint, clear the
**Special requirements**:

Copy one word each time.
How to wait for the write process: read a half word from FLASH_SR, loop until the content is not 1.

How to wait for the write process: read a word from FLASH_SR, loop until the busy bit is reset.

## stm32f4lv.s

Expand All @@ -71,7 +69,7 @@ How to wait for the write process: read a half word from FLASH_SR, loop until th

Copy one byte each time.

How to wait from the write process: read a half word from FLASH_SR, loop until the content is not 1.
How to wait from the write process: read a half word from FLASH_SR, loop until the busy bit is reset.

## stm32f7.s

Expand All @@ -89,16 +87,14 @@ Mostly same with `stm32f4.s`. Require establishing a memory barrier after every

Mostly same with `stm32f7.s`. Copy one byte each time.

## stm32l0x.s
## stm32lx.s

**Special Requirements**:

Copy one word each time. No wait for write.

## stm32l4.s

**Exception**: r2 stores the double word count to be copied.

`flash_base`: 0x40022000
`FLASH_BSY`: offset from `flash_base` is 0x12

Expand All @@ -109,14 +105,10 @@ Copy one word each time. No wait for write.

Copy one double word each time (More than one registers are allowed).

How to wait for the write process: read a half word from `FLASH_BSY`, loop until the lowest bit turns non-1.

## stm32lx.s
How to wait for the write process: read a half word from `FLASH_BSY`, loop until the busy bit is reset.

Same with stm32l0x.s.


# 净室工程文档-原始中文版
# 净室工程文档-原始中文版 (out of date)

代码位于的section:`.text`
编译制导添加`.syntax unified`
Expand All @@ -139,8 +131,6 @@ Same with stm32l0x.s.

## stm32f0.s

例外:`r2`:拷贝half word(2字节)数

特殊地址定义:`flash_base`:定义为0x40022000

`FLASH_CR`: 相对`flash_base`的offset为16
Expand Down Expand Up @@ -230,4 +220,4 @@ Same with stm32l0x.s.

## stm32lx.s

要求与stm32l0x.s相同
要求与stm32l0x.s相同
78 changes: 44 additions & 34 deletions flashloaders/stm32f0.s
Original file line number Diff line number Diff line change
@@ -1,6 +1,14 @@
.syntax unified
.text

/*
* Arguments:
* r0 - source memory ptr
* r1 - target memory ptr
* r2 - count of bytes
* r3 - flash register offset
*/

.global copy
copy:
/*
Expand All @@ -17,54 +25,56 @@ copy:
*/
nop
nop
ldr r7, =flash_base
ldr r4, [r7]
ldr r7, =flash_off_cr
ldr r6, [r7]
adds r6, r6, r4
ldr r7, =flash_off_sr
ldr r5, [r7]
adds r5, r5, r4

loop:
# FLASH_CR ^= 1
# load flash control register address
# add r3 to flash_base for support dual bank (see flash_loader.c)
ldr r7, flash_base
add r7, r7, r3
ldr r6, flash_off_cr
add r6, r6, r7
ldr r5, flash_off_sr
add r5, r5, r7

# FLASH_CR |= 0x01 (set PG)
ldr r7, =0x1
ldr r3, [r6]
orrs r3, r3, r7
str r3, [r6]
ldr r4, [r6]
orrs r4, r4, r7
str r4, [r6]

loop:
# copy 2 bytes
ldrh r3, [r0]
strh r3, [r1]
ldrh r4, [r0]
strh r4, [r1]

ldr r7, =2
adds r0, r0, r7
adds r1, r1, r7
# increment address
adds r0, r0, #0x2
adds r1, r1, #0x2

# wait if FLASH_SR == 1
# BUSY flag
ldr r7, =0x01
wait:
ldr r7, =0x1
ldr r3, [r5]
tst r3, r7
beq wait
# get FLASH_SR
ldr r4, [r5]

# exit if FLASH_SR != 4
ldr r7, =0x4
tst r3, r7
# wait until BUSY flag is reset
tst r4, r7
bne wait

# test PGERR or WRPRTERR flag is reset
ldr r7, =0x14
tst r4, r7
bne exit

# loop if r2 != 0
ldr r7, =0x1
subs r2, r2, r7
cmp r2, #0
bne loop
# loop if count > 0
subs r2, r2, #0x2
bgt loop

exit:
# FLASH_CR &= ~1
ldr r7, =0x1
ldr r3, [r6]
bics r3, r3, r7
str r3, [r6]
ldr r4, [r6]
bics r4, r4, r7
str r4, [r6]

bkpt

Expand Down
32 changes: 21 additions & 11 deletions flashloaders/stm32f4.s
Original file line number Diff line number Diff line change
@@ -1,6 +1,14 @@
.syntax unified
.text

/*
* Arguments:
* r0 - source memory ptr
* r1 - target memory ptr
* r2 - count of bytes
* r3 - flash register offset
*/

.global copy
copy:
ldr r12, flash_base
Expand All @@ -9,22 +17,24 @@ copy:

loop:
# copy 4 bytes
ldr r3, [r0]
str r3, [r1]
ldr r4, [r0]
str r4, [r1]

# increment address
add r0, r0, #4
add r1, r1, #4

# wait if FLASH_SR == 1
wait:
ldrh r3, [r10]
tst r3, #0x1
beq wait

# loop if r2 != 0
sub r2, r2, #1
cmp r2, #0
bne loop
# get FLASH_SR
ldrh r4, [r10]

# wait until BUSY flag is reset
tst r4, #0x1
bne wait

# loop if count > 0
subs r2, r2, #4
bgt loop

exit:
bkpt
Expand Down
38 changes: 21 additions & 17 deletions flashloaders/stm32f4lv.s
Original file line number Diff line number Diff line change
@@ -1,36 +1,40 @@
.syntax unified
.text

/*
* Arguments:
* r0 - source memory ptr
* r1 - target memory ptr
* r2 - count of bytes
* r3 - flash register offset
*/

.global copy
copy:
ldr r12, flash_base
ldr r10, flash_off_sr
add r10, r10, r12

# tip 1: original r2 indicates the count of 4 bytes need to copy,
# but we can only copy one byte each time.
# as we have no flash larger than 1GB, we do a little trick here.
# tip 2: r2 is always a power of 2
mov r2, r2, lsl#2

loop:
# copy 1 byte
ldrb r3, [r0]
strb r3, [r1]
ldrb r4, [r0]
strb r4, [r1]

# increment address
add r0, r0, #1
add r1, r1, #1

# wait if FLASH_SR == 1
wait:
ldrh r3, [r10]
tst r3, #0x1
beq wait

# loop if r2 != 0
sub r2, r2, #1
cmp r2, #0
bne loop
# get FLASH_SR
ldrh r4, [r10]

# wait until BUSY flag is reset
tst r4, #0x1
bne wait

# loop if count > 0
subs r2, r2, #1
bgt loop

exit:
bkpt
Expand Down
Loading