Skip to content

radshield/soc-radiation-testing

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Software SEE Testing for Commodity SoCs

This tool detects single-event upsets and isolates them to errors in memory, cache, or the CPU pipeline. The default configuration is meant for the Snapdragon 801 SoC on the Mars 2020 HBS, but can be easily adapted to support other chips.

Usage

radiation_test memory_size_in_mb timeout_seconds output_file

Requirements

  • CMake 3.16 or newer
  • GCC/LLVM toolchain with C++14 support

Building

mkdir build && cd build
cmake -G Ninja ..
make

Design

The program reserves memory_size megabytes of memory using malloc and writes a repeating pattern of 0b10101010 to the reserved locations. Two threads are then spawned to test the integrity of the memory locations, which allows us to differentiate between faults in the L1 cache and in RAM. As the L2 cache has SECDED ECC, we assume that SEEs will not affect data there.

After reading a block of memory that takes up a portion of the cache defined by L1_USAGE, each thread will run a set of tests on the program twice. These tests will exercise the CPU's ALU and multiply-add pipelines:

  • Data read from memory: done vertically to fill up entire cache
  • Cache read and ALU/multiply-add test: done linearly as the data has already been loaded into the cache, skipped for entire block if data read incorrect

Possible Errors

  • Incorrect cache/data read on two cores: potential single-event upset in RAM
  • Incorrect cache/data read on one core: potential single-event upset in L1 cache
  • Incorrect ALU/multiply-add result: potential single-event transient in CPU pipeline

About

Software SEE Testing for Commodity SoCs

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published