Name		Name	Last commit message	Last commit date
parent directory ..
img		img
AllocateDoublesArray.hpp		AllocateDoublesArray.hpp
CMakeLists.txt		CMakeLists.txt
GenerateMesh.cpp		GenerateMesh.cpp
GenerateMesh.hpp		GenerateMesh.hpp
HugePagesSetupTips.md		HugePagesSetupTips.md
README.md		README.md
bench.cpp		bench.cpp
solution.cpp		solution.cpp
solution.hpp		solution.hpp
validate.cpp		validate.cpp

README.md

Finite element operator evaluation

Motivation and problem description

The finite element method is a discretization technique used mainly for physics simulations. The physical domain (usually 2D or 3D) where the physical problem is defined is broken up into a mesh of small, geometrically simple subdomains called elements (much like a surface is broken down into triangles in computer graphics). The continuous fields representing physical quantities (e.g. density, pressure, temperature) are approximated using a finite number of values, which represent the values of the fields at the vertices (nodes) of the elements. For this reason, these values are usually referred to as nodal values. The partial differential equations describing the physics to be simulated are expressed as systems of linear algebraic equations, with the nodal values being the unknowns. These algebraic systems are usually quite large, and therefore must be solved using (super-)computers.

Since the resulting system of algebraic equations is usually solved using iterative methods (e.g. the conjugate gradient method), it is not necessary to explicitly store the matrix describing it in memory. It is sufficient to be able to evaluate the action of this matrix on a vector (compute the matrix-vector product). This approach is usually referred to as matrix-free.

In this lab, you are meant to optimize a piece of code which evaluates the aforementioned matrix-vector product for a structural problem involving a truss. The truss consists of many bars, each of which is represented using a single finite element. To evaluate the matrix-free operator, we iterate over all elements, compute the local 4x4 matrix, gather the local 4-element left-hand side vector, perform the multiplication, and then scatter the result into the global right-hand side vector. This example is rather simple (at the level of a 2nd year mechanical engineering course), but it illustrates the gather-scatter memory access pattern, which is ubiquitous in various simulation codes across the globe.

Note: The above description of the finite element method is extremely simplified, and is only meant to give a high level overview of the problem.

Hint

Note that the compiler generates fairly optimal code for the floating-point computations, so there are little gains to be had by optimizing those. Instead, focus on the memory access pattern, which is extremely random, meaning many distant memory addresses are accessed in rapid succession, which puts stress on the TLB. This situation could be alleviated by allocating the memory which is accessed in a random fashion on huge pages (see HugePagesSetupTips). For the convenience of your solution, all such allocations are done using the allocateDoublesArray function. In fact, this is the only place of the code which is modified in the suggested solution.

Authored-by: @kubagalecki

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

huge_pages_1

huge_pages_1

README.md

Finite element operator evaluation

Motivation and problem description

Hint

Files

huge_pages_1

Directory actions

More options

Directory actions

More options

Latest commit

History

huge_pages_1

Folders and files

parent directory

README.md

Finite element operator evaluation

Motivation and problem description

Hint