Skip to content

Latest commit

 

History

History
162 lines (125 loc) · 6.19 KB

HardwareAssistedAddressSanitizerDesign.rst

File metadata and controls

162 lines (125 loc) · 6.19 KB

Hardware-assisted AddressSanitizer Design Documentation

This page is a design document for hardware-assisted AddressSanitizer (or HWASAN) a tool similar to :doc:`AddressSanitizer`, but based on partial hardware assistance.

The document is a draft, suggestions are welcome.

Introduction

:doc:`AddressSanitizer` tags every 8 bytes of the application memory with a 1 byte tag (using shadow memory), uses redzones to find buffer-overflows and quarantine to find use-after-free. The redzones, the quarantine, and, to a less extent, the shadow, are the sources of AddressSanitizer's memory overhead. See the AddressSanitizer paper for details.

AArch64 has the Address Tagging (or top-byte-ignore, TBI), a hardware feature that allows software to use 8 most significant bits of a 64-bit pointer as a tag. HWASAN uses Address Tagging to implement a memory safety tool, similar to :doc:`AddressSanitizer`, but with smaller memory overhead and slightly different (mostly better) accuracy guarantees.

Algorithm

  • Every heap/stack/global memory object is forcibly aligned by N bytes (N is e.g. 16 or 64). We call N the granularity of tagging.
  • For every such object a random K-bit tag T is chosen (K is e.g. 4 or 8)
  • The pointer to the object is tagged with T.
  • The memory for the object is also tagged with T (using a N=>1 shadow memory)
  • Every load and store is instrumented to read the memory tag and compare it with the pointer tag, exception is raised on tag mismatch.

Instrumentation

Memory Accesses

All memory accesses are prefixed with an inline instruction sequence that verifies the tags. Currently, the following sequence is used:

// int foo(int *a) { return *a; }
// clang -O2 --target=aarch64-linux -fsanitize=hwaddress -c load.c
foo:
     0:       08 dc 44 d3     ubfx    x8, x0, #4, #52  // shadow address
     4:       08 01 40 39     ldrb    w8, [x8]         // load shadow
     8:       09 fc 78 d3     lsr     x9, x0, #56      // address tag
     c:       3f 01 08 6b     cmp     w9, w8           // compare tags
    10:       61 00 00 54     b.ne    #12              // jump on mismatch
    14:       00 00 40 b9     ldr     w0, [x0]         // original load
    18:       c0 03 5f d6     ret
    1c:       40 20 40 d4     hlt     #0x102           // halt
    20:       00 00 40 b9     ldr     w0, [x0]         // original load
    24:       c0 03 5f d6     ret

Alternatively, memory accesses are prefixed with a function call.

Heap

Tagging the heap memory/pointers is done by malloc. This can be based on any malloc that forces all objects to be N-aligned. free tags the memory with a different tag.

Stack

Stack frames are instrumented by aligning all non-promotable allocas by N and tagging stack memory in function prologue and epilogue.

Tags for different allocas in one function are not generated independently; doing that in a function with M allocas would require maintaining M live stack pointers, significantly increasing register pressure. Instead we generate a single base tag value in the prologue, and build the tag for alloca number M as ReTag(BaseTag, M), where ReTag can be as simple as exclusive-or with constant M.

Stack instrumentation is expected to be a major source of overhead, but could be optional.

Globals

TODO: details.

Error reporting

Errors are generated by the HLT instruction and are handled by a signal handler.

Attribute

HWASAN uses its own LLVM IR Attribute sanitize_hwaddress and a matching C function attribute. An alternative would be to re-use ASAN's attribute sanitize_address. The reasons to use a separate attribute are:

  • Users may need to disable ASAN but not HWASAN, or vise versa, because the tools have different trade-offs and compatibility issues.
  • LLVM (ideally) does not use flags to decide which pass is being used, ASAN or HWASAN are being applied, based on the function attributes.

This does mean that users of HWASAN may need to add the new attribute to the code that already uses the old attribute.

Comparison with AddressSanitizer

HWASAN:
  • Is less portable than :doc:`AddressSanitizer` as it relies on hardware Address Tagging (AArch64). Address Tagging can be emulated with compiler instrumentation, but it will require the instrumentation to remove the tags before any load or store, which is infeasible in any realistic environment that contains non-instrumented code.
  • May have compatibility problems if the target code uses higher pointer bits for other purposes.
  • May require changes in the OS kernels (e.g. Linux seems to dislike tagged pointers passed from address space: https://www.kernel.org/doc/Documentation/arm64/tagged-pointers.txt).
  • Does not require redzones to detect buffer overflows, but the buffer overflow detection is probabilistic, with roughly (2**K-1)/(2**K) probability of catching a bug.
  • Does not require quarantine to detect heap-use-after-free, or stack-use-after-return. The detection is similarly probabilistic.

The memory overhead of HWASAN is expected to be much smaller than that of AddressSanitizer: 1/N extra memory for the shadow and some overhead due to N-aligning all objects.

Related Work